Sep
07
2021
--

Seqera Labs grabs $5.5M to help sequence COVID-19 variants and other complex data problems

Bringing order and understanding to unstructured information located across disparate silos has been one of the more significant breakthroughs of the big data era, and today a European startup that has built a platform to help with this challenge specifically in the area of life sciences — and has, notably, been used by labs to sequence and so far identify two major COVID-19 variants — is announcing some funding to continue building out its tools to a wider set of use cases, and to expand into North America.

Seqera Labs, a Barcelona-based data orchestration and workflow platform tailored to help scientists and engineers order and gain insights from cloud-based genomic data troves, as well as to tackle other life science applications that involve harnessing complex data from multiple locations, has raised $5.5 million in seed funding.

Talis Capital and Speedinvest co-led this round, with participation also from previous backer BoxOne Ventures and a grant from the Chan Zuckerberg Initiative, Mark Zuckerberg and Dr. Priscilla Chan’s effort to back open source software projects for science applications.

Seqera — a portmanteau of “sequence” and “era”, the age of sequencing data, basically — had previously raised less than $1 million, and quietly, it is already generating revenues, with five of the world’s biggest pharmaceutical companies part of its customer base, alongside biotech and other life sciences customers.

Seqera was spun out of the Centre for Genomic Regulation, a biomedical research center based out of Barcelona, where it was built as the commercial application of Nextflow, open source workflow and data orchestration software originally created by the founders of Seqera, Evan Floden and Paolo Di Tommaso, at the CGR.

Floden, Seqera’s CEO, told TechCrunch that he and Di Tommaso were motivated to create Seqera in 2018 after seeing Nextflow gain a lot of traction in the life science community, and subsequently getting a lot of repeat requests for further customization and features. Both Nextflow and Seqera have seen a lot of usage: the Nextflow runtime has been downloaded more than 2 million times, the company said, while Seqera’s commercial cloud offering has now processed more than 5 billion tasks.

The COVID-19 pandemic is a classic example of the acute challenge that Seqera (and by association Nextflow) aims to address in the scientific community. With COVID-19 outbreaks happening globally, each time a test for COVID-19 is processed in a lab, live genetic samples of the virus get collected. Taken together, these millions of tests represent a goldmine of information about the coronavirus and how it is mutating, and when and where it is doing so. For a new virus about which so little is understood and that is still persisting, that’s invaluable data.

So the problem is not if the data exists for better insights (it does); it is that it’s nearly impossible to use more legacy tools to view that data as a holistic body. It’s in too many places, and there is just too much of it, and it’s growing every day (and changing every day), which means that traditional approaches of porting data to a centralized location to run analytics on it just wouldn’t be efficient, and would cost a fortune to execute.

That is where Segera comes in. The company’s technology treats each source of data across different clouds as a salient pipeline which can be merged and analyzed as a single body, without that data ever leaving the boundaries of the infrastructure where it already exists. Customised to focus on genomic troves, scientists can then query that information for more insights. Seqera was central to the discovery of both the Alpha and Delta variants of the virus, and work is still ongoing as COVID-19 continues to hammer the globe.

Seqera is being used in other kinds of medical applications, such as in the realm of so-called “precision medicine.” This is emerging as a very big opportunity in complex fields like oncology: cancer mutates and behaves differently depending on many factors, including genetic differences of the patients themselves, which means that treatments are less effective if they are “one size fits all.”

Increasingly, we are seeing approaches that leverage machine learning and big data analytics to better understand individual cancers and how they develop for different populations, to subsequently create more personalized treatments, and Seqera comes into play as a way to sequence that kind of data.

This also highlights something else notable about the Seqera platform: it is used directly by the people who are analyzing the data — that is, the researchers and scientists themselves, without data specialists necessarily needing to get involved. This was a practical priority for the company, Floden told me, but nonetheless, it’s an interesting detail of how the platform is inadvertently part of that bigger trend of “no-code/low-code” software, designed to make highly technical processes usable by non-technical people.

It’s both the existing opportunity and how Seqera might be applied in the future across other kinds of data that lives in the cloud that makes it an interesting company, and it seems an interesting investment, too.

“Advancements in machine learning, and the proliferation of volumes and types of data, are leading to increasingly more applications of computer science in life sciences and biology,” said Kirill Tasilov, principal at Talis Capital, in a statement. “While this is incredibly exciting from a humanity perspective, it’s also skyrocketing the cost of experiments to sometimes millions of dollars per project as they become computer-heavy and complex to run. Nextflow is already a ubiquitous solution in this space and Seqera is driving those capabilities at an enterprise level – and in doing so, is bringing the entire life sciences industry into the modern age. We’re thrilled to be a part of Seqera’s journey.”

“With the explosion of biological data from cheap, commercial DNA sequencing, there is a pressing need to analyse increasingly growing and complex quantities of data,” added Arnaud Bakker, principal at Speedinvest. “Seqera’s open and cloud-first framework provides an advanced tooling kit allowing organisations to scale complex deployments of data analysis and enable data-driven life sciences solutions.”

Although medicine and life sciences are perhaps Seqera’s most obvious and timely applications today, the framework originally designed for genetics and biology can be applied to any a number of other areas: AI training, image analysis and astronomy are three early use cases, Floden said. Astronomy is perhaps very apt, since it seems that the sky is the limit.

“We think we are in the century of biology,” Floden said. “It’s the center of activity and it’s becoming data-centric, and we are here to build services around that.”

Seqera is not disclosing its valuation with this round.

Aug
19
2021
--

Companies betting on data must value people as much as AI

The Pareto principle, also known as the 80-20 rule, asserts that 80% of consequences come from 20% of causes, rendering the remainder way less impactful.

Those working with data may have heard a different rendition of the 80-20 rule: A data scientist spends 80% of their time at work cleaning up messy data as opposed to doing actual analysis or generating insights. Imagine a 30-minute drive expanded to two-and-a-half hours by traffic jams, and you’ll get the picture.

As tempting as it may be to think of a future where there is a machine learning model for every business process, we do not need to tread that far right now.

While most data scientists spend more than 20% of their time at work on actual analysis, they still have to waste countless hours turning a trove of messy data into a tidy dataset ready for analysis. This process can include removing duplicate data, making sure all entries are formatted correctly and doing other preparatory work.

On average, this workflow stage takes up about 45% of the total time, a recent Anaconda survey found. An earlier poll by CrowdFlower put the estimate at 60%, and many other surveys cite figures in this range.

None of this is to say data preparation is not important. “Garbage in, garbage out” is a well-known rule in computer science circles, and it applies to data science, too. In the best-case scenario, the script will just return an error, warning that it cannot calculate the average spending per client, because the entry for customer #1527 is formatted as text, not as a numeral. In the worst case, the company will act on insights that have little to do with reality.

The real question to ask here is whether re-formatting the data for customer #1527 is really the best way to use the time of a well-paid expert. The average data scientist is paid between $95,000 and $120,000 per year, according to various estimates. Having the employee on such pay focus on mind-numbing, non-expert tasks is a waste both of their time and the company’s money. Besides, real-world data has a lifespan, and if a dataset for a time-sensitive project takes too long to collect and process, it can be outdated before any analysis is done.

What’s more, companies’ quests for data often include wasting the time of non-data-focused personnel, with employees asked to help fetch or produce data instead of working on their regular responsibilities. More than half of the data being collected by companies is often not used at all, suggesting that the time of everyone involved in the collection has been wasted to produce nothing but operational delay and the associated losses.

The data that has been collected, on the other hand, is often only used by a designated data science team that is too overworked to go through everything that is available.

All for data, and data for all

The issues outlined here all play into the fact that save for the data pioneers like Google and Facebook, companies are still wrapping their heads around how to re-imagine themselves for the data-driven era. Data is pulled into huge databases and data scientists are left with a lot of cleaning to do, while others, whose time was wasted on helping fetch the data, do not benefit from it too often.

The truth is, we are still early when it comes to data transformation. The success of tech giants that put data at the core of their business models set off a spark that is only starting to take off. And even though the results are mixed for now, this is a sign that companies have yet to master thinking with data.

Data holds much value, and businesses are very much aware of it, as showcased by the appetite for AI experts in non-tech companies. Companies just have to do it right, and one of the key tasks in this respect is to start focusing on people as much as we do on AIs.

Data can enhance the operations of virtually any component within the organizational structure of any business. As tempting as it may be to think of a future where there is a machine learning model for every business process, we do not need to tread that far right now. The goal for any company looking to tap data today comes down to getting it from point A to point B. Point A is the part in the workflow where data is being collected, and point B is the person who needs this data for decision-making.

Importantly, point B does not have to be a data scientist. It could be a manager trying to figure out the optimal workflow design, an engineer looking for flaws in a manufacturing process or a UI designer doing A/B testing on a specific feature. All of these people must have the data they need at hand all the time, ready to be processed for insights.

People can thrive with data just as well as models, especially if the company invests in them and makes sure to equip them with basic analysis skills. In this approach, accessibility must be the name of the game.

Skeptics may claim that big data is nothing but an overused corporate buzzword, but advanced analytics capacities can enhance the bottom line for any company as long as it comes with a clear plan and appropriate expectations. The first step is to focus on making data accessible and easy to use and not on hauling in as much data as possible.

In other words, an all-around data culture is just as important for an enterprise as the data infrastructure.

Jul
15
2021
--

The CockroachDB EC-1

Every application is a palimpsest of technologies, each layer forming a base that enables the next layer to function. Web front ends rely on JavaScript and browser DOM, which rely on back-end APIs, which themselves rely on databases.

As one goes deeper down the stack, engineering decisions become ever more conservative — changing the location of a button in a web app is an inconvenience; changing a database engine can radically upend an entire project.

It’s little surprise then that database technologies are among the longest-lasting engineering projects in the modern software developer toolkit. MySQL, which remains one of the most popular database engines in the world, was first released in the mid-1990s, and Oracle Database, launched more than four decades ago, is still widely used in high-performance corporate environments.

Database technology can change the world, but the world in these parts changes very, very slowly. That’s made building a startup in the sector a tough equation: Sales cycles can be painfully slow, even when new features can dramatically expand a developer’s capabilities. Competition is stiff and comes from some of the largest and most entrenched tech companies in the world. Exits have also been few and far between.

That challenge — and opportunity — is what makes studying Cockroach Labs so interesting. The company behind CockroachDB attempts to solve a long-standing problem in large-scale, distributed database architecture: How to make it so that data created in one place on the planet is always available for consumption by applications that are thousands of miles away, immediately and accurately. Making global data always available immediately and accurately might sound like a simple use case, but in reality it’s quite the herculean task. Cockroach Labs’ story is one of an uphill struggle, but one that saw it turn into a next-generation, $2-billion-valued database contender.

The lead writer of this EC-1 is Bob Reselman. Reselman has been writing about the enterprise software market for more than two decades, with a particular emphasis on teaching and educating engineers on technology. The lead editor for this package was Danny Crichton, the assistant editor was Ram Iyer, the copy editor was Richard Dal Porto, figures were designed by Bob Reselman and stylized by Bryce Durbin, and illustrations were drawn by Nigel Sussman.

CockroachDB had no say in the content of this analysis and did not get advance access to it. Reselman has no financial ties to CockroachDB or other conflicts of interest to disclose.

The CockroachDB EC-1 comprises four main articles numbering 9,100 words and a reading time of 37 minutes. Here’s what we’ll be crawling over:

We’re always iterating on the EC-1 format. If you have questions, comments or ideas, please send an email to TechCrunch Managing Editor Danny Crichton at danny@techcrunch.com.

Jul
15
2021
--

CockroachDB, the database that just won’t die

There is an art to engineering, and sometimes engineering can transform art. For Spencer Kimball and Peter Mattis, those two worlds collided when they created the widely successful open-source graphics program, GIMP, as college students at Berkeley.

That project was so successful that when the two joined Google in 2002, Sergey Brin and Larry Page personally stopped by to tell the new hires how much they liked it and explained how they used the program to create the first Google logo.

Cockroach Labs was started by developers and stays true to its roots to this day.

In terms of good fortune in the corporate hierarchy, when you get this type of recognition in a company such as Google, there’s only one way you can go — up. They went from rising stars to stars at Google, becoming the go-to guys on the Infrastructure Team. They could easily have looked forward to a lifetime of lucrative employment.

But Kimball, Mattis and another Google employee, Ben Darnell, wanted more — a company of their own. To realize their ambitions, they created Cockroach Labs, the business entity behind their ambitious open-source database CockroachDB. Can some of the smartest former engineers in Google’s arsenal upend the world of databases in a market spotted with the gravesites of storage dreams past? That’s what we are here to find out.

Berkeley software distribution

Mattis and Kimball were roommates at Berkeley majoring in computer science in the early-to-mid-1990s. In addition to their usual studies, they also became involved with the eXperimental Computing Facility (XCF), an organization of undergraduates who have a keen, almost obsessive interest in CS.

Jul
15
2021
--

How engineers fought the CAP theorem in the global war on latency

CockroachDB was intended to be a global database from the beginning. The founders of Cockroach Labs wanted to ensure that data written in one location would be viewable immediately in another location 10,000 miles away. The use case was simple, but the work needed to make it happen was herculean.

The company is betting the farm that it can solve one of the largest challenges for web-scale applications. The approach it’s taking is clever, but it’s a bit complicated, particularly for the non-technical reader. Given its history and engineering talent, the company is in the process of pulling it off and making a big impact on the database market, making it a technology well worth understanding. In short, there’s value in digging into the details.

Using CockroachDB’s multiregion feature to segment data according to geographic proximity fulfills Cockroach Labs’ primary directive: To get data as close to the user as possible.

In part 1 of this EC-1, I provided a general overview and a look at the origins of Cockroach Labs. In this installment, I’m going to cover the technical details of the technology with an eye to the non-technical reader. I’m going to describe the CockroachDB technology through three questions:

  1. What makes reading and writing data over a global geography so hard?
  2. How does CockroachDB address the problem?
  3. What does it all mean for those using CockroachDB?

What makes reading and writing data over a global geography so hard?

Spencer Kimball, CEO and co-founder of Cockroach Labs, describes the situation this way:

There’s lots of other stuff you need to consider when building global applications, particularly around data management. Take, for example, the question and answer website Quora. Let’s say you live in Australia. You have an account and you store the particulars of your Quora user identity on a database partition in Australia.

But when you post a question, you actually don’t want that data to just be posted in Australia. You want that data to be posted everywhere so that all the answers to all the questions are the same for everybody, anywhere. You don’t want to have a situation where you answer a question in Sydney and then you can see it in Hong Kong, but you can’t see it in the EU. When that’s the case, you end up getting different answers depending where you are. That’s a huge problem.

Reading and writing data over a global geography is challenging for pretty much the same reason that it’s faster to get a pizza delivered from across the street than from across the city. The essential constraints of time and space apply. Whether it’s digital data or a pepperoni pizza, the further away you are from the source, the longer stuff takes to get to you.

Jul
15
2021
--

“Developers, as you know, do not like to pay for things”

In the previous part of this EC-1, we looked at the technical details of CockroachDB and how it provides accurate data instantaneously anywhere on the planet. In this installment, we’re going to take a look at the product side of Cockroach, with a particular focus on developer relations.

As a business, Cockroach Labs has many things going for it. The company’s approach to distributed database technology is novel. And, as more companies operate on a global level, CockroachDB has the potential to gain some significant market share internationally. The company is seven years into a typical 10-year maturity model for databases, has raised $355 million, and holds a $2 billion market value. It’s considered a double unicorn. Few database companies can say this.

The company is now aggressively expanding into the database-as-a-service space, offering its own technology in a fully managed package, expanding the spectrum of clients who can take immediate advantage of its products.

But its growth depends upon securing the love of developers while also making its product easier to use for new customers. To that end, I’m going to analyze the company’s pivot to the cloud as well as its extensive outreach to developers as it works to set itself up for long-term, sustainable success.

Cockroach Labs looks to the cloud

These days, just about any company of consequence provides services via the internet, and a growing number of these services are powered by products and services from native cloud providers. Gartner forecasted in 2019 that cloud services are growing at an annual rate of 17.5%, and there’s no sign that the growth has abated at all.

Its founders’ history with Google back in the mid-2000s has meant that Cockroach Labs has always been aware of the impact of cloud services on the commercial web. Unsurprisingly, CockroachDB could run cloud native right from its first release, given that its architecture presupposes the cloud in its operation — as we saw in part 2 of this EC-1.

Mar
03
2021
--

Yugabyte announces $48M investment as cloud native database makes enterprise push

As demand for cloud native applications is growing, Yugabyte, makers of the cloud native, open source YugabyteDB database are seeing a corresponding rise in demand for their products, especially with large enterprise customers. Today, the company announced a $48 million financing round to help build on that momentum. The round is an extension of the startup’s $30 million Series B last June.

Lightspeed Venture Partners led the round with participation from Greenspring Associates, Dell Technologies Capital, Wipro Ventures and 8VC. It has raised a total raised to $103 million, according to the company.

Kannan Muthukkaruppan, Yugabyte co-founder and president, says the startup saw a marked increase in interest in both the open source and commercial offerings in 2020 as the pandemic pushed many companies to the cloud faster than they might have gone otherwise, something many startup founders have pointed out to me.

“The distributed SQL space is definitely heating up, and if anything over the last six months almost in every vector in terms of enterprise customers — from Fortune 500 companies across financial, retail, ISP or telcos — are putting Yugabyte in production to be the system of record database to meet some of their business critical services needs,” Muthukkaruppan told me.

In addition, he’s seeing a similar rise in the level of interest from the open source version of the product.”Similarly, the groundswell on the community and the open source adoption has been phenomenal. Our Slack [open source] user community quadrupled in 2020,” he said.

That kind of momentum led to the increased investor interest, says co-founder and CTO Karthik Ranganathan. “Some of the primary reasons to go and even ask for funding was that we realized we could accelerate some of this stuff, and we couldn’t do that with the original $30 million we had raised,” he said. The original thinking was to do a secondary raise in the $15-20 million, but multiple investors expressed interest in participating, and it ended up being $48 million when all was said and done.

Former Pivotal president Bill Cook came on board as CEO at the same time they were announcing their last funding round in June and brought some enterprise chops to the table. It was his job to figure out how to expand the market opportunity with larger high-value enterprise clients. “And so the last six or seven months has been about that, dealing with enterprise clients on one hand and then this emerging developer led cloud offering as well,” Cook said.

The company has a three tier offering that includes the open source YugabyteDB. Then there is a fully managed cloud version called Yugabyte Cloud, and finally there is a self-managed cloud version of the database called Yugabyte Platform. The latter is especially attractive to large enterprise customers, who want to be in the cloud, but still want to maintain control of their data and infrastructure, and so choose to manage the cloud installation themselves.

Yugabyte started last year with 50 employees, doubled that to this point, and now expects to reach 200 by the end of this year. As they add employees, the leadership team is cognizant of the importance of building a diverse and inclusive workforce, while recognizing the challenges in doing so.

“It’s work in progress as always. We’ve added diversity candidates right along the whole spectrum as we’ve grown but from my perspective it’s never sufficient, and we just need to keep pushing on it hard, and I think as a leadership team we recognize that,” Cook said.

The three leaders of the company have been working together remotely now since the announcement in June, and had only met briefly in person prior to the pandemic shutting down offices, but they say that it has gone smoothly. And while they would obviously like to meet in person again when the time is right, the momentum the company is experiencing shows that things are moving in the right direction, regardless of where they are getting their work done.

Note: The article originally stated this was a Series C round, but the company later clarified that was a B-1 round and we updated the article to reflect that.

Jan
27
2021
--

Datastax acquires Kesque as it gets into data streaming

Datastax, the company best known for commercializing the open-source Apache Cassandra database, is moving beyond databases. As the company announced today, it has acquired Kesque, a cloud messaging service.

The Kesque team built its service on top of the Apache Pulsar messaging and streaming project. Datastax has now taken that team’s knowledge in this area and, combined with its own expertise, is launching its own Pulsar-based streaming platform by the name of Datastax Luna Streaming, which is now generally available.

This move comes right as Datastax is also now, for the first time, announcing that it is cash-flow positive and profitable, as the company’s chief product officer, Ed Anuff, told me. “We are at over $150 million in [annual recurring revenue]. We are cash-flow positive and we are profitable,” he told me. This marks the first time the company is publically announcing this data. In addition, the company also today revealed that about 20 percent of its annual contract value is now for DataStax Astra, its managed multi-cloud Cassandra service and that the number of self-service Asta subscribers has more than doubled from Q3 to Q4.

The launch of Luna Streaming now gives the 10-year-old company a new area to expand into — and one that has some obvious adjacencies with its existing product portfolio.

“We looked at how a lot of developers are building on top of Cassandra,” Anuff, who joined Datastax after leaving Google Cloud last year, said. “What they’re doing is, they’re addressing what people call ‘data-in-motion’ use cases. They have huge amounts of data that are coming in, huge amounts of data that are going out — and they’re typically looking at doing something with streaming in conjunction with that. As we’ve gone in and asked, “What’s next for Datastax?,’ streaming is going to be a big part of that.”

Given Datastax’s open-source roots, it’s no surprise the team decided to build its service on another open-source project and acquire an open-source company to help it do so. Anuff noted that while there has been a lot of hype around streaming and Apache Kafka, a cloud-native solution like Pulsar seemed like the better solution for the company. Pulsar was originally developed at Yahoo! (which, full disclosure, belongs to the same Verizon Media Group family as TechCrunch) and even before acquiring Kesque, Datastax already used Pulsar to build its Astra platform. Other Pulsar users include Yahoo, Tencent, Nutanix and Splunk.

“What we saw was that when you go and look at doing streaming in a scale-out way, that Kafka isn’t the only approach. We looked at it, and we liked the Pulsar architecture, we like what’s going on, we like the community — and remember, we’re a company that grew up in the Apache open-source community — we said, ‘okay, we think that it’s got all the right underpinnings, let’s go and get involved in that,” Anuff said. And in the process of doing so, the team came across Kesque founder Chris Bartholomew and eventually decided to acquire his company.

The new Luna Streaming offering will be what Datastax calls a “subscription to success with Apache Pulsar.’ It will include a free, production-ready distribution of Pulsar and an optional, SLA-backed subscription tier with enterprise support.

Unsurprisingly, Datastax also plans to remain active in the Pulsar community. The team is already making code contributions, but Anuff also stressed that Datastax is helping out with scalability testing. “This is one of the things that we learned in our participation in the Apache Cassandra project,” Anuff said. “A lot of what these projects need is folks coming in doing testing, helping with deployments, supporting users. Our goal is to be a great participant in the community.”

Oct
01
2020
--

Altinity grabs $4M seed to build cloud version of ClickHouse open-source data warehouse

Earlier this month, cloud data warehouse Snowflake turned heads when it debuted on the stock market. Today, Altinity, the commercial company behind the open-source ClickHouse data warehouse, announced a $4 million seed round from Accel along with a new cloud service, Altinity.Cloud.

“Fundamentally, the company started out as an open-source services bureau offering support, training and [custom] engineering features into ClickHouse. And what we’re doing now with this investment from Accel is we’re extending it to offer a cloud platform in addition to the other things that we already have,” CEO Robert Hodges told TechCrunch.

As the company describes it, “Altinity.Cloud offers immediate access to production-ready ClickHouse clusters with expert enterprise support during every aspect of the application life cycle.” It also helps with application design and implementation and production assistance, in essence combining the consulting side of the house with the cloud service.

The company was launched in 2017 by CTO Alexander Zaitsev, who was one of the early adopters of ClickHouse. Up until now the startup has been bootstrapped with revenue from the services business.

Hodges came on board last year after a stint at VMware because he saw a company with tremendous potential, and his background in cloud services made him a good person to lead the company as it built the cloud product and moved into its next phase.

ClickHouse at its core is a relational database that can run in the cloud or on-prem with big improvements in performance, Hodges says. And he says that developers are enamored with it because you can start a project on a laptop and scale it up from there.

“We’re very simple to operate, just a single binary. You can start from a Docker image. You can run it anywhere, literally anywhere that Linux runs, from an Intel Nuc all the way up to clusters with hundreds of nodes,” Hodges explained.

The investment from Accel should help them finish building the cloud product, which has been in private beta since July, while helping them build a sales and marketing operation to help sell it to the target enterprise market. The startup currently has 27 people, with plans to hire 15 more.

Hodges says that he wants to build a diverse and inclusive company, something he says the tech industry in general has failed at achieving. He believes that one of the reasons for that is the requirement of a computer science degree, which he says has created “a gate for women and people of color,” and he thinks by hiring people with more diverse backgrounds, you can build a more diverse company.

“So one of the things that’s high up on my list is to get back to a more equitable and diverse population of people working on this thing,” he said.

Over time, the company sees the cloud business overtaking the consulting arm in terms of revenue, but that aspect of the business will always have a role in the revenue mix because this is complex by its nature, even with a cloud service.

“Customers can’t just do it entirely by having a push-button interface. They will actually need humans that work with them, and help them understand how to frame problems, help them understand how to build applications that take care of that […] And then finally, help them deal with problems that naturally arise when you’re when you’re in production,” he said.

Jul
02
2020
--

QuestDB nabs $2.3M seed to build open source time series database

QuestDB, a member of the Y Combinator summer 2020 cohort, is building an open source time series database with speed top of mind. Today the startup announced a $2.3 million seed round.

Episode1 Ventures led the round with assistance from Seedcamp, 7percent Ventures, YCombinator, Kima Ventures and several unnamed angel investors.

The database was originally conceived in 2013 when current CTO Vlad Ilyushchenko was building trading systems for a financial services company and he was frustrated by the performance limitations of the databases available at the time, so he began building a database that could handle large amounts of data and process it extremely fast.

For a number of years, QuestDB was a side project, a labor of love for Ilyushchenko until he met his other co-founders Nicolas Hourcard, who became CEO and Tancrede Collard, who became CPO, and the three decided to build a startup on top of the open source project last year.

“We’re building an open source database for time series data, and time series databases are a multi-billion-dollar market because they’re central for financial services, IoT and other enterprise applications. And we basically make it easy to handle explosive amounts of data, and to reduce infrastructure costs massively,” Hourcard told TechCrunch.

He adds that it’s also about high performance. “We recently released a demo that you can access from our website that enables you to query a super large datasets — 1.6 billion rows with sub-second queries, mostly, and that just illustrates how performant the software is,” he said.

He sees open source as a way to build adoption from the bottom up inside organizations, winning the hearts and minds of developers first, then moving deeper in the company when they eventually build a managed cloud version of the product. For now, being open source also helps them as a small team to have a community of contributors help build the database and add to its feature set.

“We’ve got this open source product that is free to use, and it’s pretty important for us to have such a distribution model because we can basically empower developers to solve their problems, and we can ask for contributions from various communities. […] And this is really a way to spur adoption,” Hourcard said.

He says that working with YC has allowed them to talk to other companies in the ecosystem who have built similar open source-based startups and that’s been helpful, but it has also helped them learn to set and meet goals and have access to some of the biggest names in Silicon Valley, including Marc Andreessen, who delivered a talk to the cohort the same day we spoke.

Today the company has seven employees, including the three founders, spread out across the US, EU and South America. He sees this geographic diversity helping when it comes to building a diverse team in the future. “We definitely want to have more diverse backgrounds to make sure that we keep having a diverse team and we’re very strongly committed to that.”

For the short term, the company wants to continue building its community, working on continuing to improve the open source product, while working on the managed cloud product.

Powered by WordPress | Theme: Aeros 2.0 by TheBuckmaker.com