Aug
31
2021
--

Databricks raises $1.6B at $38B valuation as it blasts past $600M ARR

Databricks this morning confirmed earlier reports that it was raising new capital at a higher valuation. The data- and AI-focused company has secured a $1.6 billion round at a $38 billion valuation, it said. Bloomberg first reported last week that Databricks was pursuing new capital at that price.

The Series H was led by Counterpoint Global, a Morgan Stanley fund. Other new investors included Baillie Gifford, UC Investments and ClearBridge. A grip of prior investors also kicked in cash to the round.

The new funding brings Databricks’ total private funding raised to $3.5 billion. Notably, its latest raise comes just seven months after the late-stage startup raised $1 billion on a $28 billion valuation. Its new valuation represents paper value creation in excess of $1 billion per month.

The company, which makes open source and commercial products for processing structured and unstructured data in one location, views its market as a new technology category. Databricks calls the technology a data “lakehouse,” a mashup of data lake and data warehouse.

Databricks CEO and co-founder Ali Ghodsi believes that its new capital will help his company secure market leadership.

For context, since the 1980s, large companies have stored massive amounts of structured data in data warehouses. More recently, companies like Snowflake and Databricks have provided a similar solution for unstructured data called a data lake.

In Ghodsi’s view, combining structured and unstructured data in a single place with the ability for customers to execute data science and business-intelligence work without moving the underlying data is a critical change in the larger data market.

“[Data lakehouses are] a new category, and we think there’s going to be lots of vendors in this data category. So it’s a land grab. We want to quickly race to build it and complete the picture,” he said in an interview with TechCrunch.

Ghodsi also pointed out that he is going up against well-capitalized competitors and that he wants the funds to compete hard with them.

“And you know, it’s not like we’re up against some tiny startups that are getting seed funding to build this. It’s all kinds of [large, established] vendors,” he said. That includes Snowflake, Amazon, Google and others who want to secure a piece of the new market category that Databricks sees emerging.

The company’s performance indicates that it’s onto something.

Growth

Databricks has reached the $600 million annual recurring revenue (ARR) milestone, it disclosed as part of its funding announcement. It closed 2020 at $425 million ARR, to better illustrate how quickly it is growing at scale.

Per the company, its new ARR figure represents 75% growth, measured on a year-over-year basis.

That’s quick for a company of its size; per the Bessemer Cloud Index, top-quartile public software companies are growing at around 44% year over year. Those companies are worth around 22x their forward revenues.

At its new valuation, Databricks is worth 63x its current ARR. So Databricks isn’t cheap, but at its current pace should be able to grow to a size that makes its most recent private valuation easily tenable when it does go public, provided that it doesn’t set a new, higher bar for its future performance by raising again before going public.

Ghodsi declined to share timing around a possible IPO, and it isn’t clear whether the company will pursue a traditional IPO or if it will continue to raise private funds so that it can direct list when it chooses to float. Regardless, Databricks is now sufficiently valuable that it can only exit to one of a handful of mega-cap technology giants or go public.

Why hasn’t the company gone public? Ghodsi is enjoying a rare position in the startup market: He has access to unlimited capital. Databricks had to open another $100 million in its latest round, which was originally set to close at just $1.5 billion. It doesn’t lack for investor interest, allowing its CEO to bring aboard the sort of shareholder he wants for his company’s post-IPO life — while enjoying limited dilution.

This also enables him to hire aggressively, possibly buy some smaller companies to fill in holes in Databricks’ product roadmap, and grow outside of the glare of Wall Street expectations from a position of capital advantage. It’s the startup equivalent of having one’s cake and eating it too.

But staying private longer isn’t without risks. If the larger market for software companies was rapidly devalued, Databricks could find itself too expensive to go public at its final private valuation. However, given the long bull market that we’ve seen in recent years for software shares, and the confidence Ghodsi has in his potential market, that doesn’t seem likely.

There’s still much about Databricks’ financial position that we don’t yet know — its gross margin profile, for example. TechCrunch is also incredibly curious what all its fundraising and ensuing spending have done to near-term Databricks operating cash flow results, as well as how long its gross-margin adjusted CAC payback has evolved since the onset of COVID-19. If we ever get an S-1, we might find out.

For now, winsome private markets are giving Ghodsi and crew space to operate an effectively public company without the annoyances that come with actually being public. Want the same thing for your company? Easy: Just reach $600 million ARR while growing 75% year over year.

Jun
25
2021
--

Edge Delta raises $15M Series A to take on Splunk

Seattle-based Edge Delta, a startup that is building a modern distributed monitoring stack that is competing directly with industry heavyweights like Splunk, New Relic and Datadog, today announced that it has raised a $15 million Series A funding round led by Menlo Ventures and Tim Tully, the former CTO of Splunk. Previous investors MaC Venture Capital and Amity Ventures also participated in this round, which brings the company’s total funding to date to $18 million.

“Our thesis is that there’s no way that enterprises today can continue to analyze all their data in real time,” said Edge Delta co-founder and CEO Ozan Unlu, who has worked in the observability space for about 15 years already (including at Microsoft and Sumo Logic). “The way that it was traditionally done with these primitive, centralized models — there’s just too much data. It worked 10 years ago, but gigabytes turned into terabytes and now terabytes are turning into petabytes. That whole model is breaking down.”

Image Credits: Edge Delta

He acknowledges that traditional big data warehousing works quite well for business intelligence and analytics use cases. But that’s not real-time and also involves moving a lot of data from where it’s generated to a centralized warehouse. The promise of Edge Delta is that it can offer all of the capabilities of this centralized model by allowing enterprises to start to analyze their logs, metrics, traces and other telemetry right at the source. This, in turn, also allows them to get visibility into all of the data that’s generated there, instead of many of today’s systems, which only provide insights into a small slice of this information.

While competing services tend to have agents that run on a customer’s machine, but typically only compress the data, encrypt it and then send it on to its final destination, Edge Delta’s agent starts analyzing the data right at the local level. With that, if you want to, for example, graph error rates from your Kubernetes cluster, you wouldn’t have to gather all of this data and send it off to your data warehouse where it has to be indexed before it can be analyzed and graphed.

With Edge Delta, you could instead have every single node draw its own graph, which Edge Delta can then combine later on. With this, Edge Delta argues, its agent is able to offer significant performance benefits, often by orders of magnitude. This also allows businesses to run their machine learning models at the edge, as well.

Image Credits: Edge Delta

“What I saw before I was leaving Splunk was that people were sort of being choosy about where they put workloads for a variety of reasons, including cost control,” said Menlo Ventures’ Tim Tully, who joined the firm only a couple of months ago. “So this idea that you can move some of the compute down to the edge and lower latency and do machine learning at the edge in a distributed way was incredibly fascinating to me.”

Edge Delta is able to offer a significantly cheaper service, in large part because it doesn’t have to run a lot of compute and manage huge storage pools itself since a lot of that is handled at the edge. And while the customers obviously still incur some overhead to provision this compute power, it’s still significantly less than what they would be paying for a comparable service. The company argues that it typically sees about a 90 percent improvement in total cost of ownership compared to traditional centralized services.

Image Credits: Edge Delta

Edge Delta charges based on volume and it is not shy to compare its prices with Splunk’s and does so right on its pricing calculator. Indeed, in talking to Tully and Unlu, Splunk was clearly on everybody’s mind.

“There’s kind of this concept of unbundling of Splunk,” Unlu said. “You have Snowflake and the data warehouse solutions coming in from one side, and they’re saying, ‘hey, if you don’t care about real time, go use us.’ And then we’re the other half of the equation, which is: actually there’s a lot of real-time operational use cases and this model is actually better for those massive stream processing datasets that you required to analyze in real time.”

But despite this competition, Edge Delta can still integrate with Splunk and similar services. Users can still take their data, ingest it through Edge Delta and then pass it on to the likes of Sumo Logic, Splunk, AWS’s S3 and other solutions.

Image Credits: Edge Delta

“If you follow the trajectory of Splunk, we had this whole idea of building this business around IoT and Splunk at the Edge — and we never really quite got there,” Tully said. “I think what we’re winding up seeing collectively is the edge actually means something a little bit different. […] The advances in distributed computing and sophistication of hardware at the edge allows these types of problems to be solved at a lower cost and lower latency.”

The Edge Delta team plans to use the new funding to expand its team and support all of the new customers that have shown interest in the product. For that, it is building out its go-to-market and marketing teams, as well as its customer success and support teams.

 

Apr
13
2021
--

Meroxa raises $15M Series A for its real-time data platform

Meroxa, a startup that makes it easier for businesses to build the data pipelines to power both their analytics and operational workflows, today announced that it has raised a $15 million Series A funding round led by Drive Capital. Existing investors Root, Amplify and Hustle Fund also participated in this round, which together with the company’s previously undisclosed $4.2 million seed round now brings total funding in the company to $19.2 million.

The promise of Meroxa is that businesses can use a single platform for their various data needs and won’t need a team of experts to build their infrastructure and then manage it. At its core, Meroxa provides a single software-as-a-service solution that connects relational databases to data warehouses and then helps businesses operationalize that data.

Image Credits: Meroxa

“The interesting thing is that we are focusing squarely on relational and NoSQL databases into data warehouse,” Meroxa co-founder and CEO DeVaris Brown told me. “Honestly, people come to us as a real-time FiveTran or real-time data warehouse sink. Because, you know, the industry has moved to this [extract, load, transform] format. But the beautiful part about us is, because we do change data capture, we get that granular data as it happens.” And businesses want this very granular data to be reflected inside of their data warehouses, Brown noted, but he also stressed that Meroxa can expose this stream of data as an API endpoint or point it to a Webhook.

The company is able to do this because its core architecture is somewhat different from other data pipeline and integration services that, at first glance, seem to offer a similar solution. Because of this, users can use the service to connect different tools to their data warehouse but also build real-time tools on top of these data streams.

Image Credits: Meroxa

“We aren’t a point-to-point solution,” Meroxa co-founder and CTO Ali Hamidi explained. “When you set up the connection, you aren’t taking data from Postgres and only putting it into Snowflake. What’s really happening is that it’s going into our intermediate stream. Once it’s in that stream, you can then start hanging off connectors and say, ‘Okay, well, I also want to peek into the stream, I want to transfer my data, I want to filter out some things, I want to put it into S3.’ ”

Because of this, users can use the service to connect different tools to their data warehouse but also build real-time tools to utilize the real-time data stream. With this flexibility, Hamidi noted, a lot of the company’s customers start with a pretty standard use case and then quickly expand into other areas as well.

Brown and Hamidi met during their time at Heroku, where Brown was a director of product management and Hamidi a lead software engineer. But while Heroku made it very easy for developers to publish their web apps, there wasn’t anything comparable in the highly fragmented database space. The team acknowledges that there are a lot of tools that aim to solve these data problems, but few of them focus on the user experience.

Image Credits: Meroxa

“When we talk to customers now, it’s still very much an unsolved problem,” Hamidi said. “It seems kind of insane to me that this is such a common thing and there is no ‘oh, of course you use this tool because it addresses all my problems.’ And so the angle that we’re taking is that we see user experience not as a nice-to-have, it’s really an enabler, it is something that enables a software engineer or someone who isn’t a data engineer with 10 years of experience in wrangling Kafka and Postgres and all these things. […] That’s a transformative kind of change.”

It’s worth noting that Meroxa uses a lot of open-source tools but the company has also committed to open-sourcing everything in its data plane as well. “This has multiple wins for us, but one of the biggest incentives is in terms of the customer, we’re really committed to having our agenda aligned. Because if we don’t do well, we don’t serve the customer. If we do a crappy job, they can just keep all of those components and run it themselves,” Hamidi explained.

Today, Meroxa, which the team founded in early 2020, has more than 24 employees (and is 100% remote). “I really think we’re building one of the most talented and most inclusive teams possible,” Brown told me. “Inclusion and diversity are very, very high on our radar. Our team is 50% black and brown. Over 40% are women. Our management team is 90% underrepresented. So not only are we building a great product, we’re building a great company, we’re building a great business.”  

Mar
22
2021
--

No-code business intelligence service y42 raises $2.9M seed round

Berlin-based y42 (formerly known as Datos Intelligence), a data warehouse-centric business intelligence service that promises to give businesses access to an enterprise-level data stack that’s as simple to use as a spreadsheet, today announced that it has raised a $2.9 million seed funding round led by La Famiglia VC. Additional investors include the co-founders of Foodspring, Personio and Petlab.

The service, which was founded in 2020, integrates with more than 100 data sources, covering all the standard B2B SaaS tools, from Airtable to Shopify and Zendesk, as well as database services like Google’s BigQuery. Users can then transform and visualize this data, orchestrate their data pipelines and trigger automated workflows based on this data (think sending Slack notifications when revenue drops or emailing customers based on your own custom criteria).

Like similar startups, y42 extends the idea data warehouse, which was traditionally used for analytics, and helps businesses operationalize this data. At the core of the service is a lot of open source and the company, for example, contributes to GitLabs’ Meltano platform for building data pipelines.

y42 founder and CEO Hung Dang

y42 founder and CEO Hung Dang. Image Credits: y42

“We’re taking the best of breed open-source software. What we really want to accomplish is to create a tool that is so easy to understand and that enables everyone to work with their data effectively,” Y42 founder and CEO Hung Dang told me. “We’re extremely UX obsessed and I would describe us as a no-code/low-code BI tool — but with the power of an enterprise-level data stack and the simplicity of Google Sheets.”

Before y42, Vietnam-born Dang co-founded a major events company that operated in more than 10 countries and made millions in revenue (but with very thin margins), all while finishing up his studies with a focus on business analytics. And that in turn led him to also found a second company that focused on B2B data analytics.

Image Credits: y42

Even while building his events company, he noted, he was always very product- and data-driven. “I was implementing data pipelines to collect customer feedback and merge it with operational data — and it was really a big pain at that time,” he said. “I was using tools like Tableau and Alteryx, and it was really hard to glue them together — and they were quite expensive. So out of that frustration, I decided to develop an internal tool that was actually quite usable and in 2016, I decided to turn it into an actual company. ”

He then sold this company to a major publicly listed German company. An NDA prevents him from talking about the details of this transaction, but maybe you can draw some conclusions from the fact that he spent time at Eventim before founding y42.

Given his background, it’s maybe no surprise that y42’s focus is on making life easier for data engineers and, at the same time, putting the power of these platforms in the hands of business analysts. Dang noted that y42 typically provides some consulting work when it onboards new clients, but that’s mostly to give them a head start. Given the no-code/low-code nature of the product, most analysts are able to get started pretty quickly — and for more complex queries, customers can opt to drop down from the graphical interface to y42’s low-code level and write queries in the service’s SQL dialect.

The service itself runs on Google Cloud and the 25-people team manages about 50,000 jobs per day for its clients. The company’s customers include the likes of LifeMD, Petlab and Everdrop.

Until raising this round, Dang self-funded the company and had also raised some money from angel investors. But La Famiglia felt like the right fit for y42, especially due to its focus on connecting startups with more traditional enterprise companies.

“When we first saw the product demo, it struck us how on top of analytical excellence, a lot of product development has gone into the y42 platform,” said Judith Dada, general partner at LaFamiglia VC. “More and more work with data today means that data silos within organizations multiply, resulting in chaos or incorrect data. y42 is a powerful single source of truth for data experts and non-data experts alike. As former data scientists and analysts, we wish that we had y42 capabilities back then.”

Dang tells me he could have raised more but decided that he didn’t want to dilute the team’s stake too much at this point. “It’s a small round, but this round forces us to set up the right structure. For the Series A, which we plan to be towards the end of this year, we’re talking about a dimension which is 10x,” he told me.

Mar
16
2021
--

Noogata raises $12M seed round for its no-code enterprise AI platform

Noogata, a startup that offers a no-code AI solution for enterprises, today announced that it has raised a $12 million seed round led by Team8, with participation from Skylake Capital. The company, which was founded in 2019 and counts Colgate and PepsiCo among its customers, currently focuses on e-commerce, retail and financial services, but it notes that it will use the new funding to power its product development and expand into new industries.

The company’s platform offers a collection of what are essentially pre-built AI building blocks that enterprises can then connect to third-party tools like their data warehouse, Salesforce, Stripe and other data sources. An e-commerce retailer could use this to optimize its pricing, for example, thanks to recommendations from the Noogata platform, while a brick-and-mortar retailer could use it to plan which assortment to allocate to a given location.

Image Credits: Noogata

“We believe data teams are at the epicenter of digital transformation and that to drive impact, they need to be able to unlock the value of data. They need access to relevant, continuous and explainable insights and predictions that are reliable and up-to-date,” said Noogata co-founder and CEO Assaf Egozi. “Noogata unlocks the value of data by providing contextual, business-focused blocks that integrate seamlessly into enterprise data environments to generate actionable insights, predictions and recommendations. This empowers users to go far beyond traditional business intelligence by leveraging AI in their self-serve analytics as well as in their data solutions.”

Image Credits: Noogata

We’ve obviously seen a plethora of startups in this space lately. The proliferation of data — and the advent of data warehousing — means that most businesses now have the fuel to create machine learning-based predictions. What’s often lacking, though, is the talent. There’s still a shortage of data scientists and developers who can build these models from scratch, so it’s no surprise that we’re seeing more startups that are creating no-code/low-code services in this space. The well-funded Abacus.ai, for example, targets about the same market as Noogata.

“Noogata is perfectly positioned to address the significant market need for a best-in-class, no-code data analytics platform to drive decision-making,” writes Team8 managing partner Yuval Shachar. “The innovative platform replaces the need for internal build, which is complex and costly, or the use of out-of-the-box vendor solutions which are limited. The company’s ability to unlock the value of data through AI is a game-changer. Add to that a stellar founding team, and there is no doubt in my mind that Noogata will be enormously successful.”


Early Stage is the premier “how-to” event for startup entrepreneurs and investors. You’ll hear firsthand how some of the most successful founders and VCs build their businesses, raise money and manage their portfolios. We’ll cover every aspect of company building: Fundraising, recruiting, sales, product-market fit, PR, marketing and brand building. Each session also has audience participation built-in — there’s ample time included for audience questions and discussion. Use code “TCARTICLE at checkout to get 20% off tickets right here.

Feb
18
2021
--

Census raises $16M Series A to help companies put their data warehouses to work

Census, a startup that helps businesses sync their customer data from their data warehouses to their various business tools like Salesforce and Marketo, today announced that it has raised a $16 million Series A round led by Sequoia Capital. Other participants in this round include Andreessen Horowitz, which led the company’s $4.3 million seed round last year, as well as several notable angles, including Figma CEO Dylan Field, GitHub CTO Jason Warner, Notion COO Akshay Kothari and Rippling CEO Parker Conrad.

The company is part of a new crop of startups that are building on top of data warehouses. The general idea behind Census is to help businesses operationalize the data in their data warehouses, which was traditionally only used for analytics and reporting use cases. But as businesses realized that all the data they needed was already available in their data warehouses and that they could use that as a single source of truth without having to build additional integrations, an ecosystem of companies that operationalize this data started to form.

The company argues that the modern data stack, with data warehouses like Amazon Redshift, Google BigQuery and Snowflake at its core, offers all of the tools a business needs to extract and transform data (like Fivetran, dbt) and then visualize it (think Looker).

Tools like Census then essentially function as a new layer that sits between the data warehouse and the business tools that can help companies extract value from this data. With that, users can easily sync their product data into a marketing tool like Marketo or a CRM service like Salesforce, for example.

Image Credits: Census

Three years ago, we were the first to ask, ‘Why are we relying on a clumsy tangle of wires connecting every app when everything we need is already in the warehouse? What if you could leverage your data team to drive operations?’ When the data warehouse is connected to the rest of the business, the possibilities are limitless,” Census explains in today’s announcement. “When we launched, our focus was enabling product-led companies like Figma, Canva, and Notion to drive better marketing, sales, and customer success. Along the way, our customers have pulled Census into more and more scenarios, like auto-prioritizing support tickets in Zendesk, automating invoices in Netsuite, or even integrating with HR systems.

Census already integrates with dozens of different services and data tools and its customers include the likes of Clearbit, Figma, Fivetran, LogDNA, Loom and Notion.

Looking ahead, Census plans to use the new funding to launch new features like deeper data validation and a visual query experience. In addition, it also plans to launch code-based orchestration to make Census workflows versionable and make it easier to integrate them into an enterprise orchestration system.

Dec
16
2020
--

Hightouch raises $2.1M to help businesses get more value from their data warehouses

Hightouch, a SaaS service that helps businesses sync their customer data across sales and marketing tools, is coming out of stealth and announcing a $2.1 million seed round. The round was led by Afore Capital and Slack Fund, with a number of angel investors also participating.

At its core, Hightouch, which participated in Y Combinator’s Summer 2019 batch, aims to solve the customer data integration problems that many businesses today face.

During their time at Segment, Hightouch co-founders Tejas Manohar and Josh Curl witnessed the rise of data warehouses like Snowflake, Google’s BigQuery and Amazon Redshift — that’s where a lot of Segment data ends up, after all. As businesses adopt data warehouses, they now have a central repository for all of their customer data. Typically, though, this information is then only used for analytics purposes. Together with former Bessemer Ventures investor Kashish Gupta, the team decided to see how they could innovate on top of this trend and help businesses activate all of this information.

hightouch founders

HighTouch co-founders Kashish Gupta, Josh Curl and Tejas Manohar.

“What we found is that, with all the customer data inside of the data warehouse, it doesn’t make sense for it to just be used for analytics purposes — it also makes sense for these operational purposes like serving different business teams with the data they need to run things like marketing campaigns — or in product personalization,” Manohar told me. “That’s the angle that we’ve taken with Hightouch. It stems from us seeing the explosive growth of the data warehouse space, both in terms of technology advancements as well as like accessibility and adoption. […] Our goal is to be seen as the company that makes the warehouse not just for analytics but for these operational use cases.”

It helps that all of the big data warehousing platforms have standardized on SQL as their query language — and because the warehousing services have already solved the problem of ingesting all of this data, Hightouch doesn’t have to worry about this part of the tech stack either. And as Curl added, Snowflake and its competitors never quite went beyond serving the analytics use case either.

Image Credits: Hightouch

As for the product itself, Hightouch lets users create SQL queries and then send that data to different destinations — maybe a CRM system like Salesforce or a marketing platform like Marketo — after transforming it to the format that the destination platform expects.

Expert users can write their own SQL queries for this, but the team also built a graphical interface to help non-developers create their own queries. The core audience, though, is data teams — and they, too, will likely see value in the graphical user interface because it will speed up their workflows as well. “We want to empower the business user to access whatever models and aggregation the data user has done in the warehouse,” Gupta explained.

The company is agnostic to how and where its users want to operationalize their data, but the most common use cases right now focus on B2C companies, where marketing teams often use the data, as well as sales teams at B2B companies.

Image Credits: Hightouch

“It feels like there’s an emerging category here of tooling that’s being built on top of a data warehouse natively, rather than being a standard SaaS tool where it is its own data store and then you manage a secondary data store,” Curl said. “We have a class of things here that connect to a data warehouse and make use of that data for operational purposes. There’s no industry term for that yet, but we really believe that that’s the future of where data engineering is going. It’s about building off this centralized platform like Snowflake, BigQuery and things like that.”

“Warehouse-native,” Manohar suggested as a potential name here. We’ll see if it sticks.

Hightouch originally raised its round after its participation in the Y Combinator demo day but decided not to disclose it until it felt like it had found the right product/market fit. Current customers include the likes of Retool, Proof, Stream and Abacus, in addition to a number of significantly larger companies the team isn’t able to name publicly.

Dec
09
2020
--

Firebolt raises $37M to take on Snowflake, Amazon and Google with a new approach to data warehousing

For many organizations, the shift to cloud computing has played out more realistically as a shift to hybrid architectures, where a company’s data is just as likely to reside in one of a number of clouds as it might in an on-premise deployment, in a data warehouse or in a data lake. Today, a startup that has built a more comprehensive way to assess, analyse and use that data is announcing funding as it looks to take on Snowflake, Amazon, Google and others in the area of enterprise data analytics.

Firebolt, which has redesigned the concept of a data warehouse to work more efficiently and at a lower cost, is today announcing that it has raised $37 million from Zeev Ventures, TLV Partners, Bessemer Venture Partners and Angular Ventures. It plans to use the funding to continue developing its product and bring on more customers.

The company is officially “launching” today but — as is the case with so many enterprise startups these days operating in stealth — it has been around for two years already building its platform and signing commercial deals. It now has some 12 large enterprise customers and is “really busy” with new business, said CEO Eldad Farkash in an interview.

The funding may sound like a large amount for a company that has not really been out in the open, but part of the reason is because of the track record of the founders. Farkash was one of the founders of Sisense, the successful business intelligence startup, and he has co-founded Firebolt with two others who were on Sisense’s founding team, Saar Bitner as COO and Ariel Yaroshevich as CTO.

At Sisense, these three were coming up against an issue: When you are dealing in terabytes of data, cloud data warehouses were straining to deliver good performance to power its analytics and other tools, and the only way to potentially continue to mitigate that was by piling on more cloud capacity.

Farkash is something of a technical savant and said that he decided to move on and build Firebolt to see if he could tackle this, which he described as a new, difficult and “meaningful” problem. “The only thing I know how to do is build startups,” he joked.

In his opinion, while data warehousing has been a big breakthrough in how to handle the mass of data that companies now amass and want to use better, it has started to feel like a dated solution.

“Data warehouses are solving yesterday’s problem, which was, ‘How do I migrate to the cloud and deal with scale?’ ” he said, citing Google’s BigQuery, Amazon’s RedShift and Snowflake as fitting answers for that issue. “We see Firebolt as the new entrant in that space, with a new take on design on technology. We change the discussion from one of scale to one of speed and efficiency.”

The startup claims that its performance is up to 182 times faster than that of other data warehouses. It’s a SQL-based system that works on principles that Farkash said came out of academic research that had yet to be applied anywhere, around how to handle data in a lighter way, using new techniques in compression and how data is parsed. Data lakes in turn can be connected with a wider data ecosystem, and what it translates to is a much smaller requirement for cloud capacity.

This is not just a problem at Sisense. With enterprise data continuing to grow exponentially, cloud analytics is growing with it, and is estimated by 2025 to be a $65 billion market, Firebolt estimates.

Still, Farkash said the Firebolt concept was initially a challenging sell even to the engineers that it eventually hired to build out the business: It required building completely new warehouses from the ground up to run the platform, five of which exist today and will be augmented with more, on the back of this funding, he said.

And it should be pointed out that its competitors are not exactly sitting still either. Just yesterday, Dataform announced that it had been acquired by Google to help it build out and run better performance at BigQuery.

“Firebolt created a SaaS product that changes the analytics experience over big data sets,” Oren Zeev of Zeev Ventures said in a statement. “The pace of innovation in the big data space has lagged the explosion in data growth rendering most data warehousing solutions too slow, too expensive, or too complex to scale. Firebolt takes cloud data warehousing to the next level by offering the world’s most powerful analytical engine. This means companies can now analyze multi Terabyte / Petabyte data sets easily at significantly lower costs and provide a truly interactive user experience to their employees, customers or anyone who needs to access the data.”

Nov
12
2020
--

Databricks launches SQL Analytics

AI and data analytics company Databricks today announced the launch of SQL Analytics, a new service that makes it easier for data analysts to run their standard SQL queries directly on data lakes. And with that, enterprises can now easily connect their business intelligence tools like Tableau and Microsoft’s Power BI to these data repositories as well.

SQL Analytics will be available in public preview on November 18.

In many ways, SQL Analytics is the product Databricks has long been looking to build and that brings its concept of a “lake house” to life. It combines the performance of a data warehouse, where you store data after it has already been transformed and cleaned, with a data lake, where you store all of your data in its raw form. The data in the data lake, a concept that Databricks’ co-founder and CEO Ali Ghodsi has long championed, is typically only transformed when it gets used. That makes data lakes cheaper, but also a bit harder to handle for users.

Image Credits: Databricks

“We’ve been saying Unified Data Analytics, which means unify the data with the analytics. So data processing and analytics, those two should be merged. But no one picked that up,” Ghodsi told me. But “lake house” caught on as a term.

“Databricks has always offered data science, machine learning. We’ve talked about that for years. And with Spark, we provide the data processing capability. You can do [extract, transform, load]. That has always been possible. SQL Analytics enables you to now do the data warehousing workloads directly, and concretely, the business intelligence and reporting workloads, directly on the data lake.”

The general idea here is that with just one copy of the data, you can enable both traditional data analyst use cases (think BI) and the data science workloads (think AI) Databricks was already known for. Ideally, that makes both use cases cheaper and simpler.

The service sits on top of an optimized version of Databricks’ open-source Delta Lake storage layer to enable the service to quickly complete queries. In addition, Delta Lake also provides auto-scaling endpoints to keep the query latency consistent, even under high loads.

While data analysts can query these data sets directly, using standard SQL, the company also built a set of connectors to BI tools. Its BI partners include Tableau, Qlik, Looker and Thoughtspot, as well as ingest partners like Fivetran, Fishtown Analytics, Talend and Matillion.

Image Credits: Databricks

“Now more than ever, organizations need a data strategy that enables speed and agility to be adaptable,” said Francois Ajenstat, chief product officer at Tableau. “As organizations are rapidly moving their data to the cloud, we’re seeing growing interest in doing analytics on the data lake. The introduction of SQL Analytics delivers an entirely new experience for customers to tap into insights from massive volumes of data with the performance, reliability and scale they need.”

In a demo, Ghodsi showed me what the new SQL Analytics workspace looks like. It’s essentially a stripped-down version of the standard code-heavy experience with which Databricks users are familiar. Unsurprisingly, SQL Analytics provides a more graphical experience that focuses more on visualizations and not Python code.

While there are already some data analysts on the Databricks platform, this obviously opens up a large new market for the company — something that would surely bolster its plans for an IPO next year.

May
27
2020
--

RudderStack raises $5M seed round for its open-source Segment competitor

RudderStack, a startup that offers an open-source alternative to customer data management platforms like Segment, today announced that it has raised a $5 million seed round led by S28 Capital. Salil Deshpande of Uncorrelated Ventures and Mesosphere/D2iQ co-founder Florian Leibert (through 468 Capital) also participated in this round.

In addition, the company also today announced that it has acquired Blendo, an integration platform that helps businesses transform and move data from their data sources to databases.

Like its larger competitors, RudderStack helps businesses consolidate all of their customer data, which is now typically generated and managed in multiple places — and then extract value from this more holistic view. The company was founded by Soumyadeb Mitra, who has a Ph.D. in database systems and worked on similar problems previously when he was at 8×8 after his previous startup, MairinaIQ, was acquired by that company.

Mitra argues that RudderStack is different from its competitors thanks to its focus on developers, its privacy and security options and its focus on being a data warehouse first, without creating yet another data silo.

“Our competitors provide tools for analytics, audience segmentation, etc. on top of the data they keep,” he said. “That works well if you are a small startup, but larger enterprises have a ton of other data sources — at 8×8 we had our own internal billing system, for example — and you want to combine this internal data with the event stream data — that you collect via RudderStack or competitors — to create a 360-degree view of the customer and act on that. This becomes very difficult with the SaaS-hosted data model of our competitors — you won’t be sending all your internal data to these cloud vendors.”

Part of its appeal, of course, is the open-source nature of RudderStack, whose GitHub repository now has more than 1,700 stars for the main RudderStack server. Mitra credits getting on the front page of HackerNews for its first sale. On that day, it received over 500 GitHub stars, a few thousand clones and a lot of signups for its hosted app. “One of those signups turned out to be our first paid customer. They were already a competitor’s customer, but it wasn’t scaling up so were looking to build something in-house. That’s when they found us and started working with us,” he said.

Because it is open source, companies can run RudderStack anyway they want, but like most similar open-source companies, RudderStack offers multiple hosting options itself, too, that include cloud hosting, starting at $2,000 per month, with unlimited sources and destination.

Current users include IFTTT, Mattermost, MarineTraffic, Torpedo and Wynn Las Vegas.

As for the Blendo acquisition, it’s worth noting that the company only raised a small amount of money in its seed round. The two companies did not disclose the price of the acquisition.

“With Blendo, I had the opportunity to be part of a great team that executed on the vision of turning any company into a data-driven organization,” said Blendo founder Kostas Pardalis, who has joined RudderStack as head of Growth. “We’ve combined the talented Blendo and RudderStack teams together with the technology that both companies have created, at a time when the customer data market is ripe for the next wave of innovation. I’m excited to help drive RudderStack forward.”

Mitra tells me that RudderStack acquired Blendo instead of building its own version of this technology because “it is not a trivial technology to build — cloud sources are really complicated and have weird schemas and API challenges and it would have taken us a lot of time to figure it out. There are independent large companies doing the ETL piece.”

Powered by WordPress | Theme: Aeros 2.0 by TheBuckmaker.com