Apr
13
2021
--

Meroxa raises $15M Series A for its real-time data platform

Meroxa, a startup that makes it easier for businesses to build the data pipelines to power both their analytics and operational workflows, today announced that it has raised a $15 million Series A funding round led by Drive Capital. Existing investors Root, Amplify and Hustle Fund also participated in this round, which together with the company’s previously undisclosed $4.2 million seed round now brings total funding in the company to $19.2 million.

The promise of Meroxa is that businesses can use a single platform for their various data needs and won’t need a team of experts to build their infrastructure and then manage it. At its core, Meroxa provides a single software-as-a-service solution that connects relational databases to data warehouses and then helps businesses operationalize that data.

Image Credits: Meroxa

“The interesting thing is that we are focusing squarely on relational and NoSQL databases into data warehouse,” Meroxa co-founder and CEO DeVaris Brown told me. “Honestly, people come to us as a real-time FiveTran or real-time data warehouse sink. Because, you know, the industry has moved to this [extract, load, transform] format. But the beautiful part about us is, because we do change data capture, we get that granular data as it happens.” And businesses want this very granular data to be reflected inside of their data warehouses, Brown noted, but he also stressed that Meroxa can expose this stream of data as an API endpoint or point it to a Webhook.

The company is able to do this because its core architecture is somewhat different from other data pipeline and integration services that, at first glance, seem to offer a similar solution. Because of this, users can use the service to connect different tools to their data warehouse but also build real-time tools on top of these data streams.

Image Credits: Meroxa

“We aren’t a point-to-point solution,” Meroxa co-founder and CTO Ali Hamidi explained. “When you set up the connection, you aren’t taking data from Postgres and only putting it into Snowflake. What’s really happening is that it’s going into our intermediate stream. Once it’s in that stream, you can then start hanging off connectors and say, ‘Okay, well, I also want to peek into the stream, I want to transfer my data, I want to filter out some things, I want to put it into S3.’ ”

Because of this, users can use the service to connect different tools to their data warehouse but also build real-time tools to utilize the real-time data stream. With this flexibility, Hamidi noted, a lot of the company’s customers start with a pretty standard use case and then quickly expand into other areas as well.

Brown and Hamidi met during their time at Heroku, where Brown was a director of product management and Hamidi a lead software engineer. But while Heroku made it very easy for developers to publish their web apps, there wasn’t anything comparable in the highly fragmented database space. The team acknowledges that there are a lot of tools that aim to solve these data problems, but few of them focus on the user experience.

Image Credits: Meroxa

“When we talk to customers now, it’s still very much an unsolved problem,” Hamidi said. “It seems kind of insane to me that this is such a common thing and there is no ‘oh, of course you use this tool because it addresses all my problems.’ And so the angle that we’re taking is that we see user experience not as a nice-to-have, it’s really an enabler, it is something that enables a software engineer or someone who isn’t a data engineer with 10 years of experience in wrangling Kafka and Postgres and all these things. […] That’s a transformative kind of change.”

It’s worth noting that Meroxa uses a lot of open-source tools but the company has also committed to open-sourcing everything in its data plane as well. “This has multiple wins for us, but one of the biggest incentives is in terms of the customer, we’re really committed to having our agenda aligned. Because if we don’t do well, we don’t serve the customer. If we do a crappy job, they can just keep all of those components and run it themselves,” Hamidi explained.

Today, Meroxa, which the team founded in early 2020, has more than 24 employees (and is 100% remote). “I really think we’re building one of the most talented and most inclusive teams possible,” Brown told me. “Inclusion and diversity are very, very high on our radar. Our team is 50% black and brown. Over 40% are women. Our management team is 90% underrepresented. So not only are we building a great product, we’re building a great company, we’re building a great business.”  

Mar
22
2021
--

No-code business intelligence service y42 raises $2.9M seed round

Berlin-based y42 (formerly known as Datos Intelligence), a data warehouse-centric business intelligence service that promises to give businesses access to an enterprise-level data stack that’s as simple to use as a spreadsheet, today announced that it has raised a $2.9 million seed funding round led by La Famiglia VC. Additional investors include the co-founders of Foodspring, Personio and Petlab.

The service, which was founded in 2020, integrates with more than 100 data sources, covering all the standard B2B SaaS tools, from Airtable to Shopify and Zendesk, as well as database services like Google’s BigQuery. Users can then transform and visualize this data, orchestrate their data pipelines and trigger automated workflows based on this data (think sending Slack notifications when revenue drops or emailing customers based on your own custom criteria).

Like similar startups, y42 extends the idea data warehouse, which was traditionally used for analytics, and helps businesses operationalize this data. At the core of the service is a lot of open source and the company, for example, contributes to GitLabs’ Meltano platform for building data pipelines.

y42 founder and CEO Hung Dang

y42 founder and CEO Hung Dang. Image Credits: y42

“We’re taking the best of breed open-source software. What we really want to accomplish is to create a tool that is so easy to understand and that enables everyone to work with their data effectively,” Y42 founder and CEO Hung Dang told me. “We’re extremely UX obsessed and I would describe us as a no-code/low-code BI tool — but with the power of an enterprise-level data stack and the simplicity of Google Sheets.”

Before y42, Vietnam-born Dang co-founded a major events company that operated in more than 10 countries and made millions in revenue (but with very thin margins), all while finishing up his studies with a focus on business analytics. And that in turn led him to also found a second company that focused on B2B data analytics.

Image Credits: y42

Even while building his events company, he noted, he was always very product- and data-driven. “I was implementing data pipelines to collect customer feedback and merge it with operational data — and it was really a big pain at that time,” he said. “I was using tools like Tableau and Alteryx, and it was really hard to glue them together — and they were quite expensive. So out of that frustration, I decided to develop an internal tool that was actually quite usable and in 2016, I decided to turn it into an actual company. ”

He then sold this company to a major publicly listed German company. An NDA prevents him from talking about the details of this transaction, but maybe you can draw some conclusions from the fact that he spent time at Eventim before founding y42.

Given his background, it’s maybe no surprise that y42’s focus is on making life easier for data engineers and, at the same time, putting the power of these platforms in the hands of business analysts. Dang noted that y42 typically provides some consulting work when it onboards new clients, but that’s mostly to give them a head start. Given the no-code/low-code nature of the product, most analysts are able to get started pretty quickly — and for more complex queries, customers can opt to drop down from the graphical interface to y42’s low-code level and write queries in the service’s SQL dialect.

The service itself runs on Google Cloud and the 25-people team manages about 50,000 jobs per day for its clients. The company’s customers include the likes of LifeMD, Petlab and Everdrop.

Until raising this round, Dang self-funded the company and had also raised some money from angel investors. But La Famiglia felt like the right fit for y42, especially due to its focus on connecting startups with more traditional enterprise companies.

“When we first saw the product demo, it struck us how on top of analytical excellence, a lot of product development has gone into the y42 platform,” said Judith Dada, general partner at LaFamiglia VC. “More and more work with data today means that data silos within organizations multiply, resulting in chaos or incorrect data. y42 is a powerful single source of truth for data experts and non-data experts alike. As former data scientists and analysts, we wish that we had y42 capabilities back then.”

Dang tells me he could have raised more but decided that he didn’t want to dilute the team’s stake too much at this point. “It’s a small round, but this round forces us to set up the right structure. For the Series A, which we plan to be towards the end of this year, we’re talking about a dimension which is 10x,” he told me.

Mar
16
2021
--

Noogata raises $12M seed round for its no-code enterprise AI platform

Noogata, a startup that offers a no-code AI solution for enterprises, today announced that it has raised a $12 million seed round led by Team8, with participation from Skylake Capital. The company, which was founded in 2019 and counts Colgate and PepsiCo among its customers, currently focuses on e-commerce, retail and financial services, but it notes that it will use the new funding to power its product development and expand into new industries.

The company’s platform offers a collection of what are essentially pre-built AI building blocks that enterprises can then connect to third-party tools like their data warehouse, Salesforce, Stripe and other data sources. An e-commerce retailer could use this to optimize its pricing, for example, thanks to recommendations from the Noogata platform, while a brick-and-mortar retailer could use it to plan which assortment to allocate to a given location.

Image Credits: Noogata

“We believe data teams are at the epicenter of digital transformation and that to drive impact, they need to be able to unlock the value of data. They need access to relevant, continuous and explainable insights and predictions that are reliable and up-to-date,” said Noogata co-founder and CEO Assaf Egozi. “Noogata unlocks the value of data by providing contextual, business-focused blocks that integrate seamlessly into enterprise data environments to generate actionable insights, predictions and recommendations. This empowers users to go far beyond traditional business intelligence by leveraging AI in their self-serve analytics as well as in their data solutions.”

Image Credits: Noogata

We’ve obviously seen a plethora of startups in this space lately. The proliferation of data — and the advent of data warehousing — means that most businesses now have the fuel to create machine learning-based predictions. What’s often lacking, though, is the talent. There’s still a shortage of data scientists and developers who can build these models from scratch, so it’s no surprise that we’re seeing more startups that are creating no-code/low-code services in this space. The well-funded Abacus.ai, for example, targets about the same market as Noogata.

“Noogata is perfectly positioned to address the significant market need for a best-in-class, no-code data analytics platform to drive decision-making,” writes Team8 managing partner Yuval Shachar. “The innovative platform replaces the need for internal build, which is complex and costly, or the use of out-of-the-box vendor solutions which are limited. The company’s ability to unlock the value of data through AI is a game-changer. Add to that a stellar founding team, and there is no doubt in my mind that Noogata will be enormously successful.”


Early Stage is the premier “how-to” event for startup entrepreneurs and investors. You’ll hear firsthand how some of the most successful founders and VCs build their businesses, raise money and manage their portfolios. We’ll cover every aspect of company building: Fundraising, recruiting, sales, product-market fit, PR, marketing and brand building. Each session also has audience participation built-in — there’s ample time included for audience questions and discussion. Use code “TCARTICLE at checkout to get 20% off tickets right here.

Feb
18
2021
--

Census raises $16M Series A to help companies put their data warehouses to work

Census, a startup that helps businesses sync their customer data from their data warehouses to their various business tools like Salesforce and Marketo, today announced that it has raised a $16 million Series A round led by Sequoia Capital. Other participants in this round include Andreessen Horowitz, which led the company’s $4.3 million seed round last year, as well as several notable angles, including Figma CEO Dylan Field, GitHub CTO Jason Warner, Notion COO Akshay Kothari and Rippling CEO Parker Conrad.

The company is part of a new crop of startups that are building on top of data warehouses. The general idea behind Census is to help businesses operationalize the data in their data warehouses, which was traditionally only used for analytics and reporting use cases. But as businesses realized that all the data they needed was already available in their data warehouses and that they could use that as a single source of truth without having to build additional integrations, an ecosystem of companies that operationalize this data started to form.

The company argues that the modern data stack, with data warehouses like Amazon Redshift, Google BigQuery and Snowflake at its core, offers all of the tools a business needs to extract and transform data (like Fivetran, dbt) and then visualize it (think Looker).

Tools like Census then essentially function as a new layer that sits between the data warehouse and the business tools that can help companies extract value from this data. With that, users can easily sync their product data into a marketing tool like Marketo or a CRM service like Salesforce, for example.

Image Credits: Census

Three years ago, we were the first to ask, ‘Why are we relying on a clumsy tangle of wires connecting every app when everything we need is already in the warehouse? What if you could leverage your data team to drive operations?’ When the data warehouse is connected to the rest of the business, the possibilities are limitless,” Census explains in today’s announcement. “When we launched, our focus was enabling product-led companies like Figma, Canva, and Notion to drive better marketing, sales, and customer success. Along the way, our customers have pulled Census into more and more scenarios, like auto-prioritizing support tickets in Zendesk, automating invoices in Netsuite, or even integrating with HR systems.

Census already integrates with dozens of different services and data tools and its customers include the likes of Clearbit, Figma, Fivetran, LogDNA, Loom and Notion.

Looking ahead, Census plans to use the new funding to launch new features like deeper data validation and a visual query experience. In addition, it also plans to launch code-based orchestration to make Census workflows versionable and make it easier to integrate them into an enterprise orchestration system.

Dec
16
2020
--

Hightouch raises $2.1M to help businesses get more value from their data warehouses

Hightouch, a SaaS service that helps businesses sync their customer data across sales and marketing tools, is coming out of stealth and announcing a $2.1 million seed round. The round was led by Afore Capital and Slack Fund, with a number of angel investors also participating.

At its core, Hightouch, which participated in Y Combinator’s Summer 2019 batch, aims to solve the customer data integration problems that many businesses today face.

During their time at Segment, Hightouch co-founders Tejas Manohar and Josh Curl witnessed the rise of data warehouses like Snowflake, Google’s BigQuery and Amazon Redshift — that’s where a lot of Segment data ends up, after all. As businesses adopt data warehouses, they now have a central repository for all of their customer data. Typically, though, this information is then only used for analytics purposes. Together with former Bessemer Ventures investor Kashish Gupta, the team decided to see how they could innovate on top of this trend and help businesses activate all of this information.

hightouch founders

HighTouch co-founders Kashish Gupta, Josh Curl and Tejas Manohar.

“What we found is that, with all the customer data inside of the data warehouse, it doesn’t make sense for it to just be used for analytics purposes — it also makes sense for these operational purposes like serving different business teams with the data they need to run things like marketing campaigns — or in product personalization,” Manohar told me. “That’s the angle that we’ve taken with Hightouch. It stems from us seeing the explosive growth of the data warehouse space, both in terms of technology advancements as well as like accessibility and adoption. […] Our goal is to be seen as the company that makes the warehouse not just for analytics but for these operational use cases.”

It helps that all of the big data warehousing platforms have standardized on SQL as their query language — and because the warehousing services have already solved the problem of ingesting all of this data, Hightouch doesn’t have to worry about this part of the tech stack either. And as Curl added, Snowflake and its competitors never quite went beyond serving the analytics use case either.

Image Credits: Hightouch

As for the product itself, Hightouch lets users create SQL queries and then send that data to different destinations — maybe a CRM system like Salesforce or a marketing platform like Marketo — after transforming it to the format that the destination platform expects.

Expert users can write their own SQL queries for this, but the team also built a graphical interface to help non-developers create their own queries. The core audience, though, is data teams — and they, too, will likely see value in the graphical user interface because it will speed up their workflows as well. “We want to empower the business user to access whatever models and aggregation the data user has done in the warehouse,” Gupta explained.

The company is agnostic to how and where its users want to operationalize their data, but the most common use cases right now focus on B2C companies, where marketing teams often use the data, as well as sales teams at B2B companies.

Image Credits: Hightouch

“It feels like there’s an emerging category here of tooling that’s being built on top of a data warehouse natively, rather than being a standard SaaS tool where it is its own data store and then you manage a secondary data store,” Curl said. “We have a class of things here that connect to a data warehouse and make use of that data for operational purposes. There’s no industry term for that yet, but we really believe that that’s the future of where data engineering is going. It’s about building off this centralized platform like Snowflake, BigQuery and things like that.”

“Warehouse-native,” Manohar suggested as a potential name here. We’ll see if it sticks.

Hightouch originally raised its round after its participation in the Y Combinator demo day but decided not to disclose it until it felt like it had found the right product/market fit. Current customers include the likes of Retool, Proof, Stream and Abacus, in addition to a number of significantly larger companies the team isn’t able to name publicly.

Dec
09
2020
--

Firebolt raises $37M to take on Snowflake, Amazon and Google with a new approach to data warehousing

For many organizations, the shift to cloud computing has played out more realistically as a shift to hybrid architectures, where a company’s data is just as likely to reside in one of a number of clouds as it might in an on-premise deployment, in a data warehouse or in a data lake. Today, a startup that has built a more comprehensive way to assess, analyse and use that data is announcing funding as it looks to take on Snowflake, Amazon, Google and others in the area of enterprise data analytics.

Firebolt, which has redesigned the concept of a data warehouse to work more efficiently and at a lower cost, is today announcing that it has raised $37 million from Zeev Ventures, TLV Partners, Bessemer Venture Partners and Angular Ventures. It plans to use the funding to continue developing its product and bring on more customers.

The company is officially “launching” today but — as is the case with so many enterprise startups these days operating in stealth — it has been around for two years already building its platform and signing commercial deals. It now has some 12 large enterprise customers and is “really busy” with new business, said CEO Eldad Farkash in an interview.

The funding may sound like a large amount for a company that has not really been out in the open, but part of the reason is because of the track record of the founders. Farkash was one of the founders of Sisense, the successful business intelligence startup, and he has co-founded Firebolt with two others who were on Sisense’s founding team, Saar Bitner as COO and Ariel Yaroshevich as CTO.

At Sisense, these three were coming up against an issue: When you are dealing in terabytes of data, cloud data warehouses were straining to deliver good performance to power its analytics and other tools, and the only way to potentially continue to mitigate that was by piling on more cloud capacity.

Farkash is something of a technical savant and said that he decided to move on and build Firebolt to see if he could tackle this, which he described as a new, difficult and “meaningful” problem. “The only thing I know how to do is build startups,” he joked.

In his opinion, while data warehousing has been a big breakthrough in how to handle the mass of data that companies now amass and want to use better, it has started to feel like a dated solution.

“Data warehouses are solving yesterday’s problem, which was, ‘How do I migrate to the cloud and deal with scale?’ ” he said, citing Google’s BigQuery, Amazon’s RedShift and Snowflake as fitting answers for that issue. “We see Firebolt as the new entrant in that space, with a new take on design on technology. We change the discussion from one of scale to one of speed and efficiency.”

The startup claims that its performance is up to 182 times faster than that of other data warehouses. It’s a SQL-based system that works on principles that Farkash said came out of academic research that had yet to be applied anywhere, around how to handle data in a lighter way, using new techniques in compression and how data is parsed. Data lakes in turn can be connected with a wider data ecosystem, and what it translates to is a much smaller requirement for cloud capacity.

This is not just a problem at Sisense. With enterprise data continuing to grow exponentially, cloud analytics is growing with it, and is estimated by 2025 to be a $65 billion market, Firebolt estimates.

Still, Farkash said the Firebolt concept was initially a challenging sell even to the engineers that it eventually hired to build out the business: It required building completely new warehouses from the ground up to run the platform, five of which exist today and will be augmented with more, on the back of this funding, he said.

And it should be pointed out that its competitors are not exactly sitting still either. Just yesterday, Dataform announced that it had been acquired by Google to help it build out and run better performance at BigQuery.

“Firebolt created a SaaS product that changes the analytics experience over big data sets,” Oren Zeev of Zeev Ventures said in a statement. “The pace of innovation in the big data space has lagged the explosion in data growth rendering most data warehousing solutions too slow, too expensive, or too complex to scale. Firebolt takes cloud data warehousing to the next level by offering the world’s most powerful analytical engine. This means companies can now analyze multi Terabyte / Petabyte data sets easily at significantly lower costs and provide a truly interactive user experience to their employees, customers or anyone who needs to access the data.”

Nov
12
2020
--

Databricks launches SQL Analytics

AI and data analytics company Databricks today announced the launch of SQL Analytics, a new service that makes it easier for data analysts to run their standard SQL queries directly on data lakes. And with that, enterprises can now easily connect their business intelligence tools like Tableau and Microsoft’s Power BI to these data repositories as well.

SQL Analytics will be available in public preview on November 18.

In many ways, SQL Analytics is the product Databricks has long been looking to build and that brings its concept of a “lake house” to life. It combines the performance of a data warehouse, where you store data after it has already been transformed and cleaned, with a data lake, where you store all of your data in its raw form. The data in the data lake, a concept that Databricks’ co-founder and CEO Ali Ghodsi has long championed, is typically only transformed when it gets used. That makes data lakes cheaper, but also a bit harder to handle for users.

Image Credits: Databricks

“We’ve been saying Unified Data Analytics, which means unify the data with the analytics. So data processing and analytics, those two should be merged. But no one picked that up,” Ghodsi told me. But “lake house” caught on as a term.

“Databricks has always offered data science, machine learning. We’ve talked about that for years. And with Spark, we provide the data processing capability. You can do [extract, transform, load]. That has always been possible. SQL Analytics enables you to now do the data warehousing workloads directly, and concretely, the business intelligence and reporting workloads, directly on the data lake.”

The general idea here is that with just one copy of the data, you can enable both traditional data analyst use cases (think BI) and the data science workloads (think AI) Databricks was already known for. Ideally, that makes both use cases cheaper and simpler.

The service sits on top of an optimized version of Databricks’ open-source Delta Lake storage layer to enable the service to quickly complete queries. In addition, Delta Lake also provides auto-scaling endpoints to keep the query latency consistent, even under high loads.

While data analysts can query these data sets directly, using standard SQL, the company also built a set of connectors to BI tools. Its BI partners include Tableau, Qlik, Looker and Thoughtspot, as well as ingest partners like Fivetran, Fishtown Analytics, Talend and Matillion.

Image Credits: Databricks

“Now more than ever, organizations need a data strategy that enables speed and agility to be adaptable,” said Francois Ajenstat, chief product officer at Tableau. “As organizations are rapidly moving their data to the cloud, we’re seeing growing interest in doing analytics on the data lake. The introduction of SQL Analytics delivers an entirely new experience for customers to tap into insights from massive volumes of data with the performance, reliability and scale they need.”

In a demo, Ghodsi showed me what the new SQL Analytics workspace looks like. It’s essentially a stripped-down version of the standard code-heavy experience with which Databricks users are familiar. Unsurprisingly, SQL Analytics provides a more graphical experience that focuses more on visualizations and not Python code.

While there are already some data analysts on the Databricks platform, this obviously opens up a large new market for the company — something that would surely bolster its plans for an IPO next year.

May
27
2020
--

RudderStack raises $5M seed round for its open-source Segment competitor

RudderStack, a startup that offers an open-source alternative to customer data management platforms like Segment, today announced that it has raised a $5 million seed round led by S28 Capital. Salil Deshpande of Uncorrelated Ventures and Mesosphere/D2iQ co-founder Florian Leibert (through 468 Capital) also participated in this round.

In addition, the company also today announced that it has acquired Blendo, an integration platform that helps businesses transform and move data from their data sources to databases.

Like its larger competitors, RudderStack helps businesses consolidate all of their customer data, which is now typically generated and managed in multiple places — and then extract value from this more holistic view. The company was founded by Soumyadeb Mitra, who has a Ph.D. in database systems and worked on similar problems previously when he was at 8×8 after his previous startup, MairinaIQ, was acquired by that company.

Mitra argues that RudderStack is different from its competitors thanks to its focus on developers, its privacy and security options and its focus on being a data warehouse first, without creating yet another data silo.

“Our competitors provide tools for analytics, audience segmentation, etc. on top of the data they keep,” he said. “That works well if you are a small startup, but larger enterprises have a ton of other data sources — at 8×8 we had our own internal billing system, for example — and you want to combine this internal data with the event stream data — that you collect via RudderStack or competitors — to create a 360-degree view of the customer and act on that. This becomes very difficult with the SaaS-hosted data model of our competitors — you won’t be sending all your internal data to these cloud vendors.”

Part of its appeal, of course, is the open-source nature of RudderStack, whose GitHub repository now has more than 1,700 stars for the main RudderStack server. Mitra credits getting on the front page of HackerNews for its first sale. On that day, it received over 500 GitHub stars, a few thousand clones and a lot of signups for its hosted app. “One of those signups turned out to be our first paid customer. They were already a competitor’s customer, but it wasn’t scaling up so were looking to build something in-house. That’s when they found us and started working with us,” he said.

Because it is open source, companies can run RudderStack anyway they want, but like most similar open-source companies, RudderStack offers multiple hosting options itself, too, that include cloud hosting, starting at $2,000 per month, with unlimited sources and destination.

Current users include IFTTT, Mattermost, MarineTraffic, Torpedo and Wynn Las Vegas.

As for the Blendo acquisition, it’s worth noting that the company only raised a small amount of money in its seed round. The two companies did not disclose the price of the acquisition.

“With Blendo, I had the opportunity to be part of a great team that executed on the vision of turning any company into a data-driven organization,” said Blendo founder Kostas Pardalis, who has joined RudderStack as head of Growth. “We’ve combined the talented Blendo and RudderStack teams together with the technology that both companies have created, at a time when the customer data market is ripe for the next wave of innovation. I’m excited to help drive RudderStack forward.”

Mitra tells me that RudderStack acquired Blendo instead of building its own version of this technology because “it is not a trivial technology to build — cloud sources are really complicated and have weird schemas and API challenges and it would have taken us a lot of time to figure it out. There are independent large companies doing the ETL piece.”

May
19
2020
--

Microsoft launches Azure Synapse Link to help enterprises get faster insights from their data

At its Build developer conference, Microsoft today announced Azure Synapse Link, a new enterprise service that allows businesses to analyze their data faster and more efficiently, using an approach that’s generally called “hybrid transaction/analytical processing” (HTAP). That’s a mouthful; it essentially enables enterprises to use the same database system for analytical and transactional workloads on a single system. Traditionally, enterprises had to make some trade-offs between either building a single system for both that was often highly over-provisioned or maintain separate systems for transactional and analytics workloads.

Last year, at its Ignite conference, Microsoft announced Azure Synapse Analytics, an analytics service that combines analytics and data warehousing to create what the company calls “the next evolution of Azure SQL Data Warehouse.” Synapse Analytics brings together data from Microsoft’s services and those from its partners and makes it easier to analyze.

“One of the key things, as we work with our customers on their digital transformation journey, there is an aspect of being data-driven, of being insights-driven as a culture, and a key part of that really is that once you decide there is some amount of information or insights that you need, how quickly are you able to get to that? For us, time to insight and a secondary element, which is the cost it takes, the effort it takes to build these pipelines and maintain them with an end-to-end analytics solution, was a key metric we have been observing for multiple years from our largest enterprise customers,” said Rohan Kumar, Microsoft’s corporate VP for Azure Data.

Synapse Link takes the work Microsoft did on Synaps Analytics a step further by removing the barriers between Azure’s operational databases and Synapse Analytics, so enterprises can immediately get value from the data in those databases without going through a data warehouse first.

“What we are announcing with Synapse Link is the next major step in the same vision that we had around reducing the time to insight,” explained Kumar. “And in this particular case, a long-standing barrier that exists today between operational databases and analytics systems is these complex ETL (extract, transform, load) pipelines that need to be set up just so you can do basic operational reporting or where, in a very transactionally consistent way, you need to move data from your operational system to the analytics system, because you don’t want to impact the performance of the operational system in any way because that’s typically dealing with, depending on the system, millions of transactions per second.”

ETL pipelines, Kumar argued, are typically expensive and hard to build and maintain, yet enterprises are now building new apps — and maybe even line of business mobile apps — where any action that consumers take and that is registered in the operational database is immediately available for predictive analytics, for example.

From the user perspective, enabling this only takes a single click to link the two, while it removes the need for managing additional data pipelines or database resources. That, Kumar said, was always the main goal for Synapse Link. “With a single click, you should be able to enable real-time analytics on your operational data in ways that don’t have any impact on your operational systems, so you’re not using the compute part of your operational system to do the query, you actually have to transform the data into a columnar format, which is more adaptable for analytics, and that’s really what we achieved with Synapse Link.”

Because traditional HTAP systems on-premises typically share their compute resources with the operational database, those systems never quite took off, Kumar argued. In the cloud, with Synapse Link, though, that impact doesn’t exist because you’re dealing with two separate systems. Now, once a transaction gets committed to the operational database, the Synapse Link system transforms the data into a columnar format that is more optimized for the analytics system — and it does so in real time.

For now, Synapse Link is only available in conjunction with Microsoft’s Cosmos DB database. As Kumar told me, that’s because that’s where the company saw the highest demand for this kind of service, but you can expect the company to add support for available in Azure SQL, Azure Database for PostgreSQL and Azure Database for MySQL in the future.

Apr
22
2020
--

Fishtown Analytics raises $12.9M Series A for its open-source analytics engineering tool

Philadelphia-based Fishtown Analytics, the company behind the popular open-source data engineering tool dbt, today announced that it has raised a $12.9 million Series A round led by Andreessen Horowitz, with the firm’s general partner Martin Casado joining the company’s board.

“I wrote this blog post in early 2016, essentially saying that analysts needed to work in a fundamentally different way,” Fishtown founder and CEO Tristan Handy told me, when I asked him about how the product came to be. “They needed to work in a way that much more closely mirrored the way the software engineers work and software engineers have been figuring this shit out for years and data analysts are still like sending each other Microsoft Excel docs over email.”

The dbt open-source project forms the basis of this. It allows anyone who can write SQL queries to transform data and then load it into their preferred analytics tools. As such, it sits in-between data warehouses and the tools that load data into them on one end, and specialized analytics tools on the other.

As Casado noted when I talked to him about the investment, data warehouses have now made it affordable for businesses to store all of their data before it is transformed. So what was traditionally “extract, transform, load” (ETL) has now become “extract, load, transform” (ELT). Andreessen Horowitz is already invested in Fivetran, which helps businesses move their data into their warehouses, so it makes sense for the firm to also tackle the other side of this business.

“Dbt is, as far as we can tell, the leading community for transformation and it’s a company we’ve been tracking for at least a year,” Casado said. He also argued that data analysts — unlike data scientists — are not really catered to as a group.

Before this round, Fishtown hadn’t raised a lot of money, even though it has been around for a few years now, except for a small SAFE round from Amplify.

But Handy argued that the company needed this time to prove that it was on to something and build a community. That community now consists of more than 1,700 companies that use the dbt project in some form and over 5,000 people in the dbt Slack community. Fishtown also now has over 250 dbt Cloud customers and the company signed up a number of big enterprise clients earlier this year. With that, the company needed to raise money to expand and also better service its current list of customers.

“We live in Philadelphia. The cost of living is low here and none of us really care to make a quadro-billion dollars, but we do want to answer the question of how do we best serve the community,” Handy said. “And for the first time, in the early part of the year, we were like, holy shit, we can’t keep up with all of the stuff that people need from us.”

The company plans to expand the team from 25 to 50 employees in 2020 and with those, the team plans to improve and expand the product, especially its IDE for data analysts, which Handy admitted could use a bit more polish.

Powered by WordPress | Theme: Aeros 2.0 by TheBuckmaker.com