Nov
12
2020
--

Databricks launches SQL Analytics

AI and data analytics company Databricks today announced the launch of SQL Analytics, a new service that makes it easier for data analysts to run their standard SQL queries directly on data lakes. And with that, enterprises can now easily connect their business intelligence tools like Tableau and Microsoft’s Power BI to these data repositories as well.

SQL Analytics will be available in public preview on November 18.

In many ways, SQL Analytics is the product Databricks has long been looking to build and that brings its concept of a “lake house” to life. It combines the performance of a data warehouse, where you store data after it has already been transformed and cleaned, with a data lake, where you store all of your data in its raw form. The data in the data lake, a concept that Databricks’ co-founder and CEO Ali Ghodsi has long championed, is typically only transformed when it gets used. That makes data lakes cheaper, but also a bit harder to handle for users.

Image Credits: Databricks

“We’ve been saying Unified Data Analytics, which means unify the data with the analytics. So data processing and analytics, those two should be merged. But no one picked that up,” Ghodsi told me. But “lake house” caught on as a term.

“Databricks has always offered data science, machine learning. We’ve talked about that for years. And with Spark, we provide the data processing capability. You can do [extract, transform, load]. That has always been possible. SQL Analytics enables you to now do the data warehousing workloads directly, and concretely, the business intelligence and reporting workloads, directly on the data lake.”

The general idea here is that with just one copy of the data, you can enable both traditional data analyst use cases (think BI) and the data science workloads (think AI) Databricks was already known for. Ideally, that makes both use cases cheaper and simpler.

The service sits on top of an optimized version of Databricks’ open-source Delta Lake storage layer to enable the service to quickly complete queries. In addition, Delta Lake also provides auto-scaling endpoints to keep the query latency consistent, even under high loads.

While data analysts can query these data sets directly, using standard SQL, the company also built a set of connectors to BI tools. Its BI partners include Tableau, Qlik, Looker and Thoughtspot, as well as ingest partners like Fivetran, Fishtown Analytics, Talend and Matillion.

Image Credits: Databricks

“Now more than ever, organizations need a data strategy that enables speed and agility to be adaptable,” said Francois Ajenstat, chief product officer at Tableau. “As organizations are rapidly moving their data to the cloud, we’re seeing growing interest in doing analytics on the data lake. The introduction of SQL Analytics delivers an entirely new experience for customers to tap into insights from massive volumes of data with the performance, reliability and scale they need.”

In a demo, Ghodsi showed me what the new SQL Analytics workspace looks like. It’s essentially a stripped-down version of the standard code-heavy experience with which Databricks users are familiar. Unsurprisingly, SQL Analytics provides a more graphical experience that focuses more on visualizations and not Python code.

While there are already some data analysts on the Databricks platform, this obviously opens up a large new market for the company — something that would surely bolster its plans for an IPO next year.

Nov
11
2020
--

Mozart Data lands $4M seed to provide out-of-the-box data stack

Mozart Data founders Peter Fishman and Dan Silberman have been friends for over 20 years, working at various startups, and even launching a hot sauce company together along the way. As technologists, they saw companies building a data stack over and over. They decided to provide one for them and Mozart Data was born.

The company graduated from the Y Combinator Summer 2020 cohort in August and announced a $4 million seed round today led by Craft Ventures and Array Ventures with participation from Coelius Capital, Jigsaw VC, Signia VC, Taurus VC and various angel investors.

In spite of the detour into hot sauce, the two founders were mostly involved in data over the years and they formed strong opinions about what a data stack should look like. “We wanted to bring the same stack that we’ve been building at all these different startups, and make it available more broadly,” Fishman told TechCrunch.

They see a modern data stack as one that has different databases, SaaS tools and data sources. They pull it together, process it and make it ready for whatever business intelligence tool you use. “We do all of the parts before the BI tool. So we extract and load the data. We manage a data warehouse for you under the hood in Snowflake, and we provide a layer for you to do transformations,” he said.

The service is aimed mostly at technical people who know some SQL like data analysts, data scientists and sales and marketing operations. They founded the company earlier this year with their own money, and joined Y Combinator in June. Today, they have about a dozen customers and six employees. They expect to add 10-12 more in the next year.

Fishman says they have mostly hired from their networks, but have begun looking outward as they make their next hires with a goal of building a diverse company. In fact, they have made offers to several diverse candidates, who didn’t ultimately take the job, but he believes if you start looking at the top of the funnel, you will get good results. “I think if you spend a lot of energy in terms of top of funnel recruiting, you end up getting a good, diverse set at the bottom,” he said.

The company has been able to start from scratch in the midst of a pandemic and add employees and customers because the founders had a good network to pitch the product to, but they understand that moving forward they will have to move outside of that. They plan to use their experience as users to drive their message.

“I think talking about some of the whys and the rationale is our strategy for adding value to customers […], it’s about basically how would we set up a data stack if we were at this type of startup,” he said.

Nov
10
2020
--

Explo snags $2.3M seed to help build customer-facing BI dashboards

Explo, a member of the Y Combinator Winter 2020 class, which is helping customers build customer-facing business intelligence dashboards, announced a $2.3 million seed round today. Investors included Amplo VC, Soma Capital and Y Combinator, along with several individual investors.

The company originally was looking at a way to simplify getting data ready for models or other applications, but as the founders spoke to customers, they saw a big need for a simple way to build dashboards backed by that data and quickly pivoted.

Explo CEO and co-founder Gary Lin says the company was able to leverage the core infrastructure, data engineering and production that it had built while at Y Combinator, but the new service they created is much different from the original idea.

“In terms of the UI and the output, we had to build out the ability for our end users to create dashboards, for them to embed the dashboards and for them to customize the styles on these dashboards, so that it looks and feels as though it was part of their own product,” Lin explained.

While the founders had been working on the original idea since last year, they didn’t actually make the pivot until September. They made the change because they were hearing this was really what customers needed more than the tool they had been building while at Y Combinator. In fact, Chen says that their YC mentors and investors have been highly supportive of the switch.

The company is just getting started with the four original co-founders — Lin, COO Andrew Chen, CTO Rohan Varma and product designer Carly Stanisic — but the plan is to use this money to beef up the engineering team with three to five new hires.

With a diverse founding team, the company wants to continue looking at diversity as it builds the company. “One of the biggest reasons that we think diversity is important is that it allows us to have a bigger perspective and a grander perspective on things. And honestly, it’s in environments where I have personally […] been involved where we’ve actually been able to create the best ideas was by having a larger perspective. And so we definitely are going to be as inclusive as possible and are definitely thinking about that as we hire,” Lin said.

As the company has grown up during the pandemic, the founding core is used to working remotely and the goal moving forward is to be a distributed company. “We will be a remote distributed company so we’re hiring people no matter where they are, which actually makes it a lot easier from a hiring perspective because we’re able to reach a much more diverse and large pool of applicants,” Lin said.

They are in the process of thinking about how they can build a culture as they bring in distributed employees. “I think the way that we’ve started to see it is that working distributed is not a reduced experience, but just a different one and we are thinking about different things like how we organize new people when they on-board, and maybe we can meet up as a team and have a retreat where we are located in the same place [when travel allows],” he said.

For now, they will remain remote as they take their first half-dozen customers and begin to build the company with the new investment.

Oct
28
2020
--

MachEye raises $4.6M for its business intelligence platform

We’ve seen our fair share of business intelligence (BI) platforms that aim to make data analysis accessible to everybody in a company. Most of them are still fairly complicated, no matter what their marketing copy says. MachEye, which is launching its AI-powered BI platform today, is offering a new twist on this genre. In addition to its official launch, the company also today announced a previously unreported $4.6 seed funding round led by Canaan Partners with participation from WestWave Capital.

MachEye is not just what its founder and CEO Ramesh Panuganty calls a “low-prep, no-prep” BI platform, but it uses natural language processing to allow anybody to query data using natural language — and it can then automatically generate interactive data stories on the fly that put the answer into context. That’s quite a different approach from its more dashboard-centric competition.

Image Credits: MachEye

“I have seen the business intelligence problems in the past,” Panuganty said. “And I saw that Traditional BI, even though it has existed for 30 or 40 years, had this paradigm of ‘what you ask is what you get.’ So the business user asks for something, either in an email, on the phone or in person, and then he gets an answer to that question back. That essentially has these challenges of being dependent on the experts and there is a time that is lost to get the answers — and then there’s a lack of exploratory capabilities for the business user. and the bigger problem is that they don’t know what they don’t know.”

Panuganty’s background includes time at Sun Microsystems and Bell Labs, working on their operating systems before becoming an entrepreneur. He built three companies over the last 12 years or so. The first was a cloud management platform, Cloud360, which was acquired by Cognizant. The second was analytics company Drastin, which got acquired by Splunk in 2017, and the third was the AI-driven educational platform SelectQ, which Thinkster acquired this April. He also holds 15 patents related to machine learning, analytics and natural language processing.

Given that track record, it’s probably no surprise why VCs wanted to invest in his new startup, too. Panuganty tells me that when he met with Canaan Partners, he wasn’t really looking for an investment. He had already talked to the team while building SelectQ, but Canaan never got to make an investment because the company got acquired before it needed to raise more funding. But after an informal meeting that ended up lasting most of the day, he received an offer the next morning.

MachEye’s approach is definitely unique. “Generating audio-visuals on enterprise data, we are probably the only company that does it,” Panuganty said. But it’s important to note that it also offers all of the usual trappings of a BI service. If you really want dashboards, you can build those, and developers can use the company’s APIs to use their data elsewhere, too. The service can pull in data from most of the standard databases and data warehousing services, including AWS Redshift, Azure Synapse, Google BigQuery, Snowflake and Oracle. The company promises that it only takes 30 minutes from connecting a data source to being able to ask questions about that data.

Interestingly, MachEye’s pricing plan is per seat and doesn’t limit how much data you can query. There’s a free plan, but without the natural search and query capabilities, an $18/month/user plan that adds those capabilities and additional search features, but it takes the enterprise plan to get the audio narrations and other advanced features. The team is able to use this pricing model because it is able to quickly spin up the container infrastructure to answer a query and then immediately shut it down again — all within about two minutes.

Sep
15
2020
--

Data virtualization service Varada raises $12M

Varada, a Tel Aviv-based startup that focuses on making it easier for businesses to query data across services, today announced that it has raised a $12 million Series A round led by Israeli early-stage fund MizMaa Ventures, with participation by Gefen Capital.

“If you look at the storage aspect for big data, there’s always innovation, but we can put a lot of data in one place,” Varada CEO and co-founder Eran Vanounou told me. “But translating data into insight? It’s so hard. It’s costly. It’s slow. It’s complicated.”

That’s a lesson he learned during his time as CTO of LivePerson, which he described as a classic big data company. And just like at LivePerson, where the team had to reinvent the wheel to solve its data problems, again and again, every company — and not just the large enterprises — now struggles with managing their data and getting insights out of it, Vanounou argued.

varada architecture diagram

Image Credits: Varada

The rest of the founding team, David Krakov, Roman Vainbrand and Tal Ben-Moshe, already had a lot of experience in dealing with these problems, too, with Ben-Moshe having served at the chief software architect of Dell EMC’s XtremIO flash array unit, for example. They built the system for indexing big data that’s at the core of Varada’s platform (with the open-source Presto SQL query engine being one of the other cornerstones).

Image Credits: Varada

Essentially, Varada embraces the idea of data lakes and enriches that with its indexing capabilities. And those indexing capabilities is where Varada’s smarts can be found. As Vanounou explained, the company is using a machine learning system to understand when users tend to run certain workloads, and then caches the data ahead of time, making the system far faster than its competitors.

“If you think about big organizations and think about the workloads and the queries, what happens during the morning time is different from evening time. What happened yesterday is not what happened today. What happened on a rainy day is not what happened on a shiny day. […] We listen to what’s going on and we optimize. We leverage the indexing technology. We index what is needed when it is needed.”

That helps speed up queries, but it also means less data has to be replicated, which also brings down the cost. As MizMaa’s Aaron Applbaum noted, since Varada is not a SaaS solution, the buyers still get all of the discounts from their cloud providers, too.

In addition, the system can allocate resources intelligently so that different users can tap into different amounts of bandwidth. You can tell it to give customers more bandwidth than your financial analysts, for example.

“Data is growing like crazy: in volume, in scale, in complexity, in who requires it and what the business intelligence uses are, what the API uses are,” Applbaum said when I asked him why he decided to invest. “And compute is getting slightly cheaper, but not really, and storage is getting cheaper. So if you can make the trade-off to store more stuff, and access things more intelligently, more quickly, more agile — that was the basis of our thesis, as long as you can do it without compromising performance.”

Varada, with its team of experienced executives, architects and engineers, ticked a lot of the company’s boxes in this regard, but he also noted that unlike some other Israeli startups, the team understood that it had to listen to customers and understand their needs, too.

“In Israel, you have a history — and it’s become less and less the case — but historically, there’s a joke that it’s ‘ready, fire, aim.’ You build a technology, you’ve got this beautiful thing and you’re like, ‘alright, we did it,’ but without listening to the needs of the customer,” he explained.

The Varada team is not afraid to compare itself to Snowflake, which at least at first glance seems to make similar promises. Vananou praised the company for opening up the data warehousing market and proving that people are willing to pay for good analytics. But he argues that Varada’s approach is fundamentally different.

“We embrace the data lake. So if you are Mr. Customer, your data is your data. We’re not going to take it, move it, copy it. This is your single source of truth,” he said. And in addition, the data can stay in the company’s virtual private cloud. He also argues that Varada isn’t so much focused on the business users but the technologists inside a company.

 

Aug
21
2020
--

As the pandemic creates supply chain chaos, Craft raises $10M to apply some intelligence

During the COVID-19 pandemic, supply chains have suddenly become hot. Who knew that would ever happen? The race to secure PPE, ventilators and minor things like food was and still is an enormous issue. But perhaps, predictably, the world of “supply chain software” could use some updating. Most of the platforms are deployed “empty” and require the client to populate them with their own data, or “bring their own data.” The UIs can be outdated and still have to be juggled with manual and offline workflows. So startups working in this space are now attracting some timely attention.

Thus, Craft, the enterprise intelligence company, today announces it has closed a $10 million Series A financing round to build what it characterizes as a “supply chain intelligence platform.” With the new funding, Craft will expand its offices in San Francisco, London and Minsk, and grow remote teams across engineering, sales, marketing and operations in North America and Europe.

It competes with some large incumbents, such as Dun & Bradstreet, Bureau van Dijk and Thomson Reuters . These are traditional data providers focused primarily on providing financial data about public companies, rather than real-time data from data sources such as operating metrics, human capital and risk metrics.

The idea is to allow companies to monitor and optimize their supply chain and enterprise systems. The financing was led by High Alpha Capital, alongside Greycroft. Craft also has some high-flying angel investors, including Sam Palmisano, chairman of the Center for Global Enterprise and former CEO and chairman of IBM; Jim Moffatt, former CEO of Deloitte Consulting; Frederic Kerrest, executive vice chairman, COO and co-founder of Okta; and Uncork Capital, which previously led Craft’s seed financing. High Alpha partner Kristian Andersen is joining Craft’s board of directors.

The problem Craft is attacking is a lack of visibility into complex global supply chains. For obvious reasons, COVID-19 disrupted global supply chains, which tended to reveal a lot of risks, structural weaknesses across industries and a lack of intelligence about how it’s all holding together. Craft’s solution is a proprietary data platform, API and portal that integrates into existing enterprise workflows.

While many business intelligence products require clients to bring their own data, Craft’s data platform comes pre-deployed with data from thousands of financial and alternative sources, such as 300+ data points that are refreshed using both Machine Learning and human validation. Its open-to-the-web company profiles appear in 50 million search results, for instance.

Ilya Levtov, co-founder and CEO of Craft, said in a statement: “Today, we are focused on providing powerful tracking and visibility to enterprise supply chains, while our ultimate vision is to build the intelligence layer of the enterprise technology stack.”

Kristian Andersen, partner with High Alpha commented: “We have a deep conviction that supply chain management remains an underinvested and under-innovated category in enterprise software.”

In the first half of 2020, Craft claims its revenues have grown nearly threefold, with Fortune 100 companies, government and military agencies, and SMEs among its clients.

May
19
2020
--

Microsoft launches Azure Synapse Link to help enterprises get faster insights from their data

At its Build developer conference, Microsoft today announced Azure Synapse Link, a new enterprise service that allows businesses to analyze their data faster and more efficiently, using an approach that’s generally called “hybrid transaction/analytical processing” (HTAP). That’s a mouthful; it essentially enables enterprises to use the same database system for analytical and transactional workloads on a single system. Traditionally, enterprises had to make some trade-offs between either building a single system for both that was often highly over-provisioned or maintain separate systems for transactional and analytics workloads.

Last year, at its Ignite conference, Microsoft announced Azure Synapse Analytics, an analytics service that combines analytics and data warehousing to create what the company calls “the next evolution of Azure SQL Data Warehouse.” Synapse Analytics brings together data from Microsoft’s services and those from its partners and makes it easier to analyze.

“One of the key things, as we work with our customers on their digital transformation journey, there is an aspect of being data-driven, of being insights-driven as a culture, and a key part of that really is that once you decide there is some amount of information or insights that you need, how quickly are you able to get to that? For us, time to insight and a secondary element, which is the cost it takes, the effort it takes to build these pipelines and maintain them with an end-to-end analytics solution, was a key metric we have been observing for multiple years from our largest enterprise customers,” said Rohan Kumar, Microsoft’s corporate VP for Azure Data.

Synapse Link takes the work Microsoft did on Synaps Analytics a step further by removing the barriers between Azure’s operational databases and Synapse Analytics, so enterprises can immediately get value from the data in those databases without going through a data warehouse first.

“What we are announcing with Synapse Link is the next major step in the same vision that we had around reducing the time to insight,” explained Kumar. “And in this particular case, a long-standing barrier that exists today between operational databases and analytics systems is these complex ETL (extract, transform, load) pipelines that need to be set up just so you can do basic operational reporting or where, in a very transactionally consistent way, you need to move data from your operational system to the analytics system, because you don’t want to impact the performance of the operational system in any way because that’s typically dealing with, depending on the system, millions of transactions per second.”

ETL pipelines, Kumar argued, are typically expensive and hard to build and maintain, yet enterprises are now building new apps — and maybe even line of business mobile apps — where any action that consumers take and that is registered in the operational database is immediately available for predictive analytics, for example.

From the user perspective, enabling this only takes a single click to link the two, while it removes the need for managing additional data pipelines or database resources. That, Kumar said, was always the main goal for Synapse Link. “With a single click, you should be able to enable real-time analytics on your operational data in ways that don’t have any impact on your operational systems, so you’re not using the compute part of your operational system to do the query, you actually have to transform the data into a columnar format, which is more adaptable for analytics, and that’s really what we achieved with Synapse Link.”

Because traditional HTAP systems on-premises typically share their compute resources with the operational database, those systems never quite took off, Kumar argued. In the cloud, with Synapse Link, though, that impact doesn’t exist because you’re dealing with two separate systems. Now, once a transaction gets committed to the operational database, the Synapse Link system transforms the data into a columnar format that is more optimized for the analytics system — and it does so in real time.

For now, Synapse Link is only available in conjunction with Microsoft’s Cosmos DB database. As Kumar told me, that’s because that’s where the company saw the highest demand for this kind of service, but you can expect the company to add support for available in Azure SQL, Azure Database for PostgreSQL and Azure Database for MySQL in the future.

Apr
22
2020
--

Fishtown Analytics raises $12.9M Series A for its open-source analytics engineering tool

Philadelphia-based Fishtown Analytics, the company behind the popular open-source data engineering tool dbt, today announced that it has raised a $12.9 million Series A round led by Andreessen Horowitz, with the firm’s general partner Martin Casado joining the company’s board.

“I wrote this blog post in early 2016, essentially saying that analysts needed to work in a fundamentally different way,” Fishtown founder and CEO Tristan Handy told me, when I asked him about how the product came to be. “They needed to work in a way that much more closely mirrored the way the software engineers work and software engineers have been figuring this shit out for years and data analysts are still like sending each other Microsoft Excel docs over email.”

The dbt open-source project forms the basis of this. It allows anyone who can write SQL queries to transform data and then load it into their preferred analytics tools. As such, it sits in-between data warehouses and the tools that load data into them on one end, and specialized analytics tools on the other.

As Casado noted when I talked to him about the investment, data warehouses have now made it affordable for businesses to store all of their data before it is transformed. So what was traditionally “extract, transform, load” (ETL) has now become “extract, load, transform” (ELT). Andreessen Horowitz is already invested in Fivetran, which helps businesses move their data into their warehouses, so it makes sense for the firm to also tackle the other side of this business.

“Dbt is, as far as we can tell, the leading community for transformation and it’s a company we’ve been tracking for at least a year,” Casado said. He also argued that data analysts — unlike data scientists — are not really catered to as a group.

Before this round, Fishtown hadn’t raised a lot of money, even though it has been around for a few years now, except for a small SAFE round from Amplify.

But Handy argued that the company needed this time to prove that it was on to something and build a community. That community now consists of more than 1,700 companies that use the dbt project in some form and over 5,000 people in the dbt Slack community. Fishtown also now has over 250 dbt Cloud customers and the company signed up a number of big enterprise clients earlier this year. With that, the company needed to raise money to expand and also better service its current list of customers.

“We live in Philadelphia. The cost of living is low here and none of us really care to make a quadro-billion dollars, but we do want to answer the question of how do we best serve the community,” Handy said. “And for the first time, in the early part of the year, we were like, holy shit, we can’t keep up with all of the stuff that people need from us.”

The company plans to expand the team from 25 to 50 employees in 2020 and with those, the team plans to improve and expand the product, especially its IDE for data analysts, which Handy admitted could use a bit more polish.

Feb
24
2020
--

Databricks makes bringing data into its ‘lakehouse’ easier

Databricks today announced the launch of its new Data Ingestion Network of partners and the launch of its Databricks Ingest service. The idea here is to make it easier for businesses to combine the best of data warehouses and data lakes into a single platform — a concept Databricks likes to call “lakehouse.”

At the core of the company’s lakehouse is Delta Lake, Databricks’ Linux Foundation-managed open-source project that brings a new storage layer to data lakes that helps users manage the lifecycle of their data and ensures data quality through schema enforcement, log records and more. Databricks users can now work with the first five partners in the Ingestion Network — Fivetran, Qlik, Infoworks, StreamSets, Syncsort — to automatically load their data into Delta Lake. To ingest data from these partners, Databricks customers don’t have to set up any triggers or schedules — instead, data automatically flows into Delta Lake.

“Until now, companies have been forced to split up their data into traditional structured data and big data, and use them separately for BI and ML use cases. This results in siloed data in data lakes and data warehouses, slow processing and partial results that are too delayed or too incomplete to be effectively utilized,” says Ali Ghodsi, co-founder and CEO of Databricks. “This is one of the many drivers behind the shift to a Lakehouse paradigm, which aspires to combine the reliability of data warehouses with the scale of data lakes to support every kind of use case. In order for this architecture to work well, it needs to be easy for every type of data to be pulled in. Databricks Ingest is an important step in making that possible.”

Databricks VP of Product Marketing Bharath Gowda also tells me that this will make it easier for businesses to perform analytics on their most recent data and hence be more responsive when new information comes in. He also noted that users will be able to better leverage their structured and unstructured data for building better machine learning models, as well as to perform more traditional analytics on all of their data instead of just a small slice that’s available in their data warehouse.

Feb
19
2020
--

Google Cloud opens its Seoul region

Google Cloud today announced that its new Seoul region, its first in Korea, is now open for business. The region, which it first talked about last April, will feature three availability zones and support for virtually all of Google Cloud’s standard service, ranging from Compute Engine to BigQuery, Bigtable and Cloud Spanner.

With this, Google Cloud now has a presence in 16 countries and offers 21 regions with a total of 64 zones. The Seoul region (with the memorable name of asia-northeast3) will complement Google’s other regions in the area, including two in Japan, as well as regions in Hong Kong and Taiwan, but the obvious focus here is on serving Korean companies with low-latency access to its cloud services.

“As South Korea’s largest gaming company, we’re partnering with Google Cloud for game development, infrastructure management, and to infuse our operations with business intelligence,” said Chang-Whan Sul, the CTO of Netmarble. “Google Cloud’s region in Seoul reinforces its commitment to the region and we welcome the opportunities this initiative offers our business.”

Over the course of this year, Google Cloud also plans to open more zones and regions in Salt Lake City, Las Vegas and Jakarta, Indonesia.

Powered by WordPress | Theme: Aeros 2.0 by TheBuckmaker.com