Feb
18
2021
--

Census raises $16M Series A to help companies put their data warehouses to work

Census, a startup that helps businesses sync their customer data from their data warehouses to their various business tools like Salesforce and Marketo, today announced that it has raised a $16 million Series A round led by Sequoia Capital. Other participants in this round include Andreessen Horowitz, which led the company’s $4.3 million seed round last year, as well as several notable angles, including Figma CEO Dylan Field, GitHub CTO Jason Warner, Notion COO Akshay Kothari and Rippling CEO Parker Conrad.

The company is part of a new crop of startups that are building on top of data warehouses. The general idea behind Census is to help businesses operationalize the data in their data warehouses, which was traditionally only used for analytics and reporting use cases. But as businesses realized that all the data they needed was already available in their data warehouses and that they could use that as a single source of truth without having to build additional integrations, an ecosystem of companies that operationalize this data started to form.

The company argues that the modern data stack, with data warehouses like Amazon Redshift, Google BigQuery and Snowflake at its core, offers all of the tools a business needs to extract and transform data (like Fivetran, dbt) and then visualize it (think Looker).

Tools like Census then essentially function as a new layer that sits between the data warehouse and the business tools that can help companies extract value from this data. With that, users can easily sync their product data into a marketing tool like Marketo or a CRM service like Salesforce, for example.

Image Credits: Census

Three years ago, we were the first to ask, ‘Why are we relying on a clumsy tangle of wires connecting every app when everything we need is already in the warehouse? What if you could leverage your data team to drive operations?’ When the data warehouse is connected to the rest of the business, the possibilities are limitless,” Census explains in today’s announcement. “When we launched, our focus was enabling product-led companies like Figma, Canva, and Notion to drive better marketing, sales, and customer success. Along the way, our customers have pulled Census into more and more scenarios, like auto-prioritizing support tickets in Zendesk, automating invoices in Netsuite, or even integrating with HR systems.

Census already integrates with dozens of different services and data tools and its customers include the likes of Clearbit, Figma, Fivetran, LogDNA, Loom and Notion.

Looking ahead, Census plans to use the new funding to launch new features like deeper data validation and a visual query experience. In addition, it also plans to launch code-based orchestration to make Census workflows versionable and make it easier to integrate them into an enterprise orchestration system.

Dec
16
2020
--

Hightouch raises $2.1M to help businesses get more value from their data warehouses

Hightouch, a SaaS service that helps businesses sync their customer data across sales and marketing tools, is coming out of stealth and announcing a $2.1 million seed round. The round was led by Afore Capital and Slack Fund, with a number of angel investors also participating.

At its core, Hightouch, which participated in Y Combinator’s Summer 2019 batch, aims to solve the customer data integration problems that many businesses today face.

During their time at Segment, Hightouch co-founders Tejas Manohar and Josh Curl witnessed the rise of data warehouses like Snowflake, Google’s BigQuery and Amazon Redshift — that’s where a lot of Segment data ends up, after all. As businesses adopt data warehouses, they now have a central repository for all of their customer data. Typically, though, this information is then only used for analytics purposes. Together with former Bessemer Ventures investor Kashish Gupta, the team decided to see how they could innovate on top of this trend and help businesses activate all of this information.

hightouch founders

HighTouch co-founders Kashish Gupta, Josh Curl and Tejas Manohar.

“What we found is that, with all the customer data inside of the data warehouse, it doesn’t make sense for it to just be used for analytics purposes — it also makes sense for these operational purposes like serving different business teams with the data they need to run things like marketing campaigns — or in product personalization,” Manohar told me. “That’s the angle that we’ve taken with Hightouch. It stems from us seeing the explosive growth of the data warehouse space, both in terms of technology advancements as well as like accessibility and adoption. […] Our goal is to be seen as the company that makes the warehouse not just for analytics but for these operational use cases.”

It helps that all of the big data warehousing platforms have standardized on SQL as their query language — and because the warehousing services have already solved the problem of ingesting all of this data, Hightouch doesn’t have to worry about this part of the tech stack either. And as Curl added, Snowflake and its competitors never quite went beyond serving the analytics use case either.

Image Credits: Hightouch

As for the product itself, Hightouch lets users create SQL queries and then send that data to different destinations — maybe a CRM system like Salesforce or a marketing platform like Marketo — after transforming it to the format that the destination platform expects.

Expert users can write their own SQL queries for this, but the team also built a graphical interface to help non-developers create their own queries. The core audience, though, is data teams — and they, too, will likely see value in the graphical user interface because it will speed up their workflows as well. “We want to empower the business user to access whatever models and aggregation the data user has done in the warehouse,” Gupta explained.

The company is agnostic to how and where its users want to operationalize their data, but the most common use cases right now focus on B2C companies, where marketing teams often use the data, as well as sales teams at B2B companies.

Image Credits: Hightouch

“It feels like there’s an emerging category here of tooling that’s being built on top of a data warehouse natively, rather than being a standard SaaS tool where it is its own data store and then you manage a secondary data store,” Curl said. “We have a class of things here that connect to a data warehouse and make use of that data for operational purposes. There’s no industry term for that yet, but we really believe that that’s the future of where data engineering is going. It’s about building off this centralized platform like Snowflake, BigQuery and things like that.”

“Warehouse-native,” Manohar suggested as a potential name here. We’ll see if it sticks.

Hightouch originally raised its round after its participation in the Y Combinator demo day but decided not to disclose it until it felt like it had found the right product/market fit. Current customers include the likes of Retool, Proof, Stream and Abacus, in addition to a number of significantly larger companies the team isn’t able to name publicly.

Dec
03
2020
--

Microsoft launches Azure Purview, its new data governance service

As businesses gather, store and analyze an ever-increasing amount of data, tools for helping them discover, catalog, track and manage how that data is shared are also becoming increasingly important. With Azure Purview, Microsoft is launching a new data governance service into public preview today that brings together all of these capabilities in a new data catalog with discovery and data governance features.

As Rohan Kumar, Microsoft’s corporate VP for Azure Data, told me, this has become a major pain point for enterprises. While they may be very excited about getting started with data-heavy technologies like predictive analytics, those companies’ data and privacy-focused executives are very concerned to make sure that the way the data is used is compliant or that the company has received the right permissions to use its customers’ data, for example.

In addition, companies also want to make sure that they can trust their data and know who has access to it and who made changes to it.

“[Purview] is a unified data governance platform which automates the discovery of data, cataloging of data, mapping of data, lineage tracking — with the intention of giving our customers a very good understanding of the breadth of the data estate that exists to begin with, and also to ensure that all these regulations that are there for compliance, like GDPR, CCPA, etc, are managed across an entire data estate in ways which enable you to make sure that they don’t violate any regulation,” Kumar explained.

At the core of Purview is its catalog that can pull in data from the usual suspects, like Azure’s various data and storage services, but also third-party data stores, including Amazon’s S3 storage service and on-premises SQL Server. Over time, the company will add support for more data sources.

Kumar described this process as a “multi-semester investment,” so the capabilities the company is rolling out today are only a small part of what’s on the overall road map already. With this first release today, the focus is on mapping a company’s data estate.

Image Credits: Microsoft

“Next [on the road map] is more of the governance policies,” Kumar said. “Imagine if you want to set things like ‘if there’s any PII data across any of my data stores, only this group of users has access to it.’ Today, setting up something like that is extremely complex and most likely you’ll get it wrong. That’ll be as simple as setting a policy inside of Purview.”

In addition to launching Purview, the Azure team also today launched into general availability Azure Synapse, Microsoft’s next-generation data warehousing and analytics service. The idea behind Synapse is to give enterprises — and their engineers and data scientists — a single platform that brings together data integration, warehousing and big data analytics.

“With Synapse, we have this one product that gives a completely no-code experience for data engineers, as an example, to build out these [data] pipelines and collaborate very seamlessly with the data scientists who are building out machine learning models, or the business analysts who build out reports for things like Power BI.”

Among Microsoft’s marquee customers for the service, which Kumar described as one of the fastest-growing Azure services right now, are FedEx, Walgreens, Myntra and P&G.

“The insights we gain from continuous analysis help us optimize our network,” said Sriram Krishnasamy, senior vice president, strategic programs at FedEx Services. “So as FedEx moves critical high-value shipments across the globe, we can often predict whether that delivery will be disrupted by weather or traffic and remediate that disruption by routing the delivery from another location.”

Image Credits: Microsoft

Nov
12
2020
--

Databricks launches SQL Analytics

AI and data analytics company Databricks today announced the launch of SQL Analytics, a new service that makes it easier for data analysts to run their standard SQL queries directly on data lakes. And with that, enterprises can now easily connect their business intelligence tools like Tableau and Microsoft’s Power BI to these data repositories as well.

SQL Analytics will be available in public preview on November 18.

In many ways, SQL Analytics is the product Databricks has long been looking to build and that brings its concept of a “lake house” to life. It combines the performance of a data warehouse, where you store data after it has already been transformed and cleaned, with a data lake, where you store all of your data in its raw form. The data in the data lake, a concept that Databricks’ co-founder and CEO Ali Ghodsi has long championed, is typically only transformed when it gets used. That makes data lakes cheaper, but also a bit harder to handle for users.

Image Credits: Databricks

“We’ve been saying Unified Data Analytics, which means unify the data with the analytics. So data processing and analytics, those two should be merged. But no one picked that up,” Ghodsi told me. But “lake house” caught on as a term.

“Databricks has always offered data science, machine learning. We’ve talked about that for years. And with Spark, we provide the data processing capability. You can do [extract, transform, load]. That has always been possible. SQL Analytics enables you to now do the data warehousing workloads directly, and concretely, the business intelligence and reporting workloads, directly on the data lake.”

The general idea here is that with just one copy of the data, you can enable both traditional data analyst use cases (think BI) and the data science workloads (think AI) Databricks was already known for. Ideally, that makes both use cases cheaper and simpler.

The service sits on top of an optimized version of Databricks’ open-source Delta Lake storage layer to enable the service to quickly complete queries. In addition, Delta Lake also provides auto-scaling endpoints to keep the query latency consistent, even under high loads.

While data analysts can query these data sets directly, using standard SQL, the company also built a set of connectors to BI tools. Its BI partners include Tableau, Qlik, Looker and Thoughtspot, as well as ingest partners like Fivetran, Fishtown Analytics, Talend and Matillion.

Image Credits: Databricks

“Now more than ever, organizations need a data strategy that enables speed and agility to be adaptable,” said Francois Ajenstat, chief product officer at Tableau. “As organizations are rapidly moving their data to the cloud, we’re seeing growing interest in doing analytics on the data lake. The introduction of SQL Analytics delivers an entirely new experience for customers to tap into insights from massive volumes of data with the performance, reliability and scale they need.”

In a demo, Ghodsi showed me what the new SQL Analytics workspace looks like. It’s essentially a stripped-down version of the standard code-heavy experience with which Databricks users are familiar. Unsurprisingly, SQL Analytics provides a more graphical experience that focuses more on visualizations and not Python code.

While there are already some data analysts on the Databricks platform, this obviously opens up a large new market for the company — something that would surely bolster its plans for an IPO next year.

Sep
15
2020
--

Data virtualization service Varada raises $12M

Varada, a Tel Aviv-based startup that focuses on making it easier for businesses to query data across services, today announced that it has raised a $12 million Series A round led by Israeli early-stage fund MizMaa Ventures, with participation by Gefen Capital.

“If you look at the storage aspect for big data, there’s always innovation, but we can put a lot of data in one place,” Varada CEO and co-founder Eran Vanounou told me. “But translating data into insight? It’s so hard. It’s costly. It’s slow. It’s complicated.”

That’s a lesson he learned during his time as CTO of LivePerson, which he described as a classic big data company. And just like at LivePerson, where the team had to reinvent the wheel to solve its data problems, again and again, every company — and not just the large enterprises — now struggles with managing their data and getting insights out of it, Vanounou argued.

varada architecture diagram

Image Credits: Varada

The rest of the founding team, David Krakov, Roman Vainbrand and Tal Ben-Moshe, already had a lot of experience in dealing with these problems, too, with Ben-Moshe having served at the chief software architect of Dell EMC’s XtremIO flash array unit, for example. They built the system for indexing big data that’s at the core of Varada’s platform (with the open-source Presto SQL query engine being one of the other cornerstones).

Image Credits: Varada

Essentially, Varada embraces the idea of data lakes and enriches that with its indexing capabilities. And those indexing capabilities is where Varada’s smarts can be found. As Vanounou explained, the company is using a machine learning system to understand when users tend to run certain workloads, and then caches the data ahead of time, making the system far faster than its competitors.

“If you think about big organizations and think about the workloads and the queries, what happens during the morning time is different from evening time. What happened yesterday is not what happened today. What happened on a rainy day is not what happened on a shiny day. […] We listen to what’s going on and we optimize. We leverage the indexing technology. We index what is needed when it is needed.”

That helps speed up queries, but it also means less data has to be replicated, which also brings down the cost. As MizMaa’s Aaron Applbaum noted, since Varada is not a SaaS solution, the buyers still get all of the discounts from their cloud providers, too.

In addition, the system can allocate resources intelligently so that different users can tap into different amounts of bandwidth. You can tell it to give customers more bandwidth than your financial analysts, for example.

“Data is growing like crazy: in volume, in scale, in complexity, in who requires it and what the business intelligence uses are, what the API uses are,” Applbaum said when I asked him why he decided to invest. “And compute is getting slightly cheaper, but not really, and storage is getting cheaper. So if you can make the trade-off to store more stuff, and access things more intelligently, more quickly, more agile — that was the basis of our thesis, as long as you can do it without compromising performance.”

Varada, with its team of experienced executives, architects and engineers, ticked a lot of the company’s boxes in this regard, but he also noted that unlike some other Israeli startups, the team understood that it had to listen to customers and understand their needs, too.

“In Israel, you have a history — and it’s become less and less the case — but historically, there’s a joke that it’s ‘ready, fire, aim.’ You build a technology, you’ve got this beautiful thing and you’re like, ‘alright, we did it,’ but without listening to the needs of the customer,” he explained.

The Varada team is not afraid to compare itself to Snowflake, which at least at first glance seems to make similar promises. Vananou praised the company for opening up the data warehousing market and proving that people are willing to pay for good analytics. But he argues that Varada’s approach is fundamentally different.

“We embrace the data lake. So if you are Mr. Customer, your data is your data. We’re not going to take it, move it, copy it. This is your single source of truth,” he said. And in addition, the data can stay in the company’s virtual private cloud. He also argues that Varada isn’t so much focused on the business users but the technologists inside a company.

 

Feb
24
2020
--

Databricks makes bringing data into its ‘lakehouse’ easier

Databricks today announced the launch of its new Data Ingestion Network of partners and the launch of its Databricks Ingest service. The idea here is to make it easier for businesses to combine the best of data warehouses and data lakes into a single platform — a concept Databricks likes to call “lakehouse.”

At the core of the company’s lakehouse is Delta Lake, Databricks’ Linux Foundation-managed open-source project that brings a new storage layer to data lakes that helps users manage the lifecycle of their data and ensures data quality through schema enforcement, log records and more. Databricks users can now work with the first five partners in the Ingestion Network — Fivetran, Qlik, Infoworks, StreamSets, Syncsort — to automatically load their data into Delta Lake. To ingest data from these partners, Databricks customers don’t have to set up any triggers or schedules — instead, data automatically flows into Delta Lake.

“Until now, companies have been forced to split up their data into traditional structured data and big data, and use them separately for BI and ML use cases. This results in siloed data in data lakes and data warehouses, slow processing and partial results that are too delayed or too incomplete to be effectively utilized,” says Ali Ghodsi, co-founder and CEO of Databricks. “This is one of the many drivers behind the shift to a Lakehouse paradigm, which aspires to combine the reliability of data warehouses with the scale of data lakes to support every kind of use case. In order for this architecture to work well, it needs to be easy for every type of data to be pulled in. Databricks Ingest is an important step in making that possible.”

Databricks VP of Product Marketing Bharath Gowda also tells me that this will make it easier for businesses to perform analytics on their most recent data and hence be more responsive when new information comes in. He also noted that users will be able to better leverage their structured and unstructured data for building better machine learning models, as well as to perform more traditional analytics on all of their data instead of just a small slice that’s available in their data warehouse.

Dec
17
2019
--

Satori Cyber raises $5.25M to help businesses protect their data flows

The amount of data that most companies now store — and the places they store it — continues to increase rapidly. With that, the risk of the wrong people managing to get access to this data also increases, so it’s no surprise that we’re now seeing a number of startups that focus on protecting this data and how it flows between clouds and on-premises servers. Satori Cyber, which focuses on data protecting and governance, today announced that it has raised a $5.25 million seed round led by YL Ventures.

“We believe in the transformative power of data to drive innovation and competitive advantage for businesses,” the company says. “We are also aware of the security, privacy and operational challenges data-driven organizations face in their journey to enable broad and optimized data access for their teams, partners and customers. This is especially true for companies leveraging cloud data technologies.”

Satori is officially coming out of stealth mode today and launching its first product, the Satori Cyber Secure Data Access Cloud. This service provides enterprises with the tools to provide access controls for their data, but maybe just as importantly, it also offers these companies and their security teams visibility into their data flows across cloud and hybrid environments. The company argues that data is “a moving target” because it’s often hard to know how exactly it moves between services and who actually has access to it. With most companies now splitting their data between lots of different data stores, that problem only becomes more prevalent over time and continuous visibility becomes harder to come by.

“Until now, security teams have relied on a combination of highly segregated and restrictive data access and one-off technology-specific access controls within each data store, which has only slowed enterprises down,” said Satori Cyber CEO and co-founder Eldad Chai. “The Satori Cyber platform streamlines this process, accelerates data access and provides a holistic view across all organizational data flows, data stores and access, as well as granular access controls, to accelerate an organization’s data strategy without those constraints.”

Both co-founders (Chai and CTO Yoav Cohen) previously spent nine years building security solutions at Imperva and Incapsula (which acquired Imperva in 2014). Based on this experience, they understood that onboarding had to be as easy as possible and that operations would have to be transparent to the users. “We built Satori’s Secure Data Access Cloud with that in mind, and have designed the onboarding process to be just as quick, easy and painless. On-boarding Satori involves a simple host name change and does not require any changes in how your organizational data is accessed or used,” they explain.

Apr
02
2019
--

How to handle dark data compliance risk at your company

Slack and other consumer-grade productivity tools have been taking off in workplaces large and small — and data governance hasn’t caught up.

Whether it’s litigation, compliance with regulations like GDPR or concerns about data breaches, legal teams need to account for new types of employee communication. And that’s hard when work is happening across the latest messaging apps and SaaS products, which make data searchability and accessibility more complex.

Here’s a quick look at the problem, followed by our suggestions for best practices at your company.

Problems

The increasing frequency of reported data breaches and expanding jurisdiction of new privacy laws are prompting conversations about dark data and risks at companies of all sizes, even small startups. Data risk discussions necessarily include the risk of a data breach, as well as preservation of data. Just two weeks ago it was reported that Jared Kushner used WhatsApp for official communications and screenshots of those messages for preservation, which commentators say complies with record keeping laws but raises questions about potential admissibility as evidence.

Aug
29
2018
--

Storage provider Cloudian raises $94M

Cloudian, a company that specializes in helping businesses store petabytes of data, today announced that it has raised a $94 million Series E funding round. Investors in this round, which is one of the largest we have seen for a storage vendor, include Digital Alpha, Fidelity Eight Roads, Goldman Sachs, INCJ, JPIC (Japan Post Investment Corporation), NTT DOCOMO Ventures and WS Investments. This round includes a $25 million investment from Digital Alpha, which was first announced earlier this year.

With this, the seven-year-old company has now raised a total of $174 million.

As the company told me, it now has about 160 employees and 240 enterprise customers. Cloudian has found its sweet spot in managing the large video archives of entertainment companies, but its customers also include healthcare companies, automobile manufacturers and Formula One teams.

What’s important to stress here is that Cloudian’s focus is on on-premise storage, not cloud storage, though it does offer support for multi-cloud data management, as well. “Data tends to be most effectively used close to where it is created and close to where it’s being used,” Cloudian VP of worldwide sales Jon Ash told me. “That’s because of latency, because of network traffic. You can almost always get better performance, better control over your data if it is being stored close to where it’s being used.” He also noted that it’s often costly and complex to move that data elsewhere, especially when you’re talking about the large amounts of information that Cloudian’s customers need to manage.

Unsurprisingly, companies that have this much data now want to use it for machine learning, too, so Cloudian is starting to get into this space, as well. As Cloudian CEO and co-founder Michael Tso also told me, companies are now aware that the data they pull in, whether from IoT sensors, cameras or medical imaging devices, will only become more valuable over time as they try to train their models. If they decide to throw the data away, they run the risk of having nothing with which to train their models.

Cloudian plans to use the new funding to expand its global sales and marketing efforts and increase its engineering team. “We have to invest in engineering and our core technology, as well,” Tso noted. “We have to innovate in new areas like AI.”

As Ash also stressed, Cloudian’s business is really data management — not just storage. “Data is coming from everywhere and it’s going everywhere,” he said. “The old-school storage platforms that were siloed just don’t work anywhere.”

Feb
28
2017
--

Reflect drops public beta to power developer-first data visualization

Abstract pattern of yellow pie charts on multiColored background of geometric shapes Data visualization has been done — we have publicly traded, interactive, real-time and heck even artificially intelligent companies promising data visualization. But despite all the noise, Portland-based Reflect is making a go of it in the space, opening up its public beta today. By putting developers first and letting them integrate and customize visualizations in their own… Read More

Powered by WordPress | Theme: Aeros 2.0 by TheBuckmaker.com