Jun
17
2021
--

Transform launches with $24.5M in funding for a tool to query and build metrics out of data troves

The biggest tech companies have put a lot of time and money into building tools and platforms for their data science teams and those who work with them to glean insights and metrics out of the masses of data that their companies produce: how a company is performing, how a new feature is working, when something is broken, or when something might be selling well (and why) are all things you can figure out if you know how to read the data.

Now, three alums that worked with data in the world of big tech have founded a startup that aims to build a “metrics store” so that the rest of the enterprise world — much of which lacks the resources to build tools like this from scratch — can easily use metrics to figure things out like this, too.

Transform, as the startup is called, is coming out of stealth today, and it’s doing so with an impressive amount of early backing — a sign not just of investor confidence in these particular founders, but also the recognition that there is a gap in the market for, as the company describes it, a “single source of truth for business data” that could be usefully filled.

The company is announcing that it has closed, while in stealth, a Series A of $20 million, and an earlier seed round of $4.5 million — both led by Index Ventures and Redpoint Ventures. The seed, the company said, also had dozens of angel investors, with the list including Elad Gil of Color Genomics, Lenny Rachitsky of Airbnb and Cristina Cordova of Notion.

The big breakthrough that Transform has made is that it’s built a metrics engine that a company can apply to its structured data — a tool similar to what big tech companies have built for their own use, but that hasn’t really been created (at least until now) for others who are not those big tech companies to use, too.

Transform can work with vast troves of data from the warehouse, or data that is being tracked in real time, to generate insights and analytics about different actions around a company’s products. Transform can be used and queried by non-technical people who still have to deal with data, Handel said.

The impetus for building the product came to Nick Handel, James Mayfield and Paul Yang — respectively Transform’s CEO, COO and software engineer — when they all worked together at Airbnb (previously Mayfield and Yang were also at Facebook together) in a mix of roles that included product management and engineering.

There, they could see first-hand both the promise that data held for helping make decisions around a product, or for measuring how something is used, or to plan future features, but also the demands of harnessing it to work, and getting everyone on the same page to do so.

“There is a growing trend among tech companies to test every single feature, every single iteration of whatever. And so as a part of that, we built this tool [at Airbnb] that basically allowed you to define the various metrics that you wanted to track to understand your experiment,” Handel recalled in an interview. “But you also want to understand so many other things like, how many people are searching for listings in certain areas? How many people are instantly booking those listings? Are they contacting customer service, are they having trust and safety issues?” The tool Airbnb built was Minerva, optimised specifically for the kinds of questions Airbnb might typically have for its own data.

“By locking down all of the definitions for the metrics, you could basically have a data engineering team, a centralized data infrastructure team, do all the calculation for these metrics, and then serve those to the data scientists to then go in and do kind of deeper, more interesting work, because they weren’t bogged down in calculating those metrics over and over,” he continued. This platform evolved within Airbnb. “We were we were really inspired by some of the early work that we saw happen on this tool.”

The issue is that not every company is built to, well, build tools like these tailored to whatever their own business interests might be.

“There’s a handful of companies who do similar things in the metrics space,” Mayfield said, “really top flight companies like LinkedIn, Airbnb and Uber. They have really started to invest in metrics. But it’s only those companies that can devote teams of eight or 10, engineers, designers who can build those things in house. And I think that was probably, you know, a big part of the impetus for wanting to start this company was to say, not every organization is going to be able to devote eight or 10 engineers to building this metrics tool.”

And the other issue is that metrics have become an increasingly important — maybe the most important — lever for decision making in the world of product design and wider business strategy for a tech (and maybe by default, any) company.

We have moved away from “move fast and break things.” Instead, we now embrace — as Mayfield put it — “If you can’t measure it, you can’t move it.”

Transform is built around three basic priorities, Handel said.

The first of these has to do with collective ownership of metrics: by building a single framework for measuring these and identifying them, their theory is that it’s easier for a company to all get on the same page with using them. The second of these is to use Transform to simply make the work of the data team more efficient and easier, by turning the most repetitive parts of extracting insights into automated scripts that can be used and reused, giving the data team the ability to spend more time analyzing the data rather than just building datasets. And third of all, to provide customers with APIs that they can use to embed the metric-extracting tools into other applications, whether in business intelligence or elsewhere.

The three products it’s introducing today, called Metrics Framework, Metrics Catalog and Metrics API follow from these principles.

Transform is only really launching publicly today, but Handel said that it’s already working with a small handful of customers (unnamed) in a small beta, enough to be confident that what it’s built works as it was intended. The funding will be used to continue building out the product as well as bring on more talent and hopefully onboard more businesses to using it.

Hopefully might be less a tenuous word than its investors would use, convinced that it’s filling a strong need in the market.

“Transform is filling a critical gap within the industry. Just as we invested in Looker early on for its innovative approach to business intelligence, Transform takes it one step further by providing a powerful yet streamlined single source of truth for metrics,” said Tomasz Tungis, MD, Redpoint Ventures, in a statement.

“We’ve seen companies across the globe struggle to make sense of endless data sources or turn them into actionable, trusted metrics. We invested in Transform because they’ve developed an elegant solution to this problem that will change how companies think about their data,” added Shardul Shah, a partner at Index Ventures.

May
20
2021
--

How to ensure data quality in the era of Big Data

A little over a decade has passed since The Economist warned us that we would soon be drowning in data. The modern data stack has emerged as a proposed life-jacket for this data flood — spearheaded by Silicon Valley startups such as Snowflake, Databricks and Confluent.

Today, any entrepreneur can sign up for BigQuery or Snowflake and have a data solution that can scale with their business in a matter of hours. The emergence of cheap, flexible and scalable data storage solutions was largely a response to changing needs spurred by the massive explosion of data.

Currently, the world produces 2.5 quintillion bytes of data daily (there are 18 zeros in a quintillion). The explosion of data continues in the roaring ‘20s, both in terms of generation and storage — the amount of stored data is expected to continue to double at least every four years. However, one integral part of modern data infrastructure still lacks solutions suitable for the Big Data era and its challenges: Monitoring of data quality and data validation.

Let me go through how we got here and the challenges ahead for data quality.

The value vs. volume dilemma of Big Data

In 2005, Tim O’Reilly published his groundbreaking article “What is Web 2.0?”, truly setting off the Big Data race. The same year, Roger Mougalas from O’Reilly introduced the term “Big Data” in its modern context? — ?referring to a large set of data that is virtually impossible to manage and process using traditional BI tools.

Back in 2005, one of the biggest challenges with data was managing large volumes of it, as data infrastructure tooling was expensive and inflexible, and the cloud market was still in its infancy (AWS didn’t publicly launch until 2006). The other was speed: As Tristan Handy from Fishtown Analytics (the company behind dbt) notes, before Redshift launched in 2012, performing relatively straightforward analyses could be incredibly time-consuming even with medium-sized data sets. An entire data tooling ecosystem has since been created to mitigate these two problems.

The emergence of the modern data stack (example logos & categories)

The emergence of the modern data stack (example logos and categories). Image Credits: Validio

Scaling relational databases and data warehouse appliances used to be a real challenge. Only 10 years ago, a company that wanted to understand customer behavior had to buy and rack servers before its engineers and data scientists could work on generating insights. Data and its surrounding infrastructure was expensive, so only the biggest companies could afford large-scale data ingestion and storage.

The challenge before us is to ensure that the large volumes of Big Data are of sufficiently high quality before they’re used.

Then came a (Red)shift. In October 2012, AWS presented the first viable solution to the scale challenge with Redshift — a cloud-native, massively parallel processing (MPP) database that anyone could use for a monthly price of a pair of sneakers ($100) — about 1,000x cheaper than the previous “local-server” setup. With a price drop of this magnitude, the floodgates opened and every company, big or small, could now store and process massive amounts of data and unlock new opportunities.

As Jamin Ball from Altimeter Capital summarizes, Redshift was a big deal because it was the first cloud-native OLAP warehouse and reduced the cost of owning an OLAP database by orders of magnitude. The speed of processing analytical queries also increased dramatically. And later on (Snowflake pioneered this), they separated computing and storage, which, in overly simplified terms, meant customers could scale their storage and computing resources independently.

What did this all mean? An explosion of data collection and storage.

Mar
23
2021
--

Dataminr raises $475M on a $4.1B valuation for real-time insights based on 100k sources of public data

Significant funding news today for one of the startups making a business out of tapping huge, noisy troves of publicly available data across social media, news sites, undisclosed filings and more. Dataminr, which ingests information from a mix of 100,000 public data sources, and then based on that provides customers real-time insights into ongoing events and new developments, has closed on $475 million in new funding. Dataminr has confirmed that this Series F values the company at $4.1 billion as it gears up for an IPO in 2023.

This Series F is coming from a mix of investors including Eldridge (a firm that owns the LA Dodgers but also makes a bunch of other sports, media, tech and other investments), Valor Equity Partners (the firm behind Tesla and many tech startups), MSD Capital (Michael Dell’s fund), Reinvent Capital (Mark Pincus and Reid Hoffman’s firm), ArrowMark Partners, IVP, Eden Global and investment funds managed by Morgan Stanley Tactical Value, among others.

To put its valuation into some context, the New York-based company last raised money in 2018 at a $1.6 billion valuation. And with this latest round, it has now raised over $1 billion in outside funding, based on PitchBook data. This latest round has been in the works for a while and was rumored last week at a lower valuation than what Dataminr ultimately got.

The funding is coming at a critical moment, both for the company and for the world at large.

In terms of the company, Dataminr has been seeing a huge surge of business.

Ted Bailey, the founder and CEO, said in an interview that it will be using the money to continue growing its business in existing areas: adding more corporate customers, expanding in international sales and expanding its AI platform as it gears up for an IPO, most likely in 2023. In addition to being used journalists and newsrooms, NGOs and other public organizations, its corporate business today, Bailey said, includes half of the Fortune 50 and a number of large public sector organizations. Over the last year that large enterprise segment of its customers doubled in revenue growth.

“Whether it’s for physical safety, reputation risk or crisis management, or business intelligence or cybersecurity, we’re providing critical insights on a daily basis,” he said. “All of the events of the recent year have created a sense of urgency, and demand has really surged.”

Activity on the many platforms that Dataminr taps to ingest information has been on the rise for years, but it has grown exponentially in the last year especially as more people spend more time at home and online and away from physically interacting with each other: that means more data for Dataminr to crawl, but also, quite possibly, more at stake for all of us as a result: there is so much more out there than before, and as a result so much more to be gleaned out of that information.

That also means that the wider context of Dataminr’s growth is not quite so clear cut.

The company’s data tools have indeed usefully helped first responders react in crisis situations, feeding them data faster than even their own channels might do; and it provides a number of useful, market-impacting insights to businesses.

But Dataminr’s role in helping its customers — which include policing forces — connect the dots on certain issues has not always been seen as a positive. One controversial accusation made last year was that Dataminr data was being used by police for racial profiling. In years past, it has been barred by specific partners like Twitter from sharing data with intelligence agencies. Twitter used to be a 5% shareholder in the company. Bailey confirmed to me that it no longer is but remains a key partner for data. I’ve contacted Twitter to see if I can get more detail on this and will update the story if and when I learn more. Twitter made $509 million in revenues from services like data licensing in 2020, up by about $45 million on the year before.

In defense of Dataminr, Bailey that the negative spins on what it does result from “misperceptions,” since it can’t track people or do anything proactive. “We deliver alerts on events and it’s [about] a time advantage,” he said, likening it to the Associated Press, but “just earlier.”

“The product can’t be used for surveillance,” Bailey added. “It is prohibited.”

Of course, in the ongoing debate about surveillance, it’s more about how Dataminr’s customers might ultimately use the data that they get through Dataminr’s tools, so the criticism is more about what it might enable rather than what it does directly.

Despite some of those persistent questions about the ethics of AI and other tools and how they are implemented by end users, backers are bullish on the opportunities for Dataminr to continue growing.

Eden Global Partners served as strategic partner for the Series F capital round.


Early Stage is the premier ‘how-to’ event for startup entrepreneurs and investors. You’ll hear first-hand how some of the most successful founders and VCs build their businesses, raise money and manage their portfolios. We’ll cover every aspect of company-building: Fundraising, recruiting, sales, product market fit, PR, marketing and brand building. Each session also has audience participation built-in – there’s ample time included for audience questions and discussion. Use code “TCARTICLE” at checkout to get 20 percent off tickets right here.

Mar
07
2017
--

John Deere partners with Kespry to bring drones and aerial data to construction and forestry

 Heavy equipment makers Deere & Co., better known as John Deere, have forged a strategic alliance with drone-tech startup Kespry, the companies announced Tuesday in Las Vegas at CONEXPO, an international trade show for the construction industry. The deal could prove a boon for sales of Kespry’s drones and data analytics software. It could help John Deere tap into a new, high-tech means… Read More

Nov
19
2016
--

How data science and rocket science will get humans to Mars

Laptop computer with red ethernet cable forming a rocket, coming out of the back on a plain background President Obama recently re-affirmed America’s commitment to sending a manned mission to Mars. Think your data science challenges are complicated? Imagine the difficulties involved in mining data to understand the health impacts of a trip to Mars. When sending humans “where no one has gone before,” there are a multitude of variables to consider, and NASA is hard at work… Read More

Jun
07
2016
--

Celonis takes $27.5M led by Accel, 83North to grow the market for big data process mining

Celonis founders How do big businesses optimize IT-driven processes? Often by hiring management consultants to cast an expert eye over digital traceries and deliver recommendations for improving core operations like logistics and production. But Munich-based B2B SaaS startup Celonis reckons software can do a better job of flagging up areas where there’s room for business optimization. Its… Read More

May
29
2014
--

Trifacta Raises $25 Million For Its Data Transformation Software

8031897271_9c63e48a29_b While companies spend trillions of dollars on big data technologies to compile, store, and deliver the volumes of data they’re collecting and the software and services that they have to make sense of that data once it’s collected, there’s a divide that separates the two parts of the process. San Francisco, Calif.-based Trifacta is selling a software that purports to bridge… Read More

Powered by WordPress | Theme: Aeros 2.0 by TheBuckmaker.com