May
07
2020
--

Harbr emerges from stealth to help build online data marketplaces

Harbr co-founder Anthony Cosgrove has been working with data for over 15 years, so he has an inkling of some of the problems associated with pulling data together in a way that makes it easy for others to consume, whether internally or externally. Like many entrepreneurs before him, he decided to start a company to solve that problem, and today it came out of stealth.

Cosgrove explained that in his experience, data platforms of the past had several problems. “They were too slow. They were too expensive and too risky, and when you got the data you then ended up working in a silo with really no repeatability of anything that you did for anybody else in your organization,” he explained.

Cosgrove started Harbr because he saw a dearth of tools to help with these issues. “We wanted to create an environment where organizations could share their data, collaborate on that data and create new versions of that data that were really optimized for very specific use cases,” he said.

For now, the company is concentrating on large data vendors, helping them package and monetize the data they produce as a business more efficiently, but Cosgrove sees a time where he could be helping other firms that produce data as a byproduct of conducting business to monetize that data more easily.

He says these big data businesses generally lack the agility to package data in ways that make sense for each customer, and his company’s product should help solve that. “They’re able to start working directly with their customers to move away from kind of sending data to actually selling services, models or insights, which is what customers really want,” he said.

One other unique aspect of the tool is that it is a true platform, meaning that you are not just restricted to the data in your system. You can pull together other data sources as well, and that could make for even more interesting ways to package the data for customers.

The company launched in London in 2017 and spent some time building the product. It recently opened offices in the United States and currently has 30 employees divided between the two locations. It has raised $6.5 million in seed capital led by Boldstart Ventures .

Apr
22
2020
--

Fishtown Analytics raises $12.9M Series A for its open-source analytics engineering tool

Philadelphia-based Fishtown Analytics, the company behind the popular open-source data engineering tool dbt, today announced that it has raised a $12.9 million Series A round led by Andreessen Horowitz, with the firm’s general partner Martin Casado joining the company’s board.

“I wrote this blog post in early 2016, essentially saying that analysts needed to work in a fundamentally different way,” Fishtown founder and CEO Tristan Handy told me, when I asked him about how the product came to be. “They needed to work in a way that much more closely mirrored the way the software engineers work and software engineers have been figuring this shit out for years and data analysts are still like sending each other Microsoft Excel docs over email.”

The dbt open-source project forms the basis of this. It allows anyone who can write SQL queries to transform data and then load it into their preferred analytics tools. As such, it sits in-between data warehouses and the tools that load data into them on one end, and specialized analytics tools on the other.

As Casado noted when I talked to him about the investment, data warehouses have now made it affordable for businesses to store all of their data before it is transformed. So what was traditionally “extract, transform, load” (ETL) has now become “extract, load, transform” (ELT). Andreessen Horowitz is already invested in Fivetran, which helps businesses move their data into their warehouses, so it makes sense for the firm to also tackle the other side of this business.

“Dbt is, as far as we can tell, the leading community for transformation and it’s a company we’ve been tracking for at least a year,” Casado said. He also argued that data analysts — unlike data scientists — are not really catered to as a group.

Before this round, Fishtown hadn’t raised a lot of money, even though it has been around for a few years now, except for a small SAFE round from Amplify.

But Handy argued that the company needed this time to prove that it was on to something and build a community. That community now consists of more than 1,700 companies that use the dbt project in some form and over 5,000 people in the dbt Slack community. Fishtown also now has over 250 dbt Cloud customers and the company signed up a number of big enterprise clients earlier this year. With that, the company needed to raise money to expand and also better service its current list of customers.

“We live in Philadelphia. The cost of living is low here and none of us really care to make a quadro-billion dollars, but we do want to answer the question of how do we best serve the community,” Handy said. “And for the first time, in the early part of the year, we were like, holy shit, we can’t keep up with all of the stuff that people need from us.”

The company plans to expand the team from 25 to 50 employees in 2020 and with those, the team plans to improve and expand the product, especially its IDE for data analysts, which Handy admitted could use a bit more polish.

Apr
15
2020
--

Pinpoint releases dashboard to bring visibility to software engineering operations

As companies look for better ways to understand how different departments work at a granular level, engineering has traditionally been a black box of siloed data. Pinpoint, an Austin-based startup, has been working on a platform to bring this information into a single view, and today it released a dashboard to help companies understand what’s happening across software engineering from an operational perspective.

Jeff Haynie, co-founder and CEO at Pinpoint says the company’s mission for the last two years has been giving greater visibility into the  engineering department, something he says is even more important in the current context with workers spread out at home.

“Companies give engineering a bunch of money, and they build a bunch of amazing things, but in the end, it is just a black box, and we really don’t know what happens,” Haynie said. He says his company has been working to take all of the data to try and contextualize it, bring it together and correlate that information.

Today, they are introducing a dashboard that takes what they’ve been building and pulls it together into a single view, which is 100% self-serve. Prior to this, you needed a bunch of hand-holding from Pinpoint personnel to get it up and running, but today you can download the product and sign into your various services such as your git repository, your CI/CD software, your IDE and so forth.

It also provides a way for engineering personnel to communicate with one another without leaving the tool.

Pinpoint software engineering dashboard. Image Credit: Pinpoint

“Obviously, we will handhold and help people as they need it, and we have an enterprise version of the product with a higher level of SLA, and we have a customer success team to do that, but we’ve really focused this new release on purely self service,” Haynie said.

What’s more, while there is a free version already for teams under 10 people that’s free forever, with the release of today’s product, the company is offering unlimited access to the dashboard for free for three months.

Haynie says they’re like any startup right now, but having experience with several other startups and having lived through 9/11, the dot-com crash, 2008 and so forth, he knows how to hunker down and preserve cash. At the same time, he says they are seeing a lot of in-bound interest in the product, and they wanted to come up with a creative way to help customers through this crisis, while putting the product out there for people to use.

“We’re like any other startup or any other business frankly at this point: we’re nervous and scared. How do you survive this [and how long will it last]? The other side of it is that we’re rushing to take advantage of this inbound interest that we’re getting and trying to sort of seize the opportunity and try to be creative about how we help them.”

The startup hopes that, if companies find the product useful, after three months they won’t mind paying for the full version. For now, it’s just putting it out there for free and seeing what happens with it — just another startup trying to find a way through this crisis.

Apr
09
2020
--

Free tool helps manufacturers map where COVID-19 impacts supply chain

Assent Compliance, a company that helps large manufacturers like GE and Rolls Royce manage complex supply chains through an online data exchange, announced a new tool this week that lets any company, whether they’re a customer or not, upload bills of materials and see on a map where COVID-19 is having an impact on their supply chain.

Company co-founder Matt Whitteker, says the Ottawa startup focuses on supply chain data management, which means it has the data and the tooling to develop a data-driven supply chain map based on WHO data identifying COVID hotspots. He believes that his is the only company to have done this.

“We’re the only ones that have taken supply chain data and applied it to this particular pandemic. And it’s something that’s really native to our platform. We have all that data on hand — we have location data for suppliers. So it’s just a matter of applying that with third-party data sources (like the WHO data), and then extracting valuable business intelligence from it,” he said.

If you want to participate, you simply go to the company website and fill out a form. A customer success employee will contact you and walk you through the process of uploading your data to the platform. Once they have your data, they generate a map showing the parts of the world where your supply chain is most likely to be disrupted, identifying the level of risk based on your individual data.

The company captures supply chain data as part of the act of doing business with 1,000 customers and 500,000 suppliers currently on their platform. “When companies are manufacturing products they have what’s called a bill of materials, kind of like a recipe. And companies upload their bill of materials that basically outlines all their parts, components and commodities, and who they get them from, which basically represents their supply chain,” Whitteker explained.

After the company uploads the bill of materials, Assent opens a portal for the companies to exchange data, which might be tax forms, proof of sourcing or any kind of information and documentation the manufacturer needs to comply with legal and regulatory rules around procurement of a given part.

They decided to start building the COVID-19 map application when they recognized that this was going to have the biggest supply chain disruption the world has seen since World War II. It took about a month to build it. It went into beta last week with customers and over 350 signed up in the first two hours. This week, they made the tool generally available to anyone, even non-customers, for free.

The company was founded in 2016 and has raised $220 million, according to Whitteker.

Mar
03
2020
--

Datastax acquires The Last Pickle

Data management company Datastax, one of the largest contributors to the Apache Cassandra project, today announced that it has acquired The Last Pickle (and no, I don’t know what’s up with that name either), a New Zealand-based Cassandra consulting and services firm that’s behind a number of popular open-source tools for the distributed NoSQL database.

As Datastax Chief Strategy Officer Sam Ramji, who you may remember from his recent tenure at Apigee, the Cloud Foundry Foundation, Google and Autodesk, told me, The Last Pickle is one of the premier Apache Cassandra consulting and services companies. The team there has been building Cassandra-based open source solutions for the likes of Spotify, T Mobile and AT&T since it was founded back in 2012. And while The Last Pickle is based in New Zealand, the company has engineers all over the world that do the heavy lifting and help these companies successfully implement the Cassandra database technology.

It’s worth mentioning that Last Pickle CEO Aaron Morton first discovered Cassandra when he worked for WETA Digital on the special effects for Avatar, where the team used Cassandra to allow the VFX artists to store their data.

“There’s two parts to what they do,” Ramji explained. “One is the very visible consulting, which has led them to become world experts in the operation of Cassandra. So as we automate Cassandra and as we improve the operability of the project with enterprises, their embodied wisdom about how to operate and scale Apache Cassandra is as good as it gets — the best in the world.” And The Last Pickle’s experience in building systems with tens of thousands of nodes — and the challenges that its customers face — is something Datastax can then offer to its customers as well.

And Datastax, of course, also plans to productize The Last Pickle’s open-source tools like the automated repair tool Reaper and the Medusa backup and restore system.

As both Ramji and Datastax VP of Engineering Josh McKenzie stressed, Cassandra has seen a lot of commercial development in recent years, with the likes of AWS now offering a managed Cassandra service, for example, but there wasn’t all that much hype around the project anymore. But they argue that’s a good thing. Now that it is over ten years old, Cassandra has been battle-hardened. For the last ten years, Ramji argues, the industry tried to figure out what the de factor standard for scale-out computing should be. By 2019, it became clear that Kubernetes was the answer to that.

“This next decade is about what is the de facto standard for scale-out data? We think that’s got certain affordances, certain structural needs and we think that the decades that Cassandra has spent getting harden puts it in a position to be data for that wave.”

McKenzie also noted that Cassandra provides users with a number of built-in features like support for mutiple data centers and geo-replication, rolling updates and live scaling, as well as wide support across programming languages, give it a number of advantages over competing databases.

“It’s easy to forget how much Cassandra gives you for free just based on its architecture,” he said. “Losing the power in an entire datacenter, upgrading the version of the database, hardware failing every day? No problem. The cluster is 100 percent always still up and available. The tooling and expertise of The Last Pickle really help bring all this distributed and resilient power into the hands of the masses.”

The two companies did not disclose the price of the acquisition.

Jun
27
2019
--

Fungible raises $200 million led by SoftBank Vision Fund to help companies handle increasingly massive amounts of data

Fungible, a startup that wants to help data centers cope with the increasingly massive amounts of data produced by new technologies, has raised a $200 million Series C led by SoftBank Vision Fund, with participation from Norwest Venture Partners and its existing investors. As part of the round, SoftBank Investment Advisers senior managing partner Deep Nishar will join Fungible’s board of directors.

Founded in 2015, Fungible now counts about 200 employees and has raised more than $300 million in total funding. Its other investors include Battery Ventures, Mayfield Fund, Redline Capital and Walden Riverwood Ventures. Its new capital will be used to speed up product development. The company’s founders, CEO Pradeep Sindhu and Bertrand Serlet, say Fungible will release more information later this year about when its data processing units will be available and their on-boarding process, which they say will not require clients to change their existing applications, networking or server design.

Sindu previously founded Juniper Networks, where he held roles as chief scientist and CEO. Serlet was senior vice president of software engineering at Apple before leaving in 2011 and founding Upthere, a storage startup that was acquired by Western Digital in 2017. Sindu and Serlet describe Fungible’s objective as pivoting data centers from a “compute-centric” model to a data-centric one. While the company is often asked if they consider Intel and Nvidia competitors, they say Fungible Data Processing Units (DPU) complement tech, including central and graphics processing units, from other chip makers.

Sindhu describes Fungible’s DPUs as a new building block in data center infrastructure, allowing them to handle larger amounts of data more efficiently and also potentially enabling new kinds of applications. Its DPUs are fully programmable and connect with standard IPs over Ethernet local area networks and local buses, like the PCI Express, that in turn connect to CPUs, GPUs and storage. Placed between the two, the DPUs act like a “super-charged data traffic controller,” performing computations offloaded by the CPUs and GPUs, as well as converting the IP connection into high-speed data center fabric.

This better prepares data centers for the enormous amounts of data generated by new technology, including self-driving cars, and industries such as personalized healthcare, financial services, cloud gaming, agriculture, call centers and manufacturing, says Sindu.

In a press statement, Nishar said “As the global data explosion and AI revolution unfold, global computing, storage and networking infrastructure are undergoing a fundamental transformation. Fungible’s products enable data centers to leverage their existing hardware infrastructure and benefit from these new technology paradigms. We look forward to partnering with the company’s visionary and accomplished management team as they power the next generation of data centers.”

Jun
10
2019
--

With Tableau and Mulesoft, Salesforce gains full view of enterprise data

Back in the 2010 timeframe, it was common to say that content was king, but after watching Google buy Looker for $2.6 billion last week and Salesforce nab Tableau for $15.7 billion this morning, it’s clear that data has ascended to the throne in a business context.

We have been hearing about Big Data for years, but we’ve probably reached a point in 2019 where the data onslaught is really having an impact on business. If you can find the key data nuggets in the big data pile, it can clearly be a competitive advantage, and companies like Google and Salesforce are pulling out their checkbooks to make sure they are in a position to help you out.

While Google, as a cloud infrastructure vendor, is trying to help companies on its platform and across the cloud understand and visualize all that data, Salesforce as a SaaS vendor might have a different reason — one that might surprise you — given that Salesforce was born in the cloud. But perhaps it recognizes something fundamental. If it truly wants to own the enterprise, it has to have a hybrid story, and with Mulesoft and Tableau, that’s precisely what it has — and why it was willing to spend around $23 billion to get it.

Making connections

Certainly, Salesforce chairman Marc Benioff has no trouble seeing the connections between his two big purchases over the last year. He sees the combination of Mulesoft connecting to the data sources and Tableau providing a way to visualize as a “beautiful thing.”

May
02
2019
--

Couchbase’s mobile database gets built-in ML and enhanced synchronization features

Couchbase, the company behind the eponymous NoSQL database, announced a major update to its mobile database today that brings some machine learning smarts, as well as improved synchronization features and enhanced stats and logging support, to the software.

“We’ve led the innovation and data management at the edge since the release of our mobile database five years ago,” Couchbase’s VP of Engineering Wayne Carter told me. “And we’re excited that others are doing that now. We feel that it’s very, very important for businesses to be able to utilize these emerging technologies that do sit on the edge to drive their businesses forward, and both making their employees more effective and their customer experience better.”

The latter part is what drove a lot of today’s updates, Carter noted. He also believes that the database is the right place to do some machine learning. So with this release, the company is adding predictive queries to its mobile database. This new API allows mobile apps to take pre-trained machine learning models and run predictive queries against the data that is stored locally. This would allow a retailer to create a tool that can use a phone’s camera to figure out what part a customer is looking for.

To support these predictive queries, Couchbase mobile is also getting support for predictive indexes. “Predictive indexes allow you to create an index on prediction, enabling correlation of real-time predictions with application data in milliseconds,” Carter said. In many ways, that’s also the unique value proposition for bringing machine learning into the database. “What you really need to do is you need to utilize the unique values of a database to be able to deliver the answer to those real-time questions within milliseconds,” explained Carter.

The other major new feature in this release is delta synchronization, which allows businesses to push far smaller updates to the databases on their employees’ mobile devices. That’s because they only have to receive the information that changed instead of a full updated database. Carter says this was a highly requested feature, but until now, the company always had to prioritize work on other components of Couchbase.

This is an especially useful feature for the company’s retail customers, a vertical where it has been quite successful. These users need to keep their catalogs up to data and quite a few of them supply their employees with mobile devices to help shoppers. Rumor has it that Apple, too, is a Couchbase user.

The update also includes a few new features that will be more of interest to operators, including advanced stats reporting and enhanced logging support.

Feb
20
2019
--

Why Daimler moved its big data platform to the cloud

Like virtually every big enterprise company, a few years ago, the German auto giant Daimler decided to invest in its own on-premises data centers. And while those aren’t going away anytime soon, the company today announced that it has successfully moved its on-premises big data platform to Microsoft’s Azure cloud. This new platform, which the company calls eXtollo, is Daimler’s first major service to run outside of its own data centers, though it’ll probably not be the last.

As Daimler’s head of its corporate center of excellence for advanced analytics and big data Guido Vetter told me, the company started getting interested in big data about five years ago. “We invested in technology — the classical way, on-premise — and got a couple of people on it. And we were investigating what we could do with data because data is transforming our whole business as well,” he said.

By 2016, the size of the organization had grown to the point where a more formal structure was needed to enable the company to handle its data at a global scale. At the time, the buzz phrase was “data lakes” and the company started building its own in order to build out its analytics capacities.

Electric lineup, Daimler AG

“Sooner or later, we hit the limits as it’s not our core business to run these big environments,” Vetter said. “Flexibility and scalability are what you need for AI and advanced analytics and our whole operations are not set up for that. Our backend operations are set up for keeping a plant running and keeping everything safe and secure.” But in this new world of enterprise IT, companies need to be able to be flexible and experiment — and, if necessary, throw out failed experiments quickly.

So about a year and a half ago, Vetter’s team started the eXtollo project to bring all the company’s activities around advanced analytics, big data and artificial intelligence into the Azure Cloud, and just over two weeks ago, the team shut down its last on-premises servers after slowly turning on its solutions in Microsoft’s data centers in Europe, the U.S. and Asia. All in all, the actual transition between the on-premises data centers and the Azure cloud took about nine months. That may not seem fast, but for an enterprise project like this, that’s about as fast as it gets (and for a while, it fed all new data into both its on-premises data lake and Azure).

If you work for a startup, then all of this probably doesn’t seem like a big deal, but for a more traditional enterprise like Daimler, even just giving up control over the physical hardware where your data resides was a major culture change and something that took quite a bit of convincing. In the end, the solution came down to encryption.

“We needed the means to secure the data in the Microsoft data center with our own means that ensure that only we have access to the raw data and work with the data,” explained Vetter. In the end, the company decided to use the Azure Key Vault to manage and rotate its encryption keys. Indeed, Vetter noted that knowing that the company had full control over its own data was what allowed this project to move forward.

Vetter tells me the company obviously looked at Microsoft’s competitors as well, but he noted that his team didn’t find a compelling offer from other vendors in terms of functionality and the security features that it needed.

Today, Daimler’s big data unit uses tools like HD Insights and Azure Databricks, which covers more than 90 percents of the company’s current use cases. In the future, Vetter also wants to make it easier for less experienced users to use self-service tools to launch AI and analytics services.

While cost is often a factor that counts against the cloud, because renting server capacity isn’t cheap, Vetter argues that this move will actually save the company money and that storage costs, especially, are going to be cheaper in the cloud than in its on-premises data center (and chances are that Daimler, given its size and prestige as a customer, isn’t exactly paying the same rack rate that others are paying for the Azure services).

As with so many big data AI projects, predictions are the focus of much of what Daimler is doing. That may mean looking at a car’s data and error code and helping the technician diagnose an issue or doing predictive maintenance on a commercial vehicle. Interestingly, the company isn’t currently bringing to the cloud any of its own IoT data from its plants. That’s all managed in the company’s on-premises data centers because it wants to avoid the risk of having to shut down a plant because its tools lost the connection to a data center, for example.

Feb
06
2019
--

Big companies are not becoming data-driven fast enough

I remember watching MIT professor Andrew McAfee years ago telling stories about the importance of data over gut feeling, whether it was predicting successful wines or making sound business decisions. We have been hearing about big data and data-driven decision making for so long, you would think it has become hardened into our largest organizations by now. As it turns out, new research by NewVantage Partners finds that most large companies are having problems implementing an organization-wide, data-driven strategy.

McAfee was fond of saying that before the data deluge we have today, the way most large organizations made decisions was via the HiPPO — the highest paid person’s opinion. Then he would chide the audience that this was not the proper way to run your business. Data, not gut feelings, even those based on experience, should drive important organizational decisions.

While companies haven’t failed to recognize McAfee’s advice, the NVP report suggests they are having problems implementing data-driven decision making across organizations. There are plenty of technological solutions out there today to help them, from startups all the way to the largest enterprise vendors, but the data (see, you always need to go back to the data) suggests that it’s not a technology problem, it’s a people problem.

Executives can have farsighted vision that their organizations need to be data-driven. They can acquire all of the latest solutions to bring data to the forefront, but unless they combine that with a broad cultural shift and a deep understanding of how to use that data inside business processes, they will continue to struggle.

The study’s authors, Randy Bean and Thomas H. Davenport, wrote about the people problem in their study’s executive summary. “We hear little about initiatives devoted to changing human attitudes and behaviors around data. Unless the focus shifts to these types of activities, we are likely to see the same problem areas in the future that we’ve observed year after year in this survey.”

The survey found that 72 percent of respondents have failed in this regard, reporting they haven’t been able to create a data-driven culture, whatever that means to individual respondents. Meanwhile, 69 percent reported they had failed to create a data-driven organization, although it would seem that these two metrics would be closely aligned.

Perhaps most discouraging of all is that the data is trending the wrong way. Over the last several years, the report’s authors say that those organizations calling themselves data-driven has actually dropped each year from 37.1 percent in 2017 to 32.4 percent in 2018 to 31.0 percent in the latest survey.

This matters on so many levels, but consider that as companies shift to artificial intelligence and machine learning, these technologies rely on abundant amounts of data to work effectively. What’s more, every organization, regardless of its size, is generating vast amounts of data, simply as part of being a digital business in the 21st century. They need to find a way to control this data to make better decisions and understand their customers better. It’s essential.

There is so much talk about innovation and disruption, and understanding and affecting company culture, but so much of all this is linked. You need to be more agile. You need to be more digital. You need to be transformational. You need to be all of these things — and data is at the center of all of it.

Data has been called the new oil often enough to be cliché, but these results reveal that the lesson is failing to get through. Companies need to be data-driven now, this instant. This isn’t something to be working toward at this point. This is something you need to be doing, unless your ultimate goal is to become irrelevant.

Powered by WordPress | Theme: Aeros 2.0 by TheBuckmaker.com