Apr
02
2020
--

Collibra nabs another $112.5M at a $2.3B valuation for its big data management platform

GDPR and other data protection and privacy regulations — as well as a significant (and growing) number of data breaches and exposées of companies’ privacy policies — have put a spotlight on not just the vast troves of data that businesses and other organizations hold on us, but also how they handle it. Today, one of the companies helping them cope with that data in a better and legal way is announcing a huge round of funding to continue that work. Collibra, which provides tools to manage, warehouse, store and analyse data troves, is today announcing that it has raised $112.5 million in funding, at a post-money valuation of $2.3 billion.

The funding — a Series F, from the looks of it — represents a big bump for the startup, which last year raised $100 million at a valuation of just over $1 billion. This latest round was co-led by ICONIQ Capital, Index Ventures, and Durable Capital Partners LP, with previous investors CapitalG (Google’s growth fund), Battery Ventures, and Dawn Capital also participating.

Collibra was originally a spin-out from Vrije Universiteit in Brussels, Belgium and today it works with some 450 enterprises and other large organizations. Customers include Adobe, Verizon (which owns TechCrunch), insurers AXA and a number of healthcare providers. Its products cover a range of services focused around company data, including tools to help customers comply with local data protection policies and store it securely, and tools (and plug-ins) to run analytics and more.

These are all features and products that have long had a place in enterprise big data IT, but they have become increasingly more used and in-demand both as data policies have expanded, as security has become more of an issue, and as the prospects of what can be discovered through big data analytics have become more advanced.

With that growth, many companies have realised that they are not in a position to use and store their data in the best possible way, and that is where companies like Collibra step in.

“Most large organizations are in data chaos,” Felix Van de Maele, co-founder and CEO, previously told us. “We help them understand what data they have, where they store it and [understand] whether they are allowed to use it.”

As you would expect with a big IT trend, Collibra is not the only company chasing this opportunity. Competitors include Informatica, IBM, Talend, and Egnyte, among a number of others, but the market position of Collibra, and its advanced technology, is what has continued to impress investors.

“Durable Capital Partners invests in innovative companies that have significant potential to shape growing industries and build larger companies,” said Henry Ellenbogen, founder and chief investment officer for Durable Capital Partners LP, in a statement (Ellenbogen is formerly an investment manager a T. Rowe Price, and this is his first investment in Collibra under Durable). “We believe Collibra is a leader in the Data Intelligence category, a space that could have a tremendous impact on global business operations and a space that we expect will continue to grow as data becomes an increasingly critical asset.”

“We have a high degree of conviction in Collibra and the importance of the company’s mission to help organizations benefit from their data,” added Matt Jacobson, general partner at ICONIQ Capital and Collibra board member, in his own statement. “There is an increasing urgency for enterprises to harness their data for strategic business decisions. Collibra empowers organizations to use their data to make critical business decisions, especially in uncertain business environments.”

Mar
10
2020
--

BackboneAI scores $4.7M seed to bring order to intercompany data sharing

BackboneAI, an early-stage startup that wants to help companies dealing with lots of data, particularly coming from a variety of external sources, announced a $4.7 million seed investment today.

The round was led by Fika Ventures with participation from Boldstart Ventures, Dynamo Ventures, GGV Capital, MetaProp, Spider VC and several other unnamed investors.

Company founder Rob Bailey says he has spent a lot of time in his career watching how data flows in organizations. There are still a myriad of challenges related to moving data between organizations, and that’s what his company is trying to solve. “BackboneAI is an AI platform specifically built for automating data flows within and between companies,” he said.

This could involve any number of scenarios from keeping large, complex data catalogues up-to-date to coordinating the intricate flow of construction materials between companies or content rights management across an entertainment industry.

Bailey says that he spent 18 months talking to companies before he built the product. “What we found is that every company we talked to was, in some way or another, concerned about an absolute flood of data from all these different applications and from all the companies that they’re working with externally,” he explained.

The BackboneAI platform aims to solve a number of problems related to this. For starters, it automates the acquisition of this data, usually from third parties like suppliers, customers, regulatory agencies and so forth. Then it handles ingestion of the data, and finally it takes care of a lot of actual processing from external sources, while mapping it to internal systems like the company ERP system.

As an example, he uses an industrial supply company that may deal with a million SKUs across a couple of dozen divisions. Trying to track that with manual or even legacy systems is difficult. “They take all this product data in [from external suppliers], and then process the information in their own [internal] product catalog, and then finally present that data about those products to hundreds of thousands of customers. It’s an incredibly large and challenging data problem as you’re processing millions and millions of SKUs and orders, and you have to keep that data current on a regular basis,” he explained.

The company is just getting started. It spent 2019 incubating inside of Boldstart Ventures . Today the company has close to 20 employees in New York City, and it has signed its first Fortune 500 customer. Bailey says they have 15 additional Fortune 500 companies in the pipeline. With the seed money, he hopes to build on this initial success.

Feb
27
2020
--

London-based Gyana raises $3.9M for a no-code approach to data science

Coding and other computer science expertise remain some of the more important skills that a person can have in the working world today, but in the last few years, we have also seen a big rise in a new generation of tools providing an alternative way of reaping the fruits of technology: “no-code” software, which lets anyone — technical or non-technical — build apps, games, AI-based chatbots, and other products that used to be the exclusive terrain of engineers and computer scientists.

Today, one of the newer startups in the category — London-based Gyana, which lets non-technical people run data science analytics on any structured dataset — is announcing a round of £3 million to fuel its next stage of growth.

Led by U.K. firm Fuel Ventures, other investors in this round include Biz Stone of Twitter, Green Shores Capital and U+I , and it brings the total raised by the startup to $6.8 million since being founded in 2015.

Gyana (Sanskrit for “knowledge”) was co-founded by Joyeeta Das and David Kell, who were both pursuing post-graduate degrees at Oxford: Das, a former engineer, was getting an MBA, and Kell was doing a Ph. D. in physics.

Das said the idea of building this tool came out of the fact that the pair could see a big disconnect emerging not just in their studies, but also in the world at large — not so much a digital divide, as a digital light year in terms of the distance between the groups of who and who doesn’t know how to work in the realm of data science.

“Everyone talks about using data to inform decision making, and the world becoming data-driven, but actually that proposition is available to less than one percent of the world,” she said.

Out of that, the pair decided to work on building a platform that Das describes as a way to empower “citizen data scientists,” by letting users upload any structured data set (for example, a .CSV file) and running a series of queries on it to be able to visualise trends and other insights more easily.

While the longer term goal may be for any person to be able to produce an analytical insight out of a long list of numbers, the more practical and immediate application has been in enterprise services and building tools for non-technical knowledge workers to make better, data-driven decisions.

To prove out its software, the startup first built an app based on the platform that it calls Neera (Sanskrit for “water”), which specifically parses footfall and other “human movement” metrics, useful for applications in retail, real estate and civic planning — for example to determine well certain retail locations are performing, footfall in popular locations, decisions on where to place or remove stores, or how to price a piece of property.

Starting out with the aim of mid-market and smaller companies — those most likely not to have in-house data scientists to meet their business needs — startup has already picked up a series of customers that are actually quite a lot bigger than that. They include Vodafone, Barclays, EY, Pret a Manger, Knight Frank and the UK Ministry of Defense. It says it has some £1 million in contracts with these firms currently.

That, in turn, has served as the trigger to raise this latest round of funding and to launch Vayu (Sanskrit for “air”) — a more general purpose app that covers a wider set of parameters that can be applied to a dataset. So far, it has been adopted by academic researchers, financial services employees, and others that use analysis in their work, Das said.

With both Vayu and Neera, the aim — refreshingly — is to make the whole experience as privacy-friendly as possible, Das noted. Currently, you download an app if you want to use Gyana, and you keep your data local as you work on it. Gyana has no “anonymization” and no retention of data in its processes, except things like analytics around where your cursor hovers, so that Gyana knows how it can improve its product.

“There are always ways to reverse engineer these things,” Das said of anonymization. “We just wanted to make sure that we are not accidentally creating a situation where, despite learning from anaonyised materials, you can’t reverse engineer what people are analysing. We are just not convinced.”

While there is something commendable about building and shipping a tool with a lot of potential to it, Gyana runs the risk of facing what I think of as the “water, water everywhere” problem. Sometimes if a person really has no experience or specific aim, it can be hard to think of how to get started when you can do anything. Das said they have also identified this, and so while currently Gyana already offers some tutorials and helper tools within the app to nudge the user along, the plan is to eventually bring in a large variety of datasets for people to get started with, and also to develop a more intuitive way to “read” the basics of the files in order to figure out what kinds of data inquiries a person is most likely to want to make.

The rise of “no-code” software has been a swift one in the world of tech spanning the proliferation of startups, big acquisitions, and large funding rounds. Companies like Airtable and DashDash are aimed at building analytics leaning on interfaces that follow the basic design of a spreadsheet; AppSheet, which is a no-code mobile app building platform, was recently acquired by Google; and Roblox (for building games without needing to code) and Uncorq (for app development) have both raised significant funding just this week. In the area of no-code data analytics and visualisation, there are biggies like Tableau, as well as Trifacta, RapidMiner and more.

Gartner predicts that by 2024, some 65% of all app development will be made on low- or no-code platforms, and Forrester estimates that the no- and low-code market will be worth some $10 billion this year, rising to $21.2 billion by 2024.

That represents a big business opportunity for the likes of Gyana, which has been unique in using the no-code approach specifically to tackle the area of data science.

However, in the spirit of citizen data scientists, the intention is to keep a consumer version of the apps free to use as it works on signing up enterprise users with more enhanced paid products, which will be priced on an annual license basis (currently clients are paying between $6,000 and $12,000 depending on usage, she said).

“We want to do free for as long as we can,” Das said, both in relation to the data tools and the datasets that it will offer to users. “The biggest value add is not about accessing premium data that is hard to get. We are not a data marketplace but we want to provide data that makes sense to access,” adding that even with business users, “we’d like you to do 90% of what you want to do without paying for anything.”

Oct
15
2019
--

Databricks brings its Delta Lake project to the Linux Foundation

Databricks, the big data analytics service founded by the original developers of Apache Spark, today announced that it is bringing its Delta Lake open-source project for building data lakes to the Linux Foundation under an open governance model. The company announced the launch of Delta Lake earlier this year, and, even though it’s still a relatively new project, it has already been adopted by many organizations and has found backing from companies like Intel, Alibaba and Booz Allen Hamilton.

“In 2013, we had a small project where we added SQL to Spark at Databricks […] and donated it to the Apache Foundation,” Databricks CEO and co-founder Ali Ghodsi told me. “Over the years, slowly people have changed how they actually leverage Spark and only in the last year or so it really started to dawn upon us that there’s a new pattern that’s emerging and Spark is being used in a completely different way than maybe we had planned initially.”

This pattern, he said, is that companies are taking all of their data and putting it into data lakes and then doing a couple of things with this data, machine learning and data science being the obvious ones. But they are also doing things that are more traditionally associated with data warehouses, like business intelligence and reporting. The term Ghodsi uses for this kind of usage is “Lake House.” More and more, Databricks is seeing that Spark is being used for this purpose and not just to replace Hadoop and doing ETL (extract, transform, load). “This kind of Lake House patterns we’ve seen emerge more and more and we wanted to double down on it.”

Spark 3.0, which is launching today soon, enables more of these use cases and speeds them up significantly, in addition to the launch of a new feature that enables you to add a pluggable data catalog to Spark.

Delta Lake, Ghodsi said, is essentially the data layer of the Lake House pattern. It brings support for ACID transactions to data lakes, scalable metadata handling and data versioning, for example. All the data is stored in the Apache Parquet format and users can enforce schemas (and change them with relative ease if necessary).

It’s interesting to see Databricks choose the Linux Foundation for this project, given that its roots are in the Apache Foundation. “We’re super excited to partner with them,” Ghodsi said about why the company chose the Linux Foundation. “They run the biggest projects on the planet, including the Linux project but also a lot of cloud projects. The cloud-native stuff is all in the Linux Foundation.”

“Bringing Delta Lake under the neutral home of the Linux Foundation will help the open-source community dependent on the project develop the technology addressing how big data is stored and processed, both on-prem and in the cloud,” said Michael Dolan, VP of Strategic Programs at the Linux Foundation. “The Linux Foundation helps open-source communities leverage an open governance model to enable broad industry contribution and consensus building, which will improve the state of the art for data storage and reliability.”

Oct
08
2019
--

Satya Nadella looks to the future with edge computing

Speaking today at the Microsoft Government Leaders Summit in Washington, DC, Microsoft CEO Satya Nadella made the case for edge computing, even while pushing the Azure cloud as what he called “the world’s computer.”

While Amazon, Google and other competitors may have something to say about that, marketing hype aside, many companies are still in the midst of transitioning to the cloud. Nadella says the future of computing could actually be at the edge, where computing is done locally before data is then transferred to the cloud for AI and machine learning purposes. What goes around, comes around.

But as Nadella sees it, this is not going to be about either edge or cloud. It’s going to be the two technologies working in tandem. “Now, all this is being driven by this new tech paradigm that we describe as the intelligent cloud and the intelligent edge,” he said today.

He said that to truly understand the impact the edge is going to have on computing, you have to look at research, which predicts there will be 50 billion connected devices in the world by 2030, a number even he finds astonishing. “I mean this is pretty stunning. We think about a billion Windows machines or a couple of billion smartphones. This is 50 billion [devices], and that’s the scope,” he said.

The key here is that these 50 billion devices, whether you call them edge devices or the Internet of Things, will be generating tons of data. That means you will have to develop entirely new ways of thinking about how all this flows together. “The capacity at the edge, that ubiquity is going to be transformative in how we think about computation in any business process of ours,” he said. As we generate ever-increasing amounts of data, whether we are talking about public sector kinds of use case, or any business need, it’s going to be the fuel for artificial intelligence, and he sees the sheer amount of that data driving new AI use cases.

“Of course when you have that rich computational fabric, one of the things that you can do is create this new asset, which is data and AI. There is not going to be a single application, a single experience that you are going to build, that is not going to be driven by AI, and that means you have to really have the ability to reason over large amounts of data to create that AI,” he said.

Nadella would be more than happy to have his audience take care of all that using Microsoft products, whether Azure compute, database, AI tools or edge computers like the Data Box Edge it introduced in 2018. While Nadella is probably right about the future of computing, all of this could apply to any cloud, not just Microsoft.

As computing shifts to the edge, it’s going to have a profound impact on the way we think about technology in general, but it’s probably not going to involve being tied to a single vendor, regardless of how comprehensive their offerings may be.

Sep
17
2019
--

Data storage company Cloudian launches a new edge analytics subsidiary called Edgematrix

Cloudian, a company that enables businesses to store and manage massive amounts of data, announced today the launch of Edgematrix, a new unit focused on edge analytics for large data sets. Edgematrix, a majority-owned subsidiary of Cloudian, will first be available in Japan, where both companies are based. It has raised a $9 million Series A from strategic investors NTT Docomo, Shimizu Corporation and Japan Post Capital, as well as Cloudian co-founder and CEO Michael Tso and board director Jonathan Epstein. The funding will be used on product development, deployment and sales and marketing.

Cloudian itself has raised a total of $174 million, including a $94 million Series E round announced last year. Its products include the Hyperstore platform, which allows businesses to store hundreds of petrabytes of data on premise, and software for data analytics and machine learning. Edgematrix uses Hyperstore for storing large-scale data sets and its own AI software and hardware for data processing at the “edge” of networks, closer to where data is collected from IoT devices like sensors.

The company’s solutions were created for situations where real-time analytics is necessary. For example, it can be used to detect the make, model and year of cars on highways so targeted billboard ads can be displayed to their drivers.

Tso told TechCrunch in an email that Edgematrix was launched after Cloudian co-founder and president Hiroshi Ohta and a team spent two years working on technology to help Cloudian customers process and analyze their data more efficiently.

“With more and more data being created at the edge, including IoT data, there’s a growing need for being able to apply real-time data analysis and decision-making at or near the edge, minimizing the transmission costs and latencies involved in moving the data elsewhere,” said Tso. “Based on the initial success of a small Cloudian team developing AI software solutions and attracting a number of top-tier customers, we decided that the best way to build on this success was establishing a subsidiary with strategic investors.”

Edgematrix is launching in Japan first because spending on AI systems there is expected to grow faster than in any other market, at a compound annual growth rate of 45.3% from 2018 to 2023, according to IDC.

“Japan has been ahead of the curve as an early adopter of AI technology, with both the governmetn and private sector viewing it as essential to boosting productivity,” said Tso. “Edgematrix will focus on the Japanese market for at least the next year, and assuming that all goes well, it would then expand to North America and Europe.”

Jul
29
2019
--

Microsoft acquires data privacy and governance service BlueTalon

Microsoft today announced that it has acquired BlueTalon, a data privacy and governance service that helps enterprises set policies for how their employees can access their data. The service then enforces those policies across most popular data environments and provides tools for auditing policies and access, too.

Neither Microsoft nor BlueTalon disclosed the financial details of the transaction. Ahead of today’s acquisition, BlueTalon had raised about $27.4 million, according to Crunchbase. Investors include Bloomberg Beta, Maverick Ventures, Signia Venture Partners and Stanford’s StartX fund.

BlueTalon Policy Engine How it works

“The IP and talent acquired through BlueTalon brings a unique expertise at the apex of big data, security and governance,” writes Rohan Kumar, Microsoft’s corporate VP for Azure Data. “This acquisition will enhance our ability to empower enterprises across industries to digitally transform while ensuring right use of data with centralized data governance at scale through Azure.”

Unsurprisingly, the BlueTalon team will become part of the Azure Data Governance group, where the team will work on enhancing Microsoft’s capabilities around data privacy and governance. Microsoft already offers access and governance control tools for Azure, of course. As virtually all businesses become more data-centric, though, the need for centralized access controls that work across systems is only going to increase and new data privacy laws aren’t making this process easier.

“As we began exploring partnership opportunities with various hyperscale cloud providers to better serve our customers, Microsoft deeply impressed us,” BlueTalon CEO Eric Tilenius, who has clearly read his share of “our incredible journey” blog posts, explains in today’s announcement. “The Azure Data team was uniquely thoughtful and visionary when it came to data governance. We found them to be the perfect fit for us in both mission and culture. So when Microsoft asked us to join forces, we jumped at the opportunity.”

Jun
10
2019
--

With Tableau and Mulesoft, Salesforce gains full view of enterprise data

Back in the 2010 timeframe, it was common to say that content was king, but after watching Google buy Looker for $2.6 billion last week and Salesforce nab Tableau for $15.7 billion this morning, it’s clear that data has ascended to the throne in a business context.

We have been hearing about Big Data for years, but we’ve probably reached a point in 2019 where the data onslaught is really having an impact on business. If you can find the key data nuggets in the big data pile, it can clearly be a competitive advantage, and companies like Google and Salesforce are pulling out their checkbooks to make sure they are in a position to help you out.

While Google, as a cloud infrastructure vendor, is trying to help companies on its platform and across the cloud understand and visualize all that data, Salesforce as a SaaS vendor might have a different reason — one that might surprise you — given that Salesforce was born in the cloud. But perhaps it recognizes something fundamental. If it truly wants to own the enterprise, it has to have a hybrid story, and with Mulesoft and Tableau, that’s precisely what it has — and why it was willing to spend around $23 billion to get it.

Making connections

Certainly, Salesforce chairman Marc Benioff has no trouble seeing the connections between his two big purchases over the last year. He sees the combination of Mulesoft connecting to the data sources and Tableau providing a way to visualize as a “beautiful thing.”

May
15
2019
--

Tealium, a big data platform for structuring disparate customer information, raises $55M at $850M valuation

The average enterprise today uses about 90 different software packages, with between 30-40 of them touching customers directly or indirectly. The data that comes out of those systems can prove to be very useful — to help other systems and employees work more intelligently, to help companies make better business decisions — but only if it’s put in order: now, a startup called Tealium, which has built a system precisely to do just that and works with the likes of Facebook and IBM to help manage their customer data, has raised a big round of funding to continue building out the services it provides.

Today, it is announcing a $55 million round of funding — a Series F led by Silver Lake Waterman, the firm’s late-stage capital growth fund; with ABN AMRO, Bain Capital, Declaration Partners, Georgian Partners, Industry Ventures, Parkwood and Presidio Ventures also participating.

Jeff Lunsford, Tealium’s CEO, said the company is not disclosing valuation, but he did say that it was “substantially” higher than when the company was last priced three years ago. That valuation was $305 million in 2016, according to PitchBook — a figure Lunsford didn’t dispute when I spoke with him about it, and a source close to the company says it is “more than double” this last valuation, and actually around $850 million.

He added that the company is close to profitability and is projected to make $100 million in revenues this year, and that this is being considered the company’s “final round” — presumably a sign that it will either no longer need external funding and that if it does, the next step might be either getting acquired or going public.

This brings the total raised by Tealium to $160 million.

The company’s rise over the last eight years has dovetailed with the rapid growth of big data. The movement of services to digital platforms has resulted in a sea of information. Much of that largely sits untapped, but those who are able to bring it to order can reap the rewards by gaining better insights into their organizations.

Tealium had its beginnings in amassing and ordering tags from internet traffic to help optimise marketing and so on — a business where it competes with the likes of Google and Adobe.

Over time, it has expanded and capitalised to a much wider set of data sources that range well beyond web and commerce, and one use of the funding will be to continue expanding those data sources, and also how they are used, with an emphasis on using more AI, Lunsford said.

“There are new areas that touch customers like smart home and smart office hardware, and each requires a step up in integration for a company like us,” he said. “Then once you have it all centralised you could feed machine learning algorithms to have tighter predictions.”

That vast potential is one reason for the investor interest.

“Tealium enables enterprises to solve the customer data fragmentation problem by integrating and enriching data across sources, in real-time, to create audiences while providing data governance and fidelity,” said Shawn O’Neill, managing director of Silver Lake Waterman, in a statement. “Jeff and his team have built a great platform and we are excited to support the company’s continued growth and investment in innovation.”

The rapid growth of digital services has already seen the company getting a big boost in terms of the data that is passing through its cloud-based platform: it has had a 300% year-over-year increase in visitor profiles created, with current tech customers including the likes of Facebook, IBM, Visa and others from across a variety of sectors, such as healthcare, finance and more.

“You’d be surprised how many big tech companies use Tealium,” Lunsford said. “Even they have a limited amount of bandwidth when it comes to developing their internal platforms.”

People like to say that “data is the new oil,” but these days that expression has taken on perhaps an unintended meaning: just like the overconsumption of oil and fossil fuels in general is viewed as detrimental to the long-term health of our planet, the overconsumption of data has also become a very problematic spectre in our very pervasive world of tech.

Governments — the European Union being one notable example — are taking up the challenge of that latter issue with new regulations, specifically GDPR. Interestingly, Lunsford says this has been a good thing rather than a bad thing for his company, as it gives a much clearer directive to companies about what they can use, and how it can be used.

“They want to follow the law,” he said of their clients, “and we give them the data freedom and control to do that.” It’s not the only company tackling the business opportunity of being a big-data repository at a time when data misuse is being scrutinised more than ever: InCountry, which launched weeks ago, is also banking on this gap in the market.

I’d argue that this could potentially be one more reason why Tealium is keen on expanding to areas like IoT and other sources of customer information: just like the sea, the pool of data that’s there for the tapping is nearly limitless.

Feb
20
2019
--

Why Daimler moved its big data platform to the cloud

Like virtually every big enterprise company, a few years ago, the German auto giant Daimler decided to invest in its own on-premises data centers. And while those aren’t going away anytime soon, the company today announced that it has successfully moved its on-premises big data platform to Microsoft’s Azure cloud. This new platform, which the company calls eXtollo, is Daimler’s first major service to run outside of its own data centers, though it’ll probably not be the last.

As Daimler’s head of its corporate center of excellence for advanced analytics and big data Guido Vetter told me, the company started getting interested in big data about five years ago. “We invested in technology — the classical way, on-premise — and got a couple of people on it. And we were investigating what we could do with data because data is transforming our whole business as well,” he said.

By 2016, the size of the organization had grown to the point where a more formal structure was needed to enable the company to handle its data at a global scale. At the time, the buzz phrase was “data lakes” and the company started building its own in order to build out its analytics capacities.

Electric lineup, Daimler AG

“Sooner or later, we hit the limits as it’s not our core business to run these big environments,” Vetter said. “Flexibility and scalability are what you need for AI and advanced analytics and our whole operations are not set up for that. Our backend operations are set up for keeping a plant running and keeping everything safe and secure.” But in this new world of enterprise IT, companies need to be able to be flexible and experiment — and, if necessary, throw out failed experiments quickly.

So about a year and a half ago, Vetter’s team started the eXtollo project to bring all the company’s activities around advanced analytics, big data and artificial intelligence into the Azure Cloud, and just over two weeks ago, the team shut down its last on-premises servers after slowly turning on its solutions in Microsoft’s data centers in Europe, the U.S. and Asia. All in all, the actual transition between the on-premises data centers and the Azure cloud took about nine months. That may not seem fast, but for an enterprise project like this, that’s about as fast as it gets (and for a while, it fed all new data into both its on-premises data lake and Azure).

If you work for a startup, then all of this probably doesn’t seem like a big deal, but for a more traditional enterprise like Daimler, even just giving up control over the physical hardware where your data resides was a major culture change and something that took quite a bit of convincing. In the end, the solution came down to encryption.

“We needed the means to secure the data in the Microsoft data center with our own means that ensure that only we have access to the raw data and work with the data,” explained Vetter. In the end, the company decided to use the Azure Key Vault to manage and rotate its encryption keys. Indeed, Vetter noted that knowing that the company had full control over its own data was what allowed this project to move forward.

Vetter tells me the company obviously looked at Microsoft’s competitors as well, but he noted that his team didn’t find a compelling offer from other vendors in terms of functionality and the security features that it needed.

Today, Daimler’s big data unit uses tools like HD Insights and Azure Databricks, which covers more than 90 percents of the company’s current use cases. In the future, Vetter also wants to make it easier for less experienced users to use self-service tools to launch AI and analytics services.

While cost is often a factor that counts against the cloud, because renting server capacity isn’t cheap, Vetter argues that this move will actually save the company money and that storage costs, especially, are going to be cheaper in the cloud than in its on-premises data center (and chances are that Daimler, given its size and prestige as a customer, isn’t exactly paying the same rack rate that others are paying for the Azure services).

As with so many big data AI projects, predictions are the focus of much of what Daimler is doing. That may mean looking at a car’s data and error code and helping the technician diagnose an issue or doing predictive maintenance on a commercial vehicle. Interestingly, the company isn’t currently bringing to the cloud any of its own IoT data from its plants. That’s all managed in the company’s on-premises data centers because it wants to avoid the risk of having to shut down a plant because its tools lost the connection to a data center, for example.

Powered by WordPress | Theme: Aeros 2.0 by TheBuckmaker.com