Feb
20
2019
--

Why Daimler moved its big data platform to the cloud

Like virtually every big enterprise company, a few years ago, the German auto giant Daimler decided to invest in its own on-premises data centers. And while those aren’t going away anytime soon, the company today announced that it has successfully moved its on-premises big data platform to Microsoft’s Azure cloud. This new platform, which the company calls eXtollo, is Daimler’s first major service to run outside of its own data centers, though it’ll probably not be the last.

As Daimler’s head of its corporate center of excellence for advanced analytics and big data Guido Vetter told me, the company started getting interested in big data about five years ago. “We invested in technology — the classical way, on-premise — and got a couple of people on it. And we were investigating what we could do with data because data is transforming our whole business as well,” he said.

By 2016, the size of the organization had grown to the point where a more formal structure was needed to enable the company to handle its data at a global scale. At the time, the buzz phrase was “data lakes” and the company started building its own in order to build out its analytics capacities.

Electric lineup, Daimler AG

“Sooner or later, we hit the limits as it’s not our core business to run these big environments,” Vetter said. “Flexibility and scalability are what you need for AI and advanced analytics and our whole operations are not set up for that. Our backend operations are set up for keeping a plant running and keeping everything safe and secure.” But in this new world of enterprise IT, companies need to be able to be flexible and experiment — and, if necessary, throw out failed experiments quickly.

So about a year and a half ago, Vetter’s team started the eXtollo project to bring all the company’s activities around advanced analytics, big data and artificial intelligence into the Azure Cloud, and just over two weeks ago, the team shut down its last on-premises servers after slowly turning on its solutions in Microsoft’s data centers in Europe, the U.S. and Asia. All in all, the actual transition between the on-premises data centers and the Azure cloud took about nine months. That may not seem fast, but for an enterprise project like this, that’s about as fast as it gets (and for a while, it fed all new data into both its on-premises data lake and Azure).

If you work for a startup, then all of this probably doesn’t seem like a big deal, but for a more traditional enterprise like Daimler, even just giving up control over the physical hardware where your data resides was a major culture change and something that took quite a bit of convincing. In the end, the solution came down to encryption.

“We needed the means to secure the data in the Microsoft data center with our own means that ensure that only we have access to the raw data and work with the data,” explained Vetter. In the end, the company decided to use the Azure Key Vault to manage and rotate its encryption keys. Indeed, Vetter noted that knowing that the company had full control over its own data was what allowed this project to move forward.

Vetter tells me the company obviously looked at Microsoft’s competitors as well, but he noted that his team didn’t find a compelling offer from other vendors in terms of functionality and the security features that it needed.

Today, Daimler’s big data unit uses tools like HD Insights and Azure Databricks, which covers more than 90 percents of the company’s current use cases. In the future, Vetter also wants to make it easier for less experienced users to use self-service tools to launch AI and analytics services.

While cost is often a factor that counts against the cloud, because renting server capacity isn’t cheap, Vetter argues that this move will actually save the company money and that storage costs, especially, are going to be cheaper in the cloud than in its on-premises data center (and chances are that Daimler, given its size and prestige as a customer, isn’t exactly paying the same rack rate that others are paying for the Azure services).

As with so many big data AI projects, predictions are the focus of much of what Daimler is doing. That may mean looking at a car’s data and error code and helping the technician diagnose an issue or doing predictive maintenance on a commercial vehicle. Interestingly, the company isn’t currently bringing to the cloud any of its own IoT data from its plants. That’s all managed in the company’s on-premises data centers because it wants to avoid the risk of having to shut down a plant because its tools lost the connection to a data center, for example.

Feb
19
2019
--

Redis Labs raises a $60M Series E round

Redis Labs, a startup that offers commercial services around the Redis in-memory data store (and which counts Redis creator and lead developer Salvatore Sanfilippo among its employees), today announced that it has raised a $60 million Series E funding round led by private equity firm Francisco Partners.

The firm didn’t participate in any of Redis Labs’ previous rounds, but existing investors Goldman Sachs Private Capital Investing, Bain Capital Ventures, Viola Ventures and Dell Technologies Capital all participated in this round.

In total, Redis Labs has now raised $146 million and the company plans to use the new funding to accelerate its go-to-market strategy and continue to invest in the Redis community and product development.

Current Redis Labs users include the likes of American Express, Staples, Microsoft, Mastercard and Atlassian . In total, the company now has more than 8,500 customers. Because it’s pretty flexible, these customers use the service as a database, cache and message broker, depending on their needs. The company’s flagship product is Redis Enterprise, which extends the open-source Redis platform with additional tools and services for enterprises. The company offers managed cloud services, which give businesses the choice between hosting on public clouds like AWS, GCP and Azure, as well as their private clouds, in addition to traditional software downloads and licenses for self-managed installs.

Redis Labs CEO Ofer Bengal told me the company’s isn’t cash positive yet. He also noted that the company didn’t need to raise this round but that he decided to do so in order to accelerate growth. “In this competitive environment, you have to spend a lot and push hard on product development,” he said.

It’s worth noting that he stressed that Francisco Partners has a reputation for taking companies forward and the logical next step for Redis Labs would be an IPO. “We think that we have a very unique opportunity to build a very large company that deserves an IPO,” he said.

Part of this new competitive environment also involves competitors that use other companies’ open-source projects to build their own products without contributing back. Redis Labs was one of the first of a number of open-source companies that decided to offer its newest releases under a new license that still allows developers to modify the code but that forces competitors that want to essentially resell it to buy a commercial license. Ofer specifically notes AWS in this context. It’s worth noting that this isn’t about the Redis database itself but about the additional modules that Redis Labs built. Redis Enterprise itself is closed-source.

“When we came out with this new license, there were many different views,” he acknowledged. “Some people condemned that. But after the initial noise calmed down — and especially after some other companies came out with a similar concept — the community now understands that the original concept of open source has to be fixed because it isn’t suitable anymore to the modern era where cloud companies use their monopoly power to adopt any successful open source project without contributing anything to it.”

Feb
07
2019
--

Microsoft Azure sets its sights on more analytics workloads

Enterprises now amass huge amounts of data, both from their own tools and applications, as well as from the SaaS applications they use. For a long time, that data was basically exhaust. Maybe it was stored for a while to fulfill some legal requirements, but then it was discarded. Now, data is what drives machine learning models, and the more data you have, the better. It’s maybe no surprise, then, that the big cloud vendors started investing in data warehouses and lakes early on. But that’s just a first step. After that, you also need the analytics tools to make all of this data useful.

Today, it’s Microsoft turn to shine the spotlight on its data analytics services. The actual news here is pretty straightforward. Two of these are services that are moving into general availability: the second generation of Azure Data Lake Storage for big data analytics workloads and Azure Data Explorer, a managed service that makes easier ad-hoc analysis of massive data volumes. Microsoft is also previewing a new feature in Azure Data Factory, its graphical no-code service for building data transformation. Data Factory now features the ability to map data flows.

Those individual news pieces are interesting if you are a user or are considering Azure for your big data workloads, but what’s maybe more important here is that Microsoft is trying to offer a comprehensive set of tools for managing and storing this data — and then using it for building analytics and AI services.

(Photo credit:Josh Edelson/AFP/Getty Images)

“AI is a top priority for every company around the globe,” Julia White, Microsoft’s corporate VP for Azure, told me. “And as we are working with our customers on AI, it becomes clear that their analytics often aren’t good enough for building an AI platform.” These companies are generating plenty of data, which then has to be pulled into analytics systems. She stressed that she couldn’t remember a customer conversation in recent months that didn’t focus on AI. “There is urgency to get to the AI dream,” White said, but the growth and variety of data presents a major challenge for many enterprises. “They thought this was a technology that was separate from their core systems. Now it’s expected for both customer-facing and line-of-business applications.”

Data Lake Storage helps with managing this variety of data since it can handle both structured and unstructured data (and is optimized for the Spark and Hadoop analytics engines). The service can ingest any kind of data — yet Microsoft still promises that it will be very fast. “The world of analytics tended to be defined by having to decide upfront and then building rigid structures around it to get the performance you wanted,” explained White. Data Lake Storage, on the other hand, wants to offer the best of both worlds.

Likewise, White argued that while many enterprises used to keep these services on their on-premises servers, many of them are still appliance-based. But she believes the cloud has now reached the point where the price/performance calculations are in its favor. It took a while to get to this point, though, and to convince enterprises. White noted that for the longest time, enterprises that looked at their analytics projects thought $300 million projects took forever, tied up lots of people and were frankly a bit scary. “But also, what we had to offer in the cloud hasn’t been amazing until some of the recent work,” she said. “We’ve been on a journey — as well as the other cloud vendors — and the price performance is now compelling.” And it sure helps that if enterprises want to meet their AI goals, they’ll now have to tackle these workloads, too.

Jan
24
2019
--

Microsoft acquires Citus Data

Microsoft today announced that it has acquired Citus Data, a company that focused on making PostgreSQL databases faster and more scalable. Citus’ open-source PostgreSQL extension essentially turns the application into a distributed database and, while there has been a lot of hype around the NoSQL movement and document stores, relational databases — and especially PostgreSQL — are still a growing market, in part because of tools from companies like Citus that overcome some of their earlier limitations.

Unsurprisingly, Microsoft plans to work with the Citus Data team to “accelerate the delivery of key, enterprise-ready features from Azure to PostgreSQL and enable critical PostgreSQL workloads to run on Azure with confidence.” The Citus co-founders echo this in their own statement, noting that “as part of Microsoft, we will stay focused on building an amazing database on top of PostgreSQL that gives our users the game-changing scale, performance, and resilience they need. We will continue to drive innovation in this space.”

PostgreSQL is obviously an open-source tool, and while the fact that Microsoft is now a major open-source contributor doesn’t come as a surprise anymore, it’s worth noting that the company stresses that it will continue to work with the PostgreSQL community. In an email, a Microsoft spokesperson also noted that “the acquisition is a proof point in the company’s commitment to open source and accelerating Azure PostgreSQL performance and scale.”

Current Citus customers include the likes of real-time analytics service Chartbeat, email security service Agari and PushOwl, though the company notes that it also counts a number of Fortune 100 companies among its users (they tend to stay anonymous). The company offers both a database as a service, an on-premises enterprise version and the free open-source edition. For the time being, it seems like that’s not changing, though over time I would suspect that Microsoft will transition users of the hosted service to Azure.

The price of the acquisition was not disclosed. Citus Data, which was founded in 2010 and graduated from the Y Combinator program, previously raised more than $13 million from the likes of Khosla Ventures, SV Angel and Data Collective.

Sep
20
2018
--

MariaDB acquires Clustrix

MariaDB, the company behind the eponymous MySQL drop-in replacement database, today announced that it has acquired Clustrix, which itself is a MySQL drop-in replacement database, but with a focus on scalability. MariaDB will integrate Clustrix’s technology into its own database, which will allow it to offer its users a more scalable database service in the long run.

That by itself would be an interesting development for the popular open source database company. But there’s another angle to this story, too. In addition to the acquisition, MariaDB also today announced that cloud computing company ServiceNow is investing in MariaDB, an investment that helped it get to today’s acquisition. ServiceNow doesn’t typically make investments, though it has made a few acquisitions. It is a very large MariaDB user, though, and it’s exactly the kind of customer that will benefit from the Clustrix acquisition.

MariaDB CEO Michael Howard tells me that ServiceNow current supports about 80,000 instances of MariaDB. With this investment (which is actually an add-on to MariaDB’s 2017 Series C round), ServiceNow’s SVP of Development and Operations Pat Casey will join MariaDB’s board.

Why would MariaDB acquire a company like Clustrix, though? When I asked Howard about the motivation, he noted that he’s now seeing more companies like ServiceNow that are looking at a more scalable way to run MariaDB. Howard noted that it would take years to build a new database engine from the ground up.

“You can hire a lot of smart people individually, but not necessarily have that experience built into their profile,” he said. “So that was important and then to have a jumpstart in relation to this market opportunity — this mandate from our market. It typically takes about nine years, to get a brand new, thorough database technology off the ground. It’s not like a SaaS application where you can get a front-end going in about a year or so.

Howard also stressed that the fact that the teams at Clustrix and MariaDB share the same vocabulary, given that they both work on similar problems and aim to be compatible with MySQL, made this a good fit.

While integrating the Clustrix database technology into MariaDB won’t be trivial, Howard stressed that the database was always built to accommodate external database storage engines. MariaDB will have to make some changes to its APIs to be ready for the clustering features of Clustrix. “It’s not going to be a 1-2-3 effort,” he said. “It’s going to be a heavy-duty effort for us to do this right. But everyone on the team wants to do it because it’s good for the company and our customers.

MariaDB did not disclose the price of the acquisition. Since it was founded in 2006, though, the Y Combinator-incubated Clustrix had raised just under $72 million, though. MariaDB has raised just under $100 million so far, so it’s probably a fair guess that Clustrix didn’t necessarily sell for a large multiple of that.

Aug
29
2018
--

Storage provider Cloudian raises $94M

Cloudian, a company that specializes in helping businesses store petabytes of data, today announced that it has raised a $94 million Series E funding round. Investors in this round, which is one of the largest we have seen for a storage vendor, include Digital Alpha, Fidelity Eight Roads, Goldman Sachs, INCJ, JPIC (Japan Post Investment Corporation), NTT DOCOMO Ventures and WS Investments. This round includes a $25 million investment from Digital Alpha, which was first announced earlier this year.

With this, the seven-year-old company has now raised a total of $174 million.

As the company told me, it now has about 160 employees and 240 enterprise customers. Cloudian has found its sweet spot in managing the large video archives of entertainment companies, but its customers also include healthcare companies, automobile manufacturers and Formula One teams.

What’s important to stress here is that Cloudian’s focus is on on-premise storage, not cloud storage, though it does offer support for multi-cloud data management, as well. “Data tends to be most effectively used close to where it is created and close to where it’s being used,” Cloudian VP of worldwide sales Jon Ash told me. “That’s because of latency, because of network traffic. You can almost always get better performance, better control over your data if it is being stored close to where it’s being used.” He also noted that it’s often costly and complex to move that data elsewhere, especially when you’re talking about the large amounts of information that Cloudian’s customers need to manage.

Unsurprisingly, companies that have this much data now want to use it for machine learning, too, so Cloudian is starting to get into this space, as well. As Cloudian CEO and co-founder Michael Tso also told me, companies are now aware that the data they pull in, whether from IoT sensors, cameras or medical imaging devices, will only become more valuable over time as they try to train their models. If they decide to throw the data away, they run the risk of having nothing with which to train their models.

Cloudian plans to use the new funding to expand its global sales and marketing efforts and increase its engineering team. “We have to invest in engineering and our core technology, as well,” Tso noted. “We have to innovate in new areas like AI.”

As Ash also stressed, Cloudian’s business is really data management — not just storage. “Data is coming from everywhere and it’s going everywhere,” he said. “The old-school storage platforms that were siloed just don’t work anywhere.”

Apr
21
2018
--

BigID lands in the right place at the right time with GDPR

Every startup needs a little skill and a little luck. BigID, a NYC-based data governance solution has been blessed with both. The company, which helps customers identify sensitive data in big data stores, launched at just about the same time that the EU announced the GDPR data privacy regulations. Today, the company is having trouble keeping up with the business.

While you can’t discount that timing element, you have to have a product that actually solves a problem and BigID appears to meet that criteria. “This how the market is changing by having and demanding more technology-based controls over how data is being used,” company CEO and co-founder Dimitri Sirota told TechCrunch.

Sirota’s company enables customers to identify the most sensitive data from among vast stores of data. In fact, he says some customers have hundreds of millions of users, but their unique advantage is having built the solution more recently. That provides a modern architecture that can scale to meet these big data requirements, while identifying the data that requires your attention in a way that legacy systems just aren’t prepared to do.

“When we first started talking about this [in 2016] people didn’t grok it. They didn’t understand why you would need a privacy-centric approach. Even after 2016 when GDPR passed, most people didn’t see this. [Today] we are seeing a secular change. The assets they collect are valuable, but also incredibly toxic,” he said. It is the responsibility of the data owner to identify and protect the personal data under their purview under the GDPR rules, and that creates a data double-edged sword because you don’t want to be fined for failing to comply.

GDPR is a set of data privacy regulations that are set to take effect in the European Union at the end of May. Companies have to comply with these rules or could face stiff fines. The thing is GDPR could be just the beginning. The company is seeing similar data privacy regulations in Canada, Australia, China and Japan. Something akin go this could also be coming to the United States after Facebook CEO, Mark Zuckerberg appeared before Congress earlier this month. At the very least we could see state-level privacy laws in the US, Sirota said.

Sirota says there are challenges getting funded as a NYC startup because there hadn’t been a strong big enterprise ecosystem in place until recently, but that’s changing. “Starting an enterprise company in New York is challenging. Ed Sim from Boldstart [A New York City early stage VC firm that invests in enterprise startups] has helped educate through investment and partnerships. More challenging, but it’s reaching a new level now,” he said.

The company launched in 2016 and has raised $16.1 million to date. It scored the bulk of that in a $14 million round at the end of January. Just this week at the RSAC Sandbox competition at the RSA Conference in San Francisco, BigID was named the Most Innovative Startup in a big recognition of the work they are doing around GDPR.

Mar
09
2018
--

InfoSum’s first product touts decentralized big data insights

 Nick Halstead’s new startup, InfoSum, is launching its first product today — moving one step closer to his founding vision of a data platform that can help businesses and organizations unlock insights from big data silos without compromising user privacy, data security or data protection law. So a pretty high bar then. Read More

Aug
31
2017
--

Facebook to open source LogDevice for storing logs from distributed data centers

 Facebook is planning to open source LogDevice, the company’s custom-built solution for storing logs collected from distributed data centers. The company made the announcement as part of its Scale conference. Read More

Aug
29
2017
--

Salesforce is using AI to democratize SQL so anyone can query databases in natural language

 SQL is about as easy as it gets in the world of programming, and yet its learning curve is still steep enough to prevent many people from interacting with relational databases. Salesforce’s AI research team took it upon itself to explore how machine learning might be able to open doors for those without knowledge of SQL. Read More

Powered by WordPress | Theme: Aeros 2.0 by TheBuckmaker.com