Sep
15
2020
--

Data virtualization service Varada raises $12M

Varada, a Tel Aviv-based startup that focuses on making it easier for businesses to query data across services, today announced that it has raised a $12 million Series A round led by Israeli early-stage fund MizMaa Ventures, with participation by Gefen Capital.

“If you look at the storage aspect for big data, there’s always innovation, but we can put a lot of data in one place,” Varada CEO and co-founder Eran Vanounou told me. “But translating data into insight? It’s so hard. It’s costly. It’s slow. It’s complicated.”

That’s a lesson he learned during his time as CTO of LivePerson, which he described as a classic big data company. And just like at LivePerson, where the team had to reinvent the wheel to solve its data problems, again and again, every company — and not just the large enterprises — now struggles with managing their data and getting insights out of it, Vanounou argued.

varada architecture diagram

Image Credits: Varada

The rest of the founding team, David Krakov, Roman Vainbrand and Tal Ben-Moshe, already had a lot of experience in dealing with these problems, too, with Ben-Moshe having served at the chief software architect of Dell EMC’s XtremIO flash array unit, for example. They built the system for indexing big data that’s at the core of Varada’s platform (with the open-source Presto SQL query engine being one of the other cornerstones).

Image Credits: Varada

Essentially, Varada embraces the idea of data lakes and enriches that with its indexing capabilities. And those indexing capabilities is where Varada’s smarts can be found. As Vanounou explained, the company is using a machine learning system to understand when users tend to run certain workloads, and then caches the data ahead of time, making the system far faster than its competitors.

“If you think about big organizations and think about the workloads and the queries, what happens during the morning time is different from evening time. What happened yesterday is not what happened today. What happened on a rainy day is not what happened on a shiny day. […] We listen to what’s going on and we optimize. We leverage the indexing technology. We index what is needed when it is needed.”

That helps speed up queries, but it also means less data has to be replicated, which also brings down the cost. As MizMaa’s Aaron Applbaum noted, since Varada is not a SaaS solution, the buyers still get all of the discounts from their cloud providers, too.

In addition, the system can allocate resources intelligently so that different users can tap into different amounts of bandwidth. You can tell it to give customers more bandwidth than your financial analysts, for example.

“Data is growing like crazy: in volume, in scale, in complexity, in who requires it and what the business intelligence uses are, what the API uses are,” Applbaum said when I asked him why he decided to invest. “And compute is getting slightly cheaper, but not really, and storage is getting cheaper. So if you can make the trade-off to store more stuff, and access things more intelligently, more quickly, more agile — that was the basis of our thesis, as long as you can do it without compromising performance.”

Varada, with its team of experienced executives, architects and engineers, ticked a lot of the company’s boxes in this regard, but he also noted that unlike some other Israeli startups, the team understood that it had to listen to customers and understand their needs, too.

“In Israel, you have a history — and it’s become less and less the case — but historically, there’s a joke that it’s ‘ready, fire, aim.’ You build a technology, you’ve got this beautiful thing and you’re like, ‘alright, we did it,’ but without listening to the needs of the customer,” he explained.

The Varada team is not afraid to compare itself to Snowflake, which at least at first glance seems to make similar promises. Vananou praised the company for opening up the data warehousing market and proving that people are willing to pay for good analytics. But he argues that Varada’s approach is fundamentally different.

“We embrace the data lake. So if you are Mr. Customer, your data is your data. We’re not going to take it, move it, copy it. This is your single source of truth,” he said. And in addition, the data can stay in the company’s virtual private cloud. He also argues that Varada isn’t so much focused on the business users but the technologists inside a company.

 

Aug
06
2020
--

Mode raises $33M to supercharge its analytics platform for data scientists

Data science is the name of the game these days for companies that want to improve their decision making by tapping the information they are already amassing in their apps and other systems. And today, a startup called Mode Analytics, which has built a platform incorporating machine learning, business intelligence and big data analytics to help data scientists fulfill that task, is announcing $33 million in funding to continue making its platform ever more sophisticated.

Most recently, for example, the company has started to introduce tools (including SQL and Python tutorials) for less technical users, specifically those in product teams, so that they can structure queries that data scientists can subsequently execute faster and with more complete responses — important for the many follow-up questions that arise when a business intelligence process has been run. Mode claims that its tools can help produce answers to data queries in minutes.

This Series D is being led by SaaS specialist investor H.I.G. Growth Partners, with previous investors Valor Equity Partners, Foundation Capital, REV Venture Partners and Switch Ventures all participating. Valor led Mode’s Series C in February 2019, while Foundation and REV respectively led its A and B rounds.

Mode is not disclosing its valuation, but co-founder and CEO Derek Steer confirmed in an interview that it was “absolutely” an up-round.

For some context, PitchBook notes that last year its valuation was $106 million. The company now has a customer list that it says covers 52% of the Forbes 500, including Anheuser-Busch, Zillow, Lyft, Bloomberg, Capital One, VMware and Conde Nast. It says that to date it has processed 830 million query runs and 170 million notebook cell runs for 300,000 users. (Pricing is based on a freemium model, with a free “Studio” tier and Business and Enterprise tiers priced based on size and use.)

Mode has been around since 2013, when it was co-founded by Steer, Benn Stancil (Mode’s current president) and Josh Ferguson (initially the CTO and now chief architect).

Steer said the impetus for the startup came out of gaps in the market that the three had found through years of experience at other companies.

Specifically, when all three were working together at Yammer (they were early employees and stayed on after the Microsoft acquisition), they were part of a larger team building custom data analytics tools for Yammer. At the time, Steer said Yammer was paying $1 million per year to subscribe to Vertica (acquired by HP in 2011) to run it.

They saw an opportunity to build a platform that could provide similar kinds of tools — encompassing things like SQL Editors, Notebooks and reporting tools and dashboards — to a wider set of users.

“We and other companies like Facebook and Google were building analytics internally,” Steer recalled, “and we knew that the world wanted to work more like these tech companies. That’s why we started Mode.”

All the same, he added, “people were not clear exactly about what a data scientist even was.”

Indeed, Mode’s growth so far has mirrored that of the rise of data science overall, as the discipline of data science, and the business case for employing data scientists to help figure out what is “going on” beyond the day to day, getting answers by tapping all the data that’s being amassed in the process of just doing business. That means Mode’s addressable market has also been growing.

But even if the trove of potential buyers of Mode’s products has been growing, so has the opportunity overall. There has been a big swing in data science and big data analytics in the last several years, with a number of tech companies building tools to help those who are less technical “become data scientists” by introducing more intuitive interfaces like drag-and-drop features and natural language queries.

They include the likes of Sisense (which has been growing its analytics power with acquisitions like Periscope Data), Eigen (focusing on specific verticals like financial and legal queries), Looker (acquired by Google) and Tableau (acquired by Salesforce).

Mode’s approach up to now has been closer to that of another competitor, Alteryx, focusing on building tools that are still aimed primarily at helping data scientists themselves. You have any number of database tools on the market today, Steer noted, “Snowflake, Redshift, BigQuery, Databricks, take your pick.” The key now is in providing tools to those using those databases to do their work faster and better.

That pitch and the success of how it executes on it is what has given the company success both with customers and investors.

“Mode goes beyond traditional Business Intelligence by making data faster, more flexible and more customized,” said Scott Hilleboe, managing director, H.I.G. Growth Partners, in a statement. “The Mode data platform speeds up answers to complex business problems and makes the process more collaborative, so that everyone can build on the work of data analysts. We believe the company’s innovations in data analytics uniquely position it to take the lead in the Decision Science marketplace.”

Steer said that fundraising was planned long before the coronavirus outbreak to start in February, which meant that it was timed as badly as it could have been. Mode still raised what it wanted to in a couple of months — “a good raise by any standard,” he noted — even if it’s likely that the valuation suffered a bit in the process. “Pitching while the stock market is tanking was terrifying and not something I would repeat,” he added.

Given how many acquisitions there have been in this space, Steer confirmed that Mode too has been approached a number of times, but it’s staying put for now. (And no, he wouldn’t tell me who has been knocking, except to say that it’s large companies for whom analytics is an “adjacency” to bigger businesses, which is to say, the very large tech companies have approached Mode.)

“The reason we haven’t considered any acquisition offers is because there is just so much room,” Steer said. “I feel like this market is just getting started, and I would only consider an exit if I felt like we were handicapped by being on our own. But I think we have a lot more growing to do.”

Jul
23
2020
--

Quantexa raises $64.7M to bring big data intelligence to risk analysis and investigations

The wider field of cybersecurity — not just defending networks, but identifying fraudulent activity — has seen a big boost in activity in the last few months, and that’s no surprise. The global health pandemic has led to more interactions and transactions moving online, and the contractions we’re feeling across the economy and society have led some to take more desperate and illegal actions, using digital challenges to do it.

Today, a U.K. company called Quantexa — which has built a machine learning platform branded “Contextual Decision Intelligence” (CDI) that analyses disparate data points to get better insight into nefarious activity, as well as to (more productively) build better profiles of a company’s entire customer base — is raising a growth round of funding to address that opportunity.

The London-based startup has picked up $64.7 million, a Series C it will be using to continue building out both its tools and the use cases for applying them, as well as expanding geographically, specifically in North America, Asia-Pacific and more European territories.

The mission, said Vishal Marria, Quantexa’s founder and CEO, is to “connect the dots to make better business decisions.”

The startup built its business on the back of doing work for major banks and others in the financial services sector, and Marria added that the plan will be to continue enhancing tools for that vertical while also expanding into two growing opportunities: working with insurance and government/public sector organizations.

The backers in this round speak to how Quantexa positions itself in the market, and the traction it’s seen to date for its business. It’s being led by Evolution Equity Partners — a VC that specialises in innovative cybersecurity startups — with participation also from previous backers Dawn Capital, AlbionVC, HSBC and Accenture, as well as new backers ABN AMRO Ventures. HSBC, Accenture and ABN AMRO are all strategic investors working directly with the startup in their businesses.

Altogether, Quantexa has “thousands of users” across 70+ countries, it said, with additional large enterprises, including Standard Chartered, OFX and Dunn & Bradstreet.

The company has now raised some $90 million to date, and reliable sources close to the company tell us that the valuation is “well north” of $250 million — which to me sounds like it’s between $250 million and $300 million.

Marria said in an interview that he initially got the idea for Quantexa — which I believe may be a creative portmanteau of “quantum” and “context” — when he was working as an executive director at Ernst & Young and saw “many challenges with investigations” in the financial services industry.

“Is this a money launderer?” is the basic question that investigators aim to answer, but they were going about it, “using just a sliver of information,” he said. “I thought to myself, this is bonkers. There must be a better way.”

That better way, as built by Quantexa, is to solve it in the classic approach of tapping big data and building AI algorithms that help, in Marria’s words, connect the dots.

As an example, typically, an investigation needs to do significantly more than just track the activity of one individual or one shell company, and you need to seek out the most unlikely connections between a number of actions in order to build up an accurate picture. When you think about it, trying to identify, track, shut down and catch a large money launderer (a typical use case for Quantexa’s software) is a classic big data problem.

While there is a lot of attention these days on data protection and security breaches that leak sensitive customer information, Quantexa’s approach, Marria said, is to sell software, not ingest proprietary data into its engine to provide insights. He said that these days deployments typically either are done on premises or within private clouds, rather than using public cloud infrastructure, and that when Quantexa provides data to complement its customers’ data, it comes from publicly available sources (for example, Companies House filings in the U.K.).

There are a number of companies offering services in the same general area as Quantexa. They include those that present themselves more as business intelligence platforms that help detect fraud (such as Looker) through to those that are secretive and present themselves as AI businesses working behind the scenes for enterprises and governments to solve tough challenges, such as Palantir, through to others focusing specifically on some of the use cases for the technology, such as ComplyAdvantage and its focus on financial fraud detection.

Marria says that it has a few key differentiators from these. First is how its software works at scale: “It comes back to entity resolution that [calculations] can be done in real time and at batch,” he said. “And this is a platform, software that is easily deployed and configured at a much lower total cost of ownership. It is tech and that’s quite important in the current climate.”

And that is what has resonated with investors.

“Quantexa’s proprietary platform heralds a new generation of decision intelligence technology that uses a single contextual view of customers to profoundly improve operational decision making and overcome big data challenges,” said Richard Seewald, founding and managing partner of Evolution, in a statement. “Its impressive rapid growth, renowned client base and potential to build further value across so many sectors make Quantexa a fantastic partner whose team I look forward to working with.” Seewald is joining the board with this round.

Jul
02
2020
--

SEC filing indicates big data provider Palantir is raising $961M, $550M of it already secured

Palantir, the sometimes controversial, but always secretive, big data and analytics provider that works with governments and other public and private organizations to power national security, health and a variety of other services, has reportedly been eyeing a public listing this autumn. But in the meantime it’s also continuing to push ahead in the private markets.

The company has filed a Form D — its first in four years — indicating that it is in the process of raising nearly $1 billion — $961,099,010, to be exact — with $549,727,437 of that already sold, and a further $411,371,573 remaining to be raised.

(A Reuters report from June confirmed that Palantir had closed funding from two strategic investors that both work with the company: $500 million from Japanese insurance company Sompo Holdings, and $50 million from Fujitsu. Together, it seems like these might account for $550 million noted as already sold on the Form D.)

The Form D also notes that 58 investors are already attached to the offering, and that “of the total remaining to be sold, all but $671,576.25 represents shares of common stock already subscribed for.” This means that Palantir has already secured commitments for the remaining part of the $961 million raise, although the offering has not closed.

Palantir declined to comment on the filing, except to note that this is related to primary investments, not secondary stakes.

It’s not clear if this latest fundraise, as spelled out by the Form D, spells a delay to a public listing, or if the intention is to complement it. 

The filing also appears to confirm a report from September 2019 that Palantir was seeking to raise between $1 billion and $3 billion, its first fundraising in four years.

That report noted Palantir was targeting a $26 billion valuation, up from $20 billion four years ago. The Reuters article in June put its valuation based on secondary market trades at between $10 billion and $14 billion.

To date, Palantir has raised at least $3.3 billion in funding, according to PitchBook, which names no fewer than 108 investors on its cap table.

The PitchBook data (some of which is behind a paywall) also indicates that Palantir has raised a number of previous rounds of undisclosed amounts.

Palantir was last valued at $20 billion when it raised money four years ago, but there are some data points that point to a bigger valuation today.

While the coronavirus pandemic has all but halted the IPO market, we are starting to see some movement again, and Palantir’s own business activity points to what might be a strong candidate to usher in more activity.

In April, according to a Bloomberg report, the company briefed investors with documents showing that it expects to make $1 billion in revenues this year, up 38% on 2019, and breaking even in the first time since being founded 16 years ago by Peter Thiel, Nathan Gettings, Joe Lonsdale, Stephen Cohen and current CEO Alex Karp.

(The Bloomberg report didn’t explain why Palantir was briefing investors, whether for a potential public listing, or for the fundraise we’re reporting on here, or something else.)

On top of that, the company has been in the news a lot around the global novel coronavirus pandemic.

Specifically, it’s been winning business, in the form of projects in major markets like the U.K. (where it’s part of a consortium of companies working with the NHS on a COVID-19 data trove) and the U.S. (where it’s been working on a COVID-19 tracker for the federal government and a project with the CDC), and possibly others. Those projects will presumably need a lot of upfront capital to set up and run, alongside other business deals that Palantir has been securing — possibly one reason it is raising money now.

Updated throughout, including with response from Palantir.

Apr
02
2020
--

Collibra nabs another $112.5M at a $2.3B valuation for its big data management platform

GDPR and other data protection and privacy regulations — as well as a significant (and growing) number of data breaches and exposées of companies’ privacy policies — have put a spotlight on not just the vast troves of data that businesses and other organizations hold on us, but also how they handle it. Today, one of the companies helping them cope with that data in a better and legal way is announcing a huge round of funding to continue that work. Collibra, which provides tools to manage, warehouse, store and analyse data troves, is today announcing that it has raised $112.5 million in funding, at a post-money valuation of $2.3 billion.

The funding — a Series F, from the looks of it — represents a big bump for the startup, which last year raised $100 million at a valuation of just over $1 billion. This latest round was co-led by ICONIQ Capital, Index Ventures, and Durable Capital Partners LP, with previous investors CapitalG (Google’s growth fund), Battery Ventures, and Dawn Capital also participating.

Collibra was originally a spin-out from Vrije Universiteit in Brussels, Belgium and today it works with some 450 enterprises and other large organizations. Customers include Adobe, Verizon (which owns TechCrunch), insurers AXA and a number of healthcare providers. Its products cover a range of services focused around company data, including tools to help customers comply with local data protection policies and store it securely, and tools (and plug-ins) to run analytics and more.

These are all features and products that have long had a place in enterprise big data IT, but they have become increasingly more used and in-demand both as data policies have expanded, as security has become more of an issue, and as the prospects of what can be discovered through big data analytics have become more advanced.

With that growth, many companies have realised that they are not in a position to use and store their data in the best possible way, and that is where companies like Collibra step in.

“Most large organizations are in data chaos,” Felix Van de Maele, co-founder and CEO, previously told us. “We help them understand what data they have, where they store it and [understand] whether they are allowed to use it.”

As you would expect with a big IT trend, Collibra is not the only company chasing this opportunity. Competitors include Informatica, IBM, Talend, and Egnyte, among a number of others, but the market position of Collibra, and its advanced technology, is what has continued to impress investors.

“Durable Capital Partners invests in innovative companies that have significant potential to shape growing industries and build larger companies,” said Henry Ellenbogen, founder and chief investment officer for Durable Capital Partners LP, in a statement (Ellenbogen is formerly an investment manager a T. Rowe Price, and this is his first investment in Collibra under Durable). “We believe Collibra is a leader in the Data Intelligence category, a space that could have a tremendous impact on global business operations and a space that we expect will continue to grow as data becomes an increasingly critical asset.”

“We have a high degree of conviction in Collibra and the importance of the company’s mission to help organizations benefit from their data,” added Matt Jacobson, general partner at ICONIQ Capital and Collibra board member, in his own statement. “There is an increasing urgency for enterprises to harness their data for strategic business decisions. Collibra empowers organizations to use their data to make critical business decisions, especially in uncertain business environments.”

Mar
10
2020
--

BackboneAI scores $4.7M seed to bring order to intercompany data sharing

BackboneAI, an early-stage startup that wants to help companies dealing with lots of data, particularly coming from a variety of external sources, announced a $4.7 million seed investment today.

The round was led by Fika Ventures with participation from Boldstart Ventures, Dynamo Ventures, GGV Capital, MetaProp, Spider VC and several other unnamed investors.

Company founder Rob Bailey says he has spent a lot of time in his career watching how data flows in organizations. There are still a myriad of challenges related to moving data between organizations, and that’s what his company is trying to solve. “BackboneAI is an AI platform specifically built for automating data flows within and between companies,” he said.

This could involve any number of scenarios from keeping large, complex data catalogues up-to-date to coordinating the intricate flow of construction materials between companies or content rights management across an entertainment industry.

Bailey says that he spent 18 months talking to companies before he built the product. “What we found is that every company we talked to was, in some way or another, concerned about an absolute flood of data from all these different applications and from all the companies that they’re working with externally,” he explained.

The BackboneAI platform aims to solve a number of problems related to this. For starters, it automates the acquisition of this data, usually from third parties like suppliers, customers, regulatory agencies and so forth. Then it handles ingestion of the data, and finally it takes care of a lot of actual processing from external sources, while mapping it to internal systems like the company ERP system.

As an example, he uses an industrial supply company that may deal with a million SKUs across a couple of dozen divisions. Trying to track that with manual or even legacy systems is difficult. “They take all this product data in [from external suppliers], and then process the information in their own [internal] product catalog, and then finally present that data about those products to hundreds of thousands of customers. It’s an incredibly large and challenging data problem as you’re processing millions and millions of SKUs and orders, and you have to keep that data current on a regular basis,” he explained.

The company is just getting started. It spent 2019 incubating inside of Boldstart Ventures . Today the company has close to 20 employees in New York City, and it has signed its first Fortune 500 customer. Bailey says they have 15 additional Fortune 500 companies in the pipeline. With the seed money, he hopes to build on this initial success.

Feb
27
2020
--

London-based Gyana raises $3.9M for a no-code approach to data science

Coding and other computer science expertise remain some of the more important skills that a person can have in the working world today, but in the last few years, we have also seen a big rise in a new generation of tools providing an alternative way of reaping the fruits of technology: “no-code” software, which lets anyone — technical or non-technical — build apps, games, AI-based chatbots, and other products that used to be the exclusive terrain of engineers and computer scientists.

Today, one of the newer startups in the category — London-based Gyana, which lets non-technical people run data science analytics on any structured dataset — is announcing a round of £3 million to fuel its next stage of growth.

Led by U.K. firm Fuel Ventures, other investors in this round include Biz Stone of Twitter, Green Shores Capital and U+I , and it brings the total raised by the startup to $6.8 million since being founded in 2015.

Gyana (Sanskrit for “knowledge”) was co-founded by Joyeeta Das and David Kell, who were both pursuing post-graduate degrees at Oxford: Das, a former engineer, was getting an MBA, and Kell was doing a Ph. D. in physics.

Das said the idea of building this tool came out of the fact that the pair could see a big disconnect emerging not just in their studies, but also in the world at large — not so much a digital divide, as a digital light year in terms of the distance between the groups of who and who doesn’t know how to work in the realm of data science.

“Everyone talks about using data to inform decision making, and the world becoming data-driven, but actually that proposition is available to less than one percent of the world,” she said.

Out of that, the pair decided to work on building a platform that Das describes as a way to empower “citizen data scientists,” by letting users upload any structured data set (for example, a .CSV file) and running a series of queries on it to be able to visualise trends and other insights more easily.

While the longer term goal may be for any person to be able to produce an analytical insight out of a long list of numbers, the more practical and immediate application has been in enterprise services and building tools for non-technical knowledge workers to make better, data-driven decisions.

To prove out its software, the startup first built an app based on the platform that it calls Neera (Sanskrit for “water”), which specifically parses footfall and other “human movement” metrics, useful for applications in retail, real estate and civic planning — for example to determine well certain retail locations are performing, footfall in popular locations, decisions on where to place or remove stores, or how to price a piece of property.

Starting out with the aim of mid-market and smaller companies — those most likely not to have in-house data scientists to meet their business needs — startup has already picked up a series of customers that are actually quite a lot bigger than that. They include Vodafone, Barclays, EY, Pret a Manger, Knight Frank and the UK Ministry of Defense. It says it has some £1 million in contracts with these firms currently.

That, in turn, has served as the trigger to raise this latest round of funding and to launch Vayu (Sanskrit for “air”) — a more general purpose app that covers a wider set of parameters that can be applied to a dataset. So far, it has been adopted by academic researchers, financial services employees, and others that use analysis in their work, Das said.

With both Vayu and Neera, the aim — refreshingly — is to make the whole experience as privacy-friendly as possible, Das noted. Currently, you download an app if you want to use Gyana, and you keep your data local as you work on it. Gyana has no “anonymization” and no retention of data in its processes, except things like analytics around where your cursor hovers, so that Gyana knows how it can improve its product.

“There are always ways to reverse engineer these things,” Das said of anonymization. “We just wanted to make sure that we are not accidentally creating a situation where, despite learning from anaonyised materials, you can’t reverse engineer what people are analysing. We are just not convinced.”

While there is something commendable about building and shipping a tool with a lot of potential to it, Gyana runs the risk of facing what I think of as the “water, water everywhere” problem. Sometimes if a person really has no experience or specific aim, it can be hard to think of how to get started when you can do anything. Das said they have also identified this, and so while currently Gyana already offers some tutorials and helper tools within the app to nudge the user along, the plan is to eventually bring in a large variety of datasets for people to get started with, and also to develop a more intuitive way to “read” the basics of the files in order to figure out what kinds of data inquiries a person is most likely to want to make.

The rise of “no-code” software has been a swift one in the world of tech spanning the proliferation of startups, big acquisitions, and large funding rounds. Companies like Airtable and DashDash are aimed at building analytics leaning on interfaces that follow the basic design of a spreadsheet; AppSheet, which is a no-code mobile app building platform, was recently acquired by Google; and Roblox (for building games without needing to code) and Uncorq (for app development) have both raised significant funding just this week. In the area of no-code data analytics and visualisation, there are biggies like Tableau, as well as Trifacta, RapidMiner and more.

Gartner predicts that by 2024, some 65% of all app development will be made on low- or no-code platforms, and Forrester estimates that the no- and low-code market will be worth some $10 billion this year, rising to $21.2 billion by 2024.

That represents a big business opportunity for the likes of Gyana, which has been unique in using the no-code approach specifically to tackle the area of data science.

However, in the spirit of citizen data scientists, the intention is to keep a consumer version of the apps free to use as it works on signing up enterprise users with more enhanced paid products, which will be priced on an annual license basis (currently clients are paying between $6,000 and $12,000 depending on usage, she said).

“We want to do free for as long as we can,” Das said, both in relation to the data tools and the datasets that it will offer to users. “The biggest value add is not about accessing premium data that is hard to get. We are not a data marketplace but we want to provide data that makes sense to access,” adding that even with business users, “we’d like you to do 90% of what you want to do without paying for anything.”

Oct
15
2019
--

Databricks brings its Delta Lake project to the Linux Foundation

Databricks, the big data analytics service founded by the original developers of Apache Spark, today announced that it is bringing its Delta Lake open-source project for building data lakes to the Linux Foundation under an open governance model. The company announced the launch of Delta Lake earlier this year, and, even though it’s still a relatively new project, it has already been adopted by many organizations and has found backing from companies like Intel, Alibaba and Booz Allen Hamilton.

“In 2013, we had a small project where we added SQL to Spark at Databricks […] and donated it to the Apache Foundation,” Databricks CEO and co-founder Ali Ghodsi told me. “Over the years, slowly people have changed how they actually leverage Spark and only in the last year or so it really started to dawn upon us that there’s a new pattern that’s emerging and Spark is being used in a completely different way than maybe we had planned initially.”

This pattern, he said, is that companies are taking all of their data and putting it into data lakes and then doing a couple of things with this data, machine learning and data science being the obvious ones. But they are also doing things that are more traditionally associated with data warehouses, like business intelligence and reporting. The term Ghodsi uses for this kind of usage is “Lake House.” More and more, Databricks is seeing that Spark is being used for this purpose and not just to replace Hadoop and doing ETL (extract, transform, load). “This kind of Lake House patterns we’ve seen emerge more and more and we wanted to double down on it.”

Spark 3.0, which is launching today soon, enables more of these use cases and speeds them up significantly, in addition to the launch of a new feature that enables you to add a pluggable data catalog to Spark.

Delta Lake, Ghodsi said, is essentially the data layer of the Lake House pattern. It brings support for ACID transactions to data lakes, scalable metadata handling and data versioning, for example. All the data is stored in the Apache Parquet format and users can enforce schemas (and change them with relative ease if necessary).

It’s interesting to see Databricks choose the Linux Foundation for this project, given that its roots are in the Apache Foundation. “We’re super excited to partner with them,” Ghodsi said about why the company chose the Linux Foundation. “They run the biggest projects on the planet, including the Linux project but also a lot of cloud projects. The cloud-native stuff is all in the Linux Foundation.”

“Bringing Delta Lake under the neutral home of the Linux Foundation will help the open-source community dependent on the project develop the technology addressing how big data is stored and processed, both on-prem and in the cloud,” said Michael Dolan, VP of Strategic Programs at the Linux Foundation. “The Linux Foundation helps open-source communities leverage an open governance model to enable broad industry contribution and consensus building, which will improve the state of the art for data storage and reliability.”

Oct
08
2019
--

Satya Nadella looks to the future with edge computing

Speaking today at the Microsoft Government Leaders Summit in Washington, DC, Microsoft CEO Satya Nadella made the case for edge computing, even while pushing the Azure cloud as what he called “the world’s computer.”

While Amazon, Google and other competitors may have something to say about that, marketing hype aside, many companies are still in the midst of transitioning to the cloud. Nadella says the future of computing could actually be at the edge, where computing is done locally before data is then transferred to the cloud for AI and machine learning purposes. What goes around, comes around.

But as Nadella sees it, this is not going to be about either edge or cloud. It’s going to be the two technologies working in tandem. “Now, all this is being driven by this new tech paradigm that we describe as the intelligent cloud and the intelligent edge,” he said today.

He said that to truly understand the impact the edge is going to have on computing, you have to look at research, which predicts there will be 50 billion connected devices in the world by 2030, a number even he finds astonishing. “I mean this is pretty stunning. We think about a billion Windows machines or a couple of billion smartphones. This is 50 billion [devices], and that’s the scope,” he said.

The key here is that these 50 billion devices, whether you call them edge devices or the Internet of Things, will be generating tons of data. That means you will have to develop entirely new ways of thinking about how all this flows together. “The capacity at the edge, that ubiquity is going to be transformative in how we think about computation in any business process of ours,” he said. As we generate ever-increasing amounts of data, whether we are talking about public sector kinds of use case, or any business need, it’s going to be the fuel for artificial intelligence, and he sees the sheer amount of that data driving new AI use cases.

“Of course when you have that rich computational fabric, one of the things that you can do is create this new asset, which is data and AI. There is not going to be a single application, a single experience that you are going to build, that is not going to be driven by AI, and that means you have to really have the ability to reason over large amounts of data to create that AI,” he said.

Nadella would be more than happy to have his audience take care of all that using Microsoft products, whether Azure compute, database, AI tools or edge computers like the Data Box Edge it introduced in 2018. While Nadella is probably right about the future of computing, all of this could apply to any cloud, not just Microsoft.

As computing shifts to the edge, it’s going to have a profound impact on the way we think about technology in general, but it’s probably not going to involve being tied to a single vendor, regardless of how comprehensive their offerings may be.

Sep
17
2019
--

Data storage company Cloudian launches a new edge analytics subsidiary called Edgematrix

Cloudian, a company that enables businesses to store and manage massive amounts of data, announced today the launch of Edgematrix, a new unit focused on edge analytics for large data sets. Edgematrix, a majority-owned subsidiary of Cloudian, will first be available in Japan, where both companies are based. It has raised a $9 million Series A from strategic investors NTT Docomo, Shimizu Corporation and Japan Post Capital, as well as Cloudian co-founder and CEO Michael Tso and board director Jonathan Epstein. The funding will be used on product development, deployment and sales and marketing.

Cloudian itself has raised a total of $174 million, including a $94 million Series E round announced last year. Its products include the Hyperstore platform, which allows businesses to store hundreds of petrabytes of data on premise, and software for data analytics and machine learning. Edgematrix uses Hyperstore for storing large-scale data sets and its own AI software and hardware for data processing at the “edge” of networks, closer to where data is collected from IoT devices like sensors.

The company’s solutions were created for situations where real-time analytics is necessary. For example, it can be used to detect the make, model and year of cars on highways so targeted billboard ads can be displayed to their drivers.

Tso told TechCrunch in an email that Edgematrix was launched after Cloudian co-founder and president Hiroshi Ohta and a team spent two years working on technology to help Cloudian customers process and analyze their data more efficiently.

“With more and more data being created at the edge, including IoT data, there’s a growing need for being able to apply real-time data analysis and decision-making at or near the edge, minimizing the transmission costs and latencies involved in moving the data elsewhere,” said Tso. “Based on the initial success of a small Cloudian team developing AI software solutions and attracting a number of top-tier customers, we decided that the best way to build on this success was establishing a subsidiary with strategic investors.”

Edgematrix is launching in Japan first because spending on AI systems there is expected to grow faster than in any other market, at a compound annual growth rate of 45.3% from 2018 to 2023, according to IDC.

“Japan has been ahead of the curve as an early adopter of AI technology, with both the governmetn and private sector viewing it as essential to boosting productivity,” said Tso. “Edgematrix will focus on the Japanese market for at least the next year, and assuming that all goes well, it would then expand to North America and Europe.”

Powered by WordPress | Theme: Aeros 2.0 by TheBuckmaker.com