Oct
15
2019
--

Databricks brings its Delta Lake project to the Linux Foundation

Databricks, the big data analytics service founded by the original developers of Apache Spark, today announced that it is bringing its Delta Lake open-source project for building data lakes to the Linux Foundation under an open governance model. The company announced the launch of Delta Lake earlier this year, and, even though it’s still a relatively new project, it has already been adopted by many organizations and has found backing from companies like Intel, Alibaba and Booz Allen Hamilton.

“In 2013, we had a small project where we added SQL to Spark at Databricks […] and donated it to the Apache Foundation,” Databricks CEO and co-founder Ali Ghodsi told me. “Over the years, slowly people have changed how they actually leverage Spark and only in the last year or so it really started to dawn upon us that there’s a new pattern that’s emerging and Spark is being used in a completely different way than maybe we had planned initially.”

This pattern, he said, is that companies are taking all of their data and putting it into data lakes and then doing a couple of things with this data, machine learning and data science being the obvious ones. But they are also doing things that are more traditionally associated with data warehouses, like business intelligence and reporting. The term Ghodsi uses for this kind of usage is “Lake House.” More and more, Databricks is seeing that Spark is being used for this purpose and not just to replace Hadoop and doing ETL (extract, transform, load). “This kind of Lake House patterns we’ve seen emerge more and more and we wanted to double down on it.”

Spark 3.0, which is launching today soon, enables more of these use cases and speeds them up significantly, in addition to the launch of a new feature that enables you to add a pluggable data catalog to Spark.

Delta Lake, Ghodsi said, is essentially the data layer of the Lake House pattern. It brings support for ACID transactions to data lakes, scalable metadata handling and data versioning, for example. All the data is stored in the Apache Parquet format and users can enforce schemas (and change them with relative ease if necessary).

It’s interesting to see Databricks choose the Linux Foundation for this project, given that its roots are in the Apache Foundation. “We’re super excited to partner with them,” Ghodsi said about why the company chose the Linux Foundation. “They run the biggest projects on the planet, including the Linux project but also a lot of cloud projects. The cloud-native stuff is all in the Linux Foundation.”

“Bringing Delta Lake under the neutral home of the Linux Foundation will help the open-source community dependent on the project develop the technology addressing how big data is stored and processed, both on-prem and in the cloud,” said Michael Dolan, VP of Strategic Programs at the Linux Foundation. “The Linux Foundation helps open-source communities leverage an open governance model to enable broad industry contribution and consensus building, which will improve the state of the art for data storage and reliability.”

Oct
08
2019
--

Nadella warns government conference not to betray user trust

Microsoft CEO Satya Nadella, delivering the keynote at the Microsoft Government Leaders Summit in Washington, DC today, had a message for attendees to maintain user trust in their tools technologies above all else.

He said it is essential to earn user trust, regardless of your business. “Now, of course, the power law here is all around trust because one of the keys for us, as providers of platforms and tools, trust is everything,” he said today. But he says it doesn’t stop with the platform providers like Microsoft. Institutions using those tools also have to keep trust top of mind or risk alienating their users.

“That means you need to also ensure that there is trust in the technology that you adopt, and the technology that you create, and that’s what’s going to really define the power law on this equation. If you have trust, you will have exponential benefit. If you erode trust it will exponentially decay,” he said.

He says Microsoft sees trust along three dimensions: privacy, security and ethical use of artificial intelligence. All of these come together in his view to build a basis of trust with your customers.

Nadella said he sees privacy as a human right, pure and simple, and it’s up to vendors to ensure that privacy or lose the trust of their customers. “The investments around data governance is what’s going to define whether you’re serious about privacy or not,” he said. For Microsoft, they look at how transparent they are about how they use the data, their terms of service and how they use technology to ensure that’s being carried out at runtime.

He reiterated the call he made last year for a federal privacy law. With GDPR in Europe and California’s CCPA coming on line in January, he sees a centralized federal law as a way to streamline regulations for business.

As for security, as you might expect, he defined it in terms of how Microsoft was implementing it, but the message was clear that you needed security as part of your approach to trust, regardless of how you implement that. He asked several key questions of attendees.

“Cyber is the second area where we not only have to do our work, but you have to [ask], what’s your operational security posture, how have you thought about having the best security technology deployed across the entire chain, whether it’s on the application side, the infrastructure side or on the endpoint, side, and most importantly, around identity,” Nadella said.

The final piece, one which he said was just coming into play, was how you use artificial intelligence ethically, a sensitive topic for a government audience, but one he wasn’t afraid to broach. “One of the things people say is, ‘Oh, this AI thing is so unexplainable, especially deep learning.’ But guess what, you created that deep learning [model]. In fact, the data on top of which you train the model, the parameters and the number of parameters you use — a lot of things are in your control. So we should not abdicate our responsibility when creating AI,” he said.

Whether Microsoft or the U.S. government can adhere to these lofty goals is unclear, but Nadella was careful to outline them both for his company’s benefit and this particular audience. It’s up to both of them to follow through.

Oct
08
2019
--

Satya Nadella looks to the future with edge computing

Speaking today at the Microsoft Government Leaders Summit in Washington, DC, Microsoft CEO Satya Nadella made the case for edge computing, even while pushing the Azure cloud as what he called “the world’s computer.”

While Amazon, Google and other competitors may have something to say about that, marketing hype aside, many companies are still in the midst of transitioning to the cloud. Nadella says the future of computing could actually be at the edge, where computing is done locally before data is then transferred to the cloud for AI and machine learning purposes. What goes around, comes around.

But as Nadella sees it, this is not going to be about either edge or cloud. It’s going to be the two technologies working in tandem. “Now, all this is being driven by this new tech paradigm that we describe as the intelligent cloud and the intelligent edge,” he said today.

He said that to truly understand the impact the edge is going to have on computing, you have to look at research, which predicts there will be 50 billion connected devices in the world by 2030, a number even he finds astonishing. “I mean this is pretty stunning. We think about a billion Windows machines or a couple of billion smartphones. This is 50 billion [devices], and that’s the scope,” he said.

The key here is that these 50 billion devices, whether you call them edge devices or the Internet of Things, will be generating tons of data. That means you will have to develop entirely new ways of thinking about how all this flows together. “The capacity at the edge, that ubiquity is going to be transformative in how we think about computation in any business process of ours,” he said. As we generate ever-increasing amounts of data, whether we are talking about public sector kinds of use case, or any business need, it’s going to be the fuel for artificial intelligence, and he sees the sheer amount of that data driving new AI use cases.

“Of course when you have that rich computational fabric, one of the things that you can do is create this new asset, which is data and AI. There is not going to be a single application, a single experience that you are going to build, that is not going to be driven by AI, and that means you have to really have the ability to reason over large amounts of data to create that AI,” he said.

Nadella would be more than happy to have his audience take care of all that using Microsoft products, whether Azure compute, database, AI tools or edge computers like the Data Box Edge it introduced in 2018. While Nadella is probably right about the future of computing, all of this could apply to any cloud, not just Microsoft.

As computing shifts to the edge, it’s going to have a profound impact on the way we think about technology in general, but it’s probably not going to involve being tied to a single vendor, regardless of how comprehensive their offerings may be.

Oct
03
2019
--

T4 wants to transform market research data with a combination of AI and humans

When T4 co-founder and CEO Maks Khurgin was working at Bain and Company, he ran into a common problem for analysts looking for market data. He spent way too much time searching for it and felt there had to be a better way. He decided to build a centralized market data platform himself, and T4 was born. This week the company competes in the TechCrunch Disrupt SF Startup Battlefield.

What he created with the help of his long-time friend and CTO, Yev Spektor, was built on a couple of key components. The first is an industry classification system, a taxonomy, that organizes markets by industries and sub-industries. Using search and aggregation tools powered by artificial intelligence, it scours the web looking for information sources that match their taxonomy labels.

As they researched the tool, the founders realized that the AI could only get them so far. There were always pieces that it missed. So they built a second part to provide a way for human indexers to fill in those missing parts to offer as comprehensive a list of sources as possible.

“AI alone cannot solve this problem. If we bring people into this and avoid the last mile delivery problem, then you can actually start organizing this information in a much better way than anyone else had ever done,” Khurgin explained.

It seems simple enough, but it’s a problem that well-heeled companies like Bain have been trying to solve for years, and there was a lot of skepticism when Khurgin told his superiors he was leaving to build a product to solve this problem. “I had a partner at Bain and Company actually tell me, “You know, every consulting firm has tried to do something like this — and they failed. Why do you think you can do this?””

He knew that figuring out the nature of the problem and why the other attempts had failed was the key to solving the puzzle. He decided to take the challenge, and on his 30th birthday, he quit his job at Bain and started T4 the next day — without a product yet, mind you.

This was not the first time he had left a high-paying job to try something unconventional. “Last time I left a high paying job, actually after undergrad, I was a commodities derivatives trader for a financial [services company]. I left that to pursue a lifelong dream of being in the Marine Corps,” Khurgin said.

T4 DSC00953

T4 was probably a less risky proposition, but it still took a leap of faith that only a startup founder can understand, who believes in his idea. “I felt the problem first-hand, and the the big kind of realization that I had was that there is actually a finite amount of information out there. Market research is created by humans, and you don’t necessarily have to take a pure AI approach,” he said.

The product searches for all of the related information on a topic, finds all of the data related to a category and places it in an index. Users can search by topic and find all of the free and paid reports related to that search. The product shows which reports are free and which will cost you money, and like Google, you get a title and a brief summary.

The company is just getting started with five main market categories so far, including cloud computing, cybersecurity, networking, data centers and eSports. The founders plan to add additional categories over time, and have a bold goal for the future.

“Our long-term vision is that we become your one-stop shop to find market research in the same way that if you need to buy something, you go to Amazon, or you need financial data, you go on Bloomberg or Thomson. If you need market research, our vision is that T4 is the place that you go,” Khurgin said.


Sep
29
2019
--

Why is Dropbox reinventing itself?

According to Dropbox CEO Drew Houston, 80% of the product’s users rely on it, at least partially, for work.

It makes sense, then, that the company is refocusing to try and cement its spot in the workplace; to shed its image as “just” a file storage company (in a time when just about every big company has its own cloud storage offering) and evolve into something more immutably core to daily operations.

Earlier this week, Dropbox announced that the “new Dropbox” would be rolling out to all users. It takes the simple, shared folders that Dropbox is known for and turns them into what the company calls “Spaces” — little mini collaboration hubs for your team, complete with comment streams, AI for highlighting files you might need mid-meeting, and integrations into things like Slack, Trello and G Suite. With an overhauled interface that brings much of Dropbox’s functionality out of the OS and into its own dedicated app, it’s by far the biggest user-facing change the product has seen since launching 12 years ago.

Shortly after the announcement, I sat down with Dropbox VP of Product Adam Nash and CTO Quentin Clark . We chatted about why the company is changing things up, why they’re building this on top of the existing Dropbox product, and the things they know they just can’t change.

You can find these interviews below, edited for brevity and clarity.

Greg Kumparak: Can you explain the new focus a bit?

Adam Nash: Sure! I think you know this already, but I run products and growth, so I’m gonna have a bit of a product bias to this whole thing. But Dropbox… one of its differentiating characteristics is really that when we built this utility, this “magic folder”, it kind of went everywhere.

Sep
25
2019
--

QC Ware Forge will give developers access to quantum hardware and simulators across vendors

Quantum computing is almost ready for prime time, and, according to most experts, now is the time to start learning how to best develop for this new and less than intuitive technology. With multiple vendors like D-Wave, Google, IBM, Microsoft and Rigetti offering commercial and open-source hardware solutions, simulators and other tools, there’s already a lot of fragmentation in this business. QC Ware, which is launching its Forge cloud platform into beta today, wants to become the go-to middleman for accessing the quantum computing hardware and simulators of these vendors.

Forge, which like the rest of QC Ware’s efforts is aimed at enterprise users, will give developers the ability to run their algorithms on a variety of hardware platforms and simulators. The company argues that developers won’t need to have any previous expertise in quantum computing, though having a bit of background surely isn’t going to hurt. From Forge’s user interface, developers will be able to run algorithms for binary optimization, chemistry simulation and machine learning.

Screen Shot 2019 09 19 at 2.16.37 PM

“Practical quantum advantage will occur. Most experts agree that it’s a matter of ‘when’ not ‘if.’ The way to pull that horizon closer is by having the user community fully engaged in quantum computing application discovery. The objective of Forge is to allow those users to access the full range of quantum computing resources through a single platform,” said Matt Johnson, CEO, QC Ware. “To assist our customers in that exploration, we are spending all of our cycles working on ways to squeeze as much power as possible out of near-term quantum computers, and to bake those methods into Forge.”

Currently, QC Ware Forge offers access to hardware from D-Wave, as well as open-source simulators running on Google’s and IBM’s clouds, with plans to support a wider variety of platforms in the near future.

Initially, QC Ware also told me that it offered direct access to IBM’s hardware, but that’s not yet the case. “We currently have the integration complete and actively utilized by QC Ware developers and quantum experts,”  QC Ware’s head of business development Yianni Gamvros told me. “However, we are still working with IBM to put an agreement in place in order for our end-users to directly access IBM hardware. We expect that to be available in our next major release. For users, this makes it easier for them to deal with the churn. We expect different hardware vendors will lead at different times and that will keep changing every six months. And for our quantum computing hardware vendors, they have a channel partner they can sell through.”

Users who sign up for the beta will receive 30 days of access to the platform and one minute of actual Quantum Computing Time to evaluate the platform.

Sep
20
2019
--

Vianai emerges with $50M seed and a mission to simplify machine learning tech

You don’t see a startup get a $50 million seed round all that often, but such was the case with Vianai, an early-stage startup launched by Vishal Sikka, former Infosys managing director and SAP executive. The company launched recently with a big check and a vision to transform machine learning.

Just this week, the startup had a coming out party at Oracle Open World, where Sikka delivered one of the keynotes and demoed the product for attendees. Over the last couple of years, since he left Infosys, Sikka has been thinking about the impact of AI and machine learning on society and the way it is being delivered today. He didn’t much like what he saw.

It’s worth noting that Sikka got his PhD from Stanford with a specialty in AI in 1996, so this isn’t something that’s new to him. What’s changed, as he points out, is the growing compute power and increasing amounts of data, all fueling the current AI push inside business. What he saw when he began exploring how companies are implementing AI and machine learning today was a lot of complex tooling, which, in his view, was far more complex than it needed to be.

He saw dense Jupyter notebooks filled with code. He said that if you looked at a typical machine learning model, and stripped away all of the code, what you found was a series of mathematical expressions underlying the model. He had a vision of making that model-building more about the math, while building a highly visual data science platform from the ground up.

The company has been iterating on a solution over the last year with two core principles in mind: explorability and explainability, which involves interacting with the data and presenting it in a way that helps the user attain their goal faster than the current crop of model-building tools.

“It is about making the system reactive to what the user is doing, making it completely explorable, while making it possible for the developer to experiment with what’s happening in a way that is incredibly easy. To make it explainable means being able to go back and forth with the data and the model, using the model to understand the phenomenon that you’re trying to capture in the data,” Sikka told TechCrunch.

He says the tool isn’t just aimed at data scientists, it’s about business users and the data scientists sitting down together and iterating together to get the answers they are seeking, whether it’s finding a way to reduce user churn or discover fraud. These models do not live in a data science vacuum. They all have a business purpose, and he believes the only way to be successful with AI in the enterprise is to have both business users and data scientists sitting together at the same table working with the software to solve a specific problem, while taking advantage of one another’s expertise.

For Sikka, this means refining the actual problem you are trying to solve. “AI is about problem solving, but before you do the problem solving, there is also a [challenge around] finding and articulating a business problem that is relevant to businesses and that has a value to the organization,” he said.

He is very clear, that he isn’t looking to replace humans, but instead wants to use AI to augment human intelligence to solve actual human problems. He points out that this product is not automated machine learning (AutoML), which he considers a deeply flawed idea. “We are not here to automate the jobs of data science practitioners. We are here to augment them,” he said.

As for that massive seed round, Sikka knew it would take a big investment to build a vision like this, and with his reputation and connections, he felt it would be better to get one big investment up front, and he could concentrate on building the product and the company. He says that he was fortunate enough to have investors who believe in the vision, even though as he says, no early business plan survives the test of reality. He didn’t name specific investors, only referring to friends and wealthy and famous people and institutions. A company spokesperson reiterated they were not revealing a list of investors at this time.

For now, the company has a new product and plenty of money in the bank to get to profitability, which he states is his ultimate goal. Sikka could have taken a job running a large organization, but like many startup founders, he saw a problem, and he had an idea how to solve it. That was a challenge he couldn’t resist pursuing.

Sep
19
2019
--

Quilt Data launches from stealth with free portal to access petabytes of public data

Quilt Data‘s founders, Kevin Moore and Aneesh Karve, have been hard at work for the last four years building a platform to search for data quickly across vast repositories on AWS S3 storage. The idea is to give data scientists a way to find data in S3 buckets, then package that data in forms that a business can use. Today, the company launched out of stealth with a free data search portal that not only proves what they can do, but also provides valuable access to 3.7 petabytes of public data across 23 S3 repositories.

The public data repository includes publicly available Amazon review data along with satellite images and other high-value public information. The product works like any search engine, where you enter a query, but instead of searching the web or an enterprise repository, it finds the results in S3 storage on AWS.

The results not only include the data you are looking for, it also includes all of the information around the data, such as Jupyter notebooks, the standard workspace that data scientists use to build machine learning models. Data scientists can then use this as the basis for building their own machine learning models.

The public data, which includes more than 10 billion objects, is a resource that data scientists should greatly appreciate it, but Quilt Data is offering access to this data out of more than pure altruism. It’s doing so because it wants to show what the platform is capable of, and in the process hopes to get companies to use the commercial version of the product.

Screen Shot 2019 09 16 at 2.31.53 PM

Quilt Data search results with data about the data found (Image: Quilt Data)

Customers can try Quilt Data for free or subscribe to the product in the Amazon Marketplace. The company charges a flat rate of $550 per month for each S3 bucket. It also offers an enterprise version with priority support, custom features and education and on-boarding for $999 per month for each S3 bucket.

The company was founded in 2015 and was a member of the Y Combinator Summer 2017 cohort. The company has received $4.2 million in seed money so far from Y Combinator, Vertex Ventures, Fuel Capital and Streamlined Ventures, along with other unnamed investors.

Sep
17
2019
--

Boston-based DataRobot raises $206M Series E to bring AI to enterprise

Artificial intelligence is playing an increasingly large role in enterprise software, and Boston’s DataRobot has been helping companies build, manage and deploy machine learning models for some time now. Today, the company announced a $206 million Series E investment led by Sapphire Ventures.

Other participants in this round included new investors Tiger Global Management, World Innovation Lab, Alliance Bernstein PCI, and EDBI along with existing investors DFJ Growth, Geodesic Capital, Intel Capital, Sands Capital, NEA and Meritech.

Today’s investment brings the total raised to $431 million, according to the company. It has a pre-money valuation of $1 billion, according to PitchBook. DataRobot would not confirm this number.

The company has been catching the attention of these investors by offering a machine learning platform aimed at analysts, developers and data scientists to help build predictive models much more quickly than it typically takes using traditional methodologies. Once built, the company provides a way to deliver the model in the form of an API, simplifying deployment.

The late-stage startup plans to use the money to continue building out its product line, while looking for acquisition opportunities where it makes sense. The company also announced the availability of a new product today, DataRobot MLOps, a tool to manage, monitor and deploy machine learning models across a large organization.

The company, which was founded in 2012, claims it has had triple-digit recurring revenue growth dating back to 2015, as well as one billion models built on the platform to-date. Customers contributing to that number include a broad range of companies such as Humana, United Airlines, Harvard Business School and Deloitte.

Sep
11
2019
--

Explorium reveals $19.1M in total funding for machine learning data discovery platform

Explorium, a data discovery platform for machine learning models, received a couple of unannounced funding rounds over the last year — a $3.6 million seed round last September and a $15.5 million Series A round in March. Today, it made both of these rounds public.

The seed round was led by Emerge with participation of F2 Capital. The Series A was led by Zeev Ventures with participation from the seed investors. The total raised is $19.1 million.

The company founders, who have a data science background, found that it was problematic to find the right data to build a machine learning model. Like most good startup founders confronted with a problem, they decided to solve it themselves by building a data discovery platform for data scientists.

CEO and co-founder, Maor Shlomo says that the company wanted to focus on the quality of the data because not much work has been done there. “A lot of work has been invested on the algorithmic part of machine learning, but the algorithms themselves have very much become commodities. The challenge now is really finding the right data to feed into those algorithms,” Sholmo told TechCrunch.

It’s a hard problem to solve, so they built a kind of search engine that can go out and find the best data wherever it happens to live, whether it’s internally or in an open data set, public data or premium databases. The company has partnered with thousands of data sources, according to Schlomo, to help data scientist customers find the best data for their particular model.

“We developed a new type of search engine that’s capable of looking at the customers data, connecting and enriching it with literally thousands of data sources, while automatically selecting what are the best pieces of data, and what are the best variables or features, which could actually generate the best performing machine learning model,” he explained.

Shlomo sees a big role for partnerships, whether that involves data sources or consulting firms, who can help push Explorium into more companies.

Explorium has 63 employees spread across offices in Tel Aviv, Kiev and San Francisco. It’s still early days, but Sholmo reports “tens of customers.” As more customers try to bring data science to their companies, especially with a shortage of data scientists, having a tool like Explorium could help fill that gap.

Powered by WordPress | Theme: Aeros 2.0 by TheBuckmaker.com