May
18
2021
--

Google Cloud launches Vertex AI, a new managed machine learning platform

At Google I/O today Google Cloud announced Vertex AI, a new managed machine learning platform that is meant to make it easier for developers to deploy and maintain their AI models. It’s a bit of an odd announcement at I/O, which tends to focus on mobile and web developers and doesn’t traditionally feature a lot of Google Cloud news, but the fact that Google decided to announce Vertex today goes to show how important it thinks this new service is for a wide range of developers.

The launch of Vertex is the result of quite a bit of introspection by the Google Cloud team. “Machine learning in the enterprise is in crisis, in my view,” Craig Wiley, the director of product management for Google Cloud’s AI Platform, told me. “As someone who has worked in that space for a number of years, if you look at the Harvard Business Review or analyst reviews, or what have you — every single one of them comes out saying that the vast majority of companies are either investing or are interested in investing in machine learning and are not getting value from it. That has to change. It has to change.”

Image Credits: Google

Wiley, who was also the general manager of AWS’s SageMaker AI service from 2016 to 2018 before coming to Google in 2019, noted that Google and others who were able to make machine learning work for themselves saw how it can have a transformational impact, but he also noted that the way the big clouds started offering these services was by launching dozens of services, “many of which were dead ends,” according to him (including some of Google’s own). “Ultimately, our goal with Vertex is to reduce the time to ROI for these enterprises, to make sure that they can not just build a model but get real value from the models they’re building.”

Vertex then is meant to be a very flexible platform that allows developers and data scientist across skill levels to quickly train models. Google says it takes about 80% fewer lines of code to train a model versus some of its competitors, for example, and then help them manage the entire lifecycle of these models.

Image Credits: Google

The service is also integrated with Vizier, Google’s AI optimizer that can automatically tune hyperparameters in machine learning models. This greatly reduces the time it takes to tune a model and allows engineers to run more experiments and do so faster.

Vertex also offers a “Feature Store” that helps its users serve, share and reuse the machine learning features and Vertex Experiments to help them accelerate the deployment of their models into producing with faster model selection.

Deployment is backed by a continuous monitoring service and Vertex Pipelines, a rebrand of Google Cloud’s AI Platform Pipelines that helps teams manage the workflows involved in preparing and analyzing data for the models, train them, evaluate them and deploy them to production.

To give a wide variety of developers the right entry points, the service provides three interfaces: a drag-and-drop tool, notebooks for advanced users and — and this may be a bit of a surprise — BigQuery ML, Google’s tool for using standard SQL queries to create and execute machine learning models in its BigQuery data warehouse.

We had two guiding lights while building Vertex AI: get data scientists and engineers out of the orchestration weeds, and create an industry-wide shift that would make everyone get serious about moving AI out of pilot purgatory and into full-scale production,” said Andrew Moore, vice president and general manager of Cloud AI and Industry Solutions at Google Cloud. “We are very proud of what we came up with in this platform, as it enables serious deployments for a new generation of AI that will empower data scientists and engineers to do fulfilling and creative work.”

Dec
08
2020
--

AWS expands on SageMaker capabilities with end-to-end features for machine learning

Nearly three years after it was first launched, Amazon Web Services’ SageMaker platform has gotten a significant upgrade in the form of new features, making it easier for developers to automate and scale each step of the process to build new automation and machine learning capabilities, the company said.

As machine learning moves into the mainstream, business units across organizations will find applications for automation, and AWS is trying to make the development of those bespoke applications easier for its customers.

“One of the best parts of having such a widely adopted service like SageMaker is that we get lots of customer suggestions which fuel our next set of deliverables,” said AWS vice president of machine learning, Swami Sivasubramanian. “Today, we are announcing a set of tools for Amazon SageMaker that makes it much easier for developers to build end-to-end machine learning pipelines to prepare, build, train, explain, inspect, monitor, debug and run custom machine learning models with greater visibility, explainability and automation at scale.”

Already companies like 3M, ADP, AstraZeneca, Avis, Bayer, Capital One, Cerner, Domino’s Pizza, Fidelity Investments, Lenovo, Lyft, T-Mobile and Thomson Reuters are using SageMaker tools in their own operations, according to AWS.

The company’s new products include Amazon SageMaker Data Wrangler, which the company said was providing a way to normalize data from disparate sources so the data is consistently easy to use. Data Wrangler can also ease the process of grouping disparate data sources into features to highlight certain types of data. The Data Wrangler tool contains more than 300 built-in data transformers that can help customers normalize, transform and combine features without having to write any code.

Amazon also unveiled the Feature Store, which allows customers to create repositories that make it easier to store, update, retrieve and share machine learning features for training and inference.

Another new tool that Amazon Web Services touted was Pipelines, its workflow management and automation toolkit. The Pipelines tech is designed to provide orchestration and automation features not dissimilar from traditional programming. Using pipelines, developers can define each step of an end-to-end machine learning workflow, the company said in a statement. Developers can use the tools to re-run an end-to-end workflow from SageMaker Studio using the same settings to get the same model every time, or they can re-run the workflow with new data to update their models.

To address the longstanding issues with data bias in artificial intelligence and machine learning models, Amazon launched SageMaker Clarify. First announced today, this tool allegedly provides bias detection across the machine learning workflow, so developers can build with an eye toward better transparency on how models were set up. There are open-source tools that can do these tests, Amazon acknowledged, but the tools are manual and require a lot of lifting from developers, according to the company.

Other products designed to simplify the machine learning application development process include SageMaker Debugger, which enables developers to train models faster by monitoring system resource utilization and alerting developers to potential bottlenecks; Distributed Training, which makes it possible to train large, complex, deep learning models faster than current approaches by automatically splitting data across multiple GPUs to accelerate training times; and SageMaker Edge Manager, a machine learning model management tool for edge devices, which allows developers to optimize, secure, monitor and manage models deployed on fleets of edge devices.

Last but not least, Amazon unveiled SageMaker JumpStart, which provides developers with a searchable interface to find algorithms and sample notebooks so they can get started on their machine learning journey. The company said it would give developers new to machine learning the option to select several pre-built machine learning solutions and deploy them into SageMaker environments.

Dec
08
2020
--

AWS announces SageMaker Clarify to help reduce bias in machine learning models

As companies rely increasingly on machine learning models to run their businesses, it’s imperative to include anti-bias measures to ensure these models are not making false or misleading assumptions. Today at AWS re:Invent, AWS introduced Amazon SageMaker Clarify to help reduce bias in machine learning models.

“We are launching Amazon SageMaker Clarify. And what that does is it allows you to have insight into your data and models throughout your machine learning lifecycle,” Bratin Saha, Amazon VP and general manager of machine learning told TechCrunch.

He says that it is designed to analyze the data for bias before you start data prep, so you can find these kinds of problems before you even start building your model.

“Once I have my training data set, I can [look at things like if I have] an equal number of various classes, like do I have equal numbers of males and females or do I have equal numbers of other kinds of classes, and we have a set of several metrics that you can use for the statistical analysis so you get real insight into easier data set balance,” Saha explained.

After you build your model, you can run SageMaker Clarify again to look for similar factors that might have crept into your model as you built it. “So you start off by doing statistical bias analysis on your data, and then post training you can again do analysis on the model,” he said.

There are multiple types of bias that can enter a model due to the background of the data scientists building the model, the nature of the data and how they data scientists interpret that data through the model they built. While this can be problematic in general it can also lead to racial stereotypes being extended to algorithms. As an example, facial recognition systems have proven quite accurate at identifying white faces, but much less so when it comes to recognizing people of color.

It may be difficult to identify these kinds of biases with software as it often has to do with team makeup and other factors outside the purview of a software analysis tool, but Saha says they are trying to make that software approach as comprehensive as possible.

“If you look at SageMaker Clarify it gives you data bias analysis, it gives you model bias analysis, it gives you model explainability it gives you per inference explainability it gives you a global explainability,” Saha said.

Saha says that Amazon is aware of the bias problem and that is why it created this tool to help, but he recognizes that this tool alone won’t eliminate all of the bias issues that can crop up in machine learning models, and they offer other ways to help too.

“We are also working with our customers in various ways. So we have documentation, best practices, and we point our customers to how to be able to architect their systems and work with the system so they get the desired results,” he said.

SageMaker Clarify is available starting to day in multiple regions.

Apr
21
2020
--

AWS and Facebook launch an open-source model server for PyTorch

AWS and Facebook today announced two new open-source projects around PyTorch, the popular open-source machine learning framework. The first of these is TorchServe, a model-serving framework for PyTorch that will make it easier for developers to put their models into production. The other is TorchElastic, a library that makes it easier for developers to build fault-tolerant training jobs on Kubernetes clusters, including AWS’s EC2 spot instances and Elastic Kubernetes Service.

In many ways, the two companies are taking what they have learned from running their own machine learning systems at scale and are putting this into the project. For AWS, that’s mostly SageMaker, the company’s machine learning platform, but as Bratin Saha, AWS VP and GM for Machine Learning Services, told me, the work on PyTorch was mostly motivated by requests from the community. And while there are obviously other model servers like TensorFlow Serving and the Multi Model Server available today, Saha argues that it would be hard to optimize those for PyTorch.

“If we tried to take some other model server, we would not be able to quote optimize it as much, as well as create it within the nuances of how PyTorch developers like to see this,” he said. AWS has lots of experience in running its own model servers for SageMaker that can handle multiple frameworks, but the community was asking for a model server that was tailored toward how they work. That also meant adapting the server’s API to what PyTorch developers expect from their framework of choice, for example.

As Saha told me, the server that AWS and Facebook are now launching as open source is similar to what AWS is using internally. “It’s quite close,” he said. “We actually started with what we had internally for one of our model servers and then put it out to the community, worked closely with Facebook, to iterate and get feedback — and then modified it so it’s quite close.”

Bill Jia, Facebook’s VP of AI Infrastructure, also told me, he’s very happy about how his team and the community has pushed PyTorch forward in recent years. “If you look at the entire industry community — a large number of researchers and enterprise users are using AWS,” he said. “And then we figured out if we can collaborate with AWS and push PyTorch together, then Facebook and AWS can get a lot of benefits, but more so, all the users can get a lot of benefits from PyTorch. That’s our reason for why we wanted to collaborate with AWS.”

As for TorchElastic, the focus here is on allowing developers to create training systems that can work on large distributed Kubernetes clusters where you might want to use cheaper spot instances. Those are preemptible, though, so your system has to be able to handle that, while traditionally, machine learning training frameworks often expect a system where the number of instances stays the same throughout the process. That, too, is something AWS originally built for SageMaker. There, it’s fully managed by AWS, though, so developers never have to think about it. For developers who want more control over their dynamic training systems or to stay very close to the metal, TorchElastic now allows them to recreate this experience on their own Kubernetes clusters.

AWS has a bit of a reputation when it comes to open source and its engagement with the open-source community. In this case, though, it’s nice to see AWS lead the way to bring some of its own work on building model servers, for example, to the PyTorch community. In the machine learning ecosystem, that’s very much expected, and Saha stressed that AWS has long engaged with the community as one of the main contributors to MXNet and through its contributions to projects like Jupyter, TensorFlow and libraries like NumPy.

Dec
02
2019
--

New Amazon tool simplifies delivery of containerized machine learning models

As part of the flurry of announcements coming this week out of AWS re:Invent, Amazon announced the release of Amazon SageMaker Operators for Kubernetes, a way for data scientists and developers to simplify training, tuning and deploying containerized machine learning models.

Packaging machine learning models in containers can help put them to work inside organizations faster, but getting there often requires a lot of extra management to make it all work. Amazon SageMaker Operators for Kubernetes is supposed to make it easier to run and manage those containers, the underlying infrastructure needed to run the models and the workflows associated with all of it.

“While Kubernetes gives customers control and portability, running ML workloads on a Kubernetes cluster brings unique challenges. For example, the underlying infrastructure requires additional management such as optimizing for utilization, cost and performance; complying with appropriate security and regulatory requirements; and ensuring high availability and reliability,” AWS’ Aditya Bindal wrote in a blog post introducing the new feature.

When you combine that with the workflows associated with delivering a machine learning model inside an organization at scale, it becomes part of a much bigger delivery pipeline, one that is challenging to manage across departments and a variety of resource requirements.

This is precisely what Amazon SageMaker Operators for Kubernetes has been designed to help DevOps teams do. “Amazon SageMaker Operators for Kubernetes bridges this gap, and customers are now spared all the heavy lifting of integrating their Amazon SageMaker and Kubernetes workflows. Starting today, customers using Kubernetes can make a simple call to Amazon SageMaker, a modular and fully-managed service that makes it easier to build, train, and deploy machine learning (ML) models at scale,” Bindal wrote.

The promise of Kubernetes is that it can orchestrate the delivery of containers at the right moment, but if you haven’t automated delivery of the underlying infrastructure, you can over (or under) provision and not provide the correct amount of resources required to run the job. That’s where this new tool, combined with SageMaker, can help.

“With workflows in Amazon SageMaker, compute resources are pre-configured and optimized, only provisioned when requested, scaled as needed, and shut down automatically when jobs complete, offering near 100% utilization,” Bindal wrote.

Amazon SageMaker Operators for Kubernetes are available today in select AWS regions.

Powered by WordPress | Theme: Aeros 2.0 by TheBuckmaker.com