Jan
26
2021
--

Run:AI raises $30M Series B for its AI compute platform

Run:AI, a Tel Aviv-based company that helps businesses orchestrate and optimize their AI compute infrastructure, today announced that it has raised a $30 million Series B round. The new round was led by Insight Partners, with participation from existing investors TLV Partners and S Capital. This brings the company’s total funding to date to $43 million.

At the core of Run:AI’s platform is the ability to effectively virtualize and orchestrate AI workloads on top of its Kubernetes-based scheduler. Traditionally, it was always hard to virtualize GPUs, so even as demand for training AI models has increased, a lot of the physical GPUs often set idle for long periods because it was hard to dynamically allocate them between projects.

Image Credits: Run.AI

The promise behind Run:AI’s platform is that it allows its users to abstract away all of the AI infrastructure and pool all of their GPU resources — no matter whether in the cloud or on-premises. This also makes it easier for businesses to share these resources between users and teams. In the process, IT teams also get better insights into how their compute resources are being used.

“Every enterprise is either already rearchitecting themselves to be built around learning systems powered by AI, or they should be,” said Lonne Jaffe, managing director at Insight Partners and now a board member at Run:AI.” Just as virtualization and then container technology transformed CPU-based workloads over the last decades, Run:AI is bringing orchestration and virtualization technology to AI chipsets such as GPUs, dramatically accelerating both AI training and inference. The system also future-proofs deep learning workloads, allowing them to inherit the power of the latest hardware with less rework. In Run:AI, we’ve found disruptive technology, an experienced team and a SaaS-based market strategy that will help enterprises deploy the AI they’ll need to stay competitive.”

Run:AI says that it is currently working with customers in a wide variety of industries, including automotive, finance, defense, manufacturing and healthcare. These customers, the company says, are seeing their GPU utilization increase from 25 to 75% on average.

“The new funds enable Run:AI to grow the company in two important areas: first, to triple the size of our development team this year,” the company’s CEO Omri Geller told me. “We have an aggressive roadmap for building out the truly innovative parts of our product vision — particularly around virtualizing AI workloads — a bigger team will help speed up development in this area. Second, a round this size enables us to quickly expand sales and marketing to additional industries and markets.”

Nov
02
2020
--

AWS launches its next-gen GPU instances

AWS today announced the launch of its newest GPU-equipped instances. Dubbed P4d, these new instances are launching a decade after AWS launched its first set of Cluster GPU instances. This new generation is powered by Intel Cascade Lake processors and eight of Nvidia’s A100 Tensor Core GPUs. These instances, AWS promises, offer up to 2.5x the deep learning performance of the previous generation — and training a comparable model should be about 60% cheaper with these new instances.

Image Credits: AWS

For now, there is only one size available, the p4d.24xlarge instance, in AWS slang, and the eight A100 GPUs are connected over Nvidia’s NVLink communication interface and offer support for the company’s GPUDirect interface as well.

With 320 GB of high-bandwidth GPU memory and 400 Gbps networking, this is obviously a very powerful machine. Add to that the 96 CPU cores, 1.1 TB of system memory and 8 TB of SSD storage and it’s maybe no surprise that the on-demand price is $32.77 per hour (though that price goes down to less than $20/hour for one-year reserved instances and $11.57 for three-year reserved instances.

Image Credits: AWS

On the extreme end, you can combine 4,000 or more GPUs into an EC2 UltraCluster, as AWS calls these machines, for high-performance computing workloads at what is essentially a supercomputer-scale machine. Given the price, you’re not likely to spin up one of these clusters to train your model for your toy app anytime soon, but AWS has already been working with a number of enterprise customers to test these instances and clusters, including Toyota Research Institute, GE Healthcare and Aon.

“At [Toyota Research Institute], we’re working to build a future where everyone has the freedom to move,” said Mike Garrison, Technical Lead, Infrastructure Engineering at TRI. “The previous generation P3 instances helped us reduce our time to train machine learning models from days to hours and we are looking forward to utilizing P4d instances, as the additional GPU memory and more efficient float formats will allow our machine learning team to train with more complex models at an even faster speed.”

Oct
10
2018
--

Nvidia launches Rapids to help bring GPU acceleration to data analytics

Nvidia, together with partners like IBM, HPE, Oracle, Databricks and others, is launching a new open-source platform for data science and machine learning today. Rapids, as the company is calling it, is all about making it easier for large businesses to use the power of GPUs to quickly analyze massive amounts of data and then use that to build machine learning models.

“Businesses are increasingly data-driven,” Nvidia’s VP of Accelerated Computing Ian Buck told me. “They sense the market and the environment and the behavior and operations of their business through the data they’ve collected. We’ve just come through a decade of big data and the output of that data is using analytics and AI. But most it is still using traditional machine learning to recognize complex patterns, detect changes and make predictions that directly impact their bottom line.”

The idea behind Rapids then is to work with the existing popular open-source libraries and platforms that data scientists use today and accelerate them using GPUs. Rapids integrates with these libraries to provide accelerated analytics, machine learning and — in the future — visualization.

Rapids is based on Python, Buck noted; it has interfaces that are similar to Pandas and Scikit, two very popular machine learning and data analysis libraries, and it’s based on Apache Arrow for in-memory database processing. It can scale from a single GPU to multiple notes and IBM notes that the platform can achieve improvements of up to 50x for some specific use cases when compared to running the same algorithms on CPUs (though that’s not all that surprising, given what we’ve seen from other GPU-accelerated workloads in the past).

Buck noted that Rapids is the result of a multi-year effort to develop a rich enough set of libraries and algorithms, get them running well on GPUs and build the relationships with the open-source projects involved.

“It’s designed to accelerate data science end-to-end,” Buck explained. “From the data prep to machine learning and for those who want to take the next step, deep learning. Through Arrow, Spark users can easily move data into the Rapids platform for acceleration.”

Indeed, Spark is surely going to be one of the major use cases here, so it’s no wonder that Databricks, the company founded by the team behind Spark, is one of the early partners.

“We have multiple ongoing projects to integrate Spark better with native accelerators, including Apache Arrow support and GPU scheduling with Project Hydrogen,” said Spark founder Matei Zaharia in today’s announcement. “We believe that RAPIDS is an exciting new opportunity to scale our customers’ data science and AI workloads.”

Nvidia is also working with Anaconda, BlazingDB, PyData, Quansight and scikit-learn, as well as Wes McKinney, the head of Ursa Labs and the creator of Apache Arrow and Pandas.

Another partner is IBM, which plans to bring Rapids support to many of its services and platforms, including its PowerAI tools for running data science and AI workloads on GPU-accelerated Power9 servers, IBM Watson Studio and Watson Machine Learning and the IBM Cloud with its GPU-enabled machines. “At IBM, we’re very interested in anything that enables higher performance, better business outcomes for data science and machine learning — and we think Nvidia has something very unique here,” Rob Thomas, the GM of IBM Analytics told me.

“The main benefit to the community is that through an entirely free and open-source set of libraries that are directly compatible with the existing algorithms and subroutines that their used to — they now get access to GPU-accelerated versions of them,” Buck said. He also stressed that Rapids isn’t trying to compete with existing machine learning solutions. “Part of the reason why Rapids is open source is so that you can easily incorporate those machine learning subroutines into their software and get the benefits of it.”

Sep
12
2018
--

Nvidia launches the Tesla T4, its fastest data center inferencing platform yet

Nvidia today announced its new GPU for machine learning and inferencing in the data center. The new Tesla T4 GPUs (where the ‘T’ stands for Nvidia’s new Turing architecture) are the successors to the current batch of P4 GPUs that virtually every major cloud computing provider now offers. Google, Nvidia said, will be among the first to bring the new T4 GPUs to its Cloud Platform.

Nvidia argues that the T4s are significantly faster than the P4s. For language inferencing, for example, the T4 is 34 times faster than using a CPU and more than 3.5 times faster than the P4. Peak performance for the P4 is 260 TOPS for 4-bit integer operations and 65 TOPS for floating point operations. The T4 sits on a standard low-profile 75 watt PCI-e card.

What’s most important, though, is that Nvidia designed these chips specifically for AI inferencing. “What makes Tesla T4 such an efficient GPU for inferencing is the new Turing tensor core,” said Ian Buck, Nvidia’s VP and GM of its Tesla data center business. “[Nvidia CEO] Jensen [Huang] already talked about the Tensor core and what it can do for gaming and rendering and for AI, but for inferencing — that’s what it’s designed for.” In total, the chip features 320 Turing Tensor cores and 2,560 CUDA cores.

In addition to the new chip, Nvidia is also launching a refresh of its TensorRT software for optimizing deep learning models. This new version also includes the TensorRT inference server, a fully containerized microservice for data center inferencing that plugs seamlessly into an existing Kubernetes infrastructure.

 

 

Sep
21
2017
--

Google Cloud adds support for more powerful Nvidia GPUs

 Google Cloud Platform announced support for some powerful Nvidia GPUs on Google Compute Engine today. For starters, the company is making Nvidia K80 GPUs generally available. At the same time, it’s launching support for Nvidia P100 GPUs in Beta along with a new sustained pricing model. For companies working with machine learning workloads, having access to GPUs in the cloud provides… Read More

Aug
07
2017
--

IBM touts improved distributed training time for visual recognition models

 Two months ago, Facebook’s AI Research Lab (FAIR) published some impressive training times for massively distributed visual recognition models. Today IBM is firing back with some numbers of its own. IBM’s research groups says it was able to train ResNet-50 for 1k classes in 50 minutes across 256 GPUs — which is just the polite way of saying “my model trains faster than… Read More

May
09
2017
--

Nvidia is surging after its income more than doubled year-over-year

 Nvidia’s ballooning GPU business and big bets on divisions like autonomous driving continue to look better and better, with the company’s shares jumping more than 10% after it reported its first-quarter earnings. In the first quarter this year, the company said it brought in $507 million in net income — up from $208 million in the first quarter a year ago. That doubled… Read More

Powered by WordPress | Theme: Aeros 2.0 by TheBuckmaker.com