Sep
12
2018
--

Nvidia launches the Tesla T4, its fastest data center inferencing platform yet

Nvidia today announced its new GPU for machine learning and inferencing in the data center. The new Tesla T4 GPUs (where the ‘T’ stands for Nvidia’s new Turing architecture) are the successors to the current batch of P4 GPUs that virtually every major cloud computing provider now offers. Google, Nvidia said, will be among the first to bring the new T4 GPUs to its Cloud Platform.

Nvidia argues that the T4s are significantly faster than the P4s. For language inferencing, for example, the T4 is 34 times faster than using a CPU and more than 3.5 times faster than the P4. Peak performance for the P4 is 260 TOPS for 4-bit integer operations and 65 TOPS for floating point operations. The T4 sits on a standard low-profile 75 watt PCI-e card.

What’s most important, though, is that Nvidia designed these chips specifically for AI inferencing. “What makes Tesla T4 such an efficient GPU for inferencing is the new Turing tensor core,” said Ian Buck, Nvidia’s VP and GM of its Tesla data center business. “[Nvidia CEO] Jensen [Huang] already talked about the Tensor core and what it can do for gaming and rendering and for AI, but for inferencing — that’s what it’s designed for.” In total, the chip features 320 Turing Tensor cores and 2,560 CUDA cores.

In addition to the new chip, Nvidia is also launching a refresh of its TensorRT software for optimizing deep learning models. This new version also includes the TensorRT inference server, a fully containerized microservice for data center inferencing that plugs seamlessly into an existing Kubernetes infrastructure.

 

 

May
30
2018
--

Nvidia launches colossal HGX-2 cloud server to power HPC and AI

Nvidia launched a monster box yesterday called the HGX-2, and it’s the stuff that geek dreams are made of. It’s a cloud server that is purported to be so powerful it combines high performance computing with artificial intelligence requirements in one exceptionally compelling package.

You know you want to know the specs, so let’s get to it: It starts with 16x NVIDIA Tesla V100 GPUs. That’s good for 2 petaFLOPS for AI with low precision, 250 teraFLOPS
for medium precision and 125 teraFLOPS for those times when you need the highest precision. It comes standard with a 1/2 a terabyte of memory and 12 Nvidia NVSwitches, which enable GPU to GPU communications at 300 GB per second. They have doubled the capacity from the HGX-1 released last year.

Chart: Nvidia

Paresh Kharya, group product marketing manager for Nvidia’s Tesla data center products says this communication speed enables them to treat the GPUs essentially as a one giant, single GPU. “And what that allows [developers] to do is not just access that massive compute power, but also access that half a terabyte of GPU memory as a single memory block in their programs,” he explained.

Graphic: Nvidia

Unfortunately you won’t be able to buy one of these boxes. In fact, Nvidia is distributing them strictly to resellers, who will likely package these babies up and sell them to hyperscale datacenters and cloud providers. The beauty of this approach for cloud resellers is that when they buy it, they have the entire range of precision in a single box, Kharya said

“The benefit of the unified platform is as companies and cloud providers are building out their infrastructure, they can standardize on a single unified architecture that supports the entire range of high performance workloads. So whether it’s AI, or whether it’s high performance simulations the entire range of workloads is now possible in just a single platform,”Kharya explained.

He points out this is particularly important in large scale datacenters. “In hyperscale companies or cloud providers, the main benefit that they’re providing is the economies of scale. If they can standardize on the fewest possible architectures, they can really maximize the operational efficiency. And what HGX allows them to do is to standardize on that single unified platform,” he added.

As for developers, they can write programs that take advantage of the underlying technologies and program in the exact level of precision they require from a single box.

The HGX-2 powered servers will be available later this year from partner resellers including Lenovo, QCT, Supermicro and Wiwynn.

Mar
27
2018
--

Pure Storage teams with Nvidia on GPU-fueled Flash storage solution for AI

As companies gather increasing amounts of data, they face a choice over bottlenecks. They can have it in the storage component or the backend compute system. Some companies have attacked the problem by using GPUs to streamline the back end problem or Flash storage to speed up the storage problem. Pure Storage wants to give customers the best of both worlds.

Today it announced, Airi, a complete data storage solution for AI workloads in a box.

Under the hood Airi starts with a Pure Storage FlashBlade, a storage solution that Pure created specifically with AI and machine learning kind of processing in mind. NVidia contributes the pure power with four NVIDIA DGX-1 supercomputers, delivering four petaFLOPS of performance with NVIDIA ® Tesla ® V100 GPUs. Arista provides the networking hardware to make it all work together with Arista 100GbE switches. The software glue layer comes from the NVIDIA GPU Cloud deep learning stack and Pure Storage AIRI Scaling Toolkit.

Photo: Pure Storage

One interesting aspect of this deal is that the FlashBlade product operates as a separate product inside of the Pure Storage organization. They have put together a team of engineers with AI and data pipeline understanding with the focus inside the company on finding ways to move beyond the traditional storage market and find out where the market is going.

This approach certainly does that, but the question is do companies want to chase the on-prem hardware approach or take this kind of data to the cloud. Pure would argue that the data gravity of AI workloads would make this difficult to achieve with a cloud solution, but we are seeing increasingly large amounts of data moving to the cloud with the cloud vendors providing tools for data scientists to process that data.

If companies choose to go the hardware route over the cloud, each vendor in this equation — whether Nvidia, Pure Storage or Arista — should benefit from a multi-vendor sale. The idea ultimately is to provide customers with a one-stop solution they can install quickly inside a data center if that’s the approach they want to take.

Mar
15
2018
--

The red-hot AI hardware space gets even hotter with $56M for a startup called SambaNova Systems

Another massive financing round for an AI chip company is coming in today, this time for SambaNova Systems — a startup founded by a pair of Stanford professors and a longtime chip company executive — to build out the next generation of hardware to supercharge AI-centric operations.

SambaNova joins an already quite large class of startups looking to attack the problem of making AI operations much more efficient and faster by rethinking the actual substrate where the computations happen. The GPU has become increasingly popular among developers for its ability to handle the kinds of lightweight mathematics in very speedy fashion necessary for AI operations. Startups like SambaNova look to create a new platform from scratch, all the way down to the hardware, that is optimized exactly for those operations. The hope is that by doing that, it will be able to outclass a GPU in terms of speed, power usage, and even potentially the actual size of the chip. SambaNova today said it has raised a massive $56 million series A financing round was co-led by GV and Walden International, with participation from Redline Capital and Atlantic Bridge Ventures.

SambaNova is the product of technology from Kunle Olukotun and Chris Ré, two professors at Stanford, and led by former Oracle SVP of development Rodrigo Liang, who was also a VP at Sun for almost 8 years. When looking at the landscape, the team at SambaNova looked to work their way backwards, first identifying what operations need to happen more efficiently and then figuring out what kind of hardware needs to be in place in order to make that happen. That boils down to a lot of calculations stemming from a field of mathematics called linear algebra done very, very quickly, but it’s something that existing CPUs aren’t exactly tuned to do. And a common criticism from most of the founders in this space is that Nvidia GPUs, while much more powerful than CPUs when it comes to these operations, are still ripe for disruption.

“You’ve got these huge [computational] demands, but you have the slowing down of Moore’s law,” Olukotun said. “The question is, how do you meet these demands while Moore’s law slows. Fundamentally you have to develop computing that’s more efficient. If you look at the current approaches to improve these applications based on multiple big cores or many small, or even FPGA or GPU, we fundamentally don’t think you can get to the efficiencies you need. You need an approach that’s different in the algorithms you use and the underlying hardware that’s also required. You need a combination of the two in order to achieve the performance and flexibility levels you need in order to move forward.”

While a $56 million funding round for a series A might sound massive, it’s becoming a pretty standard number for startups looking to attack this space, which has an opportunity to beat massive chipmakers and create a new generation of hardware that will be omnipresent among any device that is built around artificial intelligence — whether that’s a chip sitting on an autonomous vehicle doing rapid image processing to potentially even a server within a healthcare organization training models for complex medical problems. Graphcore, another chip startup, got $50 million in funding from Sequoia Capital, while Cerebras Systems also received significant funding from Benchmark Capital.

Olukotun and Liang wouldn’t go into the specifics of the architecture, but they are looking to redo the operational hardware to optimize for the AI-centric frameworks that have become increasingly popular in fields like image and speech recognition. At its core, that involves a lot of rethinking of how interaction with memory occurs and what happens with heat dissipation for the hardware, among other complex problems. Apple, Google with its TPU, and reportedly Amazon have taken an intense interest in this space to design their own hardware that’s optimized for products like Siri or Alexa, which makes sense because dropping that latency to as close to zero as possible with as much accuracy as possible in the end improves the user experience. A great user experience leads to more lock-in for those platforms, and while the larger players may end up making their own hardware, GV’s Dave Munichiello — who is joining the company’s board — says this is basically a validation that everyone else is going to need the technology soon enough.

“Large companies see a need for specialized hardware and infrastructure,” he said. “AI and large-scale data analytics are so essential to providing services the largest companies provide that they’re willing to invest in their own infrastructure, and that tells us more investment is coming. What Amazon and Google and Microsoft and Apple are doing today will be what the rest of the Fortune 100 are investing in in 5 years. I think it just creates a really interesting market and an opportunity to sell a unique product. It just means the market is really large, if you believe in your company’s technical differentiation, you welcome competition.”

There is certainly going to be a lot of competition in this area, and not just from those startups. While SambaNova wants to create a true platform, there are a lot of different interpretations of where it should go — such as whether it should be two separate pieces of hardware that handle either inference or machine training. Intel, too, is betting on an array of products, as well as a technology called Field Programmable Gate Arrays (or FPGA), which would allow for a more modular approach in building hardware specified for AI and are designed to be flexible and change over time. Both Munichiello’s and Olukotun’s arguments are that these require developers who have a special expertise of FPGA, which is a sort of niche-within-a-niche that most organizations will probably not have readily available.

Nvidia has been a massive benefactor in the explosion of AI systems, but it clearly exposed a ton of interest in investing in a new breed of silicon. There’s certainly an argument for developer lock-in on Nvidia’s platforms like Cuda. But there are a lot of new frameworks, like TensorFlow, that are creating a layer of abstraction that are increasingly popular with developers. That, too represents an opportunity for both SambaNova and other startups, who can just work to plug into those popular frameworks, Olukotun said. Cerebras Systems CEO Andrew Feldman actually also addressed some of this on stage at the Goldman Sachs Technology and Internet Conference last month.

“Nvidia has spent a long time building an ecosystem around their GPUs, and for the most part, with the combination of TensorFlow, Google has killed most of its value,” Feldman said at the conference. “What TensorFlow does is, it says to researchers and AI professionals, you don’t have to get into the guts of the hardware. You can write at the upper layers and you can write in Python, you can use scripts, you don’t have to worry about what’s happening underneath. Then you can compile it very simply and directly to a CPU, TPU, GPU, to many different hardwares, including ours. If in order to do that work, you have to be the type of engineer that can do hand-tuned assembly or can live deep in the guts of hardware, there will be no adoption… We’ll just take in their TensorFlow, we don’t have to worry about anything else.”

(As an aside, I was once told that Cuda and those other lower-level platforms are really used by AI wonks like Yann LeCun building weird AI stuff in the corners of the Internet.)

There are, also, two big question marks for SambaNova: first, it’s very new, having started in just November while many of these efforts for both startups and larger companies have been years in the making. Munichiello’s answer to this is that the development for those technologies did, indeed, begin a while ago — and that’s not a terrible thing as SambaNova just gets started in the current generation of AI needs. And the second, among some in the valley, is that most of the industry just might not need hardware that’s does these operations in a blazing fast manner. The latter, you might argue, could just be alleviated by the fact that so many of these companies are getting so much funding, with some already reaching close to billion-dollar valuations.

But, in the end, you can now add SambaNova to the list of AI startups that have raised enormous rounds of funding — one that stretches out to include a myriad of companies around the world like Graphcore and Cerebras Systems, as well as a lot of reported activity out of China with companies like Cambricon Technology and Horizon Robotics. This effort does, indeed, require significant investment not only because it’s hardware at its base, but it has to actually convince customers to deploy that hardware and start tapping the platforms it creates, which supporting existing frameworks hopefully alleviates.

“The challenge you see is that the industry, over the last ten years, has underinvested in semiconductor design,” Liang said. “If you look at the innovations at the startup level all the way through big companies, we really haven’t pushed the envelope on semiconductor design. It was very expensive and the returns were not quite as good. Here we are, suddenly you have a need for semiconductor design, and to do low-power design requires a different skillset. If you look at this transition to intelligent software, it’s one of the biggest transitions we’ve seen in this industry in a long time. You’re not accelerating old software, you want to create that platform that’s flexible enough [to optimize these operations] — and you want to think about all the pieces. It’s not just about machine learning.”

Sep
21
2017
--

Google Cloud adds support for more powerful Nvidia GPUs

 Google Cloud Platform announced support for some powerful Nvidia GPUs on Google Compute Engine today. For starters, the company is making Nvidia K80 GPUs generally available. At the same time, it’s launching support for Nvidia P100 GPUs in Beta along with a new sustained pricing model. For companies working with machine learning workloads, having access to GPUs in the cloud provides… Read More

Aug
07
2017
--

IBM touts improved distributed training time for visual recognition models

 Two months ago, Facebook’s AI Research Lab (FAIR) published some impressive training times for massively distributed visual recognition models. Today IBM is firing back with some numbers of its own. IBM’s research groups says it was able to train ResNet-50 for 1k classes in 50 minutes across 256 GPUs — which is just the polite way of saying “my model trains faster than… Read More

Aug
02
2017
--

HP’s new Nvidia-powered backpack VR PC is designed for work, not play

 HP has a new entrant in that most curious PC niche – the backpack computer. A product of the virtual reality computing wave, the backpack PC provides all the power needed to drive high-quality VR headsets like Oculus Rift and HTC Vive, but with a form factor that allows the user to roam about untethered. The new HP Z VR Backpack is a bit different from the rest of the field, though,… Read More

May
09
2017
--

Nvidia is surging after its income more than doubled year-over-year

 Nvidia’s ballooning GPU business and big bets on divisions like autonomous driving continue to look better and better, with the company’s shares jumping more than 10% after it reported its first-quarter earnings. In the first quarter this year, the company said it brought in $507 million in net income — up from $208 million in the first quarter a year ago. That doubled… Read More

Jan
05
2017
--

Nvidia hits prime time at CES this year

img_0031 What do you do when you rapidly become one of the most important chip manufacturers in the world and your stock price more than triples in a single year? For Nvidia, that means you throw a massive keynote stuffed with announcements that are setting the stage for a suite of products built around your core technology — building GPUs — that will make you the center of the… Read More

Powered by WordPress | Theme: Aeros 2.0 by TheBuckmaker.com