Aug
08
2018
--

AI Chip startup Cerebras Systems picks up a former Intel top exec

While some of the largest technology companies in the world are racing to figure out the next generation of machine learning-focused chips that will support devices — whether that’s data centers or edge devices — there’s a whole class of startups that are racing to get there first.

That includes Cerebras Systems, one of the startups that has raised a significant amount of capital, which is looking to continue targeting next-generation machine learning operations with the hiring of Dhiraj Mallick as its Vice President of Engineering and Business Development. Prior to joining Cerebras, Mallick served as the VP of architecture and CTO of Intel’s data center group. That group generated more than $5.5 billion in the second quarter this year, up from nearly $4.4 billion in the second quarter of 2017, and has generated more than $10 billion in revenue in the first half of this year. Prior to Intel, Mallick spent time at AMD and SeaMicro.

That latter part is going to be a big part of the puzzle, as Google looks to lock in customers in its cloud platform with tools like the Tensor Processing Unit, the third generation of which was announced at Google I/O earlier this year. Data centers are able to handle some of the heavy lifting when it comes to training the models that handle machine learning processes like image recognition as they don’t necessarily have to worry about space (or partly heat, in the case of the TPU running with liquid cooling) constraints. Google is betting on that with the TPU, optimizing its hardware for its TensorFlow machine learning framework and trying to build a whole developer ecosystem that it can lock into its hardware with that and its new edge-focused TPU for inference.

Cerebras Systems is one of a class of startups that want to figure out what the next generation of machine hardware looks like, and most of them have raised tens of millions of dollars. It’s one of the startups that has been working on its technology for a considerable amount of time. Others include Mythic, SambaNova, Graphcore, and more than a dozen others that are all looking at different pieces of the machine learning ecosystem. But the end goal for all of them is to capture part of the machine learning process — whether that’s inference on the device or training in a server somewhere — and optimize a piece of hardware for just that.

And while Google looks to lock in developers into its TensorFlow ecosystem with the TPU, that there are a number of different frameworks for machine learning may actually open the door for some startups like the ones mentioned above. There are frameworks like PyTorch and Caffe2, and having a kind of third-party piece of equipment that works across a number of different developer frameworks may end up being attractive to some companies. Nvidia has been one of the largest beneficiaries here of the emergence of GPUs as a go-to piece of hardware for machine learning, but these startups all bet on room for a new piece of hardware that’s even better at those specialized operations.

Apr
02
2018
--

SiFive gets $50.6M to help companies get their custom chip designs out the door

With the race to next-generation silicon in full swing, the waterfall of venture money flowing into custom silicon startups is already showing an enormous amount of potential for some more flexible hardware for an increasingly changing technology landscape — and Naveed Sherwani hopes to tap that for everyone else.

That’s the premise of SiFive, a startup that’s designed to help entrepreneurs — or any company — come up with a custom designed chip for their needs. But rather than having to raise tens of millions of dollars from a venture firm or have a massive production system in place, SiFive’s goal is to help get that piece of silicon in the hands of the developer quickly so they can see if it actually works based off a set of basic hardware and IP offered, and then figure out when and how to move it into full-scale production. The company starts by offering templates and then allows them to make some modifications for what eventually ends up as a piece of RISC-V silicon that’s in their hands. SiFive today said it has raised $50.6 million in venture financing in a round led by Sutter Hill Ventures, Spark Capital, and Osage University Partners.

“The way we view it, is that we think we should not depend on people learning special languages and things of that nature to be able to modify the architecture and enhance the architecture,” Sherwani said. “What we believe is there could be a high-level interface, which is what we’re building, which will allow people to take existing cores, bring them into their design space, and then apply a configuration. Moving those configurations, you can modify the core, and then you can get the new modified core. That’s the approach we take, we don’t have to learn a special language or be an expert, it’s the way we present the core. We’d like to start with cores that are verified, and each of these modifications does not cause to become non-verifiable.”

SiFive is based on a design framework for silicon called RISC-V. You could consider it a kind of open source analog to designs by major chip fab firms, but the goal for RISC-V chips is to lean on the decades of experience since the original piece of silicon came out of Intel to develop something that is less messy while still getting the right tasks done. Sherwani says that RISC-V chips have more than 50 instruction sets while common chips will have more than 1,000. By nature, they aren’t at the kind of scale of an Intel, so the kinds of efficiencies those firms might have don’t exist. But SiFive hopes to serve a wide array of varying needs rather than mass-producing a single style of silicon.

There are two flows for developers looking to build out silicon using SiFive. First is the prototype flow, where developers will get a chance to spec out their silicon and figure out their specific needs. The goal there is to get something into the hands of the developer they can use to showcase their ideas or technology, and SiFive works with IP vendors and other supply chain partners — during this time, developers aren’t paying for IP. Once the case is proved out (and the startup has, perhaps, raised money based on that idea) they can switch to a production flow with SiFive where they will start paying for IP and services. There’s also a potential marketplace element as more and more people come up with novel ideas for operational cores.

“For any segment in the market there will be a few templates available,” Sherwani said. “We’ll have some tools and methodologies there, and among all the various templates are available show what would be the best for [that customer]. We also have an app store — we are expecting people who have designed cores who are willing to share it, because they don’t need it to be proprietary. If anyone uses that template, then whatever price they can put on it, they can make some money doing that. This whole idea of marketplaces will get more people excited.”

As there is an intense rush to develop new customized silicon, it may be that services like the ones offered by SiFive become more and more necessary. But there’s another element to the bet behind SiFive: making the chip itself less ambiguous and trying to remove black boxes. That doesn’t necessarily make it wildly more secure than the one next to it, but at the very least, it means when there is a major security flaw like Intel’s Spectre problems, there may be a bit more tolerance from the developer community because there are fewer black boxes.

“All these complications are there and unless you have all this expertise, you can’t do a chip,” Sherwani said. “Our vision is that we deliver the entire chip experience to that platform and people can be able to log in. They don’t need a team, any tools, they don’t need FPGAs because all those will be available on the web. As a result the cost goes down because it’s a shared economy, they’re sharing tools, and that is how we think dramatically you can do chips at much lower cost.”

While there is a lot of venture money flowing into the AI chip space — with many different interpretations of what that hardware looks like — Sherwani said the benefit of working with SiFive is to be able to rapidly adapt an idea to a changing algorithm. Developers have already proven out a lot of different tools and frameworks, but once a piece of silicon is in production it’s not easy to change on the fly. Should those best practices or algorithms change, developers will have an opportunity to reassess and redesign the chip as quickly as possible.

The idea of that custom silicon is going to be a big theme going forward as more and more use cases emerge that could be easier with a customized piece of hardware. Already there are startups like Mythic and SambaNova Systems, which have raised tens of millions of dollars and specialize in the rapid-fire calculations for typical AI processes. But this kind of technology is now showing up in devices ranging from an autonomous vehicle to a fridge, and each use case may carry different needs. Intel and other chip design firms probably can’t hit every niche, and the one-size-fits-all (or even something more modular like an FPGA from Intel) might not hit each sweet spot. That, in theory, is the hole that a company like SiFive could fill.

Mar
20
2018
--

Mythic nets $40M to create a new breed of efficient AI-focused hardware

Another huge financing round is coming in for an AI company today, this time for a startup called Mythic getting a fresh $40 million as it appears massive deals are closing left and right in the sector.

Mythic particularly focuses on the inference side of AI operations — basically making the calculation on the spot for something based off an extensively trained model. The chips are designed to be low power, small, and achieve the same kind of performance you’d expect from a GPU in terms of the lightning-fast operations that algorithms need to perform to figure out whether or not that thing your car is about to run into is a cat or just some text on the road. SoftBank Ventures led this most-recent round of funding, with a strategic investment also coming from Lockheed Martin Ventures. ARM executive Rene Haas will also be joining the company’s board of directors.

“The key to getting really high performance and really good energy efficiency is to keep everything on the chip,” Henry said. “The minute you have to go outside the chip to memory, you lose all performance and energy. It just goes out the window. Knowing that, we found that you can actually leverage flash memory in a very special way. The limit there is, it’s for inference only, but we’re only going after the inference market — it’s gonna be huge. On top of that, the challenge is getting the processors and memory as close together as possible so you don’t have to move around the data on the chip.”

Mythic, like other startups, is looking to ease the back-and-forth trips to memory on the processors in order to speed things up and lower the power consumption, and CEO Michael Henry says the company has figured out how to essentially do the operations — based in a field of mathematics called linear algebra — on flash memory itself.

Mythic’s approach is designed to be what Henry calls more analog. To visualize how it might work, imagine a set-up in Minecraft, with a number of different strings of blocks leading to an end gate. If you flipped a switch to turn 50 of those strings on with some unit value, leaving the rest off, and joined them at the end and saw the combined final result of the power, you would have completed something similar to an addition operation leading to a sum of 50 units. Mythic’s chips are designed to do something not so dissimilar, finding ways to complete those kinds of analog operations for addition and multiplication in order to handle the computational requirements for an inference operation. The end result, Henry says, consumes less power and dissipates less heat while still getting just enough accuracy to get the right solution (more technically: the calculations are 8-bit results).

After that, the challenge is sticking a layer on top of that to make it look and behave like a normal chip to a developer. The goal is to, like other players in the AI hardware space, just plug into frameworks like TensorFlow. Those frameworks abstract out all the complicated tooling and tuning required for such a specific piece of hardware and make it very approachable and easy for developers to start building machine learning projects. Andrew Feldman, CEO of another AI hardware startup called Cerebras Systems, said at the Goldman Sachs Technology and Internet conference last month that frameworks like TensorFlow had  most of the value Nvidia had building up an ecosystem for developers on its own system.

Henry, too, is a big TensorFlow fan. And for good reason: it’s because of frameworks like TensorFlow that allow next-generation chip ideas to even get off the ground in the first place. These kinds of frameworks, which have become increasingly popular with developers, have abstracted out the complexity of working with specific low-level hardware like a field programmable gate array (FPGA) or a GPU. That’s made building machine learning-based operations much easier for developers and led to an explosion of activity when it comes to machine learning, whether it’s speech or image recognition among a number of other use cases.

“Things like TensorFlow make our lives so much easier,” Henry said. “Once you have a neural network described on TensorFlow, it’s on us to take that and translate that onto our chip. We can abstract that difficulty by having an automatic compiler.”

While many of these companies are talking about getting massive performance gains over a GPU — and, to be sure, Henry hopes that’ll be the case — the near term goal for Mythic is to match the performance of a $1,000 GPU while showing it can take up less space and consume less power. There’s a market for the card that customers can hot swap in right away. Henry says the company is focused on using a PCI-E interface, a very common plug-and-play system, and that’s it.

The challenge for Mythic, however, is going to get into the actual design of some of the hardware that comes out. It’s one thing to sell a bunch of cards that companies can stick into their existing hardware, but it’s another to get embedded into the actual pieces of hardware themselves — which is what’s going to need to happen if it wants to be a true workhorse for devices on the edge, like security cameras or things handling speech recognition. That makes the buying cycle a little more difficult, but at the same time, there will be billions of devices out there that need advanced hardware to power their inference operations.

“If we can sell a PCI card, you buy it and drop it in right away, but those are usually for low-volume, high-selling price products,” Henry said. “The other customers we serve design you into the hardware products. That’s a longer cycle, that can take upwards of a year. For that, typically the volumes are much higher. The nice thing is that you’re really really sticky. If they design you into a product you’re really sticky. We can go after both, we can go after board sales, and then go after design.”

There are probably going to be two big walls to Mythic, much less any of the other players out there. The first is that none of these companies have shipped a product. While Mythic, or other companies, might have a proof-of-concept chip that can drop on the table, getting a production-ready piece of next-generation silicon is a dramatic undertaking. Then there’s the process of not only getting people to buy the hardware, but actually convincing them that they’ll have the systems in place to ensure that developers will build on that hardware. Mythic says it plans to have a sample for customers by the end of the year, with a production product by 2019.

That also explains why Mythic, along with those other startups, are able to raise enormous rounds of money — which means there’s going to be a lot of competition amongst all of them. Here’s a quick list of what fundraising has happened so far: SambaNova Systems raised $56 million last week; Graphcore raised $50 million in November last year; Cerebras Systems’s first round was $25 million in December 2016; and this isn’t even counting an increasing amount of activity happening among companies in China. There’s still definitely a segment of investors that consider the space way too hot (and there is, indeed, a ton of funding) or potentially unnecessary if you don’t need the bleeding edge efficiency or power of these products.

And there are, of course, the elephants in the room in the form of Nvidia and to a lesser extent Intel. The latter is betting big on FPGA and other products, while Nvidia has snapped up most of the market thanks to GPUs being much more efficient at the kind of math needed for AI. The play for all these startups is they can be faster, more efficient, or in the case of Mythic, cheaper than all those other options. It remains to be seen whether they’ll unseat Nvidia, but nonetheless there’s an enormous amount of funding flowing in.

“The question is, is someone going to be able to beat Nvidia when they have the valuation and cash reserves,” Henry said. “But the thing, is we’re in a different market. We’re going after the edge, we’re going after things embedded inside phones and cars and drones and robotics, for applications like AR and VR, and it’s just really a different market. When investors analyze us they have to think of us differently. They don’t think, is this the one that wins Nvidia, they think, are one or more of these powder keg markets explode. It’s a different conversation for us because we’re an edge company.”

Mar
15
2018
--

The red-hot AI hardware space gets even hotter with $56M for a startup called SambaNova Systems

Another massive financing round for an AI chip company is coming in today, this time for SambaNova Systems — a startup founded by a pair of Stanford professors and a longtime chip company executive — to build out the next generation of hardware to supercharge AI-centric operations.

SambaNova joins an already quite large class of startups looking to attack the problem of making AI operations much more efficient and faster by rethinking the actual substrate where the computations happen. The GPU has become increasingly popular among developers for its ability to handle the kinds of lightweight mathematics in very speedy fashion necessary for AI operations. Startups like SambaNova look to create a new platform from scratch, all the way down to the hardware, that is optimized exactly for those operations. The hope is that by doing that, it will be able to outclass a GPU in terms of speed, power usage, and even potentially the actual size of the chip. SambaNova today said it has raised a massive $56 million series A financing round was co-led by GV and Walden International, with participation from Redline Capital and Atlantic Bridge Ventures.

SambaNova is the product of technology from Kunle Olukotun and Chris Ré, two professors at Stanford, and led by former Oracle SVP of development Rodrigo Liang, who was also a VP at Sun for almost 8 years. When looking at the landscape, the team at SambaNova looked to work their way backwards, first identifying what operations need to happen more efficiently and then figuring out what kind of hardware needs to be in place in order to make that happen. That boils down to a lot of calculations stemming from a field of mathematics called linear algebra done very, very quickly, but it’s something that existing CPUs aren’t exactly tuned to do. And a common criticism from most of the founders in this space is that Nvidia GPUs, while much more powerful than CPUs when it comes to these operations, are still ripe for disruption.

“You’ve got these huge [computational] demands, but you have the slowing down of Moore’s law,” Olukotun said. “The question is, how do you meet these demands while Moore’s law slows. Fundamentally you have to develop computing that’s more efficient. If you look at the current approaches to improve these applications based on multiple big cores or many small, or even FPGA or GPU, we fundamentally don’t think you can get to the efficiencies you need. You need an approach that’s different in the algorithms you use and the underlying hardware that’s also required. You need a combination of the two in order to achieve the performance and flexibility levels you need in order to move forward.”

While a $56 million funding round for a series A might sound massive, it’s becoming a pretty standard number for startups looking to attack this space, which has an opportunity to beat massive chipmakers and create a new generation of hardware that will be omnipresent among any device that is built around artificial intelligence — whether that’s a chip sitting on an autonomous vehicle doing rapid image processing to potentially even a server within a healthcare organization training models for complex medical problems. Graphcore, another chip startup, got $50 million in funding from Sequoia Capital, while Cerebras Systems also received significant funding from Benchmark Capital.

Olukotun and Liang wouldn’t go into the specifics of the architecture, but they are looking to redo the operational hardware to optimize for the AI-centric frameworks that have become increasingly popular in fields like image and speech recognition. At its core, that involves a lot of rethinking of how interaction with memory occurs and what happens with heat dissipation for the hardware, among other complex problems. Apple, Google with its TPU, and reportedly Amazon have taken an intense interest in this space to design their own hardware that’s optimized for products like Siri or Alexa, which makes sense because dropping that latency to as close to zero as possible with as much accuracy as possible in the end improves the user experience. A great user experience leads to more lock-in for those platforms, and while the larger players may end up making their own hardware, GV’s Dave Munichiello — who is joining the company’s board — says this is basically a validation that everyone else is going to need the technology soon enough.

“Large companies see a need for specialized hardware and infrastructure,” he said. “AI and large-scale data analytics are so essential to providing services the largest companies provide that they’re willing to invest in their own infrastructure, and that tells us more investment is coming. What Amazon and Google and Microsoft and Apple are doing today will be what the rest of the Fortune 100 are investing in in 5 years. I think it just creates a really interesting market and an opportunity to sell a unique product. It just means the market is really large, if you believe in your company’s technical differentiation, you welcome competition.”

There is certainly going to be a lot of competition in this area, and not just from those startups. While SambaNova wants to create a true platform, there are a lot of different interpretations of where it should go — such as whether it should be two separate pieces of hardware that handle either inference or machine training. Intel, too, is betting on an array of products, as well as a technology called Field Programmable Gate Arrays (or FPGA), which would allow for a more modular approach in building hardware specified for AI and are designed to be flexible and change over time. Both Munichiello’s and Olukotun’s arguments are that these require developers who have a special expertise of FPGA, which is a sort of niche-within-a-niche that most organizations will probably not have readily available.

Nvidia has been a massive benefactor in the explosion of AI systems, but it clearly exposed a ton of interest in investing in a new breed of silicon. There’s certainly an argument for developer lock-in on Nvidia’s platforms like Cuda. But there are a lot of new frameworks, like TensorFlow, that are creating a layer of abstraction that are increasingly popular with developers. That, too represents an opportunity for both SambaNova and other startups, who can just work to plug into those popular frameworks, Olukotun said. Cerebras Systems CEO Andrew Feldman actually also addressed some of this on stage at the Goldman Sachs Technology and Internet Conference last month.

“Nvidia has spent a long time building an ecosystem around their GPUs, and for the most part, with the combination of TensorFlow, Google has killed most of its value,” Feldman said at the conference. “What TensorFlow does is, it says to researchers and AI professionals, you don’t have to get into the guts of the hardware. You can write at the upper layers and you can write in Python, you can use scripts, you don’t have to worry about what’s happening underneath. Then you can compile it very simply and directly to a CPU, TPU, GPU, to many different hardwares, including ours. If in order to do that work, you have to be the type of engineer that can do hand-tuned assembly or can live deep in the guts of hardware, there will be no adoption… We’ll just take in their TensorFlow, we don’t have to worry about anything else.”

(As an aside, I was once told that Cuda and those other lower-level platforms are really used by AI wonks like Yann LeCun building weird AI stuff in the corners of the Internet.)

There are, also, two big question marks for SambaNova: first, it’s very new, having started in just November while many of these efforts for both startups and larger companies have been years in the making. Munichiello’s answer to this is that the development for those technologies did, indeed, begin a while ago — and that’s not a terrible thing as SambaNova just gets started in the current generation of AI needs. And the second, among some in the valley, is that most of the industry just might not need hardware that’s does these operations in a blazing fast manner. The latter, you might argue, could just be alleviated by the fact that so many of these companies are getting so much funding, with some already reaching close to billion-dollar valuations.

But, in the end, you can now add SambaNova to the list of AI startups that have raised enormous rounds of funding — one that stretches out to include a myriad of companies around the world like Graphcore and Cerebras Systems, as well as a lot of reported activity out of China with companies like Cambricon Technology and Horizon Robotics. This effort does, indeed, require significant investment not only because it’s hardware at its base, but it has to actually convince customers to deploy that hardware and start tapping the platforms it creates, which supporting existing frameworks hopefully alleviates.

“The challenge you see is that the industry, over the last ten years, has underinvested in semiconductor design,” Liang said. “If you look at the innovations at the startup level all the way through big companies, we really haven’t pushed the envelope on semiconductor design. It was very expensive and the returns were not quite as good. Here we are, suddenly you have a need for semiconductor design, and to do low-power design requires a different skillset. If you look at this transition to intelligent software, it’s one of the biggest transitions we’ve seen in this industry in a long time. You’re not accelerating old software, you want to create that platform that’s flexible enough [to optimize these operations] — and you want to think about all the pieces. It’s not just about machine learning.”

Powered by WordPress | Theme: Aeros 2.0 by TheBuckmaker.com