AI chip race: Groq CEO takes on Nvidia, claims most startups will use speedy LPUs by finish of 2024

February 23, 2024

64

Everyone seems to be speaking about Nvidia’s jaw-dropping earnings outcomes — up a whopping 265% from a yr in the past. However don’t sleep on Groq, the Silicon Valley-based firm creating new AI chips for giant language mannequin (LLM) inference (making choices or predictions on current fashions, versus coaching). Final weekend, Groq all of a sudden loved a viral second most startups simply dream of.

Certain, it wasn’t as massive a social media splash as even one of Elon Musk’s posts concerning the completely unrelated giant language mannequin Grok. However I’m sure the oldsters at Nvidia took discover after Matt Shumer, CEO of HyperWrite, posted on X about Groq’s “wild tech” that’s “serving Mixtral at almost 500 tok/s” with solutions which are “just about instantaneous.”

Shumer adopted up on X with a public demo of a “lightning-fast solutions engine” exhibiting “factual, cited solutions with a whole lot of phrases in lower than a second” —and all of a sudden it appeared like everybody in AI was speaking about and attempting out Groq’s chat app on its web site, the place customers can select from output served up by Llama and Mistral LLMs.

This was all on prime of a CNN interview over every week in the past the place Groq CEO and founder Jonathan Ross confirmed off Groq powering an audio chat interface that “breaks pace data.”

VB Occasion

The AI Impression Tour – NYC

We’ll be in New York on February 29 in partnership with Microsoft to debate the best way to steadiness dangers and rewards of AI functions. Request an invitation to the unique occasion beneath.

Request an invitation

Whereas no firm can problem Nvidia dominance proper now — Nvidia enjoys over 80% of the high-end chip market; different AI chip startups like SambaNova and Cerebras have but to make a lot headway, even with AI inference; Nvidia simply reported $22 billion in 4th quarter income — Groq CEO and founder Jonathan Ross advised me in an interview that the eye-watering prices of inference make his startup’s providing a “super-fast,” cheaper possibility particularly for LLM use.

In a daring declare, Ross advised me that “we’re in all probability going to be the infrastructure that almost all startups are utilizing by the top of the yr,” including that “we’re very favorable in the direction of startups — attain out and we’ll just be sure you’re not paying as a lot as you’d elsewhere.”

Groq LPUs vs. Nvidia GPUs

Groq’s web site describes its LPUs, or ‘language processing models,’ as “a brand new sort of end-to-end processing unit system that gives the quickest inference for computationally intensive functions with a sequential part to them, comparable to AI language functions (LLMs).”

In contrast, Nvidia GPUs are optimized for parallel graphics processing, not LLMs. Since Groq’s LPUs are particularly designed to take care of sequences of knowledge, like code and pure language, they’ll serve up LLM output quicker than GPUs by bypassing two areas that GPUs or CPUs have hassle with: compute density and reminiscence bandwidth.

As well as, in relation to their chat interface, Ross claims that Groq additionally differentiates from firms like OpenAI as a result of Groq doesn’t prepare fashions — and due to this fact don’t must log any information and might hold chat queries personal.

With ChatGPT estimated to run greater than 13 instances quicker if it have been powered by Groq chips, would OpenAI be a possible Groq companion? Ross wouldn’t say particularly, however the demo model of a Groq audio chat interface advised me it’s “doable that they may collaborate if there’s a mutual profit. Open AI could also be taken with leveraging the distinctive capabilities of LPUs for his or her language processing initiatives. It might be an thrilling partnership in the event that they share comparable targets.”

Are Groq’s LPUs actually an AI inference game-changer?

I used to be supposed to talk with Ross months in the past, ever for the reason that firm’s PR rep reached out to me in mid-December calling Groq the “US chipmaker poised to win the AI race.” I used to be curious, however by no means had time to take the decision.

However now I positively made time: I wished to know if Groq is simply the newest entrant within the fast-moving AI hype cycle of “PR consideration is all you want”? Are Groq’s LPUs actually an AI inference game-changer? And what has life been like for Ross and his small 200-person group (they name themselves ‘Groqsters’) over the previous week after a particular second of tech {hardware} fame?

Shumer’s posts have been “the match that lit the fuse,” Ross advised me on a video name from a Paris resort, the place he had simply had lunch with the group from Mistral — the French open supply LLM startup that has loved a number of of its personal viral moments over the previous couple of months.

He estimated that over 3000 individuals reached out to Groq asking for API entry inside 24 hours of Shumer’s put up, however laughed, including that “we’re not billing them as a result of we don’t have billing arrange. We’re simply letting individuals use it without spending a dime in the meanwhile.”

However Ross is hardly inexperienced in relation to the ins and outs of operating a startup in Silicon Valley — he has been beating the drum concerning the potential of Groq’s tech because it was based in 2016. A fast Google search unearthed a Forbes story from 2021 which detailed Groq’s $300 million fundraising spherical, in addition to Ross’s backstory of serving to invent Google’s tensor processing unit, or TPU, after which leaving Google to launch Groq in 2016.

At Groq, Ross and his group we constructed what he calls “a really uncommon chip, as a result of for those who’re constructing a automobile, you can begin with the engine or you can begin with the driving expertise. And we began with the driving expertise — we spent the primary six months engaged on a compiler earlier than we designed the chip.”

Feeding the starvation for Nvidia GPU entry has develop into massive enterprise

As I reported final week, feeding the widespread starvation for entry to Nvidia GPUs, which was the prime gossip of Silicon Valley final summer time, has develop into massive enterprise throughout the AI business.

It has minted new GPU cloud unicorns (Lamda, Collectively AI and Coreweave), whereas former GitHub CEO Nat Friedman introduced yesterday that his group had even created a Craigslist for GPU clusters. And, in fact, there was the Wall Road Journal report that OpenAI CEO Sam Altman desires to take care of the demand by reshaping the world of AI chips — with a venture that would value trillions and has a advanced geopolitical backdrop.

Ross claims that a few of what’s going on now within the GPU area is definitely in response to issues that Groq is doing. “There’s a bit of little bit of a virtuous cycle,” he stated. For instance, “Nvidia has discovered sovereign nations are a complete factor they’re doing, and I’m on a five-week tour within the technique of attempting to lock down some offers right here with nations…you don’t see this while you’re on the skin, however there’s a variety of stuff that’s been following us.”

He additionally pushed again boldly on Altman’s effort to lift as much as $7 trillion for an huge AI chip venture. “All I’ll say is that we might do it for 700 billion,” he stated. “We’re a cut price.”

He added that Groq may also contribute to the provision of AI chips, with loads of capability.

“By the top of this yr, we will certainly have 25 million tokens a second or capability, which is the place we estimate OpenAI was on the finish of 2023,” he stated. “Nevertheless, we’re working with nations to deploy {hardware} which might improve that quantity. Just like the UAE, like many others. I’m in Europe for a cause — there’s all types of nations that might have an interest on this.”

However in the meantime, Groq additionally has to sort out mundane present points — like getting individuals to pay for the API within the wake of the corporate’s viral second final week. Once I requested Ross if he deliberate on determining Groq’s API billing, Ross stated “We’ll look into it.” His PR rep, additionally on the decision, rapidly jumped in: “Sure, that will likely be one of many first orders of enterprise, Jonathan.”

VentureBeat’s mission is to be a digital city sq. for technical decision-makers to realize data about transformative enterprise know-how and transact. Uncover our Briefings.

AI chip race: Groq CEO takes on Nvidia, claims most startups will use speedy LPUs by finish of 2024

VB Occasion

Groq LPUs vs. Nvidia GPUs

Are Groq’s LPUs actually an AI inference game-changer?

Feeding the starvation for Nvidia GPU entry has develop into massive enterprise

Related Articles

OpenAI proclaims “deep analysis” agent that may full on-line analysis

Digital Adverts Price Extra, Convert Much less: Frustration To Blame

A Information to Buying and selling Bots

ABOUT US