How Nvidia created the chip driving the generative artificial intelligence boom.
In 2022, US chipmaker Nvidia released the H100, one of the most powerful processors it has ever built – and one of the most expensive, costing around $40,000 each. The rollout seemed ill-timed, as businesses scrambled to cut spending amid rampant inflation.
Then, in November, ChatGPT was launched.
“We turned from a pretty tough year last year to an overnight turnaround,” Nvidia CEO Jensen Huang said. OpenAI’s successful chatbot was an “aha moment,” he said. “It created immediate demand.”
ChatGPT’s sudden popularity has set off an arms race among the world’s top tech companies and startups, which are rushing to acquire the H100, which Huang describes as “the world’s first computer.” [chip] made for generative artificial intelligence” – artificial intelligence systems that can quickly create human-like text, images and content.
The value of getting the right product at the right time became apparent this week. Nvidia on Wednesday reported sales of $11 billion for the three months ending in July, more than 50 percent ahead of Wall Street’s previous estimates. This is due to a revival in data center spending by Big Tech and demand for AI chips.
Investors’ response to the forecast added $184 billion to Nvidia’s market capitalization in a single day on Thursday, bringing the world’s already most valuable chip company to nearly $1 billion.
Nvidia is an early winner in the astronomical rise of generative artificial intelligence, a technology that threatens to transform industries, deliver massive productivity gains and displace millions of jobs.
This technological leap will be accelerated by the H100, which is based on a new Nvidia chip architecture dubbed ‘Hopper’ – named after American programming pioneer Grace Hopper – and has suddenly become the hottest commodity in Silicon Valley.
“This whole thing started when we start production on Hopper,” Huang said, adding that large-scale production only started a few weeks before ChatGPT’s debut.
Huang’s confidence in continued growth stems in part from the fact that the chipmaker can work with TSMC to increase production of the H100 to meet exploding demand from cloud providers such as Microsoft, Amazon and Google, Internet groups such as Meta and enterprise customers. .
“It’s one of the most scarce engineering resources on the planet,” said Brannin McBee, chief strategy officer and founder of CoreWeave.
Some customers have waited up to six months to get the thousands of H100 chips they want to use to train their big data models. AI startups have expressed concern that there will be a shortage of H100s when demand picks up.
Elon Musk, who has bought thousands of Nvidia chips for his new AI startup X.ai, told a Wall Street Journal event this week that right now GPUs (graphics processing units) are “much harder to get than drugs”. joking that “there wasn’t really a high bar in San Francisco.”
“The cost of computing has become astronomical,” Musk added. “The minimum ante should be $250 million in server hardware [to build generative AI systems].”
The H100 has proven particularly popular with big tech companies like Microsoft and Amazon, which are building entire data centers focused on AI workloads, as well as generative AI startups like OpenAI, Anthropic, Stability AI, and Inflection AI because it promises greater performance. which can speed up product adoption or reduce training costs over time.
“In terms of access, yes, that’s what it feels like to ramp up a GPU with a new architecture,” said Ian Buck, head of Nvidia’s hyperscale and high-performance computing business, who has the daunting task of increasing H100 supply to meet demand. “It’s happening at hyperscale,” he added, with some big customers looking for tens of thousands of GPUs.
The unusually large chip, an “accelerator” designed to operate in data centers, contains 80 billion transistors, five times as many as the processors powering the latest iPhones. Although twice as expensive as its predecessor, the A100 released in 2020, early adopters say the H100 boasts at least three times the performance.
“The H100 solves the scalability issue that has plagued me so far [AI] modelers,” said Emad Mostaque, co-founder and CEO of Stability AI, one of the companies behind the image generation service Stable Diffusion. “This is important because it allows us all to train larger models faster, as it turns from a research problem into an engineering problem.”
While the timing of the H100’s launch was ideal, Nvidia’s breakthrough in artificial intelligence dates back nearly two decades to software innovation rather than silicon.
Created in 2006, Cuda software allows GPUs to be used as accelerators for other types of workloads besides graphics. Then around 2012, Buck explained, “AI found us.”
Canadian researchers have discovered that GPUs are ideally suited for creating neural networks, a form of AI inspired by the way neurons interact in the human brain, which has since become a new focus for artificial intelligence development. “It took us almost 20 years to get to where we are today,” Buck said.
Nvidia now has more software engineers than hardware engineers to support the many AI frameworks coming out in the coming years and to make its chips more efficient at the statistical calculations needed to train AI models.
Hopper was the first “transformer” optimized architecture, the approach to AI that underpins OpenAI’s “generative pre-trained transformer” chatbot. Nvidia’s close cooperation with artificial intelligence researchers allowed it to notice the appearance of the transformer in 2017 and start tuning the software accordingly.
“Nvidia arguably saw the future ahead of everyone else when it came to making GPUs programmable,” said Nathan Benaich, general partner at Air Street Capital, an investor in AI startups. “He saw an opportunity, bet big, and consistently outperformed his competition.”
Benaich estimates that Nvidia has a two-year head start on its rivals, but adds, “Its position is far from unassailable in both hardware and software.”
Stability AI Mostaque agrees. “Next-generation chips from Google, Intel and others are catching up [and] with software standardization, even the Cuda will become less of a ditch.”
Wall Street’s enthusiasm this week seems overly optimistic, according to some in the AI industry. Still, “for now,” said Jay Goldberg, founder of chip consulting firm D2D Advisory, “the semi-artificial intelligence market seems to remain a winner for Nvidia.”
Further meaning of Madhumita Murgia