When did NVIDIA Corporation (NVDA) hold its June 2, 2024 earnings call?

NVIDIA Corporation held its June 2, 2024 earnings call on June 2, 2024.

How can I access NVIDIA Corporation earnings call transcripts via API?

NVIDIA Corporation (NVDA) earnings call transcripts and historical archives are available through the EarningsCalls.dev API. The API returns full transcripts, speaker segments and AI-generated summaries for NVIDIA Corporation and 9,000+ listed companies across 70+ countries, typically within ~30 minutes of the call's publication.

NVIDIA Corporation (NVDA) Earnings Call Transcript & Summary

June 2, 2024

executive

#10

So a robotic factory is designed with 3 computers, train the AI on NVIDIA AI. You have the robot running on the PLC systems for orchestrating the factories and then you, of course, simulate everything inside Omniverse. Well, the robotic arm and the robotic AMRs are also the same way, 3 computer systems. The difference is the 2 Omniverse will come together. So they'll share 1 virtual space. When they share 1 virtual space, that robotic arm will become inside the robotic factory. And again, 3 computers, and we provide the computer, the acceleration layers and pretrained AI models. We connected NVIDIA manipulator and NVIDIA Omniverse with Siemens, the world's leading industrial automation software and systems company. This is really a fantastic partnership, and they're working on factories all over the world. Symantec Pick AI now integrates Isaac Manipulator and Symantec Pick AI runs, operates ABB, Kuka, Yaskawa, Fanuc, Universal Robotics and Techman. And so Siemens is a fantastic integration. We have all kinds of other integrations, let's take a look. [Presentation]

Jensen Huang

executive

#11

Robotics is here. Physical AI is here. This is not science fiction, and it's being used all over Taiwan and just really, really exciting. And that's the factory, the robots inside and, of course, all the products is going to be robotics. So there are 2 very high-volume robotics products. One, of course, is the self-driving car or cars that have a great deal of autonomous capability. NVIDIA again builds the entire stack. Next year, we're going to go to production with the Mercedes fleet, and after that in 2026, the JLR fleet. We offer the full stack to the world. However, you're welcome to take whichever parts, whichever layer of our stack just as the entire drive stack is open. The next high-volume robotics product that's going to be manufactured by robotic factories with robots inside will likely be humanoid robots. And this has great progress in recent years in both the cognitive capability because of foundation models and also the world understanding capability that we're in the process of developing. I'm really excited about this area because obviously, the easiest robot to adapt into the world are humanoid robots because we built the world for us. We also have the vast and most amount of data to train these robots than other types of robots because we have the same physique. And so the amount of training data we can provide through demonstration capabilities and video capabilities is going to be really great. And so we're going to see a lot of progress in this area. Well, I think we have some robots that we'd like to welcome. Here we go. About my size. And we have some friends to join us. So the future of robotics is here, the next wave of AI. And of course, Taiwan builds computers with keyboards. You build computers for your pocket. You build computers for data centers in the cloud. In the future, you're going to build computers that walk and computers that roll around. And so these are all just computers. And as it turns out, the technology is very similar to the technology of building all of the other computers that you already built today. So this is going to be a really extraordinary journey for us. Well, I want to thank -- I want to -- I want to thank -- I've made one last video, if you don't mind, something that we really enjoyed making. And if you -- let's run it. [Presentation]

Jensen Huang

executive

#12

[Foreign Language] Thank you. Thank you all for coming. Have a great COMPUTEX.

Marc Hamilton

executive

#13

Welcome, everyone. It's my great privilege to be kicking off the COMPUTEX Forum this morning. Last week, one of my engineers was showing me a computer-generated, an AI-generated news post on a social media site, and it said, Marc Hamilton, the famous actor, will be kicking off COMPUTEX Forum talking about AI. And the post went on to say even what movies I had starred in. I can assure you, I am not a famous actor. I actually work for NVIDIA and I build, help our customers and our partners build the AI factories that Jensen has been talking about. Those of you who know NVIDIA and know NVIDIA culture will know that we like to do. We don't like to talk. And so we talk quickly and we talk concisely, and we explain in simple terms what we do. So one day, several months ago, Jensen wrote a very short email, and he said, "Marc, what your team does is IBTG, infra, build, train, go." IBTG, and we use this now at NVIDIA. An AI factory is about the infrastructure. The infrastructure is, of course, not just many boxes of GPUs. The infrastructure has to be built, and we need all of our ecosystem partners to help us build that infrastructure and then it needs to be assembled into an AI factory. And why? As soon as you assemble it, you want to start training models. And after you train the models, you go. You run the models, you run your business, you start making money. So let's go ahead and get started, and I'll talk a little bit about how my team works with our partners, works with the ecosystem to help IBTG. These observations are not from PowerPoint slides or studies. This comes from installing hundreds of thousands of GPUs in AI factories. Our first AI factory at NVIDIA was built 8 years ago, 2016. My team and other engineers at NVIDIA started to build our first AI factory. We had just introduced at COMPUTEX, the P100, the NVIDIA P100 GPU. I'll show some compares of how we've gone in 8 years. But we've also learned many, many things in those 8 years about installing these systems. An AI factory is not regular data center or regular cloud, it is AI computing. An AI computing is not just about the GPU or adding the GPU to the server. An AI factory is an end-to-end solution. It is nearly impossible to simply build one part of the factory. If you only build a GPU, if you only build some software, if you only build part of the network, it's impossible to think through how you optimize from end to end. And it is, in fact, because NVIDIA for 8 years, has been building our own AI factories for our engineers to use. If you listen to the keynote or read it all about NVIDIA, you know how we don't just sell AI factory, we don't just build AI factory, but we use AI factory. We build a robotic software. We build AI software for autonomous vehicles. We build AI into our platforms for PC gaming and computer gaming that so many enthusiasts across Taipei, across Taiwan love. What are some of the lessons? How do I measure my team when they build hundreds of thousands of GPUs into hundreds of AI factories? We've actually come up with 3 very simple metrics. This is one of the things I love about NVIDIA. We take something very complicated building the world's most complex AI factory to train Llama 3, Mistral, other leading open-source models, and we condense or summarize into 3 things. The first, and I measure this for every AI factory I build, is time to first train, because if you're lucky enough to receive 1,000 or 10,000 GPUs, be it an end user, a cloud provider, another partner, you want to be able to quickly as possible, start training your models. You want to train your model so that you can be ahead of your competition, so that you can use the model to start running your business. Traditional supercomputers have often taken 6 months, 12 months, 18 months to set up. There has been, for many decades, a list of the world's TOP500 fastest supercomputers and many of these, of course, can run AI today if they are accelerated and be used. But these supercomputers were built by large national labs, by large universities where they had many PhD students, many researchers that -- their job was to build supercomputers. As AI factories move into the cloud, move into regular enterprises, we can't build every one of them differently like the TOP500 supercomputers. We need a recipe so that we can deploy them quickly. The GPU is simply too precious a resource to be deployed and sit in your data center while you try to assemble it. So time to first train. AI supercomputers actually are very good at traditional science as well. In fact, many of them end up being put on the TOP500 supercomputer list. There's a pharmaceutical company in the United States, Recursion Pharmaceutical. And several months ago, they ordered one of our DGX SuperPOD AI factories, and they waited a little bit too long. And so they were going to miss the deadline for the TOP500 list in June. So they came to us and said, "How quickly can you install it?" It was 512 GPUs, relatively small by the scale of systems today, but still relatively complex. That many GPUs, you might take months traditionally to set up. And the TOP500 deadline, by the time they receive their systems, was 1 week away, 5 days away. We were able to install their system in 1 week, run the TOP500 benchmark and get it on the top 50 of the TOP500 supercomputers. And it's there today. You can go look at top500.org and see Recursion Pharmaceutical. So time to first train. Second is GPU availability. Now you might say this is a funny metric. 100% of the GPU should be available. Now remember, Jensen talked about our latest GPU, the GB200 NVLink72. One rack, 600,000 components in that rack. And now many of those racks, millions of components in an AI factory. If you wait for every single one of those millions of components to be perfectly running, you will always be waiting. The software working together with the hardware has to work around failures in the system, be resilient to failures and continue to train the model, because these large models, 1 trillion parameter model may use tens of thousands of GPUs, many racks of GB200 for many weeks, maybe many months. Impossible to keep all the equipment going. So what is your availability at the time to first train? And finally, what is the time to train? To measure how good a job we did putting together the cluster and measure how good a job the OEM and our other partners did to make sure they followed our recipe, we compare against our internal supercomputer that is our reference, and I'll talk about some of our internal supercomputers. We get a time to train. We run this test for about 6 hours, can be run shorter or longer, depending on your needs. The test runs NVIDIA's own Megatron open-source LLM. Many people have maybe never heard about Megatron, certainly not as common as GPT or Llama or even Mistral, some of the other models. Because Megatron is open source, it's an open reference. Any OEM, any customer can get that benchmark, run it on their system, run it on their cloud and compare how good it is. And so for 250,000 GPUs that my team installs every few months, simply 3 numbers: time to first train, GPU availability and time to train. And with those 3 numbers, which we know for every size cluster, 1,000, 10,000, 20,000 and for every OEM, we can compare and judge how well we're doing. And why is that so important? It's, of course, so important because of the advances in generative AI that you're all hearing about. Now ChatGPT did a great thing. NVIDIA and many other companies had, of course, been working on generative AI for many years. [ Beta with ] researchers, it was in the lab, maybe some data scientists. The average CEO, the children and the parents of the average CEO had never heard of generative AI in large language models. And then in November of 2022, OpenAI released ChatGPT. And now not only every CEO, but the CEO's children or grandchildren and parents or grandparents for the younger CEOs, all of a sudden could see the power of GenAI. But of course -- and it's funny to talk about ChatGPT as that early or first generation of generative AI was relatively simple compared to what's being done today. It was simply text, chat, type in, type a few hundred words, get a few thousand words back. But now think about all the ways that we're learning to use generative AI. Generative AI is, of course, multimodal. It may not -- it could be text in, it could be audio in, it could be an image in, in any of those out. Think 100 words in, many images out. 100 words in, a multi-gigabyte video out. And then as GenAI applications are connected together with agents, you have one GenAI application calling another, calling another and interacting and then bringing you back a story operating a robot in a factory, showing a car how to drive down the street, automating many other tasks. So how does NVIDIA help customers move forward and get started? Beyond simply building GenAI factories. If we were to come in and build a GenAI factory or if you were to rent a GenAI factory from your favorite cloud provider, and all of the cloud providers work with NVIDIA and follow our blueprint for putting together these GenAI factories, then what do you do with that? Well, NVIDIA inference microservices or NIMs, a very short word, easy to pronounce, N-I-M, NIM. What is a NIM? Well, the challenge with -- the good thing about GenAI is that so much of it is done in the open source. You have, of course, proprietary models owned by companies. Not that proprietary is bad or ChatGPT, Gemini, other proprietary models. And then you have fully open-source models. In the open-source community, everyone can see as the code develops, as the models are released, can take advantage of those improvements in the model and other researchers can improve on top of that. And of course, companies will continue to [indiscernible] proprietary. And then when they're ready, release amazing new features for you to use and through APIs for their partners to use. But as a data scientist, as a developer in an enterprise, how do you get started with GenAI, with all these open-source models? A very good analogy is perhaps 20 years ago in the early days of Linux. Linux has, of course, open-source versions. The fact that the Linux kernel is open means that different companies around the world can go in and build Linux operating systems, add their value and distribute them. Red Hat, Red Hat Enterprise Linux, perhaps one of the best known enterprise distributions of Linux. And so what NVIDIA felt the world needed was not another GenAI model. NVIDIA, of course, has our own models under the Megatron name. Some are closed, some are open. There's great models out today and there's many open-source models. So we'll continue to develop our models when there's a need, but the open-source models are so good, but we needed an operating system for AI. Just like as an enterprise, when you started using Linux, you went to an operating system provider to get an enterprise copy of Linux. Where does an enterprise go to get an enterprise copy of everything you need to run -- to build and to run AI? So that is NVIDIA AI Enterprise, our AI operating system and our NVIDIA inference microservices. A microservice is, of course, means that you do not need to install the entire operating system. You can install the entire operating system in your data center and run a microservice on top of it using our APIs or you can simply call an NVIDIA inference microservice that is running in any of the clouds. So again, what then besides -- what is at the core of an NVIDIA inference microservice? We take all of the leading open-source models: Llama 3, Mistral, on and on, our own models, and we optimize them for our GPUs, not just for one GPU, but for all of our GPUs. The A100, which is still in many clouds, the H100 and the B100 and of course, thinking today about how do we optimize models in the future for new Vera Rubin GPUs coming out in 2026. Only NVIDIA could be thinking today about how we optimize AI software for a chip that is not even built yet. So what's the result when we put the 2 of these together? These are some of the internal supercomputers that NVIDIA has built. I mentioned that our first supercomputer was built back in 2016. It was called SATURNV. At the time, like the SATURNV rocket from NASA, we thought it was a very big supercomputer. It turned out being the 30th fastest supercomputer in the TOP500 list at the time. There was no measure back in 2016 of how fast an AI factory or AI supercomputer is. Today, of course, there's the MLPerf set of benchmarks. And these supercomputers that you see here have been, year after year, #1 in performance on MLPerf. And in a sense, that's not quite -- that's not very surprising. NVIDIA builds the chips. We build much of the software that's inside the large language model. We build entire supercomputers. So of course, it makes sense that we should be able to optimize and be among the fastest in the world. So our new superchip is called Blackwell, and we've heard a lot about it in the last several months, so I won't go and repeat too much here. But again, it includes many components built by our partners. The chip itself is, of course, fabricated by TSMC. We work very closely with TSMC on the 4N process for the transistors. And again, as great as TSMC is, we take all of their tools, all of their knowledge, our engineers sit together here in Taiwan and across the world, and we figure out how to make the process even better for Blackwell in future chips. It uses fast HBM memory from the 3 leading HBM memory providers. I believe Micron is talking later today, which is one of our partners for HBM. And what's the result when we put Blackwell together with all of our experience building AI factories? Well, again, this is looking at it in several different levels. And the first is you'll say, "Well, Marc, NVIDIA's chips keep on getting more expensive." And that's true if you look at the price of an individual chip. In fact, for $1 billion back in 2016, you could have bought about 45,000 chips. And today, you can only buy about 11,000 chips. But look at the performance improvements that you get. The GPU performance, the time it takes to train a model and the amount of power you use to run the model is all drastically reduced. So again, continuing to look at energy efficiency and to look at performance and price performance, i.e. value and delivering the best solutions. Our building blocks today for the AI factory of the future are around the GB200 NVLink72. As these systems get more complex, we have, again, so many great partners here in Taiwan that are building these individual parts, taking the NVLink Switch tray, taking the compute tray, taking the superchip module and building it into servers. As I said, it's not just about the server. It's understanding how all the connections work and how this works all the way out to the software. Thinking about just one component, NVLink. There's been a lot of discussion about NVLink the last few days. NVLink is not just a wire connecting to GPUs. NVLink, now in its fifth generation, is the entire end-to-end system for connecting the 72 GPUs in the rack. Jensen showed the large wiring connectors, 6 feet tall, the ability to send a signal out of one GPU through a set of NV switches through 6 feet of copper out to another GPU. NVLink in a sense is -- starts with software inside the GPU that initiates the communication, goes through the entire set of switches and cables, comes out the other end. It makes those 2 GPUs, those 72 GPUs act like one giant GPU. Now building the AI factory is a lot more than just putting the servers in the rack. In fact, the rack comes prebuilt in the factory, 2 miles of copper cables in the back of the rack. It weighs several thousand kilograms. You have tens of gallons of cooling liquid per minute flowing through the rack all the way to the GPUs, and it consumes 120 kilowatts of power. What it saves is thousands of kilowatts of power compared to older GPUs. Again, this system will cost millions of dollars, will start being deployed soon this year. And we're already in the process of planning very large data centers around the world for our partners, very large AI factories that will be deployed in every large cloud, in many regional clouds and in private data centers by private companies. So if you're not quite ready for this 72-GPU rack, we, of course, have smaller ways to get started. One of the new products we announced here this week at COMPUTEX was the GB200 NVLink 2. It's an industry standard MGX server that fits in any standard rack and it puts together 2 of the GB200 Superchip modules connected by NVLink. Very easy to deploy, very easy to scale out and ideal for inference. Remember, as Jensen has said, training the AI model, that can be done anywhere subject to constraints of where your data is. AI doesn't care where it goes to school. You don't always have to train your models in the city or the country or the data center. You can move that training to places where energy is less costly and more available. But then when you run your model, you want to run your model as close as possible to where the activity is. So this GB200 NVLink 2 being able to fit in a standard server in any data center of the world is super important. Now going the other way to build an entire data build an entire data factory, one rack of NVLink72 will not be enough. You, of course, read every day on how companies are using thousands or tens of thousands of GPUs to build AI models. So how do you connect together multiple of these racks? Again, this is very different than normal data center or normal cloud where you would simply take your racks of web servers or database servers and connect them together with regular networking. An AI factory is, of course, not regular compute, it's accelerated compute. So the DGX SuperPOD with DGX GB200 systems is a blueprint on how you can build an AI factory. This particular blueprint scales up to 576 GPUs in one scalable unit. And then you can then put multiple scalable units together with the 576-GPU building block, you can go up to 9,000 GPUs. Now of course, as I said, you've heard of customers using more than 9,000 GPUs today. So we continue to scale up. You can put together using a building block of 1,152 GPUs. You can build up to 18,000 GPUs. So this will be one of our standard AI factories deploying up to 18,000 GPUs. And to anyone in the audience that would like to build an AI factory of more than 18,000 GPUs, don't worry. Please come talk to your NVIDIA representative or to your OEM partner and we can -- we have designs to scale beyond this as well. And finally, while many people are used to InfiniBand networking, which Jensen showed continuing on our road map, for people who absolutely want to or would like to use Ethernet, we can now build AI factories with our Spectrum Ethernet. So I'd like to thank everyone for being here today, thank all of our partners that make the NVIDIA AI factories possible. We cannot build any of the AI factories, we cannot IBTG without these partners. So thank you very much.

Unknown Attendee

attendee

#14

Thank you, Mr. Hamilton, for your inspiring presentation.

This call discussed

For developers and AI pipelines

Programmatic access to NVIDIA Corporation earnings transcripts and 32,000+ others is available through the EarningsCalls.dev REST API. Plans from $24.99/month — full transcripts, speaker segments, full-text search, and the recently-added /api/v1/transcripts/recent polling endpoint for ETL pipelines.

NVIDIA Corporation (NVDA) Earnings Call Transcript & Summary

Earnings Call Speaker Segments

Unknown Attendee

Jensen Huang

Jensen Huang

Jensen Huang

Jensen Huang

Jensen Huang

Jensen Huang

Jensen Huang

Jensen Huang

Jensen Huang

Jensen Huang

Jensen Huang

Marc Hamilton

Unknown Attendee

This call discussed

Other NVIDIA Corporation earnings calls

Peers in Information Technology

For developers and AI pipelines