NVIDIA Corporation (NVDA) Earnings Call Transcript & Summary

June 4, 2025

NASDAQ US Information Technology Semiconductors and Semiconductor Equipment conference_presentation 45 min

Earnings Call Speaker Segments

Vivek Arya

analyst
#1

Good morning, everyone. Thank you so much for joining us on day 2 of the BofA Securities Global Technology Conference. I'm Vivek Arya. I cover semiconductors and semicap equipment here at BofA. And I'm absolutely delighted and honored to have Ian Buck, the Head of Accelerated Computing at NVIDIA join us for this keynote. I think most of you are probably familiar with, Ian, but if not, Ian heads all the hardware and software product lines, third-party enablement and marketing activities for GPU computing at NVIDIA. He joined the company in 2004. Same year, I joined Merrill Lynch. So I guess that's the only thing we have in common, I believe. And he created CUDA, which remains the established leading platform for accelerated parallel computing. And before joining NVIDIA, he was a development lead on Brook, which is a forerunner to generalized computing on GPUs. So we are absolutely thrilled to have Ian with us. And before I get into the Q&A, I was just asked to read a brief statement. So as a reminder, this presentation contains forward-looking statements. and investors are advised to read NVIDIA's reports filed with the SEC for information related to risks and uncertainties facing their business. So with that, a very warm welcome to you, Ian. Really appreciate having you. This is, I think, our third keynote session. So I really appreciate you joining us.

Ian Buck

executive
#2

Yes. We're running in the AI time. So I've -- a year ago, feels like a lifetime. One of the most challenging parts of my job often is to try to predict the future. But AI is always surprising us.

Vivek Arya

analyst
#3

That's right. Bigger and better.

Vivek Arya

analyst
#4

So NVIDIA -- sorry, Ian, let's just start with the big news that kind of rocked at least Wall Street early this year, which was the DeepSeek moment. So how much of that news was a surprise to you, right, because you have followed the industry for a long time. And what does it really mean for investors who are looking at that as some big seminal game-changing moment. So what are the positive and negative implications of that DeepSeek moment from your perspective?

Ian Buck

executive
#5

DeepSeek was -- there are a couple of inflection points in AI, for sure. You can go back to the original Google cap moment where AI recognized caps. You can go through the ResNet moment, you can go through the ImageNet moment. In 2022, we had the ChatGPT moment, which I'm sure the investor community all noticed as well. This -- and in January, we had the DeepSeek moment. DeepSeek itself wasn't a surprise. I think the company, DeepSeek and High-Flyer have been around for a while. I think if you look at the history of the papers they've been publishing, it is amazing work. Actually, they're one of the best CUDA developers out there in terms of getting all the way down. And if you read that DeepSeek-R1 paper and the V3, which it was based on the amount of optimization that they've done for GPUs, for NVLink, for GPUDirect RDMA for sending data across from the GPU over PCIe to the NIC over NVLink to build a training and inferencing platform and solution and technology is truly amazing. What -- the moment though, that really activated was reasoning. It was the first open world-class reasoning model, and it was truly open. They explained how they built, how they trained it and the optimizations that they did to make it, to train it at the level of intelligence and optimize the execution of the training and inference stack. And there's some amazing graphs in that paper that taught it. It basically -- it was a barn door moment for reasoning models in AI. And today, I think the world would agree, you can't really publish or celebrate a new model without it being a reasoning model. Reasoning wasn't new, OpenAI had been publishing papers about using reasoning. '03, '04 Mini, excellent -- and Gemini, all reasoning models. But DeepSeek really made it ubiquitous, open, and democratized it. The implications for and the impact was not understood when it got launched. First off, it -- by being open, anyone can run anywhere today, DeepSeek-R1 is call it $1 per million token. We're a traditional LLM, like Llama 70B might be $0.60 per million token. It's a big model, 671 billion parameters. I think it's 38 billion active parameters. It has over 120-plus odd layers and 250 experts or the shared expert. Like that is like stuff that only folks like Gemini or OpenAI that level of complexity and technology, you had a world -- truly world-class open model. Running that level of complexity is really hard. What I've -- what has happened is now that -- and what makes reasoning so useful is the fact that your output tokens, you let the model think, you teach the model to think and really kind of think out loud. If you've ever used DeepSeek-R1, it's quite amusing to watch it think. It actually is just talking out loud asking itself questions. It's actually trained itself to come up with an answer by thinking out loud and then it doesn't give you that answer right away. It actually -- you can see it, it checks the answer. So it's taught itself to check the answer and make sure it's right by double checking its math. And then it doesn't give you the answer again, it checks it a second time. And that's very intentional. They actually train the model to think for as long as it can until it comes up with an answer, check it once, check a twice and then give you the answer. As a result, we're seeing an explosion in the number of tokens generated. You ask Llama a question. You get an answer back at about 100 words. That's it. You paid for those hundred words of -- or, call it, 200 some-odd tokens, $0.60. DeepSeek, you're actually -- it reasons for about 1,000 words, and then it gives you that 100 word answer, and it's right. And while all those tokens you're paying for, by the way, you value at $1. So in general, DeepSeek has kind of made every model the reasoning model. The inference demand as a result has kind of exploded. The opportunity for multi-GPU, multi-node inference is everywhere. It actually had a great time for GB200 because of all those GPUs connections with NVLink and Blackwell, and you're seeing that now. With the increase in value, even of a free open model like DeepSeek-R1 at $1 per million token, it generates about 13x more tokens, 13x more tokens. That's like 20x more total market opportunity for inferencing because of reasoning. Actually, they just announced a new rev of DeepSeek-R1 on the math benchmark. They went from the AME math benchmark. They were getting about 70% accuracy, 69% or 70% accuracy. It's kind of like a C minus, 70% is like you're getting 2 out of 3 questions right. That's not that great. They just did -- the new one they did, they just updated the R1, same model, better ways, the same cost is now 89% accurate. So they went to kind of a B+, which is basically 9 out of 10 questions right, versus 2 out of 3. And the way they did that, they taught the model to think longer. So they just doubled the number of tokens they're generating, and how much thinking out loud they did. So again, as these models are getting smarter, it is driving more output tokens, more thinking and more opportunity for token revenue.

Vivek Arya

analyst
#6

Do you think anything that DeepSeek is doing or what's happening in China as a proxy for let's call it, CapEx constrained computing. So there is a lot more effort being made to make these things a lot more efficient because they may not have access. Do you think they are able to bend the cost curve in a way that has implications on how much spending needs to happen in this industry?

Ian Buck

executive
#7

No. Actually the opposite. They just made it. What everyone was doing, they just talked about an academic paper. Computing has always been constrained. Access to compute, amount of compute, dollars of compute, capital expenditure of compute. The AI race is about regardless of how much compute you have, how efficiently you're using it, how intelligently you're using it and how much value you bring. Everybody wants -- wanted Hopper with the ChatGPT moment. That wasn't unique to DeepSeek. It was around the world. It's just do you have the engineering talent to capitalize on it to advance to code your CUDA, do your -- know your InfiniBand, know your NVLink, know your -- optimize your transformer layer. One of the big innovations that DeepSeek did is they used a new technique called MLA, which actually is a statistical method for approximating the weights and the KV layers of the transformer layer. It wasn't a new idea. It actually been deployed in image generation, all those fun, drawing a picture of a teddy bear, swimming in Olympic lap. They were using this MLA statistical technique, but it compressed the Jesus out of the transformer layer, made it a lot cheaper by approximating and they were able to apply it to DeepSeek-V3 and R1. That was the first time we had been publicly talked about. Trust me, these methods are being deployed and optimized just not everyone wants. DeepSeek themselves are doing the world a favor by sharing some of the state-of-the-art research they're doing with the world. But it's happening everywhere. And it's happening back in Hopper. It was happening even back in the A100 days as well.

Vivek Arya

analyst
#8

Got it. You talked with a lot of cloud customers. Many of them are developing their frontier models. Are you seeing any kind of saturation or diminishing returns in the size of -- the benefits from increasing the size of these models. There was this kind of public story about Meta's large language model where they are not getting enough ROI on it, right? So do you see any saturation in the effectiveness of these models that again, because what this community cares about is CapEx at the end of the day. So is there anything that is happening from a Western large language model perspective, that gives you a pause on how long and how big can Western AI CapEx be?

Ian Buck

executive
#9

So the -- I won't get too hung up on the Behemoth question. There is -- Behemoth is an open model. There's a competition in the open space. It's hard to launch a model if it's not world class, and it relates to your brand and what you're doing versus what -- how is it compared to all the models that are out there. What I am seeing right now is the drive toward first reasoning models. They just add so much more value. They're able to think and solve a problem. And that is only based upon 2 things. One is how much knowledge they know, which is the size of the model and how good are they at thinking using that knowledge to come up with an answer to a question. Traditional LLMs, simply regurgitated with the new. Traditional Llama 70B, 70 billion parameters, it was trained on the corpus of the Internet. When you ask a question, it is really just trying to reconsolidate the information it knows and answer your question, but I can't really think. And the -- what the DeepSeek and the other models are doing right now is they take the corpus of the Internet, they use that information to think and answer your question. So what I'm seeing -- and the more they know the quicker they can think, the more accurate the answer they come up with or the cheaper their answer is. So we have a conflation of taking all of the knowledge that they know and baking it into the model repeatedly. And the more questions get asked, the more data now or answers, they can invest into the model itself. We don't need to -- like you and I don't need to know that 50 plus 50 is 100. But that's because we just know it. A first grader needs to actually do the math and carry the 1 and they could happen. But once they've done that, it's now part of their inherent knowledge. Think about that ChatGPT, think about Grok. Think about Meta AI. Every time someone is asking a question, they are expanding the corpus of -- they think about that answer. Now that answer gets baked into the model itself and the models are constantly training, and retraining, retraining, retraining. So they are both inferring, adding -- making money or adding value to the customers and also make it being smarter and their intelligence, how much they know is strictly the size of their model. So that's why the models when we were talking last year 100 billion parameter model plus was a rarity. Now 100 billion is kind of like sort of table stakes going to 600 billion. And obviously, we have models out there that are in the trillion, but they're not open. So the -- that's because they're adding value. There's a benefit to that model being smarter to answer the question quicker or answer more valuable questions even further. The tricks that are happening are the tricks in executing the model. The MoE experts, which is a hard thing to do actually picking through the whole model, which parts of that knowledge I should pull from and compute on versus skip is where a lot of the innovation is happening. So there's a little bit of this race right now of model size and active parameters, traditional LLMs, they're not MoE. They just compute on every piece of knowledge they know. And you and I both know that's not very efficient to take all the knowledge you know, and process it relative to what my answer is. So the question of inference, so that's what experts are. They split the model up in little pieces. And throughout the whole thinking path, they're trying to prune and only pull in the right parts. And DeepSeek made public what a lot of what we were doing, which is having the experts in every layer of the stack. So we kind of are -- the models are getting bigger. The active parameter -- it's a race between that and the active parameters to answer your question. You're only seeing a small glimpse in the public papers of what the true behind the scenes world-class work has actually been able to do.

Vivek Arya

analyst
#10

So a year from now, how large of model sizes will you be talking about?

Ian Buck

executive
#11

We're already using trillion parameter models today. You just don't know it. The active parameters are highly variant and the techniques and every piece of idea that you can use to trim how much compute you use, like you said in your previous question, is being applied, researched, figured out. What then happens is that the other way of optimizing for compute is distillation. So you take the trillion perimeter model. And if you fine-tune -- you can limit the use case or limit the application to a vertical or a narrow work space, you can get -- you can reduce down to 70 or 7 billion parameter model, and there's lots of that. Quick small models like for doing search text when you type in your text on your phone, it's expanding the sentence for you, that's a very small model, which can be finely tuned to you, personalized to you and what you may be doing at that moment. So we see an explosion of like this of vertical models. Hugging Face right now, I can't remember there's -- if you search for our Llama and Hugging Face, you're going to find bazillions of distilled models. By the way, all those distilled models also need to be computed on and they're constantly being regenerated. The one of the big consumers of GPU is distillation, taking a big model, running inference on it, creating smaller models. So we are -- they start from a really highly intelligent one, and they distilled down. So I think we're all getting to 1 trillion parameter models now. There's talk when do we get to the 10T and how many active perimeters and what does that model actually look like in terms of the optimization stack is pretty funky.

Vivek Arya

analyst
#12

The next topic, Ian, would love to get your perspective on is NVIDIA's competitiveness as the world moves to more inference, right? In that, training, I think there is recognition that NVIDIA has done an outstanding job. But as we go to inference, there's a fragmentation of workloads, optimization, et cetera, et cetera. One of your GPU competitor has added a lot more high-bandwidth memory, and they are saying, that's better for inference. There's a whole bunch of start-ups, right, who are promising lower cost per token, et cetera. So how do you view NVIDIA's competitiveness when it comes to the inference market? And even if we could compare it against a lot of the ASIC players that are out there.

Ian Buck

executive
#13

It's a good question. NVIDIA thrives at things that are hard. We just do. We're an engineering and technology company. I've got a boss who is passionate about solving the hard problems and letting other people make money and innovate on top of what we can provide as a platform. And my life is I want to update my bio, I'm just a platform guy. I'm just constantly building technology platforms, to help other people make money. The inference is really hard. It's wickedly hard. It's actually, in many cases, while training is hard for different reasons, trying to do 100,000 GPUs or going to 1 million GPU distributed training clusters and keeping that thing going at scale is a data center scale, reliability, networking, 1 giant GPU problem. Inference is a myriad of optimizations. You start with numerical precision, 32-bit floating point, 16-bit floating point, 8-bit floating point, 4-bit floating point, just to be -- if we can use the opportunity, Blackwell has 20 petaflops of FP4 per GPU. For petaflops, that's a lot. The fastest supercomputer in the world is measuring exaflops, which is only 1,000 petaflops, we got that in FP4. But making 4 bits work and come up with the right answer, you only have 4 zeros in months like that's not a lot of numbers. So that mathematically numerically getting an accurate answer by using only that is -- requires expertise in numerical and quantization primitives that are extremely complicated. Go up from there, you have that -- you now distributed the model. The model hasn't been on a single GPU, single piece of silicon, I don't care who you are and in order to get performance, you have to have multiple -- you have to connect multiple chips together to run in parallel within the node. And then if you're going to do the high-value models, you're going to actually have to run multi-node and connect them all together. And you've seen how complex and we share how complex the GP200 NVL72 is. On top of that, you have diversity of workload. AI factory is not going to run just 1 model all day long. It's easy to benchmark than one model, it is easy to optimize our own model, and certainly, it can be easy to build. If you want to just 1 run thing, you could build just -- you can tune your architecture for that, but AI factories are going to run every kind of models and the models are going to change. You're buying a $1 billion ad factory, you're going to need to capitalize that expenditure for 5 years. You damn well better make sure that whatever you buy for now, you're going to be able to run and capitalize and create value for 5 years. The future of AI is you go back 5 years ago, we were launching the first A100. I think I was still talking about ResNet today. So that's a really important and strategic investment for companies to make sure that they're building an AI factory that can do all of those optimizations, all those techniques, run all those models today and next year and the year after that, all the way out to 2030. That's why the platform is so critical. That's why NVIDIA has got to work with every single and we do with every single AI company to make sure that our platform is constantly innovating. The innovations, we don't -- we invent do some of that technology, but the vast majority of it actually comes from all of those companies like OpenAI, like Meta, like Grok -- the Grok model at X AI and as well as the entire academic community and amazing innovations come from there and also DeepSeek. FasterTransformer was a student, he's now a professor at Princeton. And just right there, doubled the transformer performance because he figured out a way to run it more efficiently, more accurately and with less cost. So that is the inference market is about running every model across all those factories now into the future. It's a fascinating business model. We think that data centers are bought with billions of dollars 5 years of CapEx and you end out -- end up charging dollars per hour or millions per token at this end.

Vivek Arya

analyst
#14

So if, let's say, you were the head of AWS, how would you go about making the decision between ASICs or GPUs for your AI factory?

Ian Buck

executive
#15

You should ask Matt that question. He's a good guy. I worked with him.

Vivek Arya

analyst
#16

Well they talk a lot about Trainium.

Ian Buck

executive
#17

I'm sure. I know. And they should, right, I mean, it's -- building silicon is hard. Talking to somebody who's been involved with it for 20 years. It's hard and getting even more complicated. So it's no small feat to be able to achieve even what they've achieved. And I'm super happy. I mean that's impressive what they've been able to -- anyone who's gotten over the -- survived it and been able to do multiple generations and stuck with it is -- requires almost founder level CEO commitment to make it happen. Their values and every hyperscaler, they're all building on silicon. They are people -- and they're both our customers and also provide -- looking at alternatives and they rightly should, their own and other silicon and other opportunities out there. Each of them have to find what they need to optimize for and what they need to go serve and what they're going to do for their business. So I can't speak for Matt's business exactly where he's going to be applying all those likewise with TPU. They're all looking at -- they have an internal workload and an external opportunity. They're all very passionate about making sure they provide our time to market, the latest NVIDIA GPUs and the customers and workloads that we bring to their clouds. So our business with AWS and with everyone is extremely healthy and it continues to grow. AWS launched -- the first launch action, the B200 HGX. We talked a lot about NVL72, but the existing B200 HGX platform, which is just 8 GPUs and muling connected the same architecture that ChatGPT ran on with Hopper. We also do it with Blackwell. It's a fantastic inference platform. It runs all the same Hopper workloads, all on x86. It carried over and immediately provided a 3x boost for inferencing. So everyone who is on Hopper using H100 H200 HGX, as soon as they're going to B300, immediately you're getting a 3x boost. And you see that in the artificial analysis benchmarks and everything else in terms of performance. So AWS is an excellent partner. How are they going to apply and where they see their opportunity? Everyone's got to define that niche or that area that they're going to add value with and then how they're going to engage in a community. It's actually -- it's one thing also to win on a benchmark or do a certain workload, it's a whole another game to try to activate an ecosystem and developers and your platform into the market. Not all need to do that. And certainly, some have chosen to work on certain opportunities. But the undeniable part of it is that we're constantly making things faster. We are lowering costs. We are making things more profitable as per the DeepSeek B200 example. And we just -- we're doing that like annually. So each of them have to kind of choose where they're going to provide value or differentiate.

Vivek Arya

analyst
#18

So if I ask the question in a different way, which is, today, if I look at $100 of spend on AI, $10 to $15 of that is going into ASICs. If we go out the next 3, 4, 5 years, what makes this $10 to $15 go to $20 to $25. What do you think would have changed in the industry or can change in the industry to make it more towards ASICs and away from merchant silicon.

Ian Buck

executive
#19

Well, you look at the problem because of the profitability. In your revenue, your performance is actually your profitability, your gross margin. And you can look at like the cost reduction and we have a component. But generally, when we look at it, we look at it in terms of there's $1 billion of AI factory that you're going to generate. How many tokens is it going to output compared to the previous generation and how much more value that those tokens are going to -- not just in strict dollar -- same dollar per token in the same model, but if you can deliver 3x more tokens per second, you would pay more for that. So the reasoning of the -- like in a reasoning model, you get your answer faster or be able to reason within a certain amount of time, you actually pay a premium for that. Asking what is 50 plus 50, go away for an hour, come back versus getting it right away is more valuable. So the -- it's a little bit the dollar spend on a data center on chips is actually pretty small. If you actually look at the chip silicon cost or even just the price of the dollars they're spending on the chips versus everything that goes around the chips is it is increasingly a really important part of the value because if AI really isn't our inference and certainly training is because of the value of reasoning in these large models is not a single GPU chip business anymore. It's about connecting all those chips together with a high-speed signaling with and as a result, liquid cooling to fit them all in 1 small space so they can all talk to each other at those speeds. The more you spread them apart the slower the signals have to travel. And so that's why liquid cool brings it all together. The complexity and the value that, that brings is driving up the -- it's not because we want to spend that much more money, and we want to run that fastest because the value that we bring with bringing that together drives up the revenue side of it. So I think the -- we will always look at previous generation. We'll always look at what the opportunity is and what others are able to actually achieve on the basket of workloads that we know is valuable now and what we do our best to predict what's going to be valuable in a year or 2 years' time. And then the good news is NVIDIA is always coming up with new GPUs every year now, new architectures every year now and also optimizing the data center design every year. So I -- that makes my job a little easier. You used to have to predict a 3-year horizon. Now I can think about now in the future. And if I get -- we get to see another opportunity or we get a little bit wrong, we can just keep fixing and fixing and fixing it. So that's the -- in terms of how do ASICs or alternatives play, I think it's going to be basically what niche, what vertical, what workload do they want to optimize for what use case and what they want to decide. NVIDIA's goal is not to run every AI model everywhere. Certainly, what goes in a ring doorbell should be what the silicon inside of ring doorbell should do or a hockey puck on your kitchen counter or what's inside of your phone and how they want to work there, where we're going to focus -- or I focus on is just the AI factory for inference and the training clusters at scale. And increasingly, those 2 things are melding together. And then also, providing it as a platform from -- with all my cloud providers that all the startups, all the innovators, the next OpenAIs and every enterprise can get access to the technology and capitalize on the opportunity of the revenue that the token spring to them and also the token serving companies can make money on the top. So it's really important to look at the overall end-to-end value that the inference brings in terms of revenue, add to the cost of compute, which is actually going up in percent or the benefit in revenue and benefit is going up in X factors. That's -- we're seeing that and but -- and only by providing that kind of percent to X factors, do you get a growth trajectory that NVIDIA can hopefully provide and will continue to provide in the future. So when we look at our value props, we look at our pricing, we look at our models, we're always looking at that net of through the chain, is everybody adding value? Is everybody able to capitalize it and be able to continue to scale up and grow. And if you just look at it over time, it's percents to X factors to big X factors. At GTC, you often see the big X factors kind of in there. But there is that whole model that actually gets played out in that world.

Vivek Arya

analyst
#20

Maybe 1 last 1 or 2 things. The new sovereign AI opportunity, how incremental is it? Is it just a lot of the Western companies just deciding to spend overseas? Or is this truly incremental versus like the original build-out of the internet was pretty kind of concentrated. Now as we are starting to see all these new AI factories open up, is this truly incremental demand for this?

Ian Buck

executive
#21

It definitely is. So when you go and talk to governments or nations or -- and actually, a lot of the supercomputing. My other job is HPC. I've been doing supercomputing for -- it's where kind of this whole thing started from. Those same people are now like getting -- are in the center of attention in every country because computing is important for their nations. We just did I believe it was a 10,000 Blackwell GPU AI factory in Taiwan. It's for Taiwan industry. It's owned by Taiwan. It's there to help apply AI to manufacturing, whether it be silicon or automotive or city or civil or as a resource for the country. We have -- we're seeing Japan, a country that is rich with data with unique industries, with a unique population and demographics and a country that's facing significant change and how to grow. They're building AI -- they're building their own -- they're using that data, building their own -- they see AI as a national need or computing need in order to basically apply their data, apply AI, apply computing to their industries. And by consolidating -- by the government stepping in, by the nation sitting in, they can actually consolidate that as a national resource versus waiting for every single company or every single industry to necessarily build their own, and they can pull some of those resources, and they're a good partner with NVIDIA. Seeing the same happening in Germany. It has happened already in the U.K. These are basically -- and they know how to build them because most of those countries know why supercomputing is important. Now it's really elevated with AI to execute. So yes, my -- the HPC and supercomputing side of the business has exploded as a result, and they know how to execute. So it is a really exciting opportunity. And every nation sees the opportunity to be a player on the stage and apply that. It starts with keeping their data, keeping their computing local and also prioritizing it.

Vivek Arya

analyst
#22

How large do you think it can be over time?

Ian Buck

executive
#23

It's a good question. Today, we are seeing about 100 AI factories being built and assembled right now across the world.

Vivek Arya

analyst
#24

And AI factories how much like 1 billion-ish or how much is an AI factory?

Ian Buck

executive
#25

Stuart and the other teams can talk to it, but we track it as a data center build that is we have either B2 Blackwell or Hopper is specifically designed for serving in for tokens for industry. And that is a number that's just going to continue to track and grow over time. The -- actually next week is GTC Paris and also IOC, 2 events at the same time, International Supercomputing Conference. You'll hear a lot about AI factories and sovereign AI and the activities.

Vivek Arya

analyst
#26

So European Commission actually announced big projects earlier this year.

Ian Buck

executive
#27

Europe gets it. They absolutely gets the fact that they can and has the capability to deploy. U.S. as well, last week launched NERSC down -- over in Berkeley across the bay, 9,000 Vera Rubin. It's actually our first supercomputer announcement with our next-generation Rubin architecture was announced with the Secretary of Energy and actually Jensen participated in the announcement that will be deployed next year. 9,000 Vera Rubins and the Mission NERSC is open science and also for industry for -- and named after -- the supercomputer is actually named the Doudna supercomputer, named after Dr. Doudna who invented -- I guess, discovered CRISPR and she was there, a wonderful woman, brilliantly intelligent. And as an example of using and why computing is important for health care and pharma discovery. So this -- and one of the purposes of the supercomputing is to combine and figure out how to apply both traditional simulation and AI together to advance scientific discovery and needs for the nation.

Vivek Arya

analyst
#28

Got it. And maybe 1 last question. What do you think will create a constraint on this growth? Is it access to power? Is it customers may not be able to adopt this kind of annual cadence of products? Is it just that CapEx demands are going up? Like what do you worry about the most as you look over the horizon?

Ian Buck

executive
#29

There's a diversification that's happening. Of course, it was -- the business is expanding. The number of players in the data center world is expanding. Certainly, power how many megawatts do you have and how many gigawatts do you have and we track that very closely with all of our CSP partners, but also now increasingly with all of the NVIDIA cloud partners and GPU data center partners, you've obviously heard of CoreWeave, but there's Lambda, there is data, maybe there's many, many players now. And the template of how to secure data center, secure GPUs for that data center and align with customers. There's actually -- and also on top of that, the software and infrastructure necessary to operate and run. It's not even just a cloud, just a GPU factory, an AI factory, a token factory is starting to become fine-tuned and executable and operational. So there's multiple things that are coming together to help accelerate the growth. Certainly, the hyperscalers know how to do it, and they're investing the time. You can see there how much megawatts and how many data centers. Microsoft just talked about the fact that they're this year are deploying more new capacity than the -- this year alone than all the capacity they had 3 years ago. So there is up in the right curve in terms of -- and they shared what their next-generation 100,000 -- hundreds of thousands of Blackwell GPUs under 1 site that they're building and they talked about in their build keynote. Look at Scott Guthrie's keynotes. It was great to see them talk about it. That is now -- but there's a diversification happening in terms of where can everybody get their compute, certainly, as more enterprises needed it, as more start-ups needed, they're both going to the public clouds for sure, but they're also looking at all the regional clouds. And what they can do from a data center capacity. So the growth it require is being tracked by gigawatts of compute that's being put online, not just by CSP but by the world by all the players. The speed of which the AI -- the deployment software and stacks get standardized or commoditized or understood or how fast they can deploy. And that as a result, diversified. You certainly, hear about the big, big ones, obviously, but that is a portion of the business. There's a very long tail of and sizable part of the business that is distributed that's happening in the world, which is exciting because it's more people being able to contribute, deliver the compute and make it available. I think the only other limiter right now is the speed at which people are coming up with new high-value models and bringing them to the enterprise. The enterprise and that's all the Fortune 500s, their ability to take an AI model and have it add value to their business, whether it's straight uplifting ChatGPT and putting that into a help or top of the search bar or to applying an ad revenue, to applying better connecting a feed to inserting the right ad or right product placement to closing and making it profitable for them. So that is certainly happening. And that's where the speed at which that was the limiter there is just how many models, how many different techniques can be deployed in all those different use cases. It's also really hard to track. I feel bad for you guys trying to figure that out. We get to -- but if you see the activity around AI for enterprise, that is the demand generation that we're seeing across all of our consumption of our GPUs.

Vivek Arya

analyst
#30

Got it. I know we are out of time. I did want to ask just 1 last question. What is NVIDIA's ability to monetize software? And where are you in that journey?

Ian Buck

executive
#31

Sure. I'm going to pause on the public statements on software monetization because I don't have that off the top of my head. I don't want to say anything about it. But I think we get to see some of the things we've said in the past. Our -- we have sort of -- NVIDIA is an open company. So my job is to make sure that their computing platform is available everywhere. And to provide that compute, whether it be in the cloud directly, go all the way down to CUDA all the way up to running pie torch or running a model of Hugging Face. For the enterprises, there's -- companies want to work directly with NVIDIA. We have the opportunity to monetize working directly with NVIDIA on specific models. And make it available. It's not to supplant the community but to provide direct engagement, and that comes in the form of providing a supported Nemotron model, which is a model NVIDIA generates, it's trained and actually my team to provide that extra value directly to them. The other opportunity is in the data center software itself. A lot of our partners are looking for help to provide the infrastructure, and we've talked about Lepton before that software to support the cloud to take -- it's one thing to stand up a data center full of GPUs. It's another thing to operate it as a data center and be able to serve and host and schedule and execute. That's another use case where we can provide that value. And in general, our software stack, all of our library, all of our CUDA X and all of the inferencing software like Dynamo and everything else, customers want to be able to gauge directly to NVIDIA. We also offer that as an enterprise support so that they can have a direct relationship with NVIDIA. As our software footprint expands as where they want to engage directly with us, we can directly monetize or provide a service to them, which they want to pay for. They want that engagement and of course, as that value and as that goes to the broader enterprise, you'll continue to see that number increase.

Vivek Arya

analyst
#32

I can go on for another hour, but we are out of time. Thank you so much, Ian. Really appreciate your insights. Thanks, everyone, for joining.

This call discussed

For developers and AI pipelines

Programmatic access to NVIDIA Corporation earnings transcripts and 32,000+ others is available through the EarningsCalls.dev REST API. Plans from $24.99/month — full transcripts, speaker segments, full-text search, and the recently-added /api/v1/transcripts/recent polling endpoint for ETL pipelines.