NVIDIA Corporation (NVDA) Earnings Call Transcript & Summary
September 9, 2021
Earnings Call Speaker Segments
Pierre Ferragu
analystI told you at the beginning of the day, we really wanted to bring together in this first edition of our Big Ideas Conference people from -- commenting from different perspectives. And I think it's absolutely fantastic to see Bryan taking the stage right after Cliff. Bryan is another PhD from Berkeley like Mark. He works at NVIDIA. He's the Vice President in charge of Applied Deep Learning Research. So he really has another unique perspective on the industry. He's inside NVIDIA, the most formidable designer of AI chips, and he's there to work on the next frontier, to work on the most challenging problems, the most challenging models, how to implement them, scale them out. He works on generic problems that can be -- and can come up with innovations that can be useful to NVIDIA's clients. But he's also working on the model that NVIDIA will eventually leverage in-house, internally in their own business and design processes. And so what we asked Bryan is really to tell us about exactly that, the next frontier, the challenges of scaling out the models that are growing in size and complexity much, much faster than Moore's law is actually increasing the performance of our chips. So with that, Bryan, thanks again for joining us, and I give you the floor.
Bryan Catanzaro
executiveWell, thank you so much for inviting me. It's my honor to be here. As Pierre said, I lead a lab of researchers whose mission it is to invent new Artificial Intelligence models, and then apply them inside NVIDIA to make NVIDIA's products better and our work more efficient. And so what I'd like to talk about today is some of the challenges that I think we and the rest of the industry face in scaling up our models to extreme sizes and the benefits that we see from doing that as well as talk a little bit about how we're using deep learning to change NVIDIA as a company. And if we can go ahead and start presenting my slides, that would be fantastic. Thank you. Let's go to the next slide. Here are some disclaimers that we're going to be talking about, research that my team is undertaking. And as you know, research isn't always the same thing as fully finished products, so everyone should be aware of that. Okay. Next slide. So the first thing that I wanted to talk about is language modeling. And we've been really seeing an enormous explosion in the complexity and in every dimension of language models recently. What is a language model? Some of you may have heard about this idea before and some of you may have not. It's actually quite simple. Our goal is to build a model that can connect works together and understand their meanings in the same way that humans would. And if you think about it, this is maybe the most important challenge in Artificial Intelligence because all of human wisdom and knowledge has been encoded in language. When we communicate, when we write papers, when we read news articles or books, we're communicating ideas through this code that we call language. So if we can create a model that is able to understand language and is able to produce language, that has some equivalences with actually understanding and producing ideas. And those kinds of models are incredibly valuable because they can be used for many different business problems. And with that, we've been seeing just an enormous explosion and investment in language model technology across the industry. I have one here, a graph, starting from the sort of big bang of deep learning with AlexNet, which was image classification neural network that really took the industry by surprise back in 2012. And going to ResNet, which was the state-of-the-art and image classification in 2016, and you can see on the Y-axis, the amount of compute that's required to train these models. And that is in petaflop days. So a petaflop is a large amount of compute. It's 10 to the 15 floating point operations per second, and then we multiply that out for a day. So that's the unit of compute we're talking about. Is if you had a petaflop machine and you ran it for a day, that's 1 petaflop day. In order to train AlexNet, it actually took 1/100 of a petaflop day. So that's what the Y-axis telling us is, 1/100 of a petaflop day. And when we scaled to ResNet in 2016, we were about 1/10 of a petaflop day. So over that span of 3 or so years, we increased the amount of compute that was required to train that model by about 10x. Now let's look at the language model. So the language model that really kicked off neural language modeling was Bert, back in 2018. And Bert took about 10 petaflop days to train. Since then, there's been a lot of investment from a lot of different companies and organizations, and we've just seen this incredible ramp in complexity. The most advanced public language model that people have been excited about recently is GPT-3 from OpenAI. And to train GPT-3, it took about 3.6 -- 3,600 petaflop days, okay? So we went from 1/10th of a petaflop day in ResNet in 2016, and now we're at 3,640 petaflop days to train it. So that's just an enormous rate of growth much faster than Moore's Law. And if we extrapolate out this trend, it could lead to 100 trillion parameter of single models trained by 2023. And this rate of growth far outpaces the hardware. The reason that we've been able to see this growth is that people are training on much, much larger machines. AlexNet was trained on 2 GPUs. To train GPT-3, took approximately 10,000 GPUs. So people are not just looking at what can we train with 1 or 2 GPUs, they are looking at what can we do with tens of thousands of GPUs. And when you think about this as an economics question, it becomes rather interesting because the technology that we use to train these models is quite scalable. In fact, as I work at NVIDIA, I'm able to see how hard NVIDIA works to make our systems very scalable all the way from small systems to the most largest supercomputers that humans have ever built. And to make sure that we actually deliver scalable performance all the way up to that scale. And it requires an enormous amount of investment from NVIDIA in software, in frameworks and libraries, in networking as well as the GPU chips itself. And so I see all of this investment happening, and it's really remarkable that our technology platform has been able to scale to this size. We can go further. So the question of what is constraining us from going further is an economic question, because training these models is expensive. And the reason that you would invest a lot to train a large model is because you're expecting a big return. So let's talk a little bit about why that would be the case. Next slide. Really, the thing that is pushing interest towards these very large language models right now is an idea called few-shot learning. And stripping away the technical jargon, the goal is an Artificial Intelligence that can actually adapt and solve new tasks that it hasn't been exposed to before. And on the right, I have an example. So this is actually a real-life example that's taken from a language model where the text that's in green was the input to the language model, and the text that's in black was the output. And what I did was I gave the language model a few examples of some simple sentences in English, along with their translation in Spanish, and then asked it to translate the sentence that -- and it was able to do that. Now the thing that's so exciting about this and so valuable is that this was just a generic language model that had been trained on the Internet. It had never been exposed to the idea of translation. It had never been formally exposed to the concept of English or Spanish, right? No computer scientist decided to build a translation model when we created this. We just made a generic language model, and we trained it on text on the Internet, which happens to contain both English and Spanish, as well as parallel text and pages that talk about how to translate back and forth. The model was able to learn enough about the properties of language, about the fact that English is a different kind of language than Spanish. The fact that there's different words that have parallel vocabularies in these 2 languages, and the fact that I asked a question to perform a new task that I had never been asked to do before, just by giving it a few examples. And so the language model that it's able to do this simple translation task is quite extraordinary, because it really represents a step towards generalized Artificial Intelligence. And that's what computer scientists mean when we say few-shot learning is that it's able to do a task with just a few examples, a new task, it's able to attempt to solve that task with just a few examples. And the reason that it can do that is that language describes all human activity. And as I said earlier, everything that humans do, we encode in language. And if you train a model that's able to understand the connections between words and language and manipulate them semantically and then produce language that is appropriate in context, then you have actually created a model that can think just a little bit. And what's the value of such a model? What's the economics behind such a model? GPT-3, if you search on the Internet, you'll find that it may have cost about $12 million of compute time to train that model. When we think about assets that companies have that cost $12 million, they're usually quite tangible, like maybe a building or maybe a fleet of cars with management dedicated to maintaining that. What does it look like in the future where companies are building language models that take millions or maybe even $1 billion to train of compute time? How do we care for that model? How do we maintain it? What kinds of disruption would that model have in order to justify that kind of investment? And I think that's a really exciting prospect. And these kinds of questions are really what's driving the investment across the industry. And if we were to think about like what would a model look like that required $1 billion to train? And how would it be used? It would need to generate many, many billions of dollars of return in order to justify that investment. And that really means that it would need to reinvent an entire company. We're not there yet. These models aren't that powerful yet, but the rate of growth is astounding. And the reason is because as the industry continues investing in these models, we keep seeing more and more exciting results, and that justifies more investment. Next slide. My team at NVIDIA has been working on showing how to train these models using large clusters of GPUs very efficiently. We have this project called Project Megatron. These models are built on this computer science idea called The Transformer. And so we thought Megatron, being the biggest and baddest transformer, was a good name for this project. Because our goal is really to show how efficiently these large models can be trained. And when you're investing $5 million to $10 million of compute time to train a single model, if you can make the training process even a few percent more efficient, then that's enormous savings and definitely worth the optimization. And so we've been working on this for many years. We have this open source project. And I think the results are quite extraordinary. We are using all of NVIDIA's technologies together, all of NVIDIA's software, all of NVIDIA's hardware. It's built on CUDA. It's using NVIDIA's CUDA-X AI libraries, using NVIDIA's interconnects, NVSwitch, Mellanox, InfiniBand, the DGX SuperPOD, NCCL, our communications library, cuBLAS and cuDNN, which is our libraries for accelerating neural networks on the GPU. And PyTorch, which is an industry standard deep learning framework. And we're able to combine all of these technologies, and each of them are running full tilt in order to train one of these extraordinary language models. And we've been able to show that we can scale these models really efficiently to many thousand GPUs. And as we do that, our efficiency remains very high. In fact, when we train 1 trillion parameter language model, which is about 5x bigger than GPT-3, maybe 6x bigger than GPT-3, we are actually sustaining 52% of the theoretical peak performance of the math units on each GPU. Only there's 3,000 of them doing that simultaneously. And usually, when people have a peak performance number, that is a speed limit, not -- it's kind of like the speed of light. It's very hard to get very close to that number. The fact that we've been able to get over half of the theoretical peak speed of the math unit when we're actually running a problem that requires an entire machine and an entire data center. Storage devices are running full tilt, all of the networking at every scale from across the data center, to within a cabinet, to a single system, to inside the system between the GPUs, each GPU is built out of many different units and all of the interconnects inside the GPU and all of the cases are running full tilt in order to support this kind of efficiency. And this is an example of what accelerated computing can provide. As the way we work in NVIDIA to build platforms for machine learning is we optimize everything together. All of the hardware and all the software. And when we set our focus on something important like supporting the training of the world's most advanced language models, then we're able to achieve quite extraordinary things. So I'm really proud of that. And this technology is being used by many companies and institutions training the most advanced language models today. So it's been an honor for us to be involved with that. So next slide. So I've been talking about language models. I'm very excited about them. I think from a business perspective, they're probably the biggest thing that's happening right now that's pushing innovation and really pushing big ideas of what Artificial Intelligence can do to transform the world, and I'm very excited about that. And we have a lot of opportunities to use these language models at NVIDIA in our own products. And my team is hard at work, along with other teams at NVIDIA, figuring out the best ways of doing that. So it's a very exciting time. I wanted to talk about a few other things that we're doing at NVIDIA to reinvent our core business using deep learning, in addition to language modeling. One of them is called DLSS. And that stands for Deep Learning Super Sampling. And it's a graphics technology. As you know, NVIDIA has its roots in the 3D graphics space. And what we've been able to do with DLSS is reinvent computer graphics to be much more efficient by applying extra computation through deep learning. And that's a little bit counterintuitive. How is it that you can make something go faster by doing more work? Well, let me tell you how that works. What we do is we have world simulations that are taking a lot of computation to render all of the ray tracing effects that make video games realistic these days. And we train a neural network to accept a lower resolution input that requires fewer samples and is therefore cheaper to produce. And then we pass that through a neural network to do a temporal reconstruction to produce a high-quality, high-resolution output. When we train this model, we actually compare it against supercomputer rendered ground truth that is infeasibly perfect. You could never do this in real time, but it's kind of the ground truth. We compare our results of this reconstruction with that, and then use that to update the model. And by so doing, we've created a model that can make a small GPU look like a big one. And as you know, to our gaming customers, the value of the GPU depends on its speed. And basically, its size, so people are going to buy a more expensive GPU if it's bigger, and therefore, can run faster. So if we're able to take a small GPU and make it perform like a large GPU because we've applied deep learning in this way that makes the rendering process smarter and, therefore, more efficient, then that's really exciting for our customers. Next slide. So here's an example. It may be hard to see on Zoom, but there's a lot of websites where you can take a look. On the left, we have a native 4K image. And on the right, we have the same image, only being rendered with DLSS. And if you look, the images are quite similar. The only thing that's different is that the frame rate counter has gone from 108 frames per second to 141. And so we've been able to create an experience that's essentially identical, but that runs quite a bit faster. Next slide. And if we zoom in a little bit, we can actually see that the neural net is doing a little bit better job in reconstructing some of the fine details in the scene, for example, the text on this backpack that this person is carrying. And so that is a really exciting thing for us and speaks to how we're thinking about reinventing our core businesses with deep learning. DLSS is one example. We have many other examples. Basically, every part of NVIDIA on everything that we do as a company, we're reconsidering, how can we make this better with AI? How can we make it more efficient? How can we give our products and our engineers new capabilities so that they can really push the state of our technology forward? And it's -- I think it's important to say that we're really thinking about this deeply in the sense of reinventing our core work as a company with artificial intelligence. Next slide. These are some quotes from some reviewers that corroborate that this technology is actually useful. Digital Foundry said "it's impressive to the point where you'd be nuts not to use it." And Hardware Unboxed said that the upscaling power of this algorithm is extremely impressive, it's basically a free performance button. So that was great to see. And again, it speaks to our core technologies being reinvented with AI. Next slide. We also care, as another example, a lot about self-driving. That's also one of NVIDIA's core projects these days. It's very important. We consider self-driving a strategic initiative that we're investing a lot in. And when you think about the possibilities of using Artificial Intelligence for self-driving, they're obviously quite great. And so we are working really hard on building platform infrastructure and service, all built around AI in very many different forms, both in terms of perceiving the world, mapping the world, simulating the world, simulating the way the car interacts with the world, simulating the world -- simulating the way the world should interact with the car, building data factories to help us create new data sets to train this. All of this is driven by AI, and it's something that's very strategic to us. I don't have a slide on our efforts to employ artificial AI to chip design and library design, but I just wanted to briefly mention that this is something that's very actively happening at NVIDIA as well. If you think about NVIDIA's core work, a lot of it revolves around systems, building systems, whether that's data centers or DGXs or individual chips, networking equipment, GPUs. All of those things, there's opportunities to rethink how we do that by applying AI. And we have some really exciting projects underway, some of which are actually already in production. So that's really exciting for us. Next slide. Okay. So I'm going to close out my prepared remarks here, just to leave you with these 3 takeaways. The first is that AI is growing at a super exponential rate. The language modeling, I think, is one great example of that. I think we're going to see a continued investment in that just because language is so useful. As I said, it encodes all of human wisdom and knowledge. And so models that can understand and manipulate it meaningfully are extraordinarily valuable. And the idea of a generalized AI model that doesn't have to be retrained for every task, I think that's the dream that Artificial Intelligence research has been working on for 70 years. But with these large language models, we're actually starting to see it come true. And that's very exciting. At NVIDIA, we build the software and systems that are used to create the most important AI. And we're really excited about the prospects of continuing to do that for the future. We optimize everything altogether. That's kind of our accelerated computing philosophy. All of the software and all the hardware that's necessary for researchers to build the future, we are continuously improving that and iterating on that. And we are very excited about the possibility of applying AI to our own work and our core work, not peripheral things, but core things. And we feel like that is the way forward for our company to improve in the future is by applying AI everywhere throughout our work. We feel like that will give us advantages that will help take us to the next level. And we offer the tools for other companies to do so as well. I think artificial intelligence is the kind of technology that really transforms everything in every industry. And there's just been so much growth. I -- when I joined -- when I rejoined NVIDIA 5 years ago, I was asked to build a new lab applying AI to NVIDIA's own work. And I was excited about that mission. When I went about trying to hire people to do that, at the time, NVIDIA wasn't as well known for our own AI, and that was a question that I got from some candidates as well, is NVIDIA really an AI company? Or is it a chip company that builds GPUs as I know about? And so I had to tell them, "Well, we are committed to using AI for our core business. And if you come join us, there's going to be some great opportunities to rethink how NVIDIA does its work and really make a difference." I think 5 years later, NVIDIA's strength in AI and applying AI to our own work has really developed. And I think that days were a lot more recognized for that. I don't get this question from candidates anymore. Now when I'm talking to people, they want to join our efforts because they see that NVIDIA is being successful in applying AI to our own work, and they want to be part of that. And I think that story of how AI develops in application is not unique to NVIDIA, I think that has happened at other companies and it will happen in the future to many other companies as well. And so I think we're just at the beginning of applying AI, and I can't wait to see where we go next. So with that, I'll close and be happy to take questions.
Pierre Ferragu
analystThanks, Bryan. Thanks for the great presentation, and thanks for giving us a background on the name Megatron. I never made the connection. It made me laugh. I didn't think of the Transformer. That's great. Let me ask you about -- so it's very interesting because I realized, like speaking to Google, just before we spoke with you. And looking at the industry, we see many people developing in-house their own AI chips, their infrastructure, and you're kind of almost doing the exact same thing on the other side of the mirror, like going from doing chips to actually providing your plants with a much broader and richer AI infrastructure. So tell me how you look at that? How do you understand like Google or Tesla going into developing their own systems rather than just leveraging what NVIDIA is offering? And how do you see that evolving over time?
Bryan Catanzaro
executiveWell, first of all, I think that is a reflection of the intense investment and focus on Artificial Intelligence. I think it makes sense for there to be many different options for people to consider building Artificial Intelligence. It's -- NVIDIA is a company that's used to competition. The founding of NVIDIA, there was a lot of companies making GPUs. And I think back then, and I think that the philosophy that we developed over the past 3 decades on how to build accelerated computing and co-evolve the hardware and the software to provide a platform that's very flexible and can be used to invent new things but is also very fast, is pretty special. So I think it's great that other companies are working on other things. I think it's good to see what kinds of technologies are out there. I think also it's important to say that we consider -- there's nothing we wouldn't consider to evolve our technology to make it better for AI. The GPU may have started with rendering 3D graphics, but these days, AI is at the top of the list. And because our accelerated computing philosophy is to co-optimize all of the software and all of the hardware together, there's really nothing that we wouldn't consider doing. So I guess to your question, how do I think about it? I wish them good luck. I think it's interesting. There's been a lot of competitors building other ways of thinking about compute for deep learning for a long time now. This isn't news, right? And yet NVIDIA's platform remains, in my mind, the best. It's the most flexible and it's the fastest. And it's really hard to do both of those things at the same time. And that's a reflection, I think, of the software that we've built that allows people to innovate, not just run benchmarks, but actually build new things as well as the continuous investments that we've made in the platform itself. So I expect we're going to continue to do that. It's our highest priority.
Pierre Ferragu
analystAnd a quick follow-up on that. How do you see -- I think the whole -- the way the whole thing started is that there was x86 running data centers and very large-scale workloads and then suddenly you realized there are ways to accelerate sets compute with a lot of [styles] like matrix multiplication and things like that, that's how NVIDIA became what it is today in the data centers of platform of choice for accelerated computing. So there is in there like an element of forking where x86 is like...
Bryan Catanzaro
executivePause you there for a second. So NVIDIA started working on CUDA in the early 2000s, right? The investments that NVIDIA has made in software, in order to enable the AI revolution, I think, often go overlooked, right? Because it took approximately 15 years of investment in the software platform of CUDA before it started to be very widely adopted. So I think the story of GPUs taking over the data center because they have a matrix multiply unit isn't really quite accurate. I think that making a matrix multiply unit is not actually enough. In order for people to be able to build new Artificial Intelligence, they need a platform that's flexible enough for them to dream new dreams and invent new things. And it was a very long-term investment for NVIDIA to make our platform into something that could be that flexible as well as use good matrix multiple units. But I feel like the software work -- more people at NVIDIA work on the software on these platforms than on the hardware because the software problem is actually more difficult, in many ways, than the hardware.
Pierre Ferragu
analystYes. That makes some good sense. It's actually a mistake to reduce it to just an architecture. But the overall ecosystem, NVIDIA has developed -- The point I wanted to make is very differentiated from the x86 ecosystem. So that's kind of like you differentiate a lot from the mainstream architecture in order to do much, much better something more specific. And so the question I wanted to come to is, do you think that this accelerated computing world is going to concentrate into like one mainstream architecture? Or do you think you'll have additional forking, where you'll see some innovations doing things very different in order to address very different sets of problems or computations down the line? Do you see -- whether it's inside NVIDIA or outside NVIDIA is another question, but is one GPU for all acceleration good? Or we will have different chips for acceleration over time?
Bryan Catanzaro
executiveYes. Well, this is, I think, a core question in computer sciences, this tension between specific architectures and programmable general purpose things. And there's this idea of MACHI MotiveWave that we sometimes go back and forth. There's kind of a pendulum that goes back and forth between these things. So I don't think that, that is an easy question to answer and it's hard to forecast. But what I can say is that this is the core of NVIDIA's work to build the best systems for Artificial Intelligence. This is the key question, what should we make specific? And what should we make general? What should be programmable? What should be hard wire? And this is the exact question that NVIDIA has been wrestling with for the past 30 years, as we built the GPU into what it is today. And when we look at the core work that happens in NVIDIA to develop our platform in these days, of course, the platform is far more than the GPU, because we're really talking about data center scale computers, and we have to optimize all of that together. And when we think about that work, what is involved in inventing that work, the central question is, what should we hardwire? And what should we make programmable? Now I think NVIDIA has a really excellent balance between those 2 points, and that's the reason why our platform is widely used, is because it is fast and it's also flexible, and it's very hard to do both of those at the same time. In the future, do I think that there's going to be continued proliferation and more special purpose things? Perhaps, in some ways. I also think that there is forces to consolidation. I think software is maybe an often overlooked force for consolidation, but it turns out that rewriting software is very difficult. And building software that's easy to use tends to unify people around that software because people know how to get their work done. So I think the software that we've built is really important. And I also think that the scale of investment also is a force towards consolidation. There used to be many different firms that produce semiconductor chips, right? There used to be many, many different kinds of fabs and many different companies that we're building chips. In these days, there's fewer of them. And the reason, I think, is because of the scale of investment that's required to build state-of-the-art chips. I've just been talking about Artificial Intelligence at extreme scales. And I think that, that kind of investment also is a force towards consolidation. So I don't think that there is a proven answer to your question. I do think that it is a central question and it's one that, like I think NVIDIA is uniquely good at thinking about, and that's the reason why the GPU is what it is today. But it's hard to say exactly where the future will take us.
Pierre Ferragu
analystThanks. You've shown like pretty impressive example of what like one of these material language processing models can learn just a few short like, just learning to translate that to example. The one thing I'm wondering is, where do we see these models getting use at a very large scale? So where do you see like the GPT-3 type of model being used today in commercial applications? And where do you see the most immediate opportunities in the next few years?
Bryan Catanzaro
executiveYes. I think these models are being used a lot right now in Internet scale businesses just because they deal with text a lot. Things like search and social media because models that can understand language can really help people have a better experience, direct people to things they're interested in. And so I think that's probably where they're being used the most right now. I think there is a big shift happening as people start thinking about adopting these models for conversational AI applications. For example, customer service and retail applications where, today, we might be speaking to a computer, but it's not actually really intelligent, wouldn't it be great if we could solve problems by speaking to a computer that actually could understand what we meant. And so I think that's pretty close at hand. And over the next year or 2, I think we're going to see some big strides made in deploying these models there. I'm also really interested in applications at the edge. So there's a lot of devices around us. And I think it's going to be fantastic when 1 day we can interact them -- interact with them very naturally and sort of have embedded intelligence around us. That's the dream, I guess, of Artificial Intelligence research, and I think we'll be working towards that over time. One thing I'll -- guys, just to say about that as well is that the inference requirements for these models are quite high. And so there's always a lot of work to figure out how we can train a big model and then make it smaller for deployment. And then along those lines, also how can we deploy bigger models. And we're seeing a lot of increase in our NVIDIA-powered GPU-powered systems for inference as well, in order to support these models in various applications. And I think that shift -- that kind of shift takes time, getting access to enough compute to actually deploy these models widely. It's a shift that's underway, but I do expect that all of our lives are going to get a lot better in the next few years because we're going to start interacting with these models, and it will be much better than what we have today.
Pierre Ferragu
analystExcellent. Let me sneak in just one last quick question for you. So we now have like exact scale kind of computers like RAN for several days to train these models. You talked about how much it curves and how much effect of the others are still growing very fast. And so we're heading towards a $1 billion model. How do you think that industry, if I can call it, so structured in a few years from now? Will you have like a very small number of extremely large models that are then reused for specific cases by thousands of different organizations? Or will any company in the world have its own exact scale computer and training massive models for their own news all the time?
Bryan Catanzaro
executiveI think that's going to be dependent on basically the economics of these models, how much value they provide. I think that, as I said, sort of the forces of consolidation for very large investments may also start applying to AI as well. And it may be that people and companies start licensing access to very large models. We've already seen that with OpenAI, right? They provide API access to GPT-3. Microsoft has been really interested in that and has actually paid OpenAI for access to that model. And so that's already starting to happen. I would expect that in the future, there's going to be more of that. In the present, I think it is right now the case that most people deploying conversational AI are not doing so with these very heroic language models. They're doing things so with models that are a lot less unwelded. In that paradigm, what often happens is people fine-tune their models. So they take a pretrained model, and then they try to adapt it to a particular task. And at NVIDIA, we've been working to make that better. We have a toolkit called TAU, which helps people fine-tune their models. This is a software product that we have. And that people find that really useful for conversational AI. So at the moment, it's a lot of smaller scale models that are being fine-tuned. I think over time, we may shift towards fewer larger scale models that are licensed. And rather than being fine-tuned, they're just being used in a few-shot learning capability because I think the results will be better and the flexibility will be greater. And so I think that's where we're headed longer term, but that shift is underway.
Pierre Ferragu
analystBryan, it was absolutely amazing to have you participate to our first Big Ideas Conference. You're definitely working on very big things, on Megatron and like this gigantic model is exciting, and I can't wait to see how this whole industry develops and change our day-to-day life.
Bryan Catanzaro
executiveThank you. It's been my pleasure.
This call discussed
For developers and AI pipelines
Programmatic access to NVIDIA Corporation earnings transcripts and 32,000+ others is available through the
EarningsCalls.dev REST API. Plans from $24.99/month — full transcripts, speaker segments,
full-text search, and the recently-added /api/v1/transcripts/recent polling endpoint for ETL pipelines.