NVIDIA Corporation (NVDA) Earnings Call Transcript & Summary

May 28, 2020

NASDAQ US Information Technology Semiconductors and Semiconductor Equipment special 91 min

Earnings Call Speaker Segments

Operator

operator
#1

Ladies and gentlemen, thank you for standing by and welcome to the NVIDIA presentation and Q&A with CEO, Jensen Huang. I would now like to hand the conference over to Simona Jankowski, Vice President of Investor Relations at NVIDIA. Thank you. Please go ahead.

Simona Stefan Jankowski

executive
#2

Thank you. Good morning, and thanks, everyone, for joining us today. We're excited for this opportunity to discuss our recent GTC product announcements and data center opportunities with you in more detail. Today's call will be moderated by Evercore ISI semiconductor analyst, C.J. Muse. I'd like to remind you that our call is being webcast live on NVIDIA's Investor Relations website. The webcast will be available for replay until May 28, 2021. The content of today's call is NVIDIA's property. It can't be reproduced or transcribed without our prior written consent. During this call, we may make forward-looking statements based on current expectations. These are subject to a number of significant risks and uncertainties, and our actual results may differ materially. For a discussion of factors that could affect our future financial results and business, please refer to the disclosure in last week's earnings release, our most recent forms 10-K and 10-Q and the reports that we may file on Form 8-K with the Securities and Exchange Commission. All our statements are made as of today, May 28, 2020, based on information currently available to us. Except as required by law, we assume no obligation to update any such statements. Okay. So with that, let me turn the call over to C.J. to get us started. C.J.?

Christopher Muse

analyst
#3

Yes. Thank you, Simona, and good morning, everyone. This is C.J. Muse with Evercore ISI, and it's truly my pleasure to host this event as part of our virtual semiconductor series over the coming weeks. Today, we have the pleasure of having Paresh Kharya, Director of Product Marketing for Accelerated Computing at NVIDIA, present key product announcements from the GTC Digital, followed by roughly 1 hour Q&A with CEO, Jensen Huang. With the normal Analyst Day at GTC canceled due to epidemic, this should be an excellent time to dig deeper into NVIDIA's product announcements from the last 2 weeks, and we certainly have plenty to discuss. Paresh will go through his presentation, and then we'll move to Q&A. If you do have questions, please e-mail those to me at [email protected], and I would be happy to direct those to Jensen. And with that, let me turn it over to Paresh.

Paresh Kharya;Director of Product Management and Marketing

executive
#4

Thank you, C.J. I hope you all have had a chance to see Jensen's kitchen keynote, where he went into the details of our company strategy and had a slew of exciting new product announcements. While this year, we couldn't do our in-person GTC, our digital GTC helped us expand our reach even further. And Jensen's keynote has already been viewed 8 million times. We've had 60,000 registered users attending sessions, technical sessions, conducted over the last few weeks. Compare that to the 10,000 that would have attended our in-person GTC. On the data center side, Jensen made several exciting new product announcements, and that was the result of some of the amazing work our entire company has been focused on for a few years. My task today is to briefly recap those announcements and hand over to Jensen for Q&A. So let's get started. First, let me talk about the 2 important and very powerful trends that are driving the adoption of AI today. First, which is the chart on the left, AI advances are demanding exponentially higher compute. Think of an AI model as the brain of intelligent apps and machines. More complex tasks require larger brains or larger AI models. Computer vision was amongst the first use cases of deep learning power of AI. Bugs have computer vision. Other tasks like understanding natural language and responding intelligently requires higher levels of intelligence. So the size of the AI brain or the AI models is increasing exponentially to train these models. And we can expect this trend to continue because we're just getting started with AI. The state-of-the-art models have evolved from computer vision ResNet-50, which was the state-of-the-art when we launched our Volta GPUs to today's Megatron-BERT, NVIDIA Megatron-BERT, the world's most accurate AI for reading comprehension developed by NVIDIA Research. And it takes 3,000x higher compute to train now from the ResNet-50 to today's Megatron-BERT. The second trend is powered by these advanced and accurate models. The use cases are everywhere from conversational AI to recommend our systems to detecting diseases in medical scans. AI applications are now pervasive and they support millions of concurrent users. However, addressing each query requires a small amount of acceleration. So there are these 2 competing trends that are playing out together. On one hand, you need massive scale-up acceleration to train these models. On the other hand, you need to scale out acceleration to support millions of concurrent users on AI-powered services. And these 2 competing trends have resulted in a data center that has become fragmented. So because of the immense application diversity, today's fragmented data centers look something like this. There's a cluster of storage servers, shown in gray. There's a sea of CPU servers, shown in blue, then there is a cluster of scale-up servers for AI training, powered by our V100 GPUs connected with NVLink and high-speed NVIDIA Mellanox InfiniBand networking, shown in the middle. Then there is a cluster of inference servers powered by our people or scale-out -- for scale-out applications and clusters to run general accelerated workloads like the PCI V100 cards. If you look at this data center, the amount of different types of clusters needed is hard to predict because applications are spiky and the demands for these change throughout the day. It's impossible to optimize this data center for high utilization so the cost can be cut down. The cloud that represents these data centers, it represents a $100 billion industry today, growing at 40% per year into an IT infrastructure industry of about $1 trillion. This is the largest growth opportunity in the computer industry. To advance this industry, what if we were able to create a data center architecture so it only increases the throughput of the scale-up and scale-out applications needed to meet all the demands but is fungible enough to adapt as the application demands are changing, as the workloads are changing throughout the day? To create this fungible data center, we have to reimagine our GPU that Jensen announced at our GTC. It has been 3 years since we launched Volta architecture, and we've been hard at work reimagining our GPU. Our new architecture provides 3 major breakthroughs to transform computing: First is performance, 20x higher performance, not 20%, 20x higher. Second, it unified AI training and inference acceleration. And finally, it provided massive scalability. So a single server can scale up as 1 giant GPU or scale out to as many as 50 different accelerators. Our next-generation GPU enables flexible, elastic universal acceleration, something that we've been seeking for a while for multiple generations. And so it's very exciting for us when Jensen announced NVIDIA Ampere GPU, NVIDIA A100, the GPU that we created for this new type of computer, to deliver data center scale acceleration. It's powered by 54 billion transistors. It's the world's largest 7-nanometer chip, and we put those transistors to great use. Ampere architecture provides the greatest generational leap out of our 8 generations of GPUs for AI training, where the default map is FP32 or single-precision A100 is 20x more performing than Volta. For AI inference that uses integer 8 map, A100 is again 20x more powerful than Volta, delivering over a peta ops of performance on a single GPU. And we've continued to make progress on HPC, making A100 2.5x faster than Volta for double-precision map. To make all of this work and to make our customers make the most of this performance, each A100 can also be partitioned into 7 independent GPUs. It took us 5 miracles for A100 to achieve these breakthroughs. First, with 54 billion transistors, Ampere is the largest 7-nanometer chip ever made. It uses our -- the state-of-the-art HBM2 memory with 3D stacking, and it can sustain 1.5 terabytes per second of memory bandwidth, a 70% jump over the prior generation. Secondly, it features third generations of our Tensor Cores, the key technology for A100 massive speed-ups, and we'll talk about that in a bit. There's a brand-new capability called sparsity acceleration to harness the inherent sparsity of AI models for higher performance. Fourthly, a new technology called multi-instance GPU, which provides the core foundational capability for A100 to power elastic data centers. And finally, third-generation of NVLink and NVSwitch that enable A100 servers to act as a gigantic GPU connected by NVLink and NVSwitch. Each NVLink provides 600 gigabytes per second of interconnect, almost 10x that of PCIe Gen 4. And new NVSwitch allows each GPU to directly talk to each other at the maximum possible NVLink speeds. There are 3 technologies that make Ampere truly special, and nothing like that has been done before. First is the third-generation Tensor Cores with Tensor Float 32 support. Tensor Float 32 is a new math format that accelerates single-precision AI training out of the box. So think of math formats as similar to rulers. So the length of the ruler determines how large of an object it can measure, whereas the fine lines on the ruler determine how precisely an object can be measured. These 2 things are expressed by math formats as well using bits. The number of bits in a format exponent determines how large of an object it can measure, how large of a variable it can measure. And the number of bits in the format's mantissa determines how fine are those lines for measuring things precisely. And the best format needs to strike the balance for AI. You want as few bits as possible so you can optimize data movement and computation but as many bits as necessary to not have any accuracy loss. And it shouldn't require heroic efforts from developers to use TensorFloat-32 or TF32. And NVIDIA Ampere architecture defines a brand-new hybrid format that strikes this balance. It uses the same 10-bit mantissa as half-precision format. So it's very precise, ensuring accuracy. And it adopts the same 8-bit exponent as today's popular single-precision format. So it can handle the largest variables that FP32 covers. This combination makes TF32 a superior tool. And with this new precision, A100 offers 20x more content compute over the single precision and provides out-of-the-box acceleration. And because developers can continue to use their inputs as single precision and get output back as single precision, they do not need to do anything differently and they benefit automatically. And that's why the AI framework communities are very excited. TensorFlow and PyTorch, they're excited to make this math the default when running AI models on A100 for training. The second capability is sparsity. Neural networks have many connections that do not contribute to accurate predictions and can be pruned. Think of it as like playing a game of Jenga, where you can remove some of the bricks without impacting the tower. So this involves reducing those weak connections to zero and making the underlying map that represents them as sparse. This property of AI models has been known for a while, but no one has managed to successfully use it to accelerate performance in a predictable and consistent manner. A100 introduces support for sparse matrix operations in Tensor Cores and accelerates them by up to 2x. While this capability, it primarily benefits inference, it will also be useful for training. And finally, A100 is extremely powerful. Not all applications need full performance of A100. So MIG, or multi-instance GPU, allows A100 to be shared amongst different applications. While one could have run multiple applications on the same GPU with our prior-generation GPUs as well, involving things like multi-process service, the resources within the GPU are shared, like memory bandwidth. And so if an application that's using more memory bandwidth, it can start all the other shared memory applications or all the other shared applications running on the same GPU. The breakthrough in MIG is that each GPU instance gets its own dedicated hardware resources such as compute cores, memory, cache and memory bandwidth. And it enables each of these instances to work independently without any interference from each other. Each instance is like a stand-alone GPU for applications. And with MIG, each A100 can be partitioned into up to 7 GPUs or any other configurations with varying amounts of compute and memory. This provides the key capability that we needed to build fungible and elastic data centers. So what does this all mean in terms of delivered performance on applications? For AI training, it means 6x higher performance than Volta. For AI inference, it means 7x higher performance than Volta. And we've chosen BERT, an advanced AI model, to showcase this performance because BERT represents a very important class of AI models today. It has revolutionized natural language processing, delivering superhuman accuracy in understanding language. And BERT neural networks are massive. They have 350 million parameters, and they are computationally complex to train. Applications based on BERT are transforming industries. I think video conferences, smart speakers, retail assistants, call centers, legal transcriptions, search, analyzing medical reports to identify disease outbreaks and so on, applications in every industry. And because BERT-based neural networks are deployed for conversational AI applications, inference has [ to happen ] real time. So speed matters. Because A100 is 7x faster, it can be divided into 7 smaller instances with MIG, each of those instances providing many real-time BERT inferences. We announced A100 to be in full production. And the first systems that are shipping A100 are DGX A100s, which have 8 A100 GPUS. DGX A100 is the world's first 5-petaflop server, a milestone that's never been achieved before. One DGX A100 is 20x the peak performance of DGX V100. Whereas the original DGX was designed for training, DGX A100 can be used as a scale of compute for data analytics with Spark or any form of deep learning training and also for scale-out applications like inference. It can be partitioned out to developers. And one [indiscernible] can be shared by 56 different developers all working on it simultaneously. And it comes with 9 Mellanox ConnectX-6 adapters. The new DGX is faster, it's smaller and it's more flexible than any DGX system before it. And it's been shipping for $199,000. The first orders of DGX have gone to U.S. Department of Energy's Argonne National Lab, which will use the clusters AI and computing capability to better fight COVID-19. We are an open computing company. As we develop vertically, these fully integrated systems like DGX to pioneer new form factors and constantly develop new software to run on them, we also open up our entire system and turn it into the elemental box or ecosystem, and the entire industry can buy pieces or it can buy the whole. The version of A100 for the cloud and partner server makers is called HGX. H is standing for hyperscale. It's the guts behind the NVIDIA DGX. And pretty much everybody in the world has come out with support for it. All the key cloud providers and system makers in the world are adopting Ampere. Alibaba Cloud, AWS, Baidu Cloud, Google Cloud, Azure, Oracle, Tencent, and also all the computer makers in the world have or adopting Ampere. So look forward to A100 powering clouds and data centers all over the world. Let me show you some quick benefits of performance, and more importantly, the unification of computing nodes into one system. As we discussed earlier, more and more businesses depend on AI-driven applications. And they have to maintain different infrastructures to build and run their applications. So this example is modeled on a commercial customer who has multiple AI applications. There's natural language processing going on. There is recommendations. There's a need search. And they have multiple applications in training to improve their models and infrastructure to support their customers. And the IT department has to manage both while at different points of the year and even different points during the day. Some of these servers are underutilized while some of the others are just starved, leading to a lot of inefficiencies. To build application, this customer requires around 50 DGX-1 systems to create and refine their models. And to serve, they require around 600 CPU servers to handle millions of queries that their applications get. This setup costs $11 million, and it takes up 25 racks, consuming 630 kilowatts of power. There should surely be a better way with DGX A100. For the same performance today, it can be done with just 1 rack, 5 DGX A100s, 1 rack, 28 kilowatts. This is the amazing benefit of our accelerated computing platform. When you get a full stack accelerated, you can achieve a level of performance that saves a tremendous amount of money. And that's why you hear Jensen say, the more you buy, the more you save. If you have large computing requirements, there's no question accelerated computing is the path forward. This is an example of how A100 combines 3 computing models into one. A modern AI data center really needs one type of computer. That one computer can be used for training, inference, analytics, scale up, scale out and utilization goes up tremendously. Now DGX A100 is -- think of it like a new steal of AI infrastructure. So we wanted to prove to the world that you can build remarkable things with this new steal. At GTC, Jensen announced DGX SuperPOD, built on DGX A100s. It integrates 140 of these DGX A100 systems interwoven with Mellanox network fabric, with 4 petabytes of all-flash storage, and it uses 15 kilometers of optical cable. So we built it, and we've seen some amazing computational power that puts it near the top of the world's most powerful supercomputing clusters. In the current top 500 list, for example, it would be equivalent of a top 20 supercomputer. But the most incredible thing is because it's architected with DGX, the entire solution is systematized and deploys incredibly fast. We stood up the DGX SuperPOD with DGX A100s in just 3 weeks. And as NVIDIA is a data center scale computing company, we run our own supercomputers to advance the field of AI, create new products via resource to our customers, and we have our modern AI data center, which we call NVIDIA Saturn V. We've expanded it with the new DGX A100 systems, and it has now gained 2.8 exaflops of AI computing. Now there's total nearly 5 exaflops of AI running today. It makes Saturn V the world's fastest AI supercomputer today. So as AI advances, every device is going to lead AI, from smart speakers to smart cars to smart robots to smart industrial machines. And these machines will be always on. They will be always sensing information from the sensors. They'll be understanding and inferring what they've sensed. And they'll be acting on that information, making decisions continuously. They've also been very distant in remote places. And there will be trillions of these things that will be there throughout the world. Now how are we going to make all of it happen? Modern AI cannot operate in real time without acceleration. It's going to need safety and security. Some of the data they will process, from health care devices to home devices, involves information that needs to be handled with care, handled with sensitivity. And finally, because these devices are going to be spread out, they're going to need fast, reliable encrypted connectivity. So we need a new kind of accelerator for the edge. To address these needs at GTC, Jensen also announced the next-generation of NVIDIA EGX products. It combines A100 GPU acceleration and on the same device, Mellanox ConnectX-6 Dx. By combining the technology of these 2 amazing products, we can deliver a great AI edge product, one that offers, on one hand, incredible AI processing because of Ampere architecture and Tensor Cores, at the same time, high performance, smart networking with ConnectX-6 SmartNIC. And together, we can do more things. It will provide a secure authenticated boot of the GPU and SmartNIC from the hardware route of trust to ensure that the device firmware and the life cycle of the software can be securely managed. Secondly, there is confidential AI and play that uses a new GPU security engine to load encrypted AI models and encrypt all AI outputs to prevent any potential theft of valuable IP. AI models, their outputs, they are all valuable IP of customers. Finally, as edge moves to the encrypted high-resolution sensors, these sensors are constantly streaming encrypted data for processing at the edge. Mellanox SmartNIC will bring support for in-line cryptographic acceleration in the solution. This allows the encrypted data feeds to be decrypted, sent directly to GPU memory, bypassing the CPU and system memory while letting the network, the connectivity, operate at the full line rate. As a part of this announcement, EGX, in combination with our AI application frameworks and portfolio of solutions, will bring secure edge AI to key vertical industries. And they include NVIDIA Metropolis for smart cities; NVIDIA Clara for health care; NVIDIA Aerial for telecom; NVIDIA Jarvis for conversational AI; and NVIDIA Isaac for robotics. It's supported by an ecosystem that includes over 100 technology companies worldwide, from start-ups to established software vendors, cloud service providers and global server and device makers. We got a ton of new announcements at GTC, and I just touched upon very few of them. The good news is the digital keynote is available online and Jensen delved into each of these areas in great details. Apart from what we covered in this call, Jensen also announced 3 software frameworks for 3 giant AI applications that are accelerated by GPUs. First is the Jarvis conversational AI SDK. It democratized the building of the state-of-the-art conversational AI services for the enterprises. Second, Merlin, NVIDIA's largest SDK, to help people build AI-based recommendation systems as the world tries to make sense of the exploding choice of content and information. Recommenders are the engines of Internet. Finally, Spark 3.0, to bring GPU acceleration to the world's leading data analytics platform for the first time. I know Jensen and C.J. are planning to cover these in more details in the Q&A, so we can skip this. So this summer, we launched Ampere, our eighth-generation GPU architecture and the biggest generational leap in our history. The first chip in this new family, the A100, delivers 8x the performance of its predecessors and is already shipping worldwide. DGX A100, the world's most advanced AI system, and EGX A100, the first edge AI product that combines our Ampere architecture-based GPUs with NVIDIA Mellanox SmartNIC. With that, I'll hand over to C.J and Jensen for Q&A.

Christopher Muse

analyst
#5

Paresh, thank you very much for that thorough summary. Very, very helpful. So thank you. So Jensen, great to have you on the line. I guess I'll start off with a big-picture question. I think looking back to 2019, roughly 50%, 60% of the data center business for you is coming from the cloud, with the rest from enterprise. And I guess as you think about compute requirements doubling every 3 to 4 months, as you think about digital transformation across most organizations globally, how do you see that mix evolving over time?

Jen-Hsun Huang

executive
#6

Yes. First of all, C.J., thanks for having us. It's a great pleasure to be with you. The -- if you look at -- if we step back and look at some of the most important foundational trend in the computing industry, what does it look like at the core of the computer industry today? And I've had the benefit of being in the computer industry now for almost 40 years. And the seismic shifts that are happening right now are pretty extraordinary, and there are 3 powerful forces. The first powerful force is cloud computing. And it's partly driven by the fact that Moore's law has come to an end at the same time that bandwidth is made available. Of course, there's a lot of great inventions of softwares and -- Kubernetes, for example, Hadoop, that led to eventually Spark. And there's a lot of things that happened along the way that made it possible, but cloud computing is one very powerful trend. The industry, as you guys know, on big numbers, just very roughly, the IT infrastructure industry, all in, is probably, call it, $1 trillion. And out of that $1 trillion, only about $100 billion of it is cloud computing today. And it's the fastest-growing segment of the computer industry, but it's still quite nascent. The amount of computing that's done in the cloud is -- has a long ways to go. And so one of the most important forces is cloud. The second, and it's unquestionably the most powerful technology force that any of us that had ever seen, is artificial intelligence. And why is it so powerful? Well, because for the very first time, we could use computers, which, in the past, required human coding to automate tasks. Computers can now learn to automate tasks by itself. This is the automation of automation. And when you wrap your head around that, the ability to grow exponentially in capability is now upon us in the area of intelligence and automation. And based on the things that Paresh said earlier, you could see the rate of progress in the last 10 years, less than 10 years. Software has achieved superhuman levels of capabilities in computer vision, in natural language understanding, speech synthesis, speech understanding. The list of things that has achieved superhuman levels in just less than 10 years is astounding. It's growing at exponential rates. That's what happens when anything becomes automated. It grows exponentially. And that's one of the enormously exciting part of AI and why we're running so fast. And then the third is the technological revolution of every industry. Agriculture is going to be a technology industry. Manufacturing is going to be obviously a technology industry. They're going to be -- retail will be a technology industry. Everything is going to become a technology industry with edge AI edge computing. And these 3 forces probably contribute more to just about everything we do than anything. Now at the core of it, back to your question originally with that foundation, cloud is a data center business. AI starts out as a data center business, and then AI as edge takes the data center right out to the world. You're not going to just have data centers in one giant place in the future. We're going to have data centers, millions of them spread out everywhere with no IT managers. That's the revolution. The ability to have data centers close by in a warehouse, in a bank, in a school, in a building, in an airport, in a train station, everywhere. The world is huge, and we're going to have data centers in all those places. And those data centers will require no management. They'll be secure, using state-of-the-art security technology. The attack surface will be tiny, impenetrable. If it's ever tampered with, it disables itself. And so the data centers will be sprinkled all over the place. We call them -- some people call it IoT. We call it edge, edge computing and edge AI. And so I think data centers is unquestionably going to be a giant industry, and it's going to be everywhere.

Christopher Muse

analyst
#7

That's really helpful. I was struck in your presentation by the scale and scope of your focus on your own software. 4 out of your 9 webcasts were focused on software. So I guess first question on this front is why is this so important to the building of your platform.

Jen-Hsun Huang

executive
#8

Yes. That's a great question. You come back to the core of computing. It takes software to create markets. Without Microsoft, there would have been no PC industry. Without AWS, there would have been no cloud computing. Without Android, there would have been no smartphone. The bottom line is without software, there's no market. Markets start with software, not chips. Chips follow the software. And it follows the software pretty substantially, oftentimes by a couple -- 2, 3, 4 years. And so this is the new industry and where we play and what we're trying to pioneer. And as the pioneer of this field, we have to go and create that market. And what we're pioneering are 2 things simultaneously, and one of them is the accelerated computing period. Accelerated computing is something that our company has been doing for over 2 decades. And the original patents of our company, the first patents, was not computer graphics. The first patents were related to accelerated computing. And the letters used were called UDA, Unified Driver Architecture, which is where CUDA came from, C-U-D-A. So we've been doing this now for almost 30 years, solving problems that normal computers can't solve. Accelerated computing, pioneering that requires a full stack. We have to go create the stack because the traditional stack of normal computing, you can't accelerate. It runs on CPUs, and it's single threaded. It's not paralyzed. And so you have to go create the acceleration domain. Each one of the domains requires a full stack. And each one of the domains requires developing the market, evangelizing to developers, creating new systems. And that's why NVIDIA is a full-stack company. Number one, to pioneer accelerated computing. Now that Moore's law has come to an end, it just turbocharged. And because there's a lot of problems that simply can't be solved without accelerated computing anymore and people can't wait around for Moore's law to make transistors and CPUs go faster, so they have to jump on a new approach. And like all computing platforms that were ever created in time, there was a pent-up need where the alternative, people gave up on. And cloud was that way. Mobile was that way. Accelerated computing is that way. The second driver -- the second area where we're trying to create the market is artificial intelligence. AI computing requires the computing platform to reach out and solve the stack, the tools, the computing platform. It's a brand-new way of computing. For a computer to go off and learn by itself from a giant amount of data is a brand-new way of computing, a brand-new way of developing software, a brand-new way of working. And we were fortunate to have thought through that nearly a decade ago now and then to have all of the experience by working with the world. We're the only AI company in the world, I believe, that works with every AI company in the world. Because we're an open platform and -- so we have the benefit of a lot of engagements and a lot of learning. And that led us to realize that -- what the software stack should be. The software stack that I talked about is 2 major areas. And then let me just frame it very quickly. 2 major areas, and they're both super important. And I'm just incredibly proud of the work that we did in the last 5 years building the stacks. The first part is the development of the AI, creating the AI. We call that NVIDIA AI. NVIDIA AI is not just TensorFlow, and that's -- we're super proud of that. We contribute an enormous amount to that. TensorFlow and PyTorch are both native TensorFLoat-32. They're both native TF32, which is a new numerical format that we invented for Ampere. But that's just probably about 30% of the workload. The entire machine learning pipeline starts with data processing, and data processing is intensely difficult. And I talked about that. And when you get a chance, look at the work that we did with RAPIDS and Spark acceleration. Those 2 software stacks is potentially the largest body of work we have ever done, and that was a huge lift. I'm super proud of the team for getting that done. So data processing, preparing the data for training is the first phase. The second is the training frameworks, TensorFlow, scikit-learn, PyTorch, all fully accelerated now. And third is the inference part. TensorRT 7.1 is our latest release. I think the most important takeaway is 7.1. We've been working on it for 7 generations. And we finally are able to accelerate every form of neural network out there. And we still have plenty of work to do, but this compiler -- optimizing compiler is immensely complex. And imagine optimizing code that is billions and billions of parameters large, so large no humans could read it. And so now this optimizing compiler has to go target a machine. So TensorRT -- from Spark to TensorFlow to TensorRT, we now have it end to end. So that's the development part. The edge part is where all the application frameworks came in. Unless we create a tool away for the industrial companies, the Walmart, the USPSs, the BMWs, all of these -- all of the world's largest companies to apply AI, to use and adopt the technology we've created to apply to their own problems, the hopes of them being able to invest the level of scale of a cloud service provider ourselves, to understand and benefit from the power of AI is very low. And so we created application frameworks in several large industries. Health care, for example, is Clara. The robotics manufacturing industry is called Isaac. And we announced a huge partnership with BMW to automate their manufacturing with robotics. The locations, we called Metropolis. So for example, smart retail work that we did with a partnership we have with Walmart. And so those application frameworks, think of them as new college grads. They are extremely well taught. They just got their PhD in artificial intelligence, robotics. And they were trained by the best professors in the world. They were trained by the best teachers. And then these new college grads then go to the industries that they get a job in. And those industries adapt, refine, retrain them for the specific skill. And so that's what the NVIDIA EGX application frameworks are all about. They're pretrained AIs that you can then adapt to yourself. So I think the long answer to that, sorry about that, is that it's so important that if you want to create new markets and you want to grow into new markets that are not commodity, to solve problems that the world has not solved before, you have to start with software. And accelerated computing is all about software. AI is all about software. And to engage these 2 gigantic opportunities, we have to be software intensive.

Christopher Muse

analyst
#9

Now that's great. I got a bunch of questions on the different platforms that you just outlined, but I wanted to hit on high level. Do you get the sense that these industry verticals understand how disruptive your platform, your solution will be to them? Would love to hear your thoughts there.

Jen-Hsun Huang

executive
#10

They didn't use to, but they do now. The retail companies -- the retail company have an existential imperative. The retailers have an existential imperative. If they don't engage AI, they will perish. They know that. It's completely -- it is top of mind. It is existential for them, and it's coming at them in 2 different ways. One of them is just called Amazon. Unless they become world-class at this -- at the most important machine learning pipeline in the world today, the economic machine learning pipeline is called the recommender. I highly advise everybody look into it. It is the single most important machine learning pipeline on the planet today. It drives vastly Internet commerce. News recommendation, friends recommendation, job recommendation, product recommendation, movie, books, music, recommendation, all based on this really complicated machine learning pipeline called recommenders. It is evolving incredibly fast. It's moving from simple to keep learning. It is going to be multimodal. It collects hundreds of terabytes of data. It's going to move to petabytes of data. It is the reason why we built Spark and the thing called Merlin. And so they know. They get it now. It is completely existential. And then there's the other part, which is the dream come true. Every industry would like to be a tech industry, frankly. And every industry would love to have the smartphone moment. They would love to have the smartphone moment, where they sell you a phone and they stay connected with you and they offer you services for as long as you shall live. The car industry initially thought that it's about driver assistance, it's about automating cars, but that's just a piece of the puzzle. The bigger piece of the puzzle, the more exciting part is that in the future, every car is going to be a car that's built around a service. And that car will be updated and enhanced and offered more services for as long as those cars are on the road. And a software-defined car that is essentially like iPhone on wheels, it is incredibly exciting to them, and they have an opportunity to revolutionize their industry. And so you see, on the one hand, it's existential. On the other hand, they're super excited that, finally, we have the pieces of software necessary and the capability necessary for them to become a tech company. And so I think they're starting to see the light. And the companies we partner with surely do. And they're world leaders. USPS is the world's single largest logistics company; and Walmart, the world's largest retailer. And we have -- the number of partners we partner with now on EGX, which is edge AI, industrial AI, it covers every industry practically, and they're the largest companies in the world.

Christopher Muse

analyst
#11

So I'd like to hit on some of these platforms. So I guess to start with Spark 3.0. And if I think about that in terms of supporting the lead and compute engine used in Etail data processing, combined with RAPIDS AI, it appears as though you're creating a de facto platform for big data. So how are you thinking about your go-to-market strategy there? What does this market opportunity look like for you? And if I could throw in one of the questions I got via e-mail, what is the spend on data processing per every $1 spent on GPU training?

Jen-Hsun Huang

executive
#12

Wow, good ones. Okay. All right. I'll answer that last one first right away. The amount that is currently -- the amount of money currently used to do big data processing is 0. And the reason for that is because you can't accelerate big data processing today. Spark is the big data computing engine of this time. As you guys know, big data started with Hadoop. And it was a great invention and it made a lot of sense. And it ran out of steam. And it ran out of steam big time. And when people moved from hundreds of gigabytes to terabyte scale and now we're in the hundreds of terabytes into the petabyte scale, Hadoop simply couldn't keep up with it. And the type of data processing that we want to do, to do what is called human prior feature engineering or predictive feature engineering, that work requires a lot of processing. And Spark is used in almost 20,000 companies around the world. It is the de facto big data computing engine, whether they use Spark or some [ installed ] version of Spark, basically it's Spark. It came out of Berkeley, a really wonderful piece of work, basically turns an entire data center into a giant computer, okay? And so today, it's about 0. The Spark work is hard because it was designed -- the entire stack was designed for single-threaded computing. That's just the nature that was created. And it was the only thing that was available in data centers at the time. And so it used -- the fundamental premise of computing was based on single-threaded CPUs, and it distributed the work -- partitioned the work across a large number of computers. And it does the scheduling. It understands exactly how to control a CPU. And it does the scheduling, it does the reduction, it does all the resilience, the recompute, if necessary. And so it does all that. It's one giant compute framework that's completely based on CPUs. Now when they ran into the wall, and you couldn't -- it was harder to scale, you can't even imagine any more scaling from 500 or 1,000 CPUs to 10,000. And the reason for that is the computing overhead, the overhead of managing all of those workers, if you will, that's what they're called, if you -- the overhead of managing workers become greater than the workers doing their jobs. And so there needs to be a way to accelerate it. And so we had to basically take the entire stack and redo it. And the hard -- the reason why it's so hard is because Spark's great advantage and its great value is the number of people using it around the world. There are thousands of companies all over the world using it, and you don't want to change the API. And so within the confines, within the constraints of the API, whether it's Spark SQL or their data frame, Spark DataFrame, all of their scheduling methodologies, we don't want to change any of that. And yet, under the hood, we need to find a way to refactor, redesign everything, all the way from the way we read from storage, stream data from storage, the way that it's vectorized into memory, the way that our GPUs and our GPU memory are now visible to Spark for scheduling, the way that work is distributed into multi -- into GPU, into multi-GPU, into multi-node, multi-GPU, that high archaical way of doing computing had to be all designed into Spark. And so it was a giant body of work. And that the entire open source community helped us. I'm super grateful. And now we've done this amazing thing, Spark on CUDA. It's CUDA accelerated. And the go-to-market, C.J., is through all of our enterprise partners. That's one of the reasons why NVIDIA is both full stack and open at the same time. On the one hand, we're full stack so that we could create new applications and new markets. Otherwise, how would Spark happen? We would have to wait another decade or a new Spark would have to come along. And so we have the ability to, on the one hand, create new markets like Clara and Isaac and DRIVE and Jarvis, and yet accelerate large ecosystem because we have a full stack expertise. And on the one hand -- on the other hand, because we build them in such a way with the discipline necessary to be an open platform so that all of our partners could take to market, HPE and Dell and Cisco and Inspur and just the list of OEMs around the world, Lenovo, just every single computer maker in the world is going to take it to market. And then, lastly, we -- our computing platform, as you know well, is now integrated into every cloud in the world. And so Spark acceleration will be available on Google Cloud, on Azure, on AWS, on Alibaba Cloud, you name it, on Oracle Cloud, on IBM Cloud. They'll be available everywhere. And it will be fully accelerated because all of our stacks are there. And so that's how we're going to go to market.

Christopher Muse

analyst
#13

That's great. You also announced 2 other application frameworks or platforms as I think of them: Jarvis with conversational AI and Merlin recommendation systems. I guess can you speak to how you are thinking about target customers, go-to-market strategy there as well?

Jen-Hsun Huang

executive
#14

Sure. Jarvis is for conversational AI. Conversational AI is the single most exciting influence application we know of. For a computer to be able to speak to you like it's human, to be able to have a conversation with you, for you to interrupt that computer in mid-sentence because you want to change the course of your conversation or you phrase -- you have a better way of phrasing the question or you've got what you needed, you want to move on to another thread of the conversation, that conversational AI is, number one, requires 6, 7 different breakthroughs in artificial intelligence first. From speech recognition to language models for enhancing the speech recognition, to understanding the conversation, to recommending an answer for the conversation, to speech synthesis, to having speech synthesis have emotional, if you will, contextual sensibility instead of sounding like a machine, if you guys get a chance to watch this, it's really fantastic. Our openings to GTC is called I Am AI. At the end of it, we revealed it. The I Am AI, the entire speaking, the entire -- the person talking is an AI. It was -- and it sounded like a person all the way to the end. And so having the emotional contextual sensibility based on the things that are being said. And conversational AI is very, very complicated because of all the breakthroughs that are necessary to make it state-of-the-art. The second is performance. You can't have conversation with somebody, if you say something, and then they look at you for 3 seconds or, in the case of running it on CPUs, they would have to look at you for like 25 seconds, you say something and it doesn't come back for another 25 seconds, but it's hard to have a conversation. And you need -- and so you need this ability to be able to do it interactively within a few hundred milliseconds, a couple -- 200, 300, 400 milliseconds. And now we've taken the entire end-to-end state-of-the-art -- we pretrained it, so you just have to refine it and adapt it to your domain. We pretrained it, it's fully accelerated, and we call that Jarvis. And so now, it's a reference design for all of the cloud service providers so that they could accelerate their own conversational AI. And they have the expertise to do it, but they -- in creating the models, but they still benefit from us doing the acceleration. And then the big opportunity is this. Health care, call centers, financial services call centers, education, anybody with a product in retail, call centers everywhere, video conferencing to have multidirectional conversation, you hear one person because most of it is simplex today that only one person can talk at a time. However, you can read and you can read from a whole lot of parties at the same time. So a lot of -- when they're talking, you could see the close captioning. You could do summarization. There are all kinds of things that you can now do if you just had a conversational chat bot accelerated engine. And so that's -- for all of the other industry, whether it's in retail or health care or financial services or others, you now have this application framework, this PhD, if you will, PhD AI, that is pretrained. And all you have to do is adapt it to your own domain. And so that's what Jarvis did. Merlin is for recommender system. All the things I said about Jarvis, it's for recommender systems. And it's an application framework that is -- it's not just for AI. It's not AI, that domain, it's very specifically for recommender AIs. Instead of TensorFlow for all AI and all machine learning, Jarvis is specifically for conversational AI. Isaac is specifically for robotics. And so these pretrained expert AIs include state-of-the-art models that are pretrained by us, and that's what our supercomputers are used for, and we pretrained these models. And then we give them tools, we give our customers tools that adapts them to their domain. And the recommender system is so complicated only -- let's -- I would say 10 companies on the planet today have the ability to create recommender systems at a scale that is fairly, fairly large. And they're the [ Fangs ], the [ BATs ]. These companies have such large engineering organizations, and they collect so much data it's possible for them to do it. They understand infrastructure, they have their own infrastructure. But for the rest of the world's industries, it's just simply not possible. And so we decided we would do the fundamental engineering and to create something that's relatively simple for people to then adapt and then they could have state-of-the-art recommender systems as well.

Christopher Muse

analyst
#15

That's great. I want to hit on one last platform, Clara. And I think it's illustrative of what the opportunity could look like. So you acquired Parabricks, I think, in January 2020. And it's a framework supporting genomic sequencing. And if I look out there, there's a company, Illumina, $50 billion market cap, essentially in that business, using an ASIC, not a GPU. Quite frankly, how can others compete with you as you move into these types of markets?

Jen-Hsun Huang

executive
#16

Yes. There are 2 -- there's some amount of overlap in the work that we do, and then there's a gigantic underlap. We don't do the NGS, next-generation sequencing. And we'll work with everybody who do the sequencing, whether it's short reads or long reads. And for example, Oxford Nanopore, a really fantastic company, and they're doing the long-read version of NGS. And they're all accelerated by NVIDIA GPUs. And so there are going to be all kinds of sequencing companies, and they'll use different types. I think the most obvious answer for sequencing is really GPUs. And I think for a lot of reasons, the algorithm is -- because the GPU is programmable. It's easier to program. It's very cost effective. It's off the shelf, you can buy it all over the place. And the body of tools that are available on GPUs is really rich. So I think over time, as we continue to advance this area and more of the sequencing processing is going to rely on AI, just as imagery construction has really largely moved to AI, computer vision has moved to AI, sequencing has a very similar problem. Cryo-electron microscopy has moved to AI and our GPUs. And so instruments after instruments after instruments have moved to AI and using our GPUs. And I think that CT machines, for example, ultrasound machines, are all using hybrid versions of computational methods and AI, and our GPUs are so good at that. And as you heard Paresh talked about Ampere being a 20x speedup, you're just not going to do that with an FPGA anytime soon. And so I think the computational platform is rich. It's moving fast. And when it improves in performance, it's effectively reducing cost. But the thing that is really great is if you think about the entire pipeline of understanding human genomes and trying to figure out ultimately what you want to do with it, sequencing is the first step. Reading the DNA is the first step. Understanding the DNA is the second step. And then correlating it to the world's population, which, as you guys know, humans are not exactly just humans that were close to each other, but there's variations in human genomes, fairly dramatic impact. And so whole population sequencing, correlation, what they call a tertiary analysis, basically doing data analytics. And data analytics today, we have 2 platforms for them. If they would like to use Python, they can use RAPIDS, if they would like to -- the Pythonic world can use RAPIDS. And then for everybody who's built on Spark, we have Spark. And so when they study, when they sequence at a lower and lower cost, which is going to happen, it's fantastic, we're going to collect gigantic amounts of data. And the first step, of course, is understanding the person's genome. But the second step is the population genome and finding cures, finding therapies, finding genetic disease of certain types and finding markers. All of those type of analytics are done at the population level. And then the next step is equally exciting, which is studying the genome of everything else. Plant and foods and animals and the work that we have to do in genetic computing is gigantic. We're at the tip of the iceberg, the tip of the tip of the iceberg. And so I think this is such an exciting area for us. We've been working on it for quite some time, and we have a platform today that accelerates perfectly down to the answer, the 2 most important platforms for genomics computation: GATK and DeepVariant. And these 2 frameworks are so vital to the industry, and you have to be perfectly compatible to them or it's difficult for people to share information, share data. And so this is a gigantic area. I think we have some amount of overlap with them, but we have a gigantic underlap. And where we are most -- where I think the big opportunity is in what is called tertiary analysis, which is the data analytics part of it at a global population and then eventually multi-species scale.

Christopher Muse

analyst
#17

So if I heard you right, as you discussed these myriad platforms, you're creating new markets, you're accelerating large ecosystems, you're getting edge customers to effectively standardize on your platform, which I would think would effectively mean the cloud would have to standardize as well. How does your new Ampere architecture, the A100, fit with that strategy?

Jen-Hsun Huang

executive
#18

Yes. Ampere -- so backing off, if you look at the data center -- a data center has multiple dimensions. And we know that where -- a microprocessor used to be the unit of computing. The unit of computing is what a developer has in their mind. A developer used to have a computer, and that CPU or the microprocessor used to be their unit of computing. That's how they saw the world, and that's how I learned the world. And -- but the new unit of computing is no longer that. The amazing thing about cloud computing is that the new unit of computing, the computer that a developer has fit in their head is the entire data center. That's the shocking part. I'm so proud of the industry for making so much productivity gains over the years. That one software engineer could write a piece of software that runs on an entire and activates every electron in that data center. And so we're in an era not of microprocessor computing, CPU computing, we're now in the world of accelerated data center-scale computing. And now how do you think about the data center in today's -- if that's the unit of computing, then what are the metrics of its success? Well, some of the things that you would say, of course, throughput. Whatever work you want to do, because there are so many people using it, the throughput is vital. Second, what are all the different types of applications you could run on it? Is it just Hadoop? Is it just queries? Is it just indexing the world's websites? Or is it also -- is it streaming music and videos? Or is it streaming games or even streaming an entire workstation? Or is it doing conversation with you? What are the different types of workloads that you have, the type of number of applications you could run? The third is what's the cost of the servers? What's the capital expense? If you could do a whole bunch of more stuff but every server is a lot more expensive, the data center is unsustainable. And then, of course, what's the cost of operation? If I simply said a data center has these 4 metrics: its throughput; its performance, the number of applications it could run; the cost of it, which is the TCO of it; and then the operating cost, Ampere raised the bar gigantically high. If you -- and it takes upon all of the things that we've learned and everything we've known about accelerated computing: the workloads that we believe that the world needs to accelerate, whether it's genomic computing or big data analytics or conversational AI; the applications of it; and then the cost of the servers, which is, ultimately, the cost per workload, not the cost per chassis or the cost per sheet metal but the cost of getting the work done; and then, of course, energy efficiency, Ampere raised the bar so high. And that's -- it was just the embodiment of all the work that we've done. And some of the things that Paresh said earlier, one additional thing that was really a great advantage is the flexibility of Ampere. It's -- you could use it for data processing. You could use it for machine learning. You could use it for inference. And it's both used for scale up as well as scale out. And so instead of all these different clusters and islands of specialized servers, some of them accelerated by V100s, some of them accelerated by T4s, some of them accelerated by V100 SXMs with NVLinks, you can now combine all of those into one. And as a result, the flexibility is greater. And so we raised the bar with a 20x speedup in performance. We raised the bar by introducing new applications: Spark, conversational AI, genomics. We raised the bar because our speedup is so great that -- and I say it tongue in cheek, but it's absolutely true. The more you buy, the more you'll save because you have a certain amount of workload you have to do anyways, and instead of an $11 million machine learning platform with DGX-1 and a whole bunch of CPU servers, you could replace that $10 million server with a $1 million DGX Ampere. That's amazing. 10x reduction in cost for the same throughput or 10x the throughput for the same cost. And the energy efficiency is off the charts because 1 rack is surely lower power than 25 racks. And we do that. We save about 15, 20x the power. And so the operating cost goes way down. And so those are the things that we thought about for Ampere. And it is just a thrill to launch it. And it's also the third-generation of AI servers that we've launched. The first one was Pascal, then it was Volta, then it was Turing. And so -- I'm sorry, it's the fourth generation of AI computing we've launched. And the world's data center in the beginning was -- felt that we were really used for a niche where only if you need it for training. But over the years, as you know, we've been expanding the aperture of all the applications that we've run from training to inference and now big data analytics. And now computing has come and cloud gaming is in there. Our GPU is used in so many applications in the cloud now that acceleration, GPU acceleration, is no longer a niche thing. It's common sense now. And so whereas the first generation, we had 1 or 2 customers at the time of ramp, now almost the entire world's data centers are ready to go to town and move to the next generation. So Ampere has been a great success.

Christopher Muse

analyst
#19

That's great. So we've got a flexible solution. And now at the same time, you're pursuing a complete solution, NVIDIA AI. And you pivoted your focus to a data center-scale, compute type of focus. And so curious, now that you brought Mellanox in-house, you acquired Cumulus Networks, it certainly looks like you can offer now a complete solution from semiconductors to systems to software. So how are you thinking about having the 3 assets combined? And what does the end game look like for NVIDIA in this type of complete solution?

Jen-Hsun Huang

executive
#20

Yes. Well, the -- our basic game plan, our approach, is to engineer full scale, engineer full stack and engineer from end-to-end. Whatever the scale of the -- whatever the scope of the problem is, in order to do great engineering, you want to have the entire freedom to engineer within that box. And the important piece is inside the data center -- for data center scale computing are the acceleration components, the acceleration platforms, the software stack that runs on top of it and the networking that fuses it all together. And the reason why we are so convinced that our collaboration with Mellanox was perfect, it was ideal and we went hand-in-hand because we were working together across the board every single day. All of the optimizations ran into networking bottleneck. And the reason for that is very clear. If you have a whole bunch of compute nodes in their worker node, and those worker nodes are slow and they take a long time to do something, then it turns out the management overhead is low. And because you give them tasks, they're going to think about it for a very long time anyways. However, if the worker nodes are superfast, then the critical path is the management overhead and the fabric, the management fabric, that connects all of them get them to work together. And so we have the superfast, supercharged GPU-accelerated node. The bottleneck is all in the data processing, moving data to us so that we could process it as fast as possible. The work that we did with them resulted in this technology, which is really, really enormously great. It's called GPUDirect Storage. The ability to move the data into our nodes, process it incredibly fast and then to work together, we call it GPUDirect, another piece of technology that we created with Mellanox. And we invented this thing called NCCL. They invented this thing called [ RDMA ]. And then at the switch level, because there's now so much traffic going back and forth, and it's not equally balanced, and so the switch, the data flow of the switch engine, they are incredibly good at. And to be able to manage the flow of information so that thousands of nodes could be working together harmoniously basically mean being a little bit -- being fair is being unfair. So you've got to throttle some people back. And so -- and then there's some computation that you could do in the network to reduce the amount of communications back and forth, and that's called reduction. And they have a technology called Sharp that's made possible to do that. And so now we know that most of the critical components in distributed computing are under one roof, we could engineer solutions and platforms from end-to-end, from top to bottom. And so that's really about engineering. That's really about innovation. It's really about invention, so that we could improve the overall performance and TCO of the unit of computing, which is data center scale. The second thing that we want to do, of course, is we want to make it available openly so that we can have reach. One of the things that all computing platforms need is developers, and developers want reach. There's only one way to get developers, and it's to have reach, it's to have installed base. And the computing platform with the largest installed base also has the largest number of developers, which attracts the largest number of problems solved, which allows us to have more go-to-market partners to generate more sales. And so that positive feedback loop is really, really hard to get going. And it's evident now that the NVIDIA accelerated computing platform has passed the tipping point, and we just want to fuel it. And so the way we're going to take our platform to market is to break it up into elemental part that work together perfectly, and then we go through our network of partners and allow them to do their differentiation on top of it for all of the different specific needs and wants of the customers that we can serve around the world. And so we'll go to market through HPE and Dell and Cisco and IBM and Inspur and Lenovo and all of the computer makers around the world.

Christopher Muse

analyst
#21

That's great. I think we got time for one last question. And I think that the common perception out there is that NVIDIA is a great semiconductor maker but -- and some good software. But I want to turn it on its head and highlight how you're probably maybe more of a software company that actually also makes great semiconductors. So would love to hear your thoughts on how does your hardware make your software better. And based on everything that you know from offering complete solutions and building the software stacks, how does that give you a competitive advantage for building the best custom hardware to work along with that?

Jen-Hsun Huang

executive
#22

Yes. I think that's absolutely right. And the way we describe it in our company is that NVIDIA is an accelerated computing platform company. A computing company is not a chip company, and a computing company isn't a box company. A computing company really is a software company. What makes AWS a computing platform is not so much the buildings that they're in, or the servers that they buy. AWS -- when you go to AWS, what you see is actually a website with a programming -- with programming SDKs. And that's what computing platforms are. Windows is a computing platform. Android is a computing platform. There's all kinds of computing platforms. Now we're not a horizontal computing platform like the examples that I gave. We're really the world's first accelerated computing platform. And accelerated computing requires full stacks. And the full stacks are optimized per domain. And the reason why that's important to know is because, otherwise, how could we accelerate applications by 20 to 50x? If we are horizontal like everybody else, it would be -- it would require tremendous amount of arrogance to think that somehow we're going to accelerate every application on the planet 50, 100x better than the rest of the industry combined. It's not -- and we obey laws of physics like everybody else. And so NVIDIA is a domain acceleration computing platform company. And we methodically, step-by-step, open up our aperture to have more and more domains of application acceleration. We started with computer graphics a long time ago. And slowly and methodically, we opened up the aperture to have more and more domains of acceleration. With every domain, there's a brand-new stack. So for example, AI, there's NVIDIA AI; for robotics, there's Isaac; for autonomous vehicle, there's DRIVE; for computational health care, life sciences, there's Clara. And so each one of these domains has vertical stacks and really complicated at that. However, we built it on top of one architecture. That one architecture gives us enormous leverage. It's the reason why we're able to simultaneously be successful in gaming while being able to process genomics. The reason for that is because, on the one hand, we are one architecture, and that requires discipline. The benefit of that one architecture is that genomics benefits from gaming. Gaming is going to benefit from physics, which is scientific computing. Deep learning benefits from gaming. Gaming benefits from deep learning. And so you're going to see leverage across all of these applications and industries and, therefore, our economics. If we stay disciplined and we have one architecture, then the investment that we make for each domain is far, far less than anybody else would have to make to deliver that same level of performance. And so we bring enormous amount of scale to new markets, whether it's genomics or others, that is illogical compared to the size of the market at the time. And so we have the benefit of -- through one architecture, through leverage, through methodical, thoughtful opening of apertures creating the vertical stacks, we end up with an accelerated computing platform that is used by millions of developers all over the world. And so that's basically the methodology. But if I come back at it the other way, the simple logic of this: NVIDIA is not a take share company. That's -- you don't ever hear us talk like that. You hear the whole company talk about new markets and how we can make a difference for that new market. And what kind of new capabilities we can invent as a result of being able to do things the way we do it and how we could help the ecosystem grow. We're always talking about growth. We're always talking about acceleration. We're always talking about doing things that otherwise weren't possible before. And therefore, we could grow into markets that didn't exist before. If we didn't do all of the software that we're talking about, the only thing that we could do is take share. It would be figure out what the socket is and go take share. And we have no trouble with any of that. We grew up in that world. But if we want to be a much, much greater company and make a far deeper and longer contribution to the world, we have to go invent new things that create new markets that drive growth for all of our partners. And that's really the architecture of our company. And your question kind of taps right into that.

Christopher Muse

analyst
#23

Well, I think that's a great ending for the call. Jensen, Paresh, I want to thank you both for spending your morning with us. I want to thank everyone who dialed in. But really great to have spent time with you, so thank you.

Jen-Hsun Huang

executive
#24

Thank you, C.J. I hope all of you guys stay safe and well, and I look forward to talking to you guys soon. Thanks a lot, C.J.

Christopher Muse

analyst
#25

Take care.

Operator

operator
#26

Ladies and gentlemen, this concludes today's webcast, and you may now disconnect. Thank you.

This call discussed

For developers and AI pipelines

Programmatic access to NVIDIA Corporation earnings transcripts and 32,000+ others is available through the EarningsCalls.dev REST API. Plans from $24.99/month — full transcripts, speaker segments, full-text search, and the recently-added /api/v1/transcripts/recent polling endpoint for ETL pipelines.