Advanced Micro Devices, Inc. (AMD) Earnings Call Transcript & Summary
February 20, 2024
Earnings Call Speaker Segments
Brett Simpson
analystOkay. Hi, everyone. It's Brett Simpson here from Arete Research, and thanks for dialing in. Just to say a big thanks to the AMD IR team for helping out here, specifically Mitch and Suresh, and they're standing by on this call. I'm excited to welcome Mark Papermaster, AMD's Chief Technology Officer, to our fireside event today. Now you all know Mark's background. He's one of the luminaries of the compute world, with distinguished career that spans IBM, Apple and Cisco and today with AMD. I think you've been with AMD 13 years, Mark, is that right?
Mark Papermaster
executiveIt is coming on 13 years by the end of '24, that's right.
Brett Simpson
analystYes. And with all the technology change that we're seeing in the market today, particularly as the AI compute market inflects, I think it's great to sit down and get Mark's perspective on how the future of compute is likely to play out. So Mark, appreciate you coming on to talk with us today. Looking forward to the conversation.
Mark Papermaster
executiveBrett, thanks very much for having me with you and with our listeners this afternoon.
Brett Simpson
analystAnd before we jump in, just to say we're going to work through specific questions we have for Mark over the next 45 minutes, and then we'll open up to Q&A for the first -- for the final 15 minutes. [Operator Instructions]
Brett Simpson
analystSo with that, Mark, maybe we can just start by summarizing the market outlook for AI. And AMD late last year laid out a $400 billion market size. It's a big upgrade on where you were last thinking about the market. So can you talk a bit about what's driving that change of thinking? And what portion of the market AMD is going to focus on? Yes.
Mark Papermaster
executiveYou bet. Well, Brett, I think we surprised some folks when we came out with that type of view of a TAM projection as you look to now through 2027, but when you think about what's happened since we made that announcement, I think there's a lot of corroborating evidence that, in fact, the market is exploding at that type of rate and speed. Since that time, you saw major hyperscalers announce even their imminent CapEx spends growing significantly year-over-year. You've had Sam Altman come out and call for trillions of dollars of investment to build up the necessary AI infrastructure to drive the capabilities that the researchers see are imminently achievable if the computing infrastructure can scale. And that's exactly what we were looking at as we looked at the TAM projection. When we think about $400 billion, it's an accelerator TAM. It is GPUs, it is the memory around it, it is other bespoke accelerators, specifically targeted for this AI infrastructure build-out. And it is just that. Think about it as a build-out like when the Internet launched and you had the -- an entire build-out, not just of compute but networking, infrastructure, et cetera. It's a new platform. And so it is indeed a major investment, and with that investment comes the monetization, which will occur as you'll see thousands upon thousands of applications, which are now at a torrid pace of being development. So there's the economics that have to be behind it and are behind it. And when you look at that $400 billion TAM, a lot of it, of course, is the hyperscalers and these massive AI cluster build-out. So that's going to be first-party workloads of the hyperscalers but also third party. And that's where the largest models are needed, the LLMs that are taking on broad questions that they're helping answer and broad productivity savings that they're driving. And so that is a big piece of that $400 billion TAM. But it is much more than that because when you look at what's going to happen over the next several years, AI is not just in the domain of these largest hyperscalers, these massive clusters. What happens is businesses is they focus and they have AI needs to address their business needs, not of the world's AI problem set that needs solved, but actually driving productivity of their business. And those models are typically smaller in size. They can be handled -- they can either be run in smaller clusters on the cloud or on-prem. And frankly, where you need more quick response and even lower latency, think about automotive applications with self-driving, think about the factory floor. That build-out will be embedded devices, and then right to the end point where you're seeing now PCs, we launched our AI-accelerated PC last year with the Ryzen 7040, and now that we look at 2024, it's a big transition year where AI comes to the PC, and we're out ahead of that. So there's a lot behind that TAM. We're expecting the AI market to grow at a 70% kind of CAGR. I know that seems like a huge number, but the work our team has done, we think -- you can debate the numbers, but the fact is it's very large, and we are investing to capture that growth.
Brett Simpson
analystYes. And Mark, just in terms of the pricing for compute, the platforms today, they are clearly built around aggressive pricing, high pricing from NVIDIA. And there's been a lot of inflation as you look generation to generation, whether it's the A100 to the H100 and now the GH200. Obviously, AMD is starting to ramp into this market now. But do you think the pricing of AI compute in general needs to reset for this market to really grow meaningfully towards that TAM that you're suggesting?
Mark Papermaster
executiveWell, Brett, first of all, I think there's an underlying premise that I think everyone inherently knows, but you have to think about it in regard to pricing, and that is that the compute demand for these AI applications, these large language model applications and driving to more and more accuracy and more and more human-like artificial general intelligence type of capabilities, the compute demand is insatiable. And so when you think about what's driving the compute devices underneath, it is trying to drive more and more compute into a smaller area and into a more efficient power per FLOP, floating-point operation, the key element -- mathematical element underneath the AI calculations, which were done through the training and inferencing. And so that's the backdrop you have. So the base devices are growing in content. So I think what's important, when you just step back, is to look at total cost of ownership. Not just one GPU, one accelerator, but total cost of ownership. But now when you also look at the macro, if there's not competition in the market, you're going to see not only a growth of the price of these devices due to the added content that they have, but you're -- without a check and balance, you're going to see very, very high margins, more than that could be sustained without a competitive environment. And what I think is very key with -- as AMD has brought competition in this market for these most powerful AI training and inference devices is you will see that check and balance. And we have a very innovative approach. We've been a leader in chiplet design, and so we have the right technology for the right purpose of the AI build-out that we do. We have, of course, a GPU accelerator, but there's many other circuitry associated with being able to scale and build out these large clusters. And we're very, very efficient in our design. And so we think, one, we will bring competition, and that will be an ameliorating factor. But more importantly, you have to look at AI from a portfolio standpoint. It's not all the LLM of these largest cluster build-outs in the cloud. AI is we're getting more and more astute in how to tailor our applications to the problem which is being addressed. As I mentioned earlier, you're going to see a build-out of really leveraging more of the CPU build-out that you have. A lot of inferencing can be done to CPU, and we certainly are supporting that with our EPYC servers. You're also seeing it move to the edge, and we've added AI acceleration into our Versal line of products, our embedded devices, and then right into end point, as I mentioned, with the Ryzen PC. And so that also has an effect of helping manage the overall cost structure, the CapEx spend, as AI is really spread across from the largest cloud installations, these massive clusters, all the way to end point devices.
Brett Simpson
analystI know the MI300X is off to a great start. Can you maybe give us a sense as to how you're shaping the road map beyond MI300X? And I think you've always said in the past customers buy road maps. So what are customers asking for? Or what's the discussions you're having that may be landing in 2 or 3 years' time? Maybe share with us how you're thinking about the evolution of the road map.
Mark Papermaster
executiveOh, absolutely. Well, I think the first thing that I'll highlight is what we did to arrive at this point, where we are a competitive force, we've been investing for years in building up our GPU road map to compete in both HPC and AI. We had a very strong harbor train that we've been on, but we had to build our muscle in the software enablement. And so we started years ago development of the ROCm software stack. It competes head on with CUDA. We're able to go head on. We're a GPU company just like NVIDIA. We've competed with NVIDIA for a year, so it's not surprising that a lot of the -- even the programming semantics that we use are similar because we've been, frankly, traversing the same journey for decades. And so that brought us up to December 6 when we announced the MI300. We brought that competition. And so I know some people ask, well, why did we not present the whole -- a multiyear road map at this time? Well, the first thing that you have to do in a race is you have to click the starting gun and you have to start the race. And that's exactly what we did in December. We put out there a highly competitive design in AI inferencing, a leadership design. And in fact, we executed that plan to a T. We brought it out to market as we projected in 2023. We're now shipping. We're now ramping. And that's exactly what we wanted. We -- and it allowed us then to create yet a different environment of how we're working with our largest customers. We worked closely with them and got input from them on the MI300. But now as they're adopting -- and you saw the large companies on stage with us, hyperscalers, OEMs, end users, application developers, on stage with us, because we demonstrated that we could compete. We demonstrated that competitive and leadership design. And so that got us a seat at the table. It's incredibly hard to earn that seat at the table to really understand the details of what's needed next. And what you saw play out is, in fact, NVIDIA reacted to our announcement. They've actually accelerated their road map. We're not standing still. We made adjustments to accelerate our road map with both memory configurations around the MI300 family, derivatives of MI300, the generation next. And so we've been very closely working with our customers. And what we can tell you is, I'll tell you right now, that race has begun, and it's going to be a competitive race. You're going to see that back and forth like you've always seen when you have competition. It's going to be great for the market. It's certainly spurring us at AMD to be at our very best of innovation. And I think it's going to spur everyone to be at the top of their game. So very, very exciting. We've launched that foundation with MI300. And stay tuned with us as we'll share more details forthcoming on that road map because it is indeed a multiyear road map that we've laid out.
Brett Simpson
analystYes. And one thing we keep hearing from the work we do speaking to folks in the industry, looking at the road maps, they say, look, there's a memory wall in this whole road map. Memory is pretty challenging. You need a lot more density, faster memory. How do you solve that memory while -- look at the MI300X, and you've got a lot more memory than other AI platforms in the market. So how do you go even -- to an even greater extent with memory, because it has such a big effect on performance and cost of ownership?
Mark Papermaster
executiveYes. Brett, that's a great question. I mean, when you have these massive compute engines -- and again, it is math that's driving AI. It breaks right down to the fundamental floating-point and operations, the multiply-accumulate mass functions. And you have to feed that beast. You have to bring memory in at high bandwidth and high capacity or you don't get the end performance that you need. We get that at AMD. We were actually the first to bring high-bandwidth memory, HBM, to market in a 2.5D configuration. What do I mean by that? Our GPU chip sitting on silicon connected on a silicon substrate. So a silicon-to-silicon connection to the HBM memory, we launched that with our Fiji product in 2015. So that was 9 years ago. So we're -- we have been extremely experienced at bringing memory into the GPU compute cluster. We led the way in what is now CoWoS at TSMC, which is the most widely used silicon substrate connectivity, to have the most efficient connection of high-bandwidth memory to compute. And we worked extremely closely with all 3 memory vendors. So that is why we led with MI300, and we decided to invest more in the HBM complex. So we have a higher bandwidth, and that is fundamental. Along with the CDNA, which is our name of our IP, that's our GPU computation IP for AI. Along with that, it was HBM know-how that allowed us to establish our leadership position in AI inferencing. And with that, we architected for the future. So we have 8-high stacks. We architected for 12-high stacks. We are shipping with MI300 HBM3. We've architected for HBM3E. So we understand memory. We have the relationship and we have the architectural know-how to really stay on top of the capabilities needed. And because of that deep history that we have, not only with the memory vendors but also with TSMC and the rest of the substrate supplier and OSAT community, we've been focused as well on delivery and supply chain.
Brett Simpson
analystMaybe we can talk a bit about inference, Mark, because I often say this. We're in the R&D stage of AI, very training-centric today. But as we deploy services, and we're just starting to see Copilot and ChatGPT get rolled out, but there's a wave of things going to come. Obviously, the deployment phase is going to be pretty significant for inference demand. Can you maybe just show us how you're thinking about inference over the next couple of years? And do you think we're going to see a bifurcation of training platforms need lots of large clusters? Inference maybe takes a different direction, and maybe there's -- you don't need clusters and maybe less HBM. And maybe it's a different product line. Can you maybe just talk a little bit from a computer architecture perspective what's different about inference requirements? And -- yes.
Mark Papermaster
executiveWell, first of all, there, of course, is a big difference between the training and the inference. I mean the training is much more dependent on just the raw computation, the floating-point operation per second that you can build out in a vast cluster for the toughest AI training needs. So as you have ever larger models, I look at GPT-4, it's over 1 trillion parameters in that model. It's massive. And so there, you have to have the raw horsepower, Brett. And so it's about building out these massive clusters. And again, that's what we've -- we're attacking with MI300 because it does have a scale up, building the base compute node. And if you look at our capabilities, we partner with the industry. We don't have just one solution to scale out and build clusters. We partner with the ecosystem to give vendors options as to how they can build out that training infrastructure and tailoring the networking providers that they have, tailoring how they may use different vendors to provide that solution. So we're very, very open as we build out our training solutions. That trend will continue. So larger and larger clusters for training. And for those large language model inferencing applications, it will still largely be GPU-based like what those massive LLM needs for the training. But they're different. Rather than being that same massive cluster build-out, there, it's about latency. How quick can you get that answer? Think, Brett, if you asked GPT or Bard, you're asking that question, it's a very broad question. You're waiting for the response. You need that calculation to be done very, very quickly. And again, that's where the memory configuration is extremely important in how you architect to improve latency. And so those trends with LLMs will continue to build out in terms of large GPU-based clusters as you get more and more applications and those applications aren't just broad LLMs and artificial general intelligence-targeted cluster build-outs and start seeing more bespoke inferencing applications. I've trained my model, but now I actually want to tailor it and I want it to do more bespoke tests. There, you will see more bifurcation, Brett. You're going to see leverage of -- with a well-trained model, and with that being a massive model size, you're going to be running on the CPU build-out that you already have today in running your business. You're going to see, as I said, more edge applications. Llama 2 is a great example of a model. There's Llama 70B, 70 billion parameters, and then moving down from that, smaller models that actually are incredibly effective. So when you take the problem space down that you're addressing with that AI application, the inferencing can become more and more concise and less power demand to get the job done and less compute demand. So it's more energy and cost-efficient. That's the bifurcation you're going to see: continued build-out for the broad LLM in the search for AGI, and then a drive to much more cost- and power-efficient solutions for the broad spectrum of AI applications, which you now see in development. And we can talk more about that in our chat here, but there's a lot of other elements underneath that, including, we think, open source will be a huge factor in the build-out of those applications, but we can get to that later.
Brett Simpson
analystSure, sure. I'm sure we can. Maybe just on that same lines, Mark, I wanted to get your perspective on ASICs. We've seen some news from NVIDIA the other day that they're going to be developing ASICs. And there's been a few hyperscalers that stood up and said, "We don't think GPUs are efficient enough for inference, so we're going to do our own ASIC." How do you look at the role of ASICs? I mean, I guess, on one hand, the technology is moving so fast, that doesn't lend itself so well to an ASIC model. But how do you see the ASIC opportunity? And does it make any sense for AMD to offer an ASIC solution to customers?
Mark Papermaster
executiveWell, there should be no surprise that you're seeing ASICs that are being targeted to an area that a large customer who has a massive compute demand. When they have a piece of an algorithm, an application they're running that's well defined and they are the absolute expert of that application, they control the parameters around it, then it makes sense to create a more tailored solution because you can really optimize and suboptimize on exactly the specific elements of a more stable application environment. Then you'll get the return of a tailored bespoke ASIC silicon device. We've seen that time and time in the industry. I mean you go back, and I talked earlier about a comparison to the whole networking infrastructure build-out. You can look at Cisco, one of the leaders. They would often create tailored ASICs for a number of those applications, yet when it became -- and where you really need high programmability, it's the programmable devices are used. It's FPGAs and it's CPUs and GPUs are being used. And you're seeing the same thing here. So we work closely with all of the players which are building, in fact, these tailored ASICs. And the size of the demand is, as I said earlier, growing so insatiably you actually need all of this. You're going to need continued growth. They need us and our competitors to continue to drive more performant, more efficient CPUs and GPUs, these heterogeneous highly programmable elements forward. And that allows very rapid innovation on algorithms. Because the hardware is not tailored to any one algorithm, it's supporting the basic math functions. It's accelerating those math functions. It's providing incredible bandwidth and low latency to memories and providing incredible scale-out efficiency. That will continue. And you'll see a continued development of these ASICs, which are focused on more specific algorithmic needs. So when you have such an expansion of TAM as we're seeing now, we need innovation on all fronts, and that's what we're seeing.
Brett Simpson
analystInteresting. And another topic that we've been hearing a lot about recently is power, data center power. And I guess as we see more AI compute, you talked about insatiable demand, the power requirements here are very different. The power densities, the power per rack, the requirements for water cooling. We're moving up to higher speed SerDes. How do you think about this challenge that -- is this going to slow down the industry? Do you think there's going to be a difficulty getting the power infrastructure that's needed to really deliver the demand, deliver supply?
Mark Papermaster
executiveIt is an absolutely huge challenge. When you look at data centers today, they are power-gated. So it's not floor space, it's not another limiter. It is power-gated in terms of being able to drive up their compute capacity. And so what are the trends? The trends is that the basic building blocks that are predominant today for AI training inferencing is the GPU -- that CPU/GPU cluster as it gets built out. And we are very, very focused on the innovation around that cluster. I mean the optimization of that CPU and how it works together with the GPU is critical. We've been doing that since 2007 is -- with the acquisition of ATI is when AMD started working on optimizing CPU with GPUs. But it's more than that, Brett. It's really what I'd call holistic design. So what we're seeing to drive the energy efficiency now is we can't just be a hardware vendor anymore. We have to be thinking about all the way through the application. So yes, we're driving energy efficiency in our design itself. We have many, many power-gating features, and we innovate at every generation on the power elements. We have entire microcontrollers whose job -- their only job that they're doing is optimizing the power of every program that is running on our compute clusters. We also work with our foundry partner. Our primary partner is TSMC. We operate in a deep, what we call DTCO, or Design-Technology Co-Optimization. And we're driving a very high optimization of how the transistors themselves and how they're manufactured drive less energy consumption. But then we work ourselves up to the entire stack. We're putting elements in to really drive optimization with math formats, with approximation. So where you can run an AI, a math approximation, versus a -- let's say a traditional high-performance application would be running at 32-bit and 64-bit floating point. Well, HPC applications needed that incredible accuracy. AI applications do not. So that's another source of energy efficiency. So it is this holistic design that we're driving. We've actually committed, from 2020, with our heterogeneous HPC and AI design base to drive a 30x improvement by 2025 with all those elements I just described, and we're on track to that. We're on track to that. So that's going to be a huge piece to manage the energy. And the other piece will be the trend I talked about earlier, that all models aren't created equal. You need models which are smaller, and therefore, a much more energy-efficient computing based on the application at hand. But I have to say, what you'll see then, what we will do is we will be increasing the power and generation on that base compute, CPU and GPU cluster for the largest LLM applications, and yet we'll be offering much more energy-efficient solutions as you look at edge applications and device applications. And so that -- it's very important, when people think about energy consumption of AI, that they step back and look at the macro. The macro is you're going to see an incredible growth of AI applications, and those AI applications all won't be on those highest, most incredible power consumption demands in the cloud, but they'll be moving to the -- as well to the edge and to end points.
Brett Simpson
analystAnd we see -- you mentioned some of the smaller models. Do you think AMD needs a separate road map for the low-end AI? Like model sizes are maybe under 100 billion parameters. Maybe MI300X is too powerful or maybe that's not what it was designed for. So do you think it makes sense to see a segmentation at some point where AMD maybe promotes some of the higher-performance client GPUs into a distinct category for lower, smaller AI, if you like?
Mark Papermaster
executiveAbsolutely, Brett. What we have done at AMD is we were a leader in modular design. It's been a key to the turnaround at AMD over the past decade. And that modularity gives us a chance, the opportunity, that we've been deploying now for years. And that is to make sure that we rightsize the product offering to the task at hand. You will see that in our GPU road map. So again, the first thing we had to do was create competition at the most demanding training and inference applications. Because that's what proved that there can indeed be competition in the market. That's what earned us that seat at the table. And so what you'll see now is as we proceed, you'll see other variants. We already have today the MI250 in a PCIe form factor. So it's a much lower power PCIe attach for those smaller model sizes. It's incredibly cost- and energy-efficient when you have those smaller model sizes. You'll see a continuation of that type of PCIe product line going forward in our road map. And as well, we have added AI to every product in our portfolio. And so you are seeing it -- again, we already started last year with the Ryzen 7040 with our AI-enabled PC. We boosted the AI inferencing capabilities with our announcement at CES, and we brought it into the desktop form factor. And we have also a broad acceptance in our Embedded devices. You can look at Alveo and other aspects of our FPGA and adaptive compute road map that brings AI into Embedded. You're even going to see it in Gaming, because AI in gaming is giving you more energy-efficient image visualization and rasterization. So it is across our entire portfolio. And as I talked about Gaming, we have enabled now AI in our ROCm AI enablement stack to Radeon. And so that was a big announcement we made last year. And we look to expanding that AI support across Radeon, because that gives a broad range in our portfolio, our GPU-based portfolio, from the absolute optimized CDNA devices for the highest capabilities also now extended with our RDNA devices.
Brett Simpson
analystYes, yes. Let's talk a bit about AI PC, Mark, because you were touching on some of the changes you've made to Radeon. But can you maybe share with us what is an AI PC? Is it a specific spec? Do you think that, in general, we should be expecting GPU attach rates in PC to climb because folks are going to want to buy a more performant machine that's accelerated PC for -- to do AI? Or is it specifically more about sort of like NPUs conforming to maybe the next operating system that Microsoft is going to roll out? But any sort of help from a high level? How do we -- how should we think about AI PC? And why are you so excited that this is going to be something that might drive an upgrade cycle for the industry?
Mark Papermaster
executiveYes. Brett, it's a great question. And I do believe that this indeed is a new cycle for PCs. You think back and PCs were thought of a few years ago as potentially having lived their life. And what we saw is by no means. The PC remains the dominant content creation device. It's a dominant workforce in both our business needs as well as any of our content creation and often our intercommunication needs that we all have in our work and business lives every day. And so what drives a new cycle on PCs? It's indeed new capabilities. So it is the new capabilities that can leverage an NPU, a neural processing unit, like we have embedded in our Ryzen 7040. We've actually shipped millions of units already, so we're out ahead of the curve. We're actually future-proofing those that purchase those Ryzen AI-based units today because the applications are now just starting. When you think about here we are in the video conference, and I don't have my background blurred, but often you might want to choose to blur your background. That's an application today that you can offload on to that neural processing unit, that inference accelerator and as well as some of the -- we clicked an option here that said, hey, let's create captions. Let's translate the words that you and I have into written text and have that captioning. That's another example that's today. Well, that's nothing compared to what's coming. Because when you look at the kind of applications, and Microsoft has spoken publicly that 2024 is the year that they're going to enable a very -- a broader set of AI inference on applications for the PC and a developers' ecosystem around that, it's going to create, indeed, a new cycle. Because now you're going to have capabilities we've never had before. This is an international broadcast we're doing today. So imagine if you're in France listening to this and say, I would like to hear this in my native language. Click French and you get a very accurate simultaneous translation on the device. And it's not tying up your capability to process the conference and be doing other tasks on the side. It's in that inference accelerator. Think about content creation, where you look at Adobe and others are creating an incredible capability to interface with the natural language and create new visual content capability. It's going to transform content creation. The PC is a content creation device. And with that AI offload capability and working in conjunction with that integrated CPU and GPU, it's just incredible capabilities that are coming. And Copilot. We're using a number of Copilot applications at AMD. We've -- are using that. I don't think there's an e-mail I create that I don't have Copilot help me draft that e-mail, certainly check the e-mail, make sure it's context-aware, it's appropriate. All of this is still at the very, very beginning, and we're really pleased with our work not only with Microsoft, but with many, many ISV developers out there. And so we're very excited with the road map. Just launched Ryzen 8000, and again, the first desktop CPUs that are out there to be AI-powered, and we have a very, very strong road map. So we're driving up significantly that very energy-efficient, very, very low-latency AI processing capability in our PC road map.
Brett Simpson
analystGreat. We're going to open up to questions in a second. Maybe just one question before we do that. I wanted to ask about Intel 18A. We get a lot of inbound from investors who are quite keen for your perspective on the impact that 18A might have on Intel's ability to rebuild. And as they ramp their platforms for PC and server, does it make AMD want to pull in their process road map? Do you see sort of 2-nanometer base chips as something that needs to happen quicker for AMD? Anything you can share in terms of that process upgrade coming from Intel and how AMD thinks about it from your own perspective.
Mark Papermaster
executiveWell, certainly. Intel has been very public on their efforts in driving their fab technology. And what we do at AMD is we always assume the competition will execute. So we drive ourselves to make sure that we will continue to be competitive and actually leadership capabilities. And that's exactly what we're doing in our PC road map. We, for years, Brett, when you go back really a decade ago and you look at where we were at before we launched our Zen product line and before we shook up the market with 7-nanometer and lower lithographies, we always competed with a disadvantage on process nodes. And so we had to really hone our design capabilities. And so you had to win by design and make up where you might have shortfalls in process. What we did from 7-nanometer and beyond in our client road map across -- as well across more broadly a road map, but I'll focus here on this topic with our Ryzen lines for PCs is, again, we leverage that DTCO, that deep technology co-optimization (sic) [ Design-Technology Co-Optimization ] with TSMC, and we've made certain that we are hitting the sweet spot on the curve of every node so that we get the right performance, the right power and performance trade-off and the right cost power and performance trade-off. And that's what's enabled us to position our road map very, very well. We've grown PC share tremendously since the launch of our Zen and our Ryzen product line, and we're very excited going forward. We have -- we worked with the right timing of when M3 and M2 comes on our road map. But I will say, again, it's not just process. You have to look at the core design. So we -- leveraging our Zen CPU product line, we continue to have our high-performance optimized cores, but we also have cores which are very dense and power-optimized, and we have the ability to support hybrid cores across design, so the applications that don't need the highest performance can run more energy-efficient, and the ones that do need that performance get it. And by the way, it's the same implementation of the instructions and architectures. So programmers don't know the difference of which core it's running on. It behaves exactly the same regardless of the hybrid core which it's being run on. So we put a lot of design innovation in and continue our deep partnership with TSMC on node technology as well as packaging technology.
Brett Simpson
analystYes. Great. Well, look, I think this is a good junction to open up for Q&A. So I have my colleague, Janco, who's also standing by to help on the questions. So Janco, do you want to take it away?
Janco Venter
analystYes. The first question comes from Akhilesh Kumawat from Bernstein. What is your strategy to ROCm to compete with CUDA? What are the key technical milestones for us to keep an eye on to validate ROCm's competitiveness?
Mark Papermaster
executiveNo, it's a great question. Thank you. ROCm, as I said earlier, is our software enablement stack. And it's critical because when customers run, they program often at a high level and framework. More and more people are programming today at a framework level, which is actually independent of the ultimate device you're running on. You're running on NVIDIA, you're running on AMD, and you have to have a stack that translate that high-level framework or a programming language like Triton, which is, again, vendor-agnostic, and you have to optimize it to really deliver that value, that total cost of ownership. So we've been developing for years and ensuring that we have a competitive stack. ROCm first went to a fully competitive position in high-performance compute, HPC, with ROCm 5.0. And all that while we were preparing in that stack as well for the broad range of AI applications. And when we announced ROCm 6.0 at our December 6 AI event, now it's out there, it's open source, which is a huge differentiator for us because it's not just us developing. We bring the community with us. And so it's now out there today. It's running highly performant. It's supported by all the widely used LLMs today. And as I said earlier, it's now expanded as well to Radeon. But I talked about how we're optimized for that general use case of running from a framework or a vendor-agnostic library. But the other thing that we've been investing on is the ability, if you did indeed code at a low level in CUDA, how to very, very efficiently port that over into ROCm, into our stack. And we've done that, and we have a number of customer testimonials out there. In fact, we shared that at our December event, of those who took existing CUDA applications ported over. Again, we're a GPU just like NVIDIA. We have a shared history, a shared journey over decades, and so it's not surprising that, that porting is a very straightforward process. There's a lot of work underneath because you have to have very detailed and performant libraries that get called from those GPU type of semantics, but we've done exactly that. We've gotten it out there. We've tested. We have a very, very strong at-scale suite, and now we've earned that credibility. We have a seat at the table. So we're working with customers, building out more and more of those use cases. I would point right now to -- for instance, Hugging Face has thousands of open-source AI LLMs out there. They are regressing not just on NVIDIA but also on AMD before any of those models goes through their nightly releases with updates. PyTorch, we're a founding member of the PyTorch 2.0 and the PyTorch Foundation. And again, we are a full-fledged supported offering of PyTorch. And across with more and more customers every day, we're building out those examples of where, in a very, very facile way, our customers are able to adopt and really get TCO advantage with ROCm. So it was an absolute turning point for us with ROCm 6.0. And we're not slowing down. We've created -- we're really growing the division we have. The AI group is an entire division of the company focused on our software capability and ISV enablement.
Brett Simpson
analystGreat. Janco, can we take the next question?
Janco Venter
analystThe next question comes from Fred Holt from Polar Capital. There were some news that blew up were on Twitter overnight and that has been around for a while about a company called Groq, which is claiming that it can blow everyone out of the water on AI inferencing. They have designed and built an LPU that use SRAM memory and now HBM, of course. Given MI300 is looking to take share in the inferencing market, do you have any view on Groq LPUs versus AMD GPUs?
Mark Papermaster
executiveWell, what I'll say broadly is there is an incredibly broad range of inferencing applications. And it turns out that the correlation of what you're trying to do and your inferencing to how the model was actually trained and how it can be deployed in a very, very broad inferencing application really matters. So if you're looking at the largest LLM applications that were trained on GPUs, it turns out the inferencing is -- needs to be run at scale on GPUs. That doesn't mean that there's not room for innovation across inferencing. We're going to see -- per my comment earlier, there's so many inferencing applications that there's going to be just a huge continued demand for our CPU/GPU-based clusters on both training and inferencing. You've seen how we've expanded our portfolio to have the kind of tailored inferencing needs where it makes sense. We talked earlier about ASICs that large hyperscalers are creating. And of course, we now have start-ups that can create efficient inferencing applications. But the trick is to look at those applications and say, what is the software stack in application demand? Can it be widely deployed? Or is it, in fact, a more bespoke applications, which is fine. That still means that there's markets for it, and that's exactly what you're going to see. But again, the compute capacity demands are growing so astronomically that when you hear these announcements, you don't have to think, oh, well, this must disrupt someone else in the industry. It's not necessary at all. Of course, there's -- I would never say there's not an opportunity for disruption, but I haven't seen it yet in the AI space. What we're seeing instead is just a broadening of the tool chest of engines that can be applied to the broad set of training and inferencing applications.
Janco Venter
analystGreat. The next question comes from Greg Hart from Viking Global. Can you talk about AMD's AI networking road map and strategy to close the GPU networking gap versus NVIDIA in building multi-GPU training clusters? To what extent does AMD plan to develop its own networking silicon or new networking fabrics to improve training performance?
Mark Papermaster
executiveThanks. It's a great question and absolutely pertinent in terms of our scale-out capabilities to take on those largest training clusters, and we're incredibly focused on that space. First, what I'll say is we differentiate from our competitor in this regard in that just like our software stack is a fully open software stack and is explicitly put out there to drive our collaboration and community engagement all the way down to our library optimizations up through the end application stack. Likewise, when you look at our hardware build-out strategy and our networking and scale-out capability, it is the same. It's about ecosystem. We announced at December 6 that we're opening up the key aspect of our Infinity Fabric, that fabric that allows our CPUs to GPUs and our GPU to GPUs to communicate very, very effectively and efficiently. And we're opening up the key aspects of that specification to networking vendors so that in their plethora of solutions that they have today leveraging the internet, the internet is the most vastly used scale-out networking capability out there. And so adding that protocol to be able to very efficiently scale out our GPU complex and to build the largest of training clusters is a fundamental in not only our training capabilities, but the ecosystem around it and our customers having choice with what vendor configurations that they may wish to use in building out those huge AI clusters. The other aspect is in the pod build-out itself. How do you build out that the GPU cluster? That does involve the know-how that we have because we're using -- we're finely tuning those -- the scale-up capability, and we have that know-how in AMD. So we have -- through the acquisitions of Xilinx and Pensando, we have outstanding networking skills, and they're focused on ensuring that we have the most efficient scale-up capabilities. The other thing you're going to see as well is innovation around switch devices that can really bring that type of capability to expand GPU clusters to a higher radix using industry standard switches, and we're excited about that as well in future road maps.
Brett Simpson
analystExcellent.
Janco Venter
analystThe next question comes from [ Richard Clove ] from [ Yannis ]. Sam Altman and [ Musk ] all jumping on the AI silicon bandwagon there. But if you started today, even with limitless money, how long before you had a viable chip in production with the necessary software stack?
Mark Papermaster
executiveWell, that's -- it's a great question. It's a tough question. I mean I made the analogy earlier to our journey. Frankly, our journey is decades long. People might look and say, look at that, AMD came out of nowhere, and here's this AMD Instinct MI300 that is taking on -- not only taking on NVIDIA, beating them in these key large language model inferencing applications. It didn't come out of nowhere. It came out of a long journey that we've been on, hardware and software journey that we've been on, the GPU heritage that we have, which is so many years in the making. And again, it took all of that to earn the seat at the table to where we now really understand where the algorithms are going. The hardest thing for a start-up today is if you haven't earned that trust of the leaders in the industry, you're not understanding where the new algorithms are going. You're not understanding where the new demands are, where they need flexibility and programmability, because the change is occurring so quickly. So I think it is a high barrier to entry. There will, of course, be more competition. And I think where you'll see the most traction is where it's focused on more bespoke applications that you can tailor and you can develop a unique advantage and you can gain that toehold in the market. For these broad, highly performant GPU applications, that's where the highest barrier to entry would be, and again, one that we were working on many, many years to arrive at where we are today with our leadership Instinct road map.
Brett Simpson
analystJanco, next question, if we can.
Janco Venter
analystThe next question comes from [ Peter Rector ]. Will you bifurcate your CDNA devices at some point, offering a device optimized for scientific compute, double precision optimized, and a device optimized for AI, small data-type optimized?
Mark Papermaster
executiveNo, it's a good question. And what I'll tell you is if you look at what we've done across our road map, we did bifurcate, first of all, our GPU road map with CDNA and RDNA. So one, the RDNA to focus on gaming applications. It still supports AI and it's going to share the base constructs and the learning that we do of AI optimization with our CDNA, but it's gaming-focused. The CDNA, on the other hand, is all about HPC and AI. And we'll mature that road map over time. At this point, all of our devices do support both HPC and AI, meaning for HPC, it does support those high-precision floating-point operations, FP64, FP32, single and double precision floating point which you need for HPC. But we'll continue to watch the space. What I will tell you is we're really seeing HPC and AI on an eventual convergence path. The developers of HPC are finding that they can, in fact, in many cases, take advantage of the math approximations. And so we'll continue to watch the space. If it's necessary to have a version of CDNA that -- one that supports single and double precision and another version that doesn't, if that provides a strong TCO advantage, we'd absolutely consider adding that. But we're going to listen to our customers. It's what we do best at AMD. We partner, we listen, we will adapt our road map to what our customers needs.
Brett Simpson
analystGreat. I think we've got time for one more question, Janco, so we'll make that the last one.
Janco Venter
analystThe last question comes from McLane Cover from First Republic. Can you talk about instances and applications where AMD's GPU can be used instead of NVIDIA's H100?
Mark Papermaster
executiveAny application. I mean if you don't have something that you wrote at a very, very low level to some unique NVIDIA constructs, which you can. I mean you can actually code such that you are tied to an H100, so those, you have to segregate and put aside. But we are a high-performance data center GPU that supports all the common frameworks. We support even low-level semantics that may have been written in CUDA that you can port over unless you used, as I said, some very unique elements that are highly proprietary. So it's -- we are after indeed the broad TAM of data center GPU compute.
Brett Simpson
analystAnd does that -- Mark, just to be -- just to clarify, are you seeing more demand this year for internal requirements? So some of your customers keeping the GPU -- rather than put them in public cloud, they want to keep the GPUs for their own use? Or are you expecting the majority to be rolled out into public cloud? And when do we see instances with MI300 in the market, roughly?
Mark Papermaster
executiveYes. Great question. I mean this is by far the fastest product launch that we've ever had. We first -- Lisa put out there that we thought we had a $2 billion market opportunity in 2024. She revised it at our last earnings to greater than $3.5 billion, and we're tracking to that growth rate. It is being led by first-party applications, because you have hyperscalers who are really starved for the kind of compute capabilities that they need. But you're going to see a quick follow. You're going to see in the first half of this year, here we are in February, but in the first half, you will see instances stood up that are there for third-party applications. And you're seeing the OEMs that will be servicing the enterprise market and bringing their products to market. And so we're really -- it is not just the fastest ramp that we've had from a revenue opportunity, it's the fastest ramp that we've had across a broad data center application suite, from first-party applications, third-party applications and enterprise.
Brett Simpson
analystExciting times. Yes. It's going to be an interesting year.
Mark Papermaster
executiveWell said.
Brett Simpson
analystMark, I just wanted to say a huge thanks for coming on today and sharing your perspective on what is a really exciting time for AI and for AMD. And fascinating discussion. Thanks very much. And I also wanted to extend my thanks to Suresh and Mitch, who's on the line as well, for making this possible. Any final remarks, Mark, that you wanted to leave with the audience?
Mark Papermaster
executiveYes. First of all, thanks for having me. I mean we're really excited. We're passionate about what we're doing at AMD. We're thrilled to be bringing competition to the highest performing levels of AI, and we're thrilled to be bringing AI across our entire portfolio, all with one software stack that we have essentially developed and supported in the company. As you said, Brett, exciting times ahead. Thanks so much.
Brett Simpson
analystYes. Great. Appreciate it. Great discussion. Thanks, Mark. And thanks, everyone, for dialing in. Thank you.
This call discussed
For developers and AI pipelines
Programmatic access to Advanced Micro Devices, Inc. earnings transcripts and 32,000+ others is available through the
EarningsCalls.dev REST API. Plans from $24.99/month — full transcripts, speaker segments,
full-text search, and the recently-added /api/v1/transcripts/recent polling endpoint for ETL pipelines.