Broadcom Inc. (AVGO) Earnings Call Transcript & Summary

March 20, 2024

NASDAQ US Information Technology Semiconductors and Semiconductor Equipment special 130 min

Earnings Call Speaker Segments

Ji Yoo

executive
#1

Good morning, and welcome, everyone. I'm Ji Hung Yoo, Head of Investor Relations at Broadcom, and welcome to Broadcom's Enabling AI Infrastructure event. On behalf of Broadcom's executive team, I'm pleased and excited to welcome our in-person attendees and our virtual audience. As a reminder, today's presentation includes forward-looking statements. Please see our recent filings with the SEC on risk factors that could cause our actual results to differ materially from those forward-looking statements. Today's presentations are being webcast live, and the slides and recording will be available on Broadcom's IR website after the conclusion at the investor meeting. With that, I'll walk you through the agenda. We're going to start with an overview with Charlie Kawwas, Broadcom's President, Semiconductor Solutions Group. Following, we'll have Ram Velaga, General Manager of our core switching Group, will discuss scalable AI networks. Jas Tremblay, General Manager of our Data Center Solutions Group, will talk about AI server interconnects; and Near Margalit, General Manager of our Optical Systems division, will talk about optical interconnects. Following will be Vijay Janapaty, General Manager of our Physical Layer Products division, talking about our foundational technology for AI interconnects; and Frank Ostojic, General Manager of our ASIC product division, we'll talk about custom AI accelerators. And in conclusion, we'll have closing remarks from Charlie Kawwas. Now with that, after a short video, I'm pleased to introduce Charlie Kawwas, President, Broadcom Semiconductor Solutions Group. [Presentation]

Charlie Kawwas

executive
#2

All right. Well, first of all, welcome to Broadcom semiconductor headquarters in San Jose, California. We're at the heart of Silicon Valley and I'm very excited you're here with us. This is the first time we actually have such an event here in our campus, and I'm glad you'll be here with us to actually visit our labs later on. First of all, thank you for taking the time. Some of you told me were in London yesterday. Others came actually from Asia, Others traveled across the country. So thank you for taking a couple of hours this morning to be with us. And for those who are on the webcast, including my 82-year-old mom who dialed in internationally, hello, mom, I love you. Thank you for being with us. With that, I have to tell you, we're at an inflection point at this point in time with incredible technology that everybody is talking about, by the way, not only this week, I think we'll be talking about this for a while. This is going to be an inflection point in the industry that will change our lives and how we work. You're going to see us here with my colleagues committed to pushing boundaries and pioneering breakthroughs on enabling AI and infrastructure. And with that, the video that you just saw, the script of the video, the images of the video, the actual video and the voice all was generated using a generative AI tool that's based on Broadcom technology. It is incredible what this technology will be able to do for us. Now as the video said, our heritage and DNA is based on technology innovation, and it goes back more than half a century, actually, more than 60 years ago. The transistor that powers every single chip that we use in our daily lives was invented by this company, by Bell Labs. The laser that connects all the data centers, the clouds in the world was also invented by this company by HP Microelectronics, which is the grand parent of Avago. For the past 20 years, we've been driving several acquisitions. I would say up until 2016, we've actually acquired all of the semiconductor franchises that I'm going to talk to you about specifically in networking. Since then, we started acquiring software companies in 2017, starting with Brocade and up until November last year, the latest acquisition, VMware. That created the infrastructure software group. Today, we're going to be focused on the semiconductor side and specifically AI. Now the interesting thing is in the last 8 years, we have not acquired a single semiconductor company. And over that period of time, we have driven organic growth. Just to take a look at the last 5 years, the business was $17 billion in 2019. We finished last year at $28 billion. That's all organic growth, and that is all based on the heritage and DNA that you saw on the video and you're going to see throughout the day to day based on technology leadership. That represents about 13% CAGR, much faster than the semiconductor industry. And that's all based on large scale of investment, over $3 billion of R&D that we invest in this business. Over the last 5 years, that's $15 billion of R&D that drove this organic growth. Not too many semiconductor companies can say they do that. Now how do we deliver this organic growth? We have actually a very simple strategy that Hock and I apply across Broadcom, and we've been doing this actually for the last 2 decades. It actually has 3 pillars. The first pillar starts with the market. not what many people think of Broadcom when we do and implement the strategy. So we choose markets that are durable. And we look at these markets for a span of 10 years. And the first question we ask ourselves, will this market be there in 10 years from now? Now most CEOs that I worked for in the past basically look for green shoots and markets that have hockey sticks. We are not interested in that, if that's all the markets you looked at. We're actually very interested in markets that will be there in 10 years. And by the way, some of the markets that we're in actually decline low single digits. And by the way, that's actually very interesting for us as long as the market is durable. Now in the case of AI, which we've been investing in for almost a decade, we just happened to hit a market that's actually growing significantly, and we're happy to be part of that. But that's not the criteria. Now the second pillar, which is the most important pillar, is technology. And that is the heritage of the DNA of Broadcom and especially the semiconductor group here. That is the area that we're determined in that market that we will bring leadership and innovation. And you cannot do that without investments, R&D investments, and you cannot do that without the engineers who you see here on this campus walking around. These are the engineers who actually bring us this leadership and bring the technologies we're going to share with you today. That is the most important thing for us at Broadcom. And the third thing is as you play in that market for 10 years and bring the best technology over that span of time, ultimately, in these spaces that we play in, with excellence in execution and seamless capabilities, we tend to be #1 in each of these categories. And as a result of these 3 things, we have coined a term that's called sustainable franchise. And that is the core definition of every business unit or division we have. So today in Broadcom, we have 26 of them. And inside the semiconductor group, there's about 17 of them. Now out of these 17, we've selected 5 end markets that we play in. Networking, which is the largest market for us and the largest business for us, wireless, server storage, broadband and industrial. And as I said, out of the 26 sustainable franchises, 17 of them are part of this group. Today, we're going to focus on networking. And inside networking, we're going to double-click on the subsegment that is AI and how do we enable AI and infrastructure. So let's start with the market, that's the first pillar. The way we look at this market is it is composed of 2 distinct markets. One, is the consumer AI space, and this is a space that has very few players, but these few players have billions of users. And the way they make money is based on ad and end user and consumer engagement. The interesting thing over the past few years that we've learned with them is that engagement is directly coupled to the amount of investment that they put in artificial intelligence and machine learning. And as a result, each of these consumer AI players, who are the majority of the market today, are investing tens of billions of dollars in this space. On a personal note, I see that working with my kids. You probably do the same. And it works. And the larger the cluster they build, the better the engagement, which means a better financial returns. The other market is on the enterprise side, which could be cloud or on-prem. In this space, a lot of people are trying to invest in AI. But to be honest with you, the business case is yet to be proven. I think there's lots of initiatives, AI initiatives. I've spoken to several CIOs yesterday, it's still wait and see. Each of these folks are building small clusters to trial these, even the cloud guys are trialing some, but there is no real tangible business case as we see on the consumer AI. So let's go to the second pillar, which is the technology, which products do we believe play in here? From a Broadcom vantage point, we focus on 2 products. One is what we call the AI accelerator. Many people call it GPUs, some people call it TPUs, others call it NPUs. For today, let's refer to it as XPUs. There are 2 ways you actually can develop these products. One, you can develop a general product that fits everybody's needs, which is great. However, if you're a consumer AI company and you're building these large-scale platforms, these general processors or GPUs are actually too powerful in terms of power consumption and too expensive to actually deploy into their networks. Some of them have no choice today because they don't have the ability to do a custom capability. But the few that have the scale of billions of users generating hundreds of billions, over $0.5 trillion in revenue, have that capability. And that's why we coin that as custom XPU or custom AI accelerator. Underpinning all of this, you have to connect all these XPUs and hence, you need a networking technology or AI connectivity. And the de facto today and for the future, will be a merchant play. So today, you will hear my colleagues and I focus on the consumer AI build-out, the very large scale build-out. From a product point of view and a technology, you will see us focus on the custom AI accelerators and then ultimately, showing you the entire AI connectivity portfolio that Broadcom has. And in both of these categories, our execution over the last 10 years has been stellar, #1 in each of these categories with amazing execution. So when you look at the strategy that I showed you, we're selecting the market to be in consumer AI on the semiconductor side. We're selecting the 2 products and technologies that we're going to invest in billions of dollars. And ultimately, our track record of execution plus what we're going to show you coming as well throughout the day today will keep us in that leadership position. So let's talk about that journey. We have not started looking at AI in the last 1 or 2 years. We've been in this for a long, long time. The revenue of AI in the semiconductor has been less than 5% for the longest time up until 2022. We're actually started seeing these consumer AI starting to spend a lot of money. And as a result, that jumped by more than 2x to 10% of our revenues in the semiconductor space. In 2023, we said we'll hit 15%, we hit 15%. And we predicted and targeted that we will do 25% for 2024. Well, guess what? It has accelerated. And at this point in time, we revised our forecast, and we said we'll actually do more than $10 billion of revenue in these 2 categories that I shared with you. Now with that, let me start with the first category. That first category, because it's very few players, it requires very deep strategic and multiyear as well as multigenerational engagement with these very few consumer AI players. And we're very proud of these engagements. We actually do them outside the franchises in AI. That's what we do with many of our customers. So that's in our DNA. We've applied this with the first customer for a decade. And my colleagues will show you that engagement. Well, they'll actually show you all the XPUs we've built, are building and planning to build. And that's obviously been in production for a while and will continue over the next few years. The second customer, we were pleased to share that with you recently. I think many of you celebrated that addition of another customer. And we just started ramping earlier this year and on production today. Very pleased to have had at least 4 years of engagement with that customer, multiple generations that we built, and it is in production at this point in time and will continue for the next few years. Well, since you're all here today and you traveled here, we wanted to actually share with you that we actually have a third customer. I don't hear any excitement. Come on. So we're very, very honored and pleased and happy to tell you that third customer is also in the consumer AI space. And we are in the ramp phase. And we will be shipping products in the next few months to that customer. And this is something that we believe will continue as well over the next few years. So in this space, the key customers around the world are probably -- can be counted on one hand. And 3 of them, we have deep multiyear strategic engagements that we're very proud of, and my colleagues here will share more details on that. Now this is a build-out that's been happening for a few years, and this build out, if we go back to 2 years ago, started with a cluster at the time that was state-of-the-art with 4,096 XPUs. The XPU at the time was a couple of hundred watts, and to interconnect 4,000 of these was fairly, compared to today, simple single layer networking layer using our Tomahawk switches. And at the time, we're pleased to actually have achieved that with a single customer. In 2023, we actually built a cluster that is using this XPU, that is over 10,000 nodes of XPUs and it requires 2 layers of Tomahawk or Jericho switches to achieve that. And this is the lowest power XPU in the industry today, merchant or custom. Less than 600 watts and using the latest technology. That's been shipping since last year. Now as you go towards 2024, we were going to extend this to over 30,000. And then the plan and the objective of many of these consumer AI customers that we have is how do we take this to 1 million, hundreds of thousands and 1 million. And as you could imagine, we're going to need breakthrough technologies to do this. So while this is what we have done and are shipping today in massive volume, this is what we're working on. And I wanted just to show that to you to show you the contrast in terms of the size of XPU and the capability that this team and these engineers in this facility are innovating. Here's -- and it's hard to see this here, but during the demo session, you'll be able to not only see, but touch some of these cool toys that we have. I call them toys because many of our engineers, including me, really are driven through some of these breakthroughs. So a couple of things about this cool XPU. It has 4 of these things on 1, not 2, 4 of these things. It has 50% more than what anybody else has announced they're going to be able to do in terms of bandwidth and in terms of memory. You can actually see the memories here, there's 12 of them. These engineers were able to fit 12 of these HBMs, the latest and greatest high-bandwidth memory, on a single chip. Here we have 6. Others have 8. We have 50% more. The chip-to-chip connectivity that you see with the 4 core dies versus 2 can do 25% higher speeds than anybody else can do in the industry. Remember, the tag name from the video was we connect everything. Connectivity is what we do. It's in our DNA. So when you have to create an XPU that has core dies that need to connect to each other, these core dies have to connect to HBMs. And then these HBMs and core dies have to connect to a chiplet that takes all of the bandwidth out. That is what Broadcom does. That's what we're good at. This is our heritage, and we can do it better and faster and in much lower power than anybody else. Now with that, what might be interesting here, I thought to share with you, is how do you build the cluster? How many of you have seen how to build a cluster? Just raise your hand. Nobody? Impossible. Okay. Well, let's build one together. So it starts with an XPU. Typically, there's 8 of them, unless you're a Broadcom and building custom XPUs, and you'll see a server, an open server and the demo area that can have 12 or 24 XPUs, not 4 or 8. We can have as many as 24 because when you customize it, you can actually significantly cut the size and power. You have to connect these together and that function is called scaling up these processors. That can be done through just directly measuring them or having a PCIe switch or a proprietary switch or even an ethernet switch. After you do that scale-up function, you bring x86 or ARM processors that you need to use as the control plane to interconnect with these XPUs. And that interconnect is done through PCIe switches. And to get these to exit the server, you need network interface cards, and these are the NIC cards that you see in here. This basic building block is called the AI server or the AI node. That is the basic building block. Now here's the cool thing. Broadcom's color is red. Anything you see in red is part of my SAM. This is what we're interested in today. So if it's a custom consumer AI, these XPUs are part of the SAM we play in. When you scale up, that's part of the SAM we play in. You have PCIe switches and NICs, that's part of the SAM we play in. We don't build processors, x86 or on processors. These are blue. We do not play in that sense. Now we have to take many of these servers and scale it out. That's the next step of architecting a cluster. And the cool thing about doing this when you scale it up, is you're going to have to use the best networking technology. And we believe that the best networking technology is ethernet. And as you scale up typically in 10,000 or 30,000 nodes, you need at least 2 layers of leaf and spine. And you will hear my colleagues here talk about the flexibility and architecture of how we do this. We're the only company that actually can support multiple ways of doing it, either based on the Tomahawk leadership products we have or the Jericho leadership products we have. So scale up and scale out is the back end. But look, all of this as a training cluster has to go to the Internet. And to do that, you need a top of rack switches, spine switches, also Tomahawk typically, could be Jerichos, and that's called the front end. Once you do that, you need to interconnect all of these things with cool, golden colors. These are our optics products. And guess what? Now everything that you see that's in red and has gold colors is part of our SAM. This is a beautiful picture. Now it comes with some challenges. If you scale this beyond 30,000, the hundreds of thousands and 1 million, guess what? The #1 problem you're going to have is not money. Even though your CFOs will not be happy with any of you to spend that type of money, even if they give you unlimited funds, power is the #1 problem. And just to give you an example. Remember the lowest power XPU that's in the industry today is ours. That's about 600 watts. The next one that's coming out either from Broadcom or others would be in the range of probably 1,000 watts. If you want to do 30,000 of these this year, just the XPUs will use 30 megawatts of power. Well, guess what? Most data centers, that's the maximum power they can give you. You have not put in power supplies, you have not cooled it down. You have not added networking. You will not be able to run that with the power strategies that exist today. To complexity, this is a heterogeneous system. And this system needs to figure out a way to scale amongst multiple players in the ecosystem. It cannot all be built by a single company. There's no single company in the world that can build everything in that data center or cluster. There are people that are required to work with each other. Some companies might have the most important and critical capabilities in here, but not all of it. So at the end of the day, it has to have some support at the ecosystem level. And to solve these, there are 3 things that we are investing our technologies in. One, we believe, this important inflection point in the industry has to be open and it has to be driven by open standards like ethernet, PCIe and other standard capabilities at the memory level even. I'll give you an example in here where some of my colleagues at foundational technologies level have supported the IEEE standard of 100-gigabits per second, meeting the spec of 2 meters reach, but we went far beyond that standard to double that to significantly lower power and cost and enable new architectures that the standard today cannot support unless you actually can exceed that. Yet, we're fully interoperable. Two, scale. How do you scale to 1 million cluster? The most important thing in these architectures is not going to be just the XPU. Actually, our vision and the way we see this moving forward is going to be centered around the network. As you move beyond 10,000 and 20,000 and 30,000 XPUs, it becomes a distributed compute challenge. A distributed compute challenge will not be solved with the best networking architecture that will be out there. That is a commitment that we're putting in place and scaling up and scaling out and interconnecting these networks. And lastly, but most importantly, power efficient technologies. And to do this, we're back to our DNA, technology innovation and leadership and delivering it in a sustainable way. So my colleagues next, Ram will cover the scalable networks ethernet piece. Jas will talk about server interconnect and PCIe. Near will show you some of the coolest and greatest optics technologies and foundational technologies such as SerDes and DSPs will be covered by Vijay. And to bring all of this together, we have Frank covering custom XPUs here. And with that, I'd like to pass it to Ram.

Ram Velaga

executive
#3

Good morning. My name is Ram Velaga. I run the switching and routing group at Broadcom. Been doing this for about 12 years. Prior to that, I was with a large OEM in the networking business for another 12-plus years. So about 20-plus years of experience in this business. Before I get started, a couple of things I really want you to think about, right? One is, are we ready to go back to the age of the mainframe where everything is vertically integrated, chips, hardware, software, application stack. But more importantly, to the back into the world of the mainframe, where you're hoping that you can get hyperscale volume. That's one. right? Or would you believe in actually an alternate thesis, an alternate thesis that basically says, look, to achieve scale, the history of technology basically says, you were able to take individual blocks that all interoperate across a common set of standards and you're able to build them with economics that you could only see by taking individual blocks. That's thesis number one versus two, that you need to think very hard about. The second thing I'd like you to think about is this is a distributed computing problem. If there's one thing I'd like you to take away from here today is in a distributed computing problem, it doesn't matter how big a GPU you make because it's not big enough to have the entire workload run on one GPU. You have to take multiple GPUs and connect them together, right? Whether they're connected inside a data center or across data centers, you cannot get around the fact that you have to connect multiple GPUs. Once you accept the fact that it is a distributed computing problem and you need a network, then I would make a very strong case for you that the best network in the world over multiple generations, over and again, has been ethernet, right? And I want to give you a few examples on thinking about why it's ethernet. About 10 years ago, we had these large cloud customers who were building large-scale data centers. They came to us and said, look, we need a lot of bandwidth in the switches. Even large OEMs said, look, you don't need this much bandwidth. We're not building those switches. They came to us and as a merchant silicon vendor, every 18 to 24 months, we double the bandwidth of our switches, right? And today, when you think about almost all the cloud deployments in the world, they're all based on merchant silicon switches. Six years ago, large telcos like AT&T came to us, and they said, hey, we want routers because we want the same economics that our cloud customers are deploying or our cloud competition is deploying, can you help us? Most people in the world said, no way in hell that you can have a large service provider, a telco network like AT&T's core network running on merchant silicon. Six years later, everything is running merchant silicon. 60% of traffic of AT&T's core network today runs on merchant silicon. Why am I telling you this? 18 months ago, I've heard a lot of people tell me, InfiniBand is going to rule the world. Every machine learning implementation is going to be based on InfiniBand. There's going to be no ethernet. I can tell you today, some of the largest deployments are based on ethernet and they're continuing to get deployed in the next year over ethernet. And as Charlie mentioned about it, what we're scaling to is a scale of 1 million-plus clusters, right? Now why do you need 1 million-plus GPUs in a cluster? Could be anything. It could be a large language model of whatever the reason is, but we take it for granted that the scale is needed, right? When this kind of scale is needed, really what ends up happening is the only way you can connect this is to have a network. By the way, so this is a picture of Google's cloud about 20 years ago, okay? And you look at this picture here. Do you see a mainframe? Anybody sees a mainframe here? No. Okay. What do you see here is a bunch of commodity servers. And they actually had the hind -- foresight back then to saying, they didn't pick the most expensive or the highest-performing CPU. You can go back and read literature on this. Instead, what they picked was the CPU that offered the best cycles per given cost. And they took that CPU, built these boards and they connected it over the network. You see all these blue, yellow, orange cables? That's all ethernet. So they built the world's largest distributed computing system that was not just inside a rack, but inside a data center, but actually extended between data centers, right? That's how you build the world's largest compute systems. And that's where they actually used ethernet to build it, right? Now if you think about it, now Sun actually coined this network as a computer. I didn't understand it 20 years ago. But as you actually start to think about the scale of the problem we're facing, this really comes to fruition. The idea that the network is the only way you can actually distribute computing and build a very large computer. Now you'd say, okay, great, networking. But why is networking so important, right? And this was a slide that was presented about 18 months ago at the OCP by Meta. What they showed in here is when they are running these large workloads, anywhere between 18% to 57% of the time, the traffic is just sitting in the network. That means during this period of time, the GPUs are actually sitting idle, right? Now think about it. If on an average, somebody is charging somebody between $20,000 to $30,000 per GPU and you've got 100,000 GPUs, you're talking about a $2 billion to $3 billion infrastructure. And if $2 billion to $3 billion infrastructure is sitting idle for 18% to 57% of the time, that's a lot of money, right? So the whole idea that, look, we have this expensive infrastructure, but it's sitting idle because traffic is sitting in the network. We need to fix it. Then you say, okay, what about the machine learning causes this traffic to sit in a network versus actually kind of the GPUs being super active? There's a couple of things. One, okay, most of you, if I ask you, hey, what do you think the bandwidth of a CPU is, if you just go buy a CPU, what the bandwidth is. You probably agree that the fastest CPUs are, at best, pushing 50 gigabits per second. 50 gigabits per second, okay? The GPUs today are pushing 400 gigabits per second. And we expect that sometime next year, late next year, these GPUs are going to be pushing 800 gigabits per second. So think about it, you're going from 50 gigabits per second to somewhere between 400 to 800 gigabits per second. That's 20x more bandwidth, okay? So that's number one. GPUs pushed a lot of bandwidth. Number two, these GPUs are very, very idiosyncratic in the way they work, okay? What happens is they all want to talk to each other at the same time. And when they talk to each other, they actually want to talk a lot at the same time, okay? So to put that into context, just think about it. In the morning, everybody wants to get to work, assuming you actually want to get to work versus working from home. And you want to get to work at 9:00, okay? So you decide to drive to work everybody at 9:00. And then you also decide I'm not going to take my car, I'm going to take an 18-wheeler and I'm going to drive an 18-wheeler truck to work, okay? Number two. Number three, then you decide, all of us are going to use the same one lane, even though I have an 8-lane highway. What do you think is going to happen? Everybody at the same time, 18-wheelers all using one lane, all the remaining 7 lines are empty, right, creates congestion. The congestion starts to back up. And once it starts to back up, then you essentially have all of these GPUs sitting idle waiting for the network to be effective, right? So these 3 attributes are extremely important, high bandwidth, the way these guys talk to each other, wanting to talk at the same time and just creating a congestion. And one of the things that's worth noting here is these workloads also last for a very, very long time. It's like saying, look, you actually are going to go to work, but your commute is very long. It's a 100-mile commute. And then if your car breaks down in between, you got to go back and restart again. And that's exactly what happens here, which is these workloads run for a very long time, they checkpoint frequently. But if there's a failure, they got to go back and start from the checkpoint. So if you've last like -- you're 70% into your job and there's a failure, what do you end up doing? You've got to go back and start again and this tremendous amount of infrastructure is wasted. So these are the things you got to really think about as you think about machine learning and these very, very large-scale deployments, right? So we've thought about this problem for a long time. And to a certain extent, we also stumbled into it by luck, so to say, right? But luck favors -- helps those who are prepared. And here's how we solve this problem, 2 different ways. One, because, first, let's just make sure we all are very clear. It's all about traffic management. you want to be able to actually decide how traffic gets onto these highways, how they're spread across the different lanes and they go from point A to point B. There's 2 ways we do it. One, what we call the endpoint schedule. And what we mean by the endpoint scheduled is really a dumb network, okay? This is a fancy way of saying a dumb network. And what we mean by this is, there are customers like Amazon, and they've been very open about it. They have their own NIC, that's called the Annapurna NIC. And that NIC actually does all the traffic scheduling. The NIC says, okay, I'm going to put the traffic on this network, and I'm going to send it at Path A versus I'm going to send it on Path B and Path C. And then all we do with our Tomahawk class of devices in between, is listen to what the NIC is saying and we just forward the traffic as fast as we can. So that's what endpoint scheduled is. The other one is what we call switch schedule. Because customers have a heterogeneous NIC environment or they may not necessarily have the NICs, which are capable of this traffic management, they'll say, hey, look, I don't want to do it, let the network do it. And that's where we have our class of products, which are called switch schedule. The switch essentially takes care of the path that the traffic takes from point A to point B. It manages all the congestion so that effectively, all the lanes in the highway are well utilized, and you don't have traffic being dropped between point A and point B. So those are 2 approaches, and we actually have our products in both of these, right? In the second one is what we have our Jericho3-AI chip. We've actually announced this product about 18 months ago, in production now, and you can actually deploy large clusters of up to 32,000 GPUs. It might be hard to see from where you're sitting, but this is our Jericho3-AI chip, right? And by the way, these chips, they're 800-millimeter square, the largest that you can build in a reticle and they have multiple HBMs. So building large chips, large chips with HPMs, advanced packaging comes very natural to us, okay? This is the chip, by the way, and this is the architecture that actually Meta just recently published a paper. And if you want access to it, I'll make sure Ji gets it to you. They compared 2 clusters: one cluster based on InfiniBand because there's a raging debate that somehow of InfiniBand is magical, right, other than the fact that it's super expensive. And that ethernet would not work. So they ran 2 clusters of 24,000 GPUs each. They tested it. Guess what? Ethernet works. It actually works fine. And it's half the price or cheaper and it doesn't melt down. It's a lot of good stuff. It actually works. And more importantly, you can actually see, it's 10% better performance than other alternatives. Now you could say, hey, 10%, big deal, right? But think about it. If you have a $2 billion to $10 billion infrastructure, 10% is about anywhere from $200 million to $1 billion. And I'd be lucky if I got paid that much for the network. So that's infinite returns. For every dollar I spend, I save $1 and more, right? So I challenge anybody who tells me InfiniBand is better than ethernet. This is one. Two, you look at what we're doing in what I call the dumb fabric right now. Just because I call it dumb, please do not expected it be cheap, okay? What I mean by dumb is paying respect to those who build the NICs. I want to be nice to them. And there, we have the Tomahawk class of devices. One thing I can tell you is every 18 to 24 months, we are doubling the bandwidth, okay? By the way, we are never in the habit of announcing products before we ship it. Never. I have seen others who announced products 2 years ahead of us, and we are shipping a year ahead of them. So you might be sitting there wondering, hey, somebody announced 100 terabit switch. Where is yours? I'll just say, we don't announce, let us please speak for itself. All of you are very good at drawing linear regressions, you can figure out where things fall, right? So this is our Tomahawk 5 device, 512 based on 100 gig SerDes. One other thing, actually, by the way, worth pointing out and without taking too much time from my colleague, Vijay, who will talk about it, making 512 100-gig SerDes work, and actually all of them not interfere with one another and deploying this in production, is both an art and a science that we've actually perfected over multiple generations. When we say a chip comes, it comes and it works, right? Then you look at this and say, okay, you guys are good at switches. What else are you doing? We understand that eventually for the switch to get the traffic between the GPU and the switch, you need a NIC. If there's one company that's going to control the NIC and says, oh, my NIC is not going to work with somebody else as switches, we'll have to say, what do we do there? So we decided we are going to build a NIC. And this NIC, by the way, is called Thor 2 because we also have Thor 1 before it. But here's what this NIC is not. It's not a smart NIC, okay? Please do not confuse it at all. It's not a smart NIC. And it is not a super NIC. I've heard this new word called super NIC now, okay? Soon, you might hear the word called cosmos NIC. We are not any of those things. We are a performance NIC, which basically means we're focused on 2 things: bandwidth, right? 400 gig, 800 gig, 1.6 terabit, so on and so forth and performance of RDMA because it's all about RDMA traffic. But as we do this, and the reason we've chosen this architecture is because the power of the NIC becomes extremely important because as I mentioned before, the GPUs need a lot of bandwidth. And the NIC has to keep up with the amount of GPU bandwidth that's coming out. So if you build a very high power requiring NIC, you won't be able to scale. So our entire focus is on a performance NIC with very high RDMA performance that we can consistently scale from 400 gigs to 800 gigs to 1.6 terabit and so on and so forth. And that's our approach here. Now here's another thing that we're doing. When we do this NIC, we realize that customers are going to use it in different ways. There are those that will take a NIC in a board and they'll plug it into anybody's GPU. Or their customers or partners who say they're building their GPUs internally. And they actually want some kind of a chiplet interface that they can use our NIC because it has some of these networking capabilities. Or they might just decide, hey, just give me that IP and I'm going to put it in a big chip that we are building, and we're going to make this NIC available and there's different form factors to anybody who wants to be able to leverage it across these form factors, right? So what does this all mean is ethernet works, works extremely well. And the beautiful part about ethernet, it's all based on standards interfaces. And you can pick any of these building blocks from us. You can buy our Jericho switches. You can buy our Tomahawk class of switches. You can just buy our NIC and we will work with everybody. We're not going to scare you by saying this is not going to work if you have to work with somebody else. We won't do that. That's not how ethernet has been built, right? And to just kind of double-click on it, to show ethernet's performance. Somebody might tell you, InfiniBand is the best thing for ML. But what if I told you, I'm a 10% higher performance than InfiniBand at half the price. Then they might come and say, I'll give it to you for free because I'm going to collect money someplace else. But what this tells you is across different packet sizes, consistently, we can deliver a higher performance than InfiniBand when InfiniBand works. Even better, if you think about it, if you just even build a cluster of 4,000 GPUs, you need 8,000 optics or actually slightly more about 9,000 plus optics, okay? Optics, even with the best electronics in them, they're flaky. And generally, you get to see about at least a 2% failure rate per year, if not 5%, okay? That's why actually some of these large mega scale cloud customers will tell you that they have a bone pile of optics when they have these large deployments. When you have that kind of a failure rate, you could experience as much as 15 failures per month. Now think about it, a job that runs for a very long period of time, connected with optics and these optics have very high failure rates, and you have to keep going and doing checkpointing. That, combined with the fact that InfiniBand takes at least 30x longer to converge than ethernet basically means you have to ship an army of people standing by with cables and standing there to make sure this thing actually runs. That's why they actually say, if you're going to deploy InfiniBand, make sure your cable length is measured, make sure the tape is in the right place, make sure it's powered exactly the way it should be because it's fragile, okay? So this is what actually ends up happening, which is why is ethernet converged so much faster because it's based on distributed protocols. For the longest time, we have things like BGP and other protocols that are looking at things like, what does my neighbor look like, right? What's my link health? And all of this is happening both in software and hardware versus InfiniBand. Everything that happens is a centralized controller, okay? This is one of the biggest Achilles heels for why people will not deploy. And then as we kind of just think about these very, very large networks, right, you have the GPUs, you have the NICs, then you have the optics and the other switches. Optics actually end up consuming a lot of power and cost. So what are we doing in our switches and actually in a lot of our technologies, we're saying, avoid the use of optics as much as you can by extending the reach of copper, okay? And we can actually have the reach of copper go 4 meters, that is twice what the standard asks for. And just to give you a context of what 4 meters is, it's a size of an elephant, okay? Let me say, okay, go as far as you can on copper. And if you no longer can use copper, use optics. But when you use optics, avoid using too many electronics inside the optics and which is what we call direct drive optics. And by the way, eventually, there will be a period of time when the amount of bandwidth coming out of these switches and accelerators is going to be so high that you no longer can just use pluggables, and you'll have to use something that's called co-packaged. And we can do these co-packaged optics that first will give you very high density, too, at a fraction of the power and the cost. So these are the things that we're doing to actually make the entire interconnect come together, right? So what I'd like to leave you with the next few slides is, one, there's no question in anybody's mind with an exception of one customer who still happens to be in InfiniBand, but eventually, I think in the next year or 2, we will move them to ethernet. Ethernet is the de facto standard for these large machine learning clusters. And this is not the front-end network. We already have got the front-end network. This is the back-end network, okay? And then you may say, what are the sizes of these clusters? I'm only sharing with you the data that actually is publicly available. You do Google searches and find, so this is what is publicly available. Amazon has clusters based on Ethernet are over 60,000 servers, Oracle, over 30,000 service. Meta, over 20,000. Tencent, over 10,000, right? Some of them are bigger than these, but these are the numbers that they've actually shown publicly. And this is all back end. This is all ethernet. This is all machine learning, right? Now we know we can do 10,000, 20,000, 30,000, 60,000, 100,000 today, but there was also this consortium that was cofounded by Broadcom and a couple of others about 2 years ago. And the idea is let's actually take this to 1 million-plus nodes, okay? And when you start thinking about million-plus nodes, the biggest issue that needs to be actually solved is RDMA. And you probably heard about RDMA, which is remote data memory access, right? RDMA came about 25 years ago. And then the idea was 2 CPUs want to talk to each other and share their memory. So it was built for 2 machines to talk to each other. And then slowly, it scaled from 2 to 16 to 32, 64, 128, 512, but it was never built for thousands or hundreds of thousands of CPUs or GPUs talking to each other. So there's a whole bunch of things that actually break down in RDMA. And what we as an industry are doing is actually making significant enhancements to RDMA so it can scale to 1 million plus clusters. And by the way, this is not something you're going to say -- they're going to say, you're going to see it 5 years from now. This is something you're actually going to see as products in 18 to 24 months from now, fully interoperable solutions across different vendors, having very high-scale RDMA between those who are building internal accelerators, those who are building merchant silicon accelerators and everyone in between, right? So why is this so important? I think this is probably the biggest takeaway slide. You will not have a world where you have millions of GPUs, where there is one mainframe solution being sold. The only way this already has played out in history is you have multiple vendors, multiple solutions. And when you have these multiple vendors and multiple solutions, what you need is a fabric that interconnects all of these together because this is a distributed computing problem. Just going and saying I can build the biggest GPU doesn't solve the problem. You build the GPUs that can scale and can be networked across a very, very large fabric. And ethernet is the fabric, will be the fabric, and you can actually hold me accountable for it, right? And lastly, not only do we believe in ethernet, but we also believe in actually making ethernet based on a very open ecosystem. We don't go and say, hey, we're building the whole box, by the way, along with the cables, do you want to buy it? No. What we do is we have the silicon. We have a whole bunch of vendors that build hardware all around the world. Then we have a whole bunch of partners who actually build software on top of it, along with all the management stuff that goes on top of it. That's the approach we're going to take. Build the best networking devices, make it available to a very, very broad ecosystem and believe in this idea that this is a distributed computing problem. And the only way you're going to solve this problem at scale is by not building a mainframe. So with that, I'd like to thank you for your time and hand over to Jas, please.

Jas Tremblay

executive
#4

Good morning, everyone. So I'll give you guys 5 seconds to cool down your fingers. I've never seen so much phonetic typing. So my name is Jas Tremblay. I've had the privilege to work for Broadcom for 18 years. I'm currently the General Manager for the Data Center Solutions group, which focuses on server connectivity. And today, we're going to talk about not how we network tens of thousands of AI servers, but how do we create a network inside the AI server. So going back to the mainframe days. Silicon to software systems, everything vertically integrated. You did not need to worry about the connectivity, you just build the connectivity for your own solutions and the same company would build all this technology. But what happened over the course of the years is, in the data center space, more companies wanted to innovate and participate. And you need to come together at the connectivity level. And one of the most important protocols for that has been PCIe. PCIe is the most used protocol inside systems that go inside data centers. And the governing body for PCIe is called PCIe [ SIG ]. It was founded in 1992. We've been through 5 generations of PCIe. And there is quite a large community of about 900 companies, extensive Plugfest. We come together to make sure our products work together. So PCIe is the protocol to interconnect inside the server. The data center has been built up of compute servers. Ram showed the picture of the Google data center 20 years ago, it was compute servers. Within that compute server, you'll find the CPU. And to the CPU you attached peripherals, Ethernet NICs, NVMe drives, storage adapters, multiple type of technology, and they come together with PCIe as a protocol in a point-to-point fashion. So here, you're not building a network, you have the CPU at the heart and you have peripherals attached to it. And the majority of the cases, they connect together with PCIe as a point-to-point networking protocol. But now fast forward to the AI server. And we'll have -- in the demo area, we'll have 3 instances of different AI servers. They're big. They're complicated. They are pieces of art from a mechanical cooling perspective. What you'll find inside these AI servers is multiple CPUs, a dozen NVMe drives, 8 to 11 ethernet NICs, 8 to 12 XPUs and other type of devices. So the point-to-point methodology does not work. You actually need to build a network inside that AI server. And the network of choice for that is PCIe. It's a very low latency. It's ubiquitous. It's standards base. And it allows companies to bring the pieces that they need together. And in fact, having an open internal fabric inside the AI server is key to freedom so that you can pick the components that you want. If you're building your own NIC in house, as a cloud provider, if you want to use different type of accelerators, having an open fabric allows you to pick and choose the components that you want and build an AI server that's more adapted to your needs. The other element is if you're a server OEM or ODM, it's very hard to build a complete system for every type of accelerator. So you want to have an architecture that you can support both merchant, custom and different type of XPUs inside the AI server. So that's where the use of PCIe switching as the internal network inside the AI servers is very important. Okay. So picture on the left, we've got a rack. This is an OCP AI rack. It's composed of multiple AI servers, pictures in the middle. And this AI server is about 15 inches tall, and we'll show you, we actually have this system in the demo area. And there's 3 trays inside this AI server. Top tray has compute CPU and the fabric. Middle tray has peripherals. That's where you'll stack up all your NICs, your NVMe drives. And the bottom tray, where a lot of the power is and you need a lot of cooling is where you could put your XPUs. And this one can support custom NPUs and multiple providers of merchant NPUs. So it's really an open platform. And it has a PCIe as the internal fabric and ethernet as the scale out fabric. So let's take the top tray, which has CPUs and the fabric and let's double-click on this a little bit. So you can see at the top there, there's 4 heatsinks. Each of these heatsinks has one of these -- get it out, one of these little chips inside. This is a Broadcom PCIe Gen5 switch. It's 4.6 terabits per second. You can attach up to 72 devices to it, either an ethernet NIC, and NVMe drive and so forth. In this specific server, there's 4 of these 144 lane PCIe switches. They're interconnected together and each of them aggregates 1 CPU, 2 NICs, 2 XPUs and 4 NVMe drives. And there's -- this picture here, it's replicated 4x in the AI server. That's a building block of the AI server, and then that interconnects to the scale-out network utilizing ethernet. So this network needs to be ultra-low latency. So we're talking 120 nanosecond latency. It needs to be high bandwidth. But most importantly, it needs to be trusted. It needs to interact with many, many types of devices out there. It needs to be standards compliant, and it needs to have advanced telemetry and diagnostics. So if you're deploying tens of thousands of these AI servers in your network, you need to have capabilities inside the network inside the AI server to tell you what's going on. Are the devices behaving properly or not. So we invest in performance, lowest power, and advanced telemetry and diagnostics. Okay. So we talked about the switch. The switch is the core element of that network. But in some -- we need to run this over effectively back plane. This is not a wired network. It's a backplane trace network, very low cost. Low power. But in some cases, the server is 15 inches tall, you need to go from one trade to the other and maneuver yourself around. You may need to go further than the PCIe spec, in terms of distance. The first way we do this is because of our SerDes that we'll show you a little bit later. We can typically go 40% further than the standards from a SerDes perspective. But sometimes, you need to go further than that and you need to have a retimer. A retimer basically extends the reach of the PCIe switch using that protocol. It's a companion device through the switch. So we've got a portfolio of switches and the companionship retimers. Two weeks ago, we announced the industry's first 5-nanometer retimer companion chip to the switch. If you use our switch with our retimer because of the same SerDes on both sides being standards compliant, we can go 40% longer reach with 35% less power. So this is really important as if you might have 4 switches, 12 retimers in one of these systems, you really want to optimize it for power and cooling. Okay. So Charlie talked about the franchise and the fact that we invest in markets that will exist for decades. So we've been doing behind the scenes PCIe switches for quite some time. In fact, we introduced the first PCIe switch in 2003 when there was no AI server, when people did not need a lot of PCIe switches. We were first to market in 2003, and we've been first to market with PCIe switch for every generation of PCIe for the past 20 years. So first to market for 5 generations, and market leader for 5 generations of PCIe switching. But now starting in Gen 4, that's when people started to build up AI servers and needed an internal fabric. And we are taking this franchise that we've been investing in for 20 years, but doubling down on it, increasing the investments for AI. So AI needs faster, faster. And what I mean by that is we need to accelerate the pace of cadence and go faster to the new protocol speeds. We need more bandwidth, more connectivity, more capabilities inside this fabric. So we're shipping in volume PCIe Gen 5 switches today, which power the vast majority of the industry's AI servers across the industry, across custom and merchant accelerators. This is the network of choice inside the AI server. We are going -- we announced our retimer in Gen 5 5-nanometer. We announced our retimer in PCI Gen 6 5-nanometer. And we also announced that we're going to be sampling PCIe Gen 6 switches at the end of this year. Another thing that we did is we're accelerating the cadence from -- if you look at Gen 3 to Gen 4, this was CPU speed in that point-to-point model. The importance of PCIe performance was not that critical, 8 years between Gen 3 to Gen 4, 4 years between Gen 4 to Gen 5, and that's where AI really started. Two years between Gen 5 and Gen 6, and then we're going to 1-year cadence. We need to speed things up. That network needs to be extremely high performance, low latency. Now the other thing -- so I've been talking about building the internal fabric to connect CPUs, NICs, NVMe drives and the XPUs, that internal fabric. The other aspect is the scale up fabric, high-performance networking from XPU to XPU. There's different ways to do this today, but we believe we need an open low power, high performance, low latency way to do this, and we've partnered up with AMD. AMD had the MI-300 launch event in December of last year. [ Forest, Norad ] and myself announced that we're partnering on building a scale-up solution that would -- we would -- Broadcom would be building the switch. AMD would be building the accelerators, and we're going to work together on this in an open way and bring this to standards bodies. So today, our switches are used in internal fabric. And with this, we're extending it to be used as a scale up. Okay. So on that note, you're building one of these complex AI servers, you need an internal fabric. PCIe is the protocol of choice for that. It allows openness and choices with customers. We introduced with no fan fare, the first PCIe Gen 1 switch in 2003. And for the past 20 years, 5 generations, we've been first to market 5 times, and we've been market leaders 5 times. And now with the dawn of AI servers, we're doubling down, increasing investment in this space and increasing the cadence of innovation between generations. So thank you very much, and I'll pass it on to Near.

Near Margalit

executive
#5

Thank you, Jas. Good morning, everybody. I'm Near Margalit, I'm the General Manager of the Optical Systems division here at Broadcom. I've been involved in optical components for over 30 years. I'm really excited to talk to you guys about some of the optical technology that we're developing here at Broadcom. So we'll start with a slide that Charlie laid out really well is, what do these AI clusters look like? We know we want to get to bigger and bigger clusters. So everything inside the rack, you'd like to do in a copper fashion and be able to interconnect with PCIe or direct attached copper across these things. But once you go any kind of scale or distance, you've got to start looking at optical links. And these are these golden lines, both the front-end network and the back-end network. We need to be able to scale to really large bandwidth. These AI systems are continuing to consume more and more bandwidth across the system. So we need the optical technology to be able to support that, both scaling and cost and being able to provide the higher level bandwidth. So within Broadcom, we're going to show -- talk about 3 core technologies that we have under our group. The first one is the vertical cavity surface emitting laser. This is kind of a workhorse for AI technology across the industry. It can be used on ethernet, InfiniBand, NVLink technology. It does have limitations in distance because of the multi-mode fiber itself, limiting to 100 meters or so, but very low-power, low-cost technology and it's being deployed widely in most of the AI systems in the world today. When you go a little bit further, you need to scale to bigger and bigger clusters, going to hundreds of thousands or millions of units, you want to be able to travel beyond that 100-meter reach. So there you look to indium phosphide-based technology or electro-absorption modulated lasers. And that gives you the reach. And again, in both these markets, we're a leading supplier for this technology. And we're going to talk to you guys a little bit more today about a new technology that we're putting together, which is co package optics, which is the integration of high-speed silicon photonics directly integrated on ASICs, whether it's switches, PCIe switches or accelerators across the system. And that really provides the future generation for both power and cost leadership for these future generation systems. So -- and you may have seen last week, we did a press release on the VCSEL and EML technology where we said we shipped more than 20 million channels of 100-gig per lane technology. So really demonstrating core leadership in our optical technology. So just going to a little bit. We've got a long history of leadership in this area. We own our 2 fabs that make these optical components, one in Pennsylvania and one in Singapore. One making VCSELs, one making the indium phosphide technology. And we've been in this business for a very, very long time. With the VCSELs going back to HP days, we were one of the original people building VCSELs technology. I think back when I was in grad school, they started doing these VCSELs technologies, and they've maintained -- we've maintained leadership through the decades in all this technology. And in addition, the indium phosphide technology, it's the original Bell Labs fab that we own that has the indium phosphide technology. And again, we've continued to maintain leadership from way back, direct modulated lasers to the highest state of the art 100-gig per lane electro-absorption modulators. So where are we at today? We're obviously talking a lot about AI and the AI data center world. So we continue our leadership in this area. We're the only people shipping high-volume 100-gigabit VCSEL technology. Very complex technology. For a long time, people thought this was not even possible to do it at 100 gigabit. And we were told this is -- VCSELs are dead. You don't need to continue any more work on them because you just can't get them to work at the data rates that you need in the future. So not only are we shipping en mass volume millions of units, the 100-gigabit VCSEL technology, we're actually going to demonstrate you today, 200-gigabit VCSEL technology. Again, a really, really large technological barrier. Not ready yet for production, but it's something that we're excited about and think that we can continue to deliver on our long history of leadership in this market. And with the indium phosphide technology, we are ready for mass production at 200 gigabit EML. We announced last week that were ready for mass production on this. This will be able to go with all the future 200-gigabit per lane link, right? The line speeds continue to go up in all these AI data centers to reduce power and reduce cost. We are ready for that technology. And we're continuing to work even on future technologies of EML to be able to run even higher speed links. And we'll talk a lot about the technology. We talked about co-packaged optics. We announced also last week, their first commercial shipments of our Bailly system. This is the combination of our 51 terabit Tomahawk switch with complete optical links directly integrated onto the package. So all 512 lanes directly optically attached to the switch itself. And we'll talk about why that's important. We continue to work on future technologies. We think this is a foundational technology that could scale for all kinds of applications where optics are needed. So first and foremost, why do co-package? Obviously, pluggable transceivers have been around for a long time, have been very effective and serve the industry extremely well. Well, with these AI systems, the bandwidth, the amount of components continues to scale and the cost of the optics continue to be a problem in that scalability. So how do you solve a road map to continue to reduce cost in optics to be able to scale with these larger and larger clusters and GPUs? Our solution to that is integration, and looking for integration specifically into silicon photonics to be able to put more and more components directly onto an individual chip. That has historically been the way that semiconductors have reduced costs, and we believe in optics, that is the correct way to go. So we see continued CPO providing the -- not only today, but also in the future, the lowest cost per bit capability. The second benefit you get for co-packaged optics is that the actual optics are right where the signals are. So you get rid of the complex electrical lanes between the ASICs and the optics. These lanes burn power, add cost, add complexity to the link that's unnecessary if you can get the optics to be directly on the substrate themselves. So what we see, we're giving you an example here. On the right, just how much power did that save? It actually is quite a bit of power. So a typical 800-gig pluggable transceivers today in the market today are 14 watts. We're showing the Bailly system that we're shipping right now at 5 watts. So almost a 70% savings over a typical deployment being done today. And does it scale for 200 gig? Yes, it's going to scale as a pluggable transceiver scale to 1.6 terabit, 25 watts. Again, we're going to have similar proportional savings. So it's actually a big deal in terms of the power savings and the optics. I think we talked a lot on optics and power being an important issue. This is a pathway to reduce the cost significantly. And the last thing that's not so obvious is what about reliability. People are saying, you integrate everything together, can reliability be there. And we think actually, you can enhance the reliability of systems with co-packaged optics because you are integrating more and more components into core silicon elements. So we see the ability to integrate components on silicon photonics itself as a way to enhance reliability. Now we're not ready yet for building lasers directly on silicon. That's still science fiction. So what we've done with our system is still maintain the laser system itself, which is not core silicon component as a pluggable component in the system. So we still maintain the serviceability for the element that's not core silicon. Everything else is built on core silicon technology, similar to the switches and PCIe and the long history of very reliable. So we actually think CPO is a way is to enhance reliability of optical links. I think Ram had in his talk 2% failure rates for a pluggable transceivers. That's a pretty bad failure rate. And we see pluggable -- sorry, CPO systems being offering a way to get rid of that as a kind of poor reliability of transceivers. And just looking at what does it mean integration, just giving you a little bit of visuals of what this looks like. When you do pluggable transceivers, traditionally, we've been in that business a long time ago, you integrate lots of different heterogeneous components all on a PCB. It's all put together. And the industry has recognized that. And actually, several suppliers have started to look at how do we integrate more of these components into silicon photonics to improve the cost and reliability of these systems. And people have done this on -- putting it on to pluggables. We know several suppliers have put silicon photonics on pluggables. And that's actually a good direction, and it actually has moved the industry forward. But Broadcom's kind of aimed a little bit further ahead and say, how do we put these silicon photonics truly in a high-density fashion, not for 8 channels. We're shooting for 64 channels on an individual photonic chip. So what that allows you is when you provide that density, you can actually start moving the optics off these subsystems that are pluggable transceivers and be able to move them directly on to ASIC substrates, again, getting the benefits of that technology. And what can you put it on? So obviously, our first product that we're announcing putting that on, is on our Tomahawk 5 switch. 512 lanes of optical connectivity across the entire device, so all 512 lanes. Not only are the SerDes working, we've got full optical capability. But that doesn't end there. We see this technology being applied to a lot of different areas. Frank will show you the ability to put it on custom accelerators if you want optics directly coming out of your end nodes, for high-bandwidth connectivity. So again, lots of applications for this, and it's a pretty foundational technology that will hopefully drive the industry forward. So how do we build this? What have we done? So the first and foremost is we focused from day 1 on high-density silicon photonics, not just low density silicon photonics, and we're showing you here an optical chip that's got full 64 lanes of 100-gig capability and about the size of a quarter. So if you think about -- you'll see in the other demo room, what -- what's 128 optical transceivers look like. This is replaced with these little tiny engines. So we integrate the muxing technology, the modulation, the photodiodes, the optical coupling all onto these individual chips. We couple it with advanced node silicon CMOS technology for drivers and TIAs to maintain lowest power and lowest cost capability across the system. We then use advanced packaging techniques to be able to stack these chips together to provide the best performance and best reliability. That -- those optical engines can then go on any kind of substrate that you have an ASIC on. And of course, showing here the 51-terabit system with 8 optical engines around the size of the die. And finally, how does the end customer consume this, right? We have ODM partners that we announced that are putting these systems into boxes that look very similar to a mini pack 3, right? This is like a -- very looks very much similar to a standard pluggable optical box. But what's the difference? We gained tremendous amount of power savings. We gained tremendous amount of cost savings. So you can consume it in the same format you've been doing pluggable transceivers but you gain the power, the cost, the reliability advantages of this system. So it's a very exciting direction. Not only that, we're going to show you guys a little video. We're focusing on integration and cost, and we're going to show you a short video on how this gets put together. We're showing here all the different steps of assembling these co-package engine. This is an example of like bonding the electronic die to the photonic die. Of course, there's optical fibers that have to be attached to these engines that can't just attach themselves. So we have robotic assembly of optical components to allow you to attach fibers directly to the optical engines. Once all that process is done, we obviously do testing at an individual die level for photonic engines. And ultimately place it on the end substrate or end product here, as an example, of 8 optical engines. And the key thing to focus when we do these videos is integration, robotics, minimizing human touch. Why? Because we know that, that improves reliability. We know that it improves cost. It improves scalability of the system. So we're really excited as this being our first product with co-package technology, and we hope to see a lot of future technologies use this capability. And so I'll end my portion of the talk today just going through 3 key points. We've shown industry leadership on optical components for over a long history, specifically now at 100-gig per lane. We're doing extremely well in delivering for AI applications. We've shown the ability that we're going to continue to scale at 200 gig, both the VCSEL technology and EML technology, we're looking to scale that. We're also shipping the first commercial system of a co-packaged optics with a pluggable laser. This is a really exciting technology that provides both cost and power benefits with up to 70% reduction in cost and 30% cost savings. So I'm going to pass it on to Vijay. He's going to talk a little bit more about some of our foundational technologies.

Vijay Janapaty

executive
#6

Good morning. Can you hear me okay? Yes. Good morning. My name is Vijay Janapaty. I'm the General Manager for a division called Physical Layer Products division. I've been here with the company for more than 25 years now, joined as a young engineer and rose up the ranks. Today, I think our presentation, I'm going to do is focus on foundation technology. And in particular, we are actually going to focus more on high-speed links. And these links are typically sort of made up with these cores called SerDes cores, also these DSPs that are used in the pluggable modules, okay? So before I do that, let me actually set up what the problem statement is. And I think Charlie and Ram talked about the 1 million accelerator cluster, right? And if you go into this 1 million cluster, at a typical case, you will find there are 10 million high-speed links. And these links are 400-gig, going to 800-gig and maybe 2 years down the road, it's going to be 1.6 terabits per second, right? So these links kind of made up today, let's say, 400 gig is made up of 4 lanes of 100 gig and then tomorrow, it will be either 4 lanes of 200 gig and then thereafter, it would be 8 lanes of 200 gig at 1.6, right? So these are very fast links. And the interesting thing about that in this market and in other networking markets, the bandwidth of these links actually doubles every 2 years. So we have to come up with a new -- a faster link every 2 years, right? And if you look at the sheer number of these links and the fact that it is doubling in speed, is these links are the number 2 source of power and cost in an AI cluster. So any savings you would make here is saving that helps bring the power down or some more power available for -- so the AI accelerators, right? So it's very important to sort of focus on the power and cost of these high-speed links. And that's what we do at Broadcom, okay? So I'm going to double-click on these links now, right, high-speed links. So if you took those 10-plus million links, a predominant portion of that links today are copper. They are either on a backplane or they are on a direct attached copper cable within the rack and their reach is about 5 meters or so, and they have the lowest power and lowest cost. Literally, they're free, right? Now if you look at -- if you can't do 5, if you have to go beyond 5, then, of course, you have to use optics. And when you have optics, the reach is much greater, but they have the highest power and highest cost. And so it's very important to figure out how do we reduce these power and cost of these optics, right? So on the copper links, the technology that we use to drive these copper links is SerDes. And these SerDes cores are embedded inside a Tomahawk switch or the XPUs that Frank is going to talk about or the NICs that we are -- Ram talked about. So they're embedded in that. So -- and on the DSPs, they're embedded inside the pluggable optics, right? And I think most of you know what those are. So for us, what are our objectives? And our objectives are how do we make sure that as many links stay on copper as possible because it's free, right? And then second, whatever links don't stay on copper, how do we reduce the cost and power for those links. So those are the 2 objectives that we drive our development philosophy with, okay? And so before I go into sort of how we are going to do this, let me just review the history. Broadcom has a history and a legacy of driving the best SerDes in the industry for at least 4 generations. So if you look at 10 gig, 25 gig, 50, 100, we have been the leader in SerDes, and that powered our products not only to double the bandwidth but be the first in the market. And I'll talk more about what that leadership is in the next slide when I talk about 100 gig SerDes. But the foundation of that leadership comes from what we believe are 4 key elements. The first is we have very deep high-speed analog expertise. It's not low-speed analog, a high-speed analog expertise. Then second, we pair that with a DSP expertise that is a custom build for that high-speed communication, that we have lots of those, right? And third, we had to take those 2 and go into technology nodes that are leading edge, right, for today's 3-nanometer technology, for example. And not only do we actually do in that new technology, but we do it actually concurrently with the foundry. When the foundry is developing the process, we actually develop our SerDes technology so that when the foundry is ready, we are also ready for our products to be first in market and have the right power and cost characteristics. And last but not least, with the scale of operation that we have, we are deploying our SerDes in hundreds and hundreds of systems with literally hundreds of cores, right? For example, Tomahawk 5 has 500 of those and some of the XPUs have hundreds of those cores. So it is the system know-how that we have gained over time, and it is generationally, we have gained. These 4 elements we are the reason why we are the #1 in the world. And today, what I'll do is I'm going to click into 100 gig to show you what we did, okay? So in 100 gig, as I mentioned, the objective 1 is keep everything on copper as much as possible, right? So that's what you see on the top 2 there, 45 dB backplane channels. So once you do that, you have a larger fan out, you can connect more chips together. And the second thing we do is we do actually 4-meter cable, right, for DACs, direct attach copper. And that's not only in the rack, but also inter rack, too. So you can actually go some links to the next rack on copper, and that's something that allows a lot of these links to stay in copper, which is going to give you a lot of power and cost benefit. Now but we didn't stop there. In this SerDes, which we actually call Peregrine, which is our leading 5-nanometer, 100-gig SerDes, we did something very unique. We built a native equalization capability for optics so that you don't have to have a DSP or a retimed pluggable. You can actually drive optics directly from the SerDes itself, from the switch, from the XPU, from the NIC. And so what that does is it enables a very disruptive use cases like CPO. You saw CPO from Near. He talked about that. It also enabled for the first time, an LPO or an LDO and people call it different names. And both of them actually dramatically reduce the power and cost of optics, okay? So this SerDes is available in every product that Broadcom will build in 5-nanometer, for 100 gig. It may be switches, routers, XPUs, NICs, and all of them carry the same benefits. So you can connect any of them to an LDO or to a CPO or to a 45 dB backplane or to a DAC cable, right? So that's -- so we cut and paste across all the products, and we have a fantastic deployment with this SerDes. So today, we have actually some new news that we want to share with you, is our next-generation SerDes. And this is a codename inside our company is Condor. It's built on 3 nanometer, not 4. And it is -- has the same benefits that I talked about in the 100 gig. Very long reach, 45-plus dB, 2-plus meters of DAC cable. So everything in the rack is pretty much covered without any retimers, you don't need active in the rack. And again, it does provide the same benefits, CPO, linear optics, it's available now and all of our product teams are designing with this right now. And again, I think with these specs we have, we absolutely believe and we're very confident that customers are going to be delighted with this and we will be the #1 SerDes, again, in 200 gig. In the demo area today, we're actually going to show you a demo of the SerDes running on DAC cables, which is the hardest thing to do. And please do see that demo. I'm going to turn to DSPs now. Yes, after doing all of this, there is still going to be some DSPs, right, and still going to be pluggable modules. And these pluggable modules, of course, are very prevalent today. And in April of 2022, I did a key chain for investors where we talked about our renewed effort in this area with innovation to integrate as many components as possible to drive the, again, cost and power, right, for pluggables. So we are integrating drivers, which are typically in a non-CMOS or technology. TIAs, which are again non-CMOS. We brought everything into CMOS, integrated everything into a single device and that was going to drive cost and power in the module lower, and we actually made that promise in 2022. So today, I'm very happy to tell you that 400 gig optics, we did that. We achieved what we wanted to achieve. We also did that on 800 gig, in 7-nanometer. And we've actually overachieved. We got even lower power by taking all that and moving into 5-nanometer. So we have some of the lowest -- the modules built with Broadcom's integrated DSPs are some of the lowest power and cost in the marketplace today, and we can drive both multimode and single mode. And that's really to help customers reduce their overall spend as well as the power that they consume for these high-speed links, okay? So on 200-gig, actually, we started investing even sooner. And of course, as you know, at 1.6T, it is 8 lanes of 200, right? That's how you get to the 1.6. And we have this family of chips called [ Xian ]. We have today a demonstration of Xian and that demonstration will be with a device that has the driver integrator. Later this year, we're going to have a device that will also have the TIA integrated. And again, we support multimode and single mode. And one of the good things about this is our performance of this Xian-based modules is the best in the industry today. We have at least 3 decades of margin or competition and even specifications. And so we have the best producing, best-performing module. And what -- why is that important, right? Why is that important is the fail-over rate that Ram talked about is something that is very clearly proportional to the error rate that you actually get. And secondly, if your performance is very high, you actually can bypass a lot of error correction and that gives you lower latency. And when you have that lower latency, it's going to be better for training workloads and things like that. So the better performance is going to be very good for the AI industry. So we are happy to show you that also today. And we have modules with both EMLs based on Broadcom's EMLs as well as Broadcom's CW lasers, which is a silicon photonics-based technology. So we will have both of those in the demo area, and we would like you to take a look at that, okay? So to wrap up on foundation high-speed links. We believe we delivered some of the best-in-class 100-gig per lane ecosystem, driving the objectives that I've outlined. We are absolutely on track to lead the industry again at 200 gig, both on SerDes and DSPs. And our -- as I also mentioned, our differentiation really comes from core expertise in analog DSP as well as our system know-how and the scale of deployment that we have been doing for many, many generations, okay? Thank you. And I will hand it over to Frank.

Frank Ostojic

executive
#7

Thank you, Vijay. Good morning, Frank Ostojic. I've been doing custom silicon since I was born. It's been a long time. That's not my high-school graduation picture. That's when Hock Tan asked me to run the custom division like 16 years ago. It's -- I loved every minute of it. But most importantly, let me tell you about my team. My team and I, we came from Hewlett Packard. We used to do whatever Hewlett Packard that was hard in custom, we did those chips, whether they were compute, printers, graphics, that's the kind of stuff that we did. And then shortly after that, we became Agilent. So we started to do it -- on top of the Hewlett Packard chip, we start to doing some analog chips and some really cool stuff for Agilent. Then after that, we became Avago and Hock Tan showed up and then we acquired LSI. So I met some incredible, good engineers, customer engineers from LSI, and we integrated it into my team. Several of those sit on the third floor right there in this building. And we've been -- we gained scale. We gained more customers. We've been working on that. After that, you know the story. We acquired Broadcom, and we got some amazing analog engineers and system engineers and other engineers that allowed us to have some crazy IP, some beautiful investments. So let me show you what my team and I have been able to accomplish. And thanks to all that effort, we are -- we've been the #1 custom silicon for 10 years, and I give all the credit to those amazing engineers. And to those great customers that have been loyal to us and been working with us for a long time. Now something happened in 2014. We met a customer that decided to do something really cool in AI, and we developed an AI chip for them and we started shifting resources and our focus to AI. And that's what we're going to talk about today. So first question, why do this consumer AI customers want their own chips? Why do they want to partner with us to create this XPUs? Why can't they use GPUs merchant chips? What's the benefit? So let's discuss that, okay? The benefit can be explained by a simple equation: performance divided by total cost of ownership. What is total cost of ownership, is the cost of the chip, the cost of the power and the cost of the infrastructure they put it together. So let's digest it. Let's go ahead and zoom in. When you take an XPU and you are one of these consumer ad companies, you have some internal workloads that are very important for your revenue generation and for your applications. So if you customize your architecture, of your accelerator and your bandwidth and the ratio of bandwidth accelerated to a I/O, you might be able to do that specific workloads or those very specific workers that you care a lot about much more efficiently than when you have a general hardware. So what happens? We work with our customers to customize the architecture that they have, which comes from them to make sure that they can maximize performance, maximize performance for what they care about. And then when you look at efficiency and optimizing, there's another really good effect, which is when you optimize hardware, you make it smaller. You make it cheaper. You use less real estate. So when these companies start using this -- the science that we co-developed, then they save millions of dollars and billions of dollars of CapEx because they're designed exactly for what they want. With the right ratio of memory, with the right ratio of I/O. So there's another benefit, which is when you optimize something and you optimize the energy, which is power, pico joules, right, pico joules per bit or pico joules per terabit, whatever it might be, you are optimizing cost, lower power, lower cost. And as you heard, power is an amazing commodity. You may or may not build the data center, depending on what power footprint you're going to have. You might determine where you build the data center or whether you're going to do it at all. So there is simple economics that are extremely strong that drive this business and this investment from the type of customers that we have been working with. So let's zoom in. Let's talk about what we're doing. All right. So you've seen the presentations from Vijay about SerDes, from a Ram about the optimized NICs that we have, from Ram about the Jericho AI. By the way, switching was one of the huge assets that we acquired when we acquired Broadcom, and that is available. You saw Near on what we're doing with co-packaged optics. And then in my division, we have a large investment in advanced packaging and buffer memory IP. Charlie discussed, this is the type of investment that we have. We're putting our money where our mouth is. We've been investing and no doubt, the priority of this $2 billion is #1 AI for this type of market. So as I mentioned before, what's this for? Super simple. Lowest power, best performance for optimized workloads in these XPUs allow us to have the best performance by TCO. That is what we're doing, very focused. That is the mantra of my division. We want to do one thing and do it right and pick the right customers. So let's zoom in a little more. Now let's do a little dissection of these chips. What are they about? What are the -- what matters in this chip? Here's a little diagram that shows the different aspects of XPU. And we're going to just touch on them. Number one, compute. Number two, memory. Number three, the network I/O, and you can see them on the top and the bottom as chiplets. And number four, last and absolutely not least, one of the hardest parts is a reliable packaging technology, right? So let's talk about that. Number one, the architecture comes from all those brilliant geniuses that work for our customers. They are looking at the workloads, they are looking at what they want to do in 3 years, in 5 years, in 10 years. That's where the architecture comes, from the accelerator. We've been working for those folks for a long time. And then we have developed a flow that allows them to optimize the construction of that compute. We have several software engineers in my team that all they do is optimize the flow to build those accelerators, really small, really fast and with a minimal area so we can go ahead and reduce the cost and improve the TCO. And obviously get it done really fast. Number two, so now we're -- compute is owned between the customer and us. It's a shared responsibility. Number two, memory, this is something that we own in Broadcom, which would take this ability to have HBMs or other memory solutions and have the right size, the right connectivity, the right cooling, the right testing to make sure they're reliable and they're ready to go. As Charlie discussed, we're going to demonstrate today how we're running these interfaces significantly faster than any standard out there or any competitor that we have with proprietary techniques that we have about everything from power distribution, from testing, et cetera, et cetera. So number three, well, you listen to my friend, Ram Velaga, right? We bought this incredible asset from Broadcom. And we're using that asset to its full extent. This, at the end of the day is going to be dominated by the network difficulty. So all the cool IP that Ram has, we have created software tools that allows us to put together chiplets that can make this I/Os wider, thinner to match the exact type of precision and ratio that the customer wants for their workloads. So we can do a 200 gig, 100 gig PCIe express. If they need another one next time, we just change it, and we can go to production quickly. So a lot of flexibility due to software automation that we've done for all the years that we've been working with them. So that's the network I/O and is done with full solutions. And not only that, we have the hardware, the firmware and the software. And we can give that to the customer before they even start doing the XPU so they can emulate. They can simulate, they can solve all the problems. Again, packaging and my favorite is, we can do 2.5D. We -- I'm going to show you some cool stuff on 3D and silicon photonics. This is really hard. And I'm going to talk about some of the things that we have resolved there. Great. So let's continue. Now let's talk about experience. We talked about investment. Now let's talk about experience. So a few days ago, I landed in San Jose, I'm from Colorado. And I saw that -- I would say all the signs are AI for this, AI for that. AI test their software for you. You go to the bookstore, there's books on the AI. You're going to go to Starbucks, you're going to have AI latte. Just everything is AI, right? Everybody is putting the AI badge, right, to participate in this cool stuff. Not us. We've been doing AI custom chips since 2014, and we've been implementing a flow mechanically, electrically, thermally and then also for design to make sure we do it better and better and better, on the chips that we have taken to production, on the chips we're developing, on the chips that we're discussing architecture, 1 decade. All right. Now we talked about that we're selective of our customers, right? So let's talk a little bit about them. With one of our customers within 10 years of chips and 10 chips. And we have learned our mistakes, their mistakes, our vendors' mistakes, and we have coded in software all the solutions for that type of a flow to make sure that we're doing like a machine with a lot of discipline with a lot of automation to avoid errors. We have taken the cool stuff that we invented, and we have gone together with another customer, and we have been working with them for about 4 years. We have done about 4 chip for them. And there's time to go to production, right? And of course, there are some chips that we're developing with each one of them, and there are some chips that we're discussing architecture. And as Charlie discussed, we're excited that we have a third customer who we've done a chip, and we're starting to take it out to production. So we have one customer that we've been in a long time, midterm, and we have a new one. So very exciting but focused and they all have similar goals. All right. So how do we do this? How about the time to market? All right. So let's look at this. In this chart here, I want you to look at this point right here. That's when the XPU officially starts. So what we've been able to do, and let me share a couple of examples here. This device, we taped out in 7 months. This device, we taped out in 9 months. And it's not because we're working in another planet and overnight. It's because we have created a flow that's automated and debugged. And we have all these tools ready for plug and play in all the IP. And we have prequalified for thermal mechanical and electrical, the packages that we need. So the co-development, we engage with them. They're familiar with us. We're familiar with them. It's the same flow we've been using, but we've been improving for a decade. We take it to the fab. And because everything is proven, the only thing that's new is the guts of the computer architecture inside, which we, together, we quickly check and verify, and obviously, we emulate before that, and we can start production levels at 3 months. That is basically hard work that happens long before. Let me give you an analogy. I'm sure some of you like cool cars. Let's talk about a Corvette, let's say. It's like building a custom Corvette for a very custom track. You know the pitches of the track, the length, the curves and all that kind of stuff. The customers working on the engine. While the customer is working on the engine, we got the brakes ready, we have the chassis assembled, the tires, the music is on, everything is ready and the hood is open just waiting with the pit crew to put it in, hit the gas and we go. Speed is incredibly important in this market so you can have the performance by TCO correctly at the right time. So this is what we specialize in. One thing, this type of devices. Now you've seen the ship that Charlie showed. Pretty cool stuff. This is what we're doing for our customers that can enable 12 HVMs with a lot of silicon. You can see we have 2 NICs here and 2 cores. And our customer can feel that with the most precise elements that they need for their internal workloads, right? So we're going to talk a little bit now about the architecture phase. What happens in the architecture phase? This is for the future, right? So we work with these customers for a long time. We know what they need. We know their struggles. We know our struggles. So we have a huge R&D investment on technology for the future. That's what Charlie showed you, right? We have the silicon photonics. And I'm going to show you here, this is the chip that Near was talking about. This is 2.5D package with HBMs and an accelerator in the silicon photonics connector. This can save 80 watts of system power. Imagine putting 1 million of this together, what is the green savings? What is the OpEx savings that you can have with that? And you're going to see this chip working in our demo with real traffic and real testing conditions. And we're hitting everything we have, optics tests, reliability test, thermal test, to make sure everything is ready for production, right? Now on this big chip that Charlie showed that looks like a coaster where you can put your drink, it's an actual chip and you'll be able to see it. Tremendously difficult to get the warpage correctly, to get the mechanical to make sure it doesn't crack. This is not our first trial on this chip. We've done several and we fix all the mistakes that now allowed it to be a production product. So that's the investment. Now this is the part I'm really excited about. This is, I'm making sure it can capture the light, a 3D wafer. And it has a close to 800-millimeter square chip in the bottom with a close to 700-millimeter square chip on the top. And we're going to put 2 of those right here plus the NICs. You do the math. That's north of 3,000 millimeter squares available for networking, for I/O, for acceleration, for whatever are the right things that our customer needs to put. But we have to invest years ahead. And it's not just showing up with a flag that we can do it. you have to put all the engineering and all the capability, right? So you'll be able to see this when we have the demo, I'll be right there. And finally, I was trying, how do I show complexity? How can I show complexity in a way that makes sense? So I came up with this chart, which is pretty simple. It looks intimidating, it's pretty simple. This is time, and the size of the bubble is a complexity measure of an XPU. So if a bubble has about twice the area as another bubble, you can assume the bandwidth is probably twice, and the content of silicon is probably twice and mechanical problems are probably twice as hard to do, okay? So to make it easy to see, I coated this one as green. You can see the first chip that we did. It seemed hard back then, by the way. And then they have a little bit of blue and then the purple. They look like bowling balls they're going to hit you. So what do you see here? What trend do you see? What's interesting? Before I do that, let's analyze the question. Complexity is a function of compute performance, network bandwidth, memory bandwidth, power delivery, thermal integrity and mechanical reliability. These 3 last ones, they take a long time to figure out. Very difficult to do, right? So some observations, as you can see, there's more content. There's more complexity. And do you see any green balls around here? They're all big. So let me give you an example. I am a surfer, I surf with my daughter in Santa Cruz, I try to keep up with her. And we -- my daughter and I look at the waves in 3 different ways. That's a 2-foot wave, 2 to 3 foot waves, you can learn how to serve on those. Have fun, catch a wave with a long board. These are 6 foot waves. You got to know what you're doing. You could get hurt, and it's really hard to go through those to go to surf. These monsters are 15- to 20-foot waves. You don't mess around with that. These are very difficult, right? However, when I see this, I see that the surf is favorable for Broadcom because we're good at surfing those big monsters. That's what we've been doing, right? And it creates a very difficult barrier to entry, right? So -- and for those of you that are not surfers, basically, imagine skiing, green, blue, double black with skeletons, okay? That's actually what it looks like. And that's what I like to ski with my kids in Colorado, and I cannot keep up with them. All right. So in summary, pretty simple. Number one, focus. We want to do one thing right, and we've been doing that for 10 years, do this consumer AI difficult chips just like we did when we started in Hewlett Packard 30 years ago. Investment, $3 billion focused primarily and prioritized on AI. Experience. Many years of fixing things, of learning mistake, of improving the flow and being diligent on that discipline to use the same flow. And a 3-year investment, or 4, on future items that we need and we think our customers are going to need. I look forward to seeing you in some of the demos. Thank you so much. And now I'm going to give the time to Charlie.

Charlie Kawwas

executive
#8

All right. Thank you, Frank. I brought my skiing gear as well as my surfing gear. Hopefully, you can join us later on at the beach or the ski hills over there. I hope you've enjoyed all of the innovations that my cogs and I have shared with you. I know it's been a little bit longer than what we planned, but hopefully, this was worth it. We've really covered how do we enable AI infrastructure with silicon. But we all know silicon alone is not enough. And in order for the silicon to run, we need software. So I've decided to bring a special guest, another colleague of mine from VMware, Paul Turner, who's a special guest, if you can join me here, Paul. Welcome, Paul. Ladies and gentlemen.

Paul Turner

attendee
#9

Thank you so much, and great to catch up with everybody.

Charlie Kawwas

executive
#10

So Paul is a colleague who just joined us on November 22. He's the VP of Products for VMware VCF. And he's the product boss. So if you have questions for Paul, obviously, stick around. You better catch them before he leaves. But more importantly, what Paul did earlier this week, as he announced at the beginning of the week, a very cool thing. It's the first release of private AI foundation. So it has AI right in the middle of this. Just as Frank was saying, everything has AI. So Paul, tell us a bit more about this announcement.

Paul Turner

attendee
#11

Sure, Charlie, actually delighted to. So you heard a lot about consumer. And the other side to the AI picture, of course, is enterprise AI. And enterprise AI is slightly different because what we are -- what we've done with private AI foundation, which we first released actually with NVIDIA, because we've realized that you can take foundational models, ones that have been done probably on custom silicon and optimized by all of those top-end companies out there. But you can take those now as open source models, bring them into your enterprise and actually use -- optimize them just for your use. Do the fine tuning, do the prompt tuning, do the rag and retrieval augmented generation so that you can optimize those models, and you can do -- you can deliver new applications within months by just optimizing with only tens of GPUs for these customers because 98%, 99% of the work is already done for them, and they can build up those foundational models. So yes, we just released VMware private AI foundation with NVIDIA, which is actually a jointly engineered solution with our NVIDIA friends. So yes, first to market, we announced it back at VMware Explore and now released into the market.

Charlie Kawwas

executive
#12

Awesome. Well, this is exciting, but I want more clarification. What does private in private AI mean?

Paul Turner

attendee
#13

Yes. Good question. So our customers are some of the biggest enterprises of the world and some of the most secure government agencies and data centers in the world. And their data is their IP. Not only is it important IP to them, they're actually very concerned about that data and that IP and the privacy of that data. And so what we -- what private AI is doing is really working out how can we bring the foundational model? How can we bring the Gen AI capability to the data versus bringing the data to the Gen AI models. And that's what our customers want. They want to actually have this ability that inside in their data center, inside in their secured environments, they can actually optimize on top of these models and do it quickly and iterate really quickly. So that's the path.

Charlie Kawwas

executive
#14

Awesome. So as you've heard me talk, one of the things that we really believe in is open solutions in AI. We talked about open, scalable, power efficient. The open piece, I think, is something that VMware since the beginning, I think it's over 2 decades, has led the world in the data center. So as you release this out, tell us a bit more about your partners' ecosystem and how you enable -- continue to enable, in a private AI world, the open ecosystem?

Paul Turner

attendee
#15

Yes. So it's a huge factor for us. So we have more than 300,000 customers out there run their data centers on VMware. So open for us means that we must support the ecosystem that's inside in that data center. That's the application ecosystem and, of course, the hardware ecosystem. So in this, we have actually done integration, the first releases, we've done integration with NVIDIA. And we tightly coupled in with their Nemo framework and so that you can do that optimization and tuning. We've integrated in with their DPUs and GPUs. But we don't stop there. We've actually gone and we're working very closely with other hardware partners. We're working with Intel, AMD. We actually have a whole set of software partners. So think of people like Hugging Face. Hugging Face is very interesting. Think of it as like the GitModel repository for all of the open source models that you can use. Build your optimized model to a particular vertical, bring it down, do the fine-tuning of it. We work with companies like DCube. We work with a whole set of companies to actually help customers not just on the hardware side, but also on the software side, innovate. So go online to vmware.com and you can actually go in and look at our AI partnerships. And importantly, with every one of those, we build best practices. We give free guidance out to customers to actually help them understand this breadth of this ecosystem because honestly, it's totally confusing to them. So how do we make it very easy and give them the best recommendations in terms of tool sets, tool chains that are right for them to use and then be able to support it on any of the hardware ecosystem they support.

Charlie Kawwas

executive
#16

Awesome. Well, I'd like to thank you for taking the time and joining us today, even though this is a semiconductor event, but he is part of Broadcom. And you can see now Broadcom is not just focused on the consumer and semiconductor side of it, obviously, the enterprise and an open enterprise solution is key. Thank you so much.

Paul Turner

attendee
#17

Thanks so much. Take care.

Charlie Kawwas

executive
#18

Appreciate it. Thank you. All right. So with that, let me just wrap it up. Remember, this is the slide that we've talked about earlier on, and my colleague, Paul, here discussed it as well. So on the semiconductor side, we're focused on these big shiny blue squares that you see here. The market is consumer AI. We're very excited about this. There is a business case that makes sense for this today. As the enterprise model evolves, our brothers and sisters on the VMware side are already investing and taking products out to help the CIOs leverage what we can on the AI side as that business case develops. Remember, the second pillar is technology. You've heard from all my colleagues how we're #1 in each of these categories. We will focus and we are focused. We've been focused for 10 years on custom XPUs. We are the market leader for the last 10 years. And the same thing on the AI connectivity. We've been the market leader on connectivity for over 10 years, almost 2 decades. And with that, we've talked about the 2 customers. You've heard my colleague, Frank, talk about the experiences and these blocks that he was showing on these road maps are actual chips that we've built and either have shipped and are in production or in co-development, the middle ones or in full production. Second to none in that space. And the super exciting news today for us is now Frank has a third consumer AI customer that's just ramping and we'll be shipping in volume this year. Then if you look at the broad portfolio that we have, this portfolio of AI connectivity and networking is second to none, and it's been second to none for over a decade, leading the industry not just in the past but more important today and in the future. In ethernet, you heard Ram with Jericho AI and Tomahawk, absolutely taking the industry to the next level and continue that leadership with all of the hyperscalers, especially the consumer space. You've heard Jas talk about PCIe. We are not just delivering PCIe switches. We realize in order for the system to work, we will deliver the total end-to-end PCIe solution. That IP is not only in the switches and retimers, it actually goes also on the XPUs. It goes in the switches. And we are -- we have been and are still the #1 for 5 generations, and we think we will be in the Gen 6 time frame. On the Optics side, very cool technology that Near shared with you, #1 in VCSELs, and we are showing you today the impossible in the labs. Our MIT and PHD engineers said, 2 years ago, nobody can do 200 gig VCSELs, literally. Today, you're going to see it working in our labs. That is mission impossible. We're the only one shipping 100 gig VCSELs. Every system that uses 100 gig VCSELs today, it doesn't matter who supplies it, we play into that space. EML, we're the leader in 200 gig. And the cool thing that's coming out now is how we will change the power equation with CPO not just for the switches, which we announced last week, we took Tomahawk 5 and these CPO tiles and have created the first 51 TCPO, which saves over 70% of the power, more than 30% of the cost. But now we actually are having these consumer AI customers tell us if each of these tiles, as Frank showed you, can save 80 watts and you have to put million of these, that 80 million watts. That's huge. That's the size of 2 data centers today just by using that technology. And then lastly, but definitely not least, all of these things would not happen, including the custom side, without Vijay's foundational technologies. It starts with the SerDes, it continues with the DSP, and we enable copper single mode and multimode capability. And with that, that is the market we play in. Remember, all the red squares, the golden lines, that's where the money is. And we play in an open platform to enable AI infrastructure across any size cluster in this space. At the end of the day, as I was just chatting with Paul, we believe in an open, scalable and power-efficient system that can allow you to get to this million cluster. You cannot do that without the best networks in the world, and that's ethernet. The only way we believe we will continue the success that we've done over the last 2 decades is through deep and large and scalable investments in our technology and our engineers across all of these technologies we shared with you. We will continue that innovation and rest assured, and I'm hoping you will join us with -- later on with the demos to see that. But at the end of the day, even if you're in the right market, even if you have the right technology, that sustainable franchise does not get completed if you do not seamlessly execute on that plan. And I'm very proud to say that the teams we have on the semiconductor side are the best in the world. And in these spaces, we continue to execute. And with this, thank you again for taking the time with us. I personally and the entire Broadcom team here appreciate you taking the time with us. And hopefully, we can spend a bit more time with you later on. Thank you again.

This call discussed

For developers and AI pipelines

Programmatic access to Broadcom Inc. earnings transcripts and 32,000+ others is available through the EarningsCalls.dev REST API. Plans from $24.99/month — full transcripts, speaker segments, full-text search, and the recently-added /api/v1/transcripts/recent polling endpoint for ETL pipelines.