International Business Machines Corporation (IBM) Earnings Call Transcript & Summary
September 21, 2023
Earnings Call Speaker Segments
Keri Olson
executiveHello, everyone, and welcome. Thank you so much for joining us today for this webinar on AI-assisted mainframe application modernization. Today, we're going to talk about IBM watsonx Code Assistant for Z. This is an upcoming product that will be available later this year, and we are excited to be here to talk with you about it today. My name is Keri Olson. I'm the VP of Product Management for IBM watsonx Code Assistant for the product family. With me today are 3 of my amazing colleagues, I got Ruchir Puri, who is an IBM fellow and Chief Scientist with IBM Research, Kyle Charlet, who is an IBM fellow and our Chief Technology Officer for IBM Z Software; and finally, Rich Larin, who is a Product Manager for IBM watsonx Code Assistant for Z. It's great to have all of you guys here with me today, and thank you once again for everyone who's tuning into our webinar. We're very excited to be here with you. Our generative AI has recently generated a lot of excitement in the industry. Organizations are starting to see how generative AI can truly be applied across their businesses and provide business value. Today, we're going to talk about IBM's approach to generative AI and more specifically, we're going to get into AI for Code Assistant and what we're doing with IBM watsonx Code Assistant. Before we get into the details, I do need to share with you that we are going to be providing some forward-looking content and some forward-looking information. Please understand that all of this information in IBM strategy and plans are subject to change at our sole discretion. This is our intent though, and we're excited to share it with you. So over the course of the next hour, you're going to see information about IBM's view on generative AI for business. We'll give an overview of IBM's watsonx Code Assistant, and then we're going to get into the meat of the presentation with mainframe application modernization, IBM watsonx Code Assistant for Z specifically. And the Star of the show a demo of watsonx Code Assistant itself. You will see it running live with a demo from Kyle. And finally, Ruchir is going to take us through the details of our code-only large language model. This is the state-of-the-art model that is powering IBM watsonx Code Assistant solutions. So let's get into it. Even though generative AI is relatively new to the market, the widespread popularity of ChatGPT has really created significant interest in large language models and foundation models. Interest in what they can do for us as a consumer, as an individual, but more importantly, there's a lot of interest in what foundation models can do for business. Now it took quite a bit of time for companies to really embrace and move towards traditional AI. We're seeing something different generative AI. Generative AI is seeing massive early interest. As a matter of fact, 80% of enterprises are either already working with or planning to leverage foundation models and adopt generative AI for their business. Goldman Sachs has also estimated that generative AI will have a very deep economical impact. They're suggesting that gen AI could raise global GDP by 7% within 10 years, and that's very significant. And Boston Consulting Group has suggested that generative AI is expected to represent 30% of the overall market by 2025. All of this data points to extremely high adoption for generative AI. We know that it's great for consumer applications. We know that it's great for individual use cases, writing a poem, writing a paper, generating a recipe, but IBM's discussion around generative AI is about how the technology will help business to move forward, and that's what you're going to see today. When it comes to generative AI, one aspect of IBM's approach is building and providing foundation models. Our foundation models are trained on large amounts of unlabeled data and then they can be adapted to new scenarios and new use cases through fine-tuning. Foundation models are already the norm in natural language processing or NLP. But language is just the beginning. As you see on the chart in front of you, IBM is building a set of domain-specific foundation models trained on all kinds of different data, many of which you see in front of you. Code, tabular data, time series data, geospatial data, semi-structured data and more. The models that IBM will be providing will empower applications ranging from co-creation to drug discovery to cybersecurity. And these models will dramatically impact how people interact with technology. This will change not only how business is done, but how we actually think about our business moving forward. The flexibility and scalability of IBM's foundation models will significantly accelerate adoption. Instead of treating AI as a tactical add-on, we believe that enterprises will be empowered to put AI to work at the strategic core of your business. Now beyond the foundation models themselves, IBM's strategy is focused on a full stack approach to generative AI. That starts at the bottom of the stack with our hybrid cloud technology. This foundation is built on open source. Moving up, you see our Data Fabric services and our AI and data platform. The AI and data platform provides everything that enterprises need to build, to train and to deploy AI across their business. While providing access to that data through our fit-for-purpose data lake house service and data and AI governance will be available to enable responsible, transparent, and explainable AI for your workflows. Moving up the stack, you see that we will enable ecosystem integration through SDKs and APIs. And finally, we will have a robust set of AI assistants that will help organizations to close skills gaps and help users to be more productive in their day-to-day work. And that's where we want to focus today with IBM watsonx Code Assistant. So as we take a look at IBM watsonx Code Assistant, this represents a family of offerings that support AI-assisted content generation, co-development and application modernization. Our objective here is to help organizations to address skills gaps and to increase productivity for both developers and IT operators. IBM initially announced IBM watsonx Code Assistant at our Think Conference in May, where we talked about IBM watsonx Code Assistant for Red Hat Ansible Lightspeed. This solution provides generative AI to help individuals to create Ansible content and Ansible playbooks more efficiently and more effectively. Fast forward, and now we have recently announced our plan to also deliver a second product in this portfolio, IBM watsonx Code Assistant for Z. The watsonx Code Assistant for Z offering is purely focused on addressing existing application modernization life cycle for Z applications. Our focus is to help organizations to continue to leverage the power of the mainframe z/OS platform while transforming and modernizing their applications. You're going to hear a lot more about this specific product in just a few moments from Rich and from Kyle. But before I pass over to Rich, I want to show you one more view of our product family. As you look at the chart in front of you now, you can see the details of how we start with our base foundation model. And as I mentioned earlier, we take that base foundation model, and we fine-tune it for specific use cases. When we fine-tune it for specific use cases, it will provide greater performance and greater accuracy for that specific use case, providing significant value for our customers, for our users and for businesses that are looking to get value from generative AI for code. So you see at the bottom, we have a foundation model. Ruchir is going to talk about this in detail at the end of our presentation today. From there, we fine-tune it with Ansible content, we fine-tune it with COBOL to Java content that feeds into our watsonx Code Assistant, which then provides these solutions for our customers tailored to the needs of those specific use cases, and this is how we will provide tremendous value through IBM watsonx Code Assistant. So now I'm going to pass on over to Rich. He's going to talk in more detail about watsonx Code Assistant, and then we'll see a demo.
Rich Larin
executiveThanks, Keri, and hi, everyone. My name is Rich Larin. I'm the Product Manager for watsonx Code Assistant for Z. Before we dive too deep into the details of the product, I want to put it in context of our mainframe application modernization strategy. Over the years, our clients have been asking questions around some of the topics you see on the screen here. How can they get more agile, how can I address -- all together? Well, the good news is we're not starting today. We've been addressing these challenges and solving them from our clients for years now. Enterprise DevOps doesn't have to be a distributed solution. It extends the mainframe. We have clients committing code 20 times a day on the mainframe. Skills, Java. We're going to talk a lot about Java today. By leveraging Java on z/OS is not a new thing. We've been investing in Java on this platform for 25 years, and we have customers using it today at scale. Cost. We now have consumption-based pricing on the mainframe field software. And pulling this all together, you're not let on your own anymore. We focused detailed guidebooks, rent books, rent papers and how to go about them doing all this. And so that's what our strategy is all, has been about and what it continues to be. And the good news is watsonx Code Assistant for Z fits right into this. We're now being generative AI into this same strategy. And so just to kind of give a little bit of illustration on how that strategy looks. Again, Z is not often an island anymore, it's fully integrated into a hybrid cloud platform. That is the solution that we recommend to address all these challenges. And there's 2 core tenets that we're driving at. One is enterprise standardization and the other is platform integration. Then by adopting this type of approach, we really can't leverage the tools that you see in the middle of that little cloud, [ offers ] cloud packs and now watsonx to really get the full power of a hybrid cloud solution. And this is how we address these pain points, and to market, talent, access and data and costs. This is our strategy when it comes to mainframe application modernization, and it's been paying dividends for years for our customers. And now watsonx Code Assistant for Z fits right into this strategy. It's going to help achieve all of these goals here. And so some of you asking how do we do this? How do we go about and achieve and realize these solutions? What we've done is we've built a set of entry points that helps our clients to modernize. And the reason you see this not as like a path, but it's more of like a menu because it depends on the situation each of our customers is facing. In some cases, they want to adopt DevOps. In some cases, it's around performance or productivity, some cases around accessing data. Whatever the situation is, there's an entry point now with prescriptive guides and explanations and capabilities and products to support that type of modernization. And now, this is where generative AI enters a picture. Keri introduced the watsonx Code Assistant for Z family. And these 2 products slope right in to 2 of these entry points here. Red Hat Ansible Lightspeed is going to be a tool for automation and standardizing. But watsonx Code Assistant for Z is going to be all about enhancing and modernizing applications. So the key point now that I just want to reemphasize, our strategy remains unchanged. Generative AI that was going to give us tools that are even more powerful to deliver on the strategy. So now let's explore watsonx Code Assistant for Z. I want to start off with explaining a little bit what our strategy is. We now have generative AI, we're able to translate COBOL to Java. But you want to show here is that this solution is more than just generative AI magic one. We are building an end-to-end solution that's going to be highly differentiated versus what's existed in the past. And for 3 reasons. The first one is life cycle. The second one is code quality and the third is interoperability and optimization. From a life cycle approach, what you'll see is that it takes more than just code translation capabilities to modernize your application. And when you have a whole suite of capabilities that are packaged together here to help you with that entire journey, I want to come on that in just a moment. The second is the quality of the code. COBOL to Java conversion tools have been around for decades. And the quality of that -- of those tools can be suspected. The challenge is it's hard to express COBOL and true object-oriented Java. And so the output in the past hasn't really looked like that. That's where our generative AI capabilities are going to make a huge difference. Ruchir will talk about the AI model specifically, but this is what we're going to be able to do is actually generate well architected Java that looks familiar to Java developers. And then lastly is around the interoperability and optimization. We're talking about COBOL to Java, but COBOL and Java are both strategic to the mainframe. We know our customers need both, and it depends on the situation when 1 need to use which. And so that's why it's so important to have that interoperability of that capability. So that, again, no matter whether you choose COBOL or you have Java, you're able to reap the benefits that the mainframe offers, whether it's resilience, performance, security, it's independent language, those benefits are assured. So this is our strategy here. So now when it comes to modernization, with generative AI can help assist in this modernization, you can do these projects much more quickly at lower cost and less risk with this type of technology. That's what the strategy is for this portfolio. So now let's dive into a little bit of what I talk about here. So from a life cycle standpoint, this tool is going to follow some of the same basic steps that many enterprises have been doing for years. You see this wheel diagram here with these 6 steps. And these steps are pretty typical when it comes to modernizing the mainframe application. You have to start off understanding what's happening in that application where all of the pendencies, then you want to go and get add some code. So you have to do some refactoring actually get the code we want to work on, then they're going to transform it. I mean here, we're talking about language conversion, but there's all different types of modernization that you can go after. We have to you transform it, you validate it, make sure you don't break anything and everything works. Then you deploy it, there might be some recommendation from a platform architecture standpoint. You get it back up and running, make it up from an observe, like an observability frameworks and tools and modernization project Phase 1 complete. So this is the cycle that has been going on, and people have been going about this for years. And when it comes to the modernization, we're talking about here with the new language, it's either you're having the piece together tools and rerunning things manually, that's all changing now with watsonx Code Assistant for Z. And so the scope of our product is we're bringing tools in steps 1, 2, 3 and 4 to really help accelerate this process and address some of the challenges on the left side, right, around skills, productivity, optimization for Z and of course, align with as an industry standard DevOps approach. So now let me talk a little bit about what are these steps 1, 2, 3 and 4. So the first thing is around understanding the applications. So before we can get to any code translation, we have to understand all the dependencies between the code between the databases. And that is what this first step is. The first part of this solution is going to give you the technology to map all those dependencies into this repository so that we actually understand what is happening. And that's the basis. That's the foundation for this type of incremental refactoring. So we need to start here. And so that's going to be the first part of the solution understanding, having those dependencies mapped out is step 1. Okay. Great. We've got that done. Step 2. Now that we understand and have that basis for the application, we're going to decide, all right, what do we start with? What is the first piece of the application that we want to go and modernize. And that's what this refactoring tooling is. So we have from brand-new technology here that's going to help assist with extraction and decomposition of these highly complex COBOL applications. Now if you see this puzzle piece in the bottom left, that's the current state, the starting point for many, many of our customers. And for the most part, 80% of the code and application is fine. It's working fine. But we have a specific need to modernize some business function or some capability. And so that's what this tool is leveraging that first part, and that's step 1. Now we can use this tool to actually extract out the puzzle pieces that we want to modernize, right? This is an incremental journey. You're going to do one step at a time, one step at a time, one step at a time. This is tooling to do that, and you'll see in the demo exactly the power of this tool to really make the developers life easy.
Kyle Charlet
executiveSo let me -- it's really important one step at a time situation because what's happening in other places and market is -- is it's all done all at once. You feed the entirety of a COBOL application into some transformer that's largely rules-based in market, largely doing line-by-line syntax translation, right? And in doing so, right, Rich will get to the transform element in a moment. But in doing so, you're sort of doing big bang kind of sledgehammer approach to the whole solution, right? The importance of refactor here is that we're taking these monolithic applications that -- and the term monolith is important to define here, right? Simply anything that's been around for a long, long time gets into the state, right? This is not a COBOL problem. This isn't every language problem. It has the type of longevity that applications have had on the Z platform. When you have that type of longevity, you have a lot of different architectural voices, a lot of authors, right? These applications are decades old at times. And when you have different authors and different voices, they all have different design principles and different places they develop. And you end up with this layered approach of over the years, people sort of reintroducing the same sort of shared services differently over time. And I've seen this with Java applications that have been around for a while. And so what refactor enables you to do is take these monolithic applications that are sort of hard to kind of detangle, and it does the detangling for you when we'll extract a discrete service from this monolith, so it's not only a stand-alone service that will be integrated by the way, with that monolithic that you extracted it from, right? Fully integrated, fully callable between the 2 -- and that service then forms the basis for what we can do with the transformation steps. So the idea here is, as Rich just stated, you're doing this 1 business service at a time, not 1 giant 30 million, 40 million, 50 million line application at a time. It's a significant difference. It's at your run rate and pace, which significantly reduces the risk associated with such a motion. So Rich, I just wanted to add that in back to you.
Rich Larin
executiveYes. Thanks, Kyle. So now let's talk a little bit about the generative AI piece. I'm not going to go into too much detail on the large language model because we'll cover that later. But this is where AI enters the picture of this modernization journey. This is where we've trained our model to really produce well architect in Java. And that's the benefit of our experience on the platform. We know COBOL. We know Java on z/OS, better than anyone. And so we're imparting that knowledge into this model to really ensure that we get a high-quality output. Again, this is optimized for Z 1 time to [indiscernible] a service. So that's the design of this model. And the one other point I'll make, and you'll see it in the demo is that this is really intended to be a developer experience, right? We want the developer to have the context of the ID they're operating in when interacting this model. So the whole motion here is that it's developed promotion so that developers as efficient as possible as they leverage the AI capability. So that's the transformed stuff. This is where the generative AI piece comes into play. And then lastly, what I want to talk about is validate. We work with a lot of customers who have been doing this type of exercise manually. And the feedback that we get is even when you have to do the conversion manually, one of the most resource-intensive parts is the actual testing and validation of that new code. And so what good is generative AI and if you have to do comprehensive testing and manually store as well. So that's why it's so important to bring new tools to support this modernization exercise. And so that's what we're doing here. So there's going to be 2 things that you're going to be able to do. One is we need to ensure equivalents between the new Java and the original COBOL. And so there's going to be a capability for automated unit test to be generated to ensure that you have that equivalence, right? That's going to be the most important part. And then secondly, we want to adhere to frameworks that are in place to really make this maintainable both from a COBOL standpoint and a Java standpoint. And the framework that we're going to be leveraging our unit for our COBOL and [ J unit ] for Java. And so that what this automated test case generation is going to be focusing on is adhering to that. So those are the 4 steps that are going to be packaged together in this watsonx Code Assistant for Z solution. And again, the intent is that it supports this journey of incremental modernization over time. So the last piece I want to cover here is kind of just pulling it all together before we then dive into the demo. So you can kind of see the approach. I'm going to just relate back to the puzzle piece and some of the comments Kyle made earlier when we talked about the refactor. So as you can see, we had this application that's operating in one of our run times accessing the data, VSAM files depending on the architecture and the application in question, but what we're doing is we're doing minimally invasive surgery, right? That's why we say a derisked approach. So you're only taking out the parts that you want to have an intention of modernizing at this point for your business reasons. But everything else remains untouched. The data model, the transactional model remains in place. And that takes out a majority of the risk of this type of modernization because everything is able to stay as is, you're just touching the pieces you need to touch. And the last point I want to make here is you see the new puzzle pieces in the bottom, COBOL or Java. Again, I just want to reinforce the point I made at the beginning, both of these languages are huge strategic to us and our customers. And we want to be able to support both of them as part of this modernization journey. So with that, Kyle, I'll turn it over to you. And if there's any point I want to make here before we go on to the demo.
Kyle Charlet
executiveThanks a bunch, Rich. Yes. No, I just want to -- now we're going to kind of get to the demo, which hopefully will be the most exciting part, not that the speakers weren't very exciting, but demos are always pretty cool. Technology tends to win sometimes we all believe. I'm really going to highlight, as Rich mentioned, the refactor transform and validate phases of this, right? And so the application that I'm going to show you is it's just -- it's a basic insurance application. It's got about 10 different services that are all within the confines of the single application. And we've already gone through the understand step, right? We've already done the ADDI scanning of the application to understand all of the dependencies of this application, including all the data dependencies this application has and the output of understand is really, as Rich said earlier, the input to our refactor step. So I'm going to quickly share my screen for you, and then we can kind of get rolling. So this is what I'm showing you now is the output of ADDI brought into our service designer solution, right? You'll see a number of modules listed here. And as I click around, right, you'll see, for example, if I go to this, this is actually a DB2 table, and you'll see all of the call paths that lead to any sort of activity in this particular table, right? For this example, we want to extract the business service that onboards a new client. That new business service ultimately results in an insert to this particular table, so you can follow the chain going back and we'll start at the head of the chain, and we'll look at all the paragraphs in this particular COBOL module. And as we look at those paragraphs, we'll see there is one called onboard customer. We'll select that right? And here is where, as Rich mentioned, sort of that semi-invasive surgery metaphor, where we can actually what's called slicing, so that makes it seem even more like surgery, I suppose. But we're actually going to then from that module, we're going to extract all of the code slices across all the blocks of code really from that module all the way down to the actual module that does the ultimate insert into the table. So this technology can scan all of that and really surgically remove those elements of code along the call chain and then extract those into that single consolidated business service now, which is what we've just done, right? Here's the different code blocks that were involved in this. And at this point, we have the stand-alone service. I'm going to export that service, right? And then we're going to switch into our VS code experience. Now you'll see that, that extracted service just appeared here. Really important point to make. What I just showed you was obviously, a web UI form factor. That's what it is today. We are integrating that whole experience into VS code because we believe very strongly that the motions of refactor of transform and validate are very, very much developer motions. As such, that service designer experience absolutely is going to be integrated into this VS code-editor experience because we don't want context switching, right? We want them all to be in the same domain space. It makes this whole thing a lot more seamless and smooth. But in any event, what we have now, I have this new COBOL program now downhill, you'll see a little watsonx Code Assistant for Z. I'm going to import COBOL into that. I'm going select the 1 we just exported. So now we have it down here. We have a couple of operations we can do. First one is generate Java classes, right? So I'm now going to generate the Java classes. A few things that allows you to do at this point is some semantic mapping, right? Here are some of the COBOL data structures, right? And you can now control what the resulting Java variable names look like, right? Do you want them camel case? Do you want mostly camel cases how we're going to go here. But that's all available here in that mapping. And then when we can actually issue to generate, right? So what's happening now is I'm going to indicate the package. And it's now going through and generating an entire class hierarchy, which you just saw pop up here. What's interesting to note here is when we brought in this actual COBOL application, it brought in not only that service, but all of the actual COBOL data structures in support of that service, so that service needs and uses as it ultimately onboards a new client, right? These are all now reflected here, right? When we talk about well object-oriented principal design Java. This is what we're talking about, right? In market today, you'll see a lot of basically, it's a line-by-line translation. And what ends up happening is you get COBOL syntax expressed in Java. That's pretty gross. It's largely unmaintainable, -- it's largely unrecognizable to a Java developer to a Java professional. We actually have worked with clients where their Java professionals actually cannot operate on the Java that was generated. They've had to become COBOL professionals to go back in to update the COBOL to regenerate the Java to deploy it, right? I think we'd all agree that's basically a 4G nightmare or at its worst, right? And so that's what we're seeing in market. What we've done here is we've modeled an entire Java class hierarchy and the transform step is going to transform into the specific areas of that class hierarchy where the relevant code needs to go. So we look at it just for a moment, we'll see there's a policy class, right? Standard policy with a bunch of getters insiders, right, standard policy, right? Now we actually have built in a hierarchy adherence model, where we have, for example, a motor policy, right? This is a specific version of a policy. What makes a motor policy unique? So it obviously shares all the attributes of a general policy and adds in the specific elements that make it a specific motor policy, right? So all good there. Now if I look in here, I look at customer request, I see there are a couple of method signatures here that have no implementation. They're not filled in yet, both insert customer and obtain customer. We'll start with insert customer. I'll go ahead and I'll generate that method. What's happening now is now we're making a call to our watsonx Code LLM through watsonx Code Assistant, right? And it's letting you know it's actually generating this method in this Java class generated from this COBOL file. We wanted to be very clear to separate the necessary subject matter expertise from the COBOL developer and the Java developer. Hey, they might be the same person that they very won't be the same person, but they very well might not be. So we see the COBOL subject matter expert doing the service designer motion that we showed you during refactor, right? You need to obviously have COBOL skills in order to do that because as a part of that extraction I showed you, you can actually move those, you can actually move those extracted snippets of code around. You can actually modify and add code in line, right, which again goes to why this needs to be integrated into a VS Code experience because you can actually manipulate the code as a part of the refactor process. But here is letting you know. It's generally from this COBOL file, you can open if you want to, from this particular paragraph. Now here's a generated code. I can insert it, I can rate it. We'll insert it. You'll see it now in line all of that code in this method, right? So we've now -- and that's -- this is where Rich is going to. It's 1 method at a time. We can actually -- we're looking to batch all of this together, right, to get more and more scale out of that but it's actually going through and properly identifying kind of proper Java and then implementing that into within the confines of the class hierarchy. The last thing I want to show you is the validate step, right? How do you trust that the code extraction and transformation actually works, right? This is where that validation step comes in, where it's actually generating a bunch of tests. In this case, the COBOL application only had 2 call [ paths ]. There was basically an [indiscernible] branch, right? And so the test cases made sure it drove each branch of logic in the actual application. And what it shows you is that here's the inputs, there's the Java input and the equivalent COBOL input. They're driving unit tests against both the extracted COBOL service and the transform Java service to make sure that there's semantic equivalents between the 2. So we'll generate 2 different tests validating the outputs remained the same, right? So that is really the kind of the crux of what it is that this whole wheel does as it pertains to refactor, transform and validate, right? These tests and validate will absolutely be lead behind tests. As Rich said, there'll be [indiscernible] test to integrate right into your CI/CD pipeline. There'll be [ Xena ] tests, which really is a game of equivalent for Cobalt. -- right? So the [ Xena ] tests that are also lead behind fully integrated into your CI/CD pipeline. So all of that information is going to be there for you to lean on and leverage. The ones that of tests I didn't show yet today are ones that drive that original COBOL monolith application, that original insurance application because we have to ensure that the extraction that we did all didn't adversely impact the actual running application itself because what we're going to be doing is not only extracting that service. But generating code in that monolith so that it can then call the -- either the extracted COBOL service or the extracted Java service. Right, as Rich mentioned, we are language experts on the platform. We have clients worldwide betting their business on the ability to interoperate between COBOL and Java in the same transactional unit of work, right? So if that COBOL application was running in IMS, that Java applications you were at IMS, if it was running in CICS, it will be running in CICS, mixed language. So that COBOL calling Java, Java calling COBOL, all there in the same transaction need to work, right? This is a value of running Java on the platform. You don't have to convert all your COBOL to Java. You don't have to run that Java in a different app server. You run it right in CICS alongside your COBOL right in IMS TM right alongside your COBOL. We have clients that are -- they're running 400 million transactions a day with mixed language COBOL Java, 85% of those transactions are running mixed language workload, right? Significant value these clients, very, very critical to their business. So what did I show you, right? We talked to enrich as well, right? We talked about continuous and targeted modernization, right? It's very targeted to a specific business service. It's not a sledgehammer approach Big Bang, right? Well-designed Java, right? And Ruchir is going to talk a lot more about our differentiated LLM or Code LLM, a very well-designed job. It's important for us. It doesn't look like [ JOBOL ], right? That it's actually not just COBOL syntax expressed in Java, right? That's a mess, z/OS optimized outcome. I answered the question earlier, right, about what does this transform Java look like. This transformed Java is leveraging 20-plus years of SDK expertise on the z/OS platform. right? We've had SDKs that we've curated for over 20 years. We've had Java available for 25 years on our platform now. It's as much legacy as any of the language now, quite frankly. But it's immensely popular and clients want to continue to bet their business on Java and COBOL both. And we're going to add PL/1 into the mix here as well, right? Looking ahead, we're certainly looking to bring PL/1 motion in here for an optimized PL/1 to Java LLM tune model. So optimized outcome here is very, very important because it's leveraging SDK technology we've had in here in the platform for many years. So it will use Db2 SDKs, [ IMS-TMSDK, CICS-SDKs ], VSAM SDKs, [ JSAs SDKs ] right to make sure that transform Java application is running in the same app server, accessing the same data sources, right? Again, using proven SDKs that have been out there for quite a long time. Obviously, from a talent pool perspective, this is a big deal for clients whose strategy, whose strategic direction is -- does include Java. For those clients whose strategic intent does include Java, it opens up the Java talent pool for them, right? And SDK is an SDK is an SDK, right? It's got Java dock, SDKs are brought in the language all the time. You look at the SDK, you follow the Java dock, you're off and run it, right? You don't need Java z/OS developers. You need Java developers, right? And they can use any SDK, no different here. That's the talent pool that we're unlocking with this motion. Finally, I'll talk about where we're heading, right? What I talked to you about today, content generation, content validation. We're going hard and strong and fast at this, a long road map ahead of us here, but a very rich road map ahead of us here. We're certainly going to be going to content optimization and content explanation, right? Content explanation is key. When we showed you like what ADDI can do, it can certainly tell you sort of what your modules are, where they are and what their dependencies are, but it's not telling you what the actual application is doing, right? And that's what content explanation can do. Tell me what this COBOL application is doing. Tell me what this Java application is doing. This comes into play very, very cleanly when we look at when we transform our COBOL service, when we transform a Java a service with content explanation technology, we can add in-line Java dock, for example, in-line COBOL comments right in line, right? We're debating right now how much of the COBOL comments that we carry over into Java. For one, comments live code doesn't, right? We can't validate the accuracy of the comments, right? We could certainly use some sort of AI capabilities to look at the natural language expression to see if it's more of a business expression or something different to determine, hey, this might make sense. But even if that were the case, since we're not doing a line-by-line syntax translation, that comment might have no place in the transformed code, right, because it's not line-by-line. So content explanation can really, really help here. And then certainly, optimization, right, optimized by COBOL. As Rich said, COBOL is the future of our platform, just as much as Java and PL/1 are. So optimize my COBOL, right, optimize my Java, right? So there's a rich road map ahead of us here. We're starting with generation and validation, but there's so much more to come. Ruchir, I'm going to hand it back over to you to talk about the greatness of our model.
Ruchir Puri
executiveHey, Kyle, thank you very much. Let me just quickly go to we are incredibly excited to bring the power of the generative AI technology to the platform, which has stood the test of time. And we know many of you or all of you really love the platform would like to see latest and the greatest of the technology on it. As Keri said in the very beginning of this webinar, we are starting from a base which is truly state-of-the-art. We feel proud to announce this to all of you to bring the latest of this code-only base model for a large language modeling technology. Our model is called [ Granite ] 20 billion code model. 20 billion refers to the number of parameters in the large language model. Just to compare these 2 other models outside, a state-of-the-art code-only large language model outside is roughly around 15.1 billion to 15.5 billion parameters. This model standing at 20 billion parameters code-only is the best in the industry. Second part. It's critically important that the large language model, given it has more number of parameters is able to see higher number of code-only tokens as well. Think of tokens as a word or a breakage of a word actually typically. So fantastic, for example, in English language will be broken into fan pass tech actually as an example. So this model, as compared to a state-of-the-art only -- code-only model outside has seen 1.6 trillion tokens, 1.63 trillion to be precise, as opposed to a model outside, which would have seen 1 trillion tokens. Third distinction and differentiation. As Kyle pointed out and Rich as well, there is a rich context in the use case we are looking at. This is different than all of you have dealt with ChatGPT. This is different than you look at you type in a prompting which language prompt, right an algorithm in Python to bubble sort string of numbers, actually, okay? And it's going to give you a code. In the case of this particular use case, as Kyle showed you, we are transforming an application. We are understanding an application. We are refactoring an application. We are transforming an application. There is million lines of code already or more in many of your applications. So there is a very rich context. And this came to us through many different users that can you please enable a larger context window? What is context? Context is basically about -- think of it as the payload that you put in the query and you ask the large language model to transform that code into, for example, Java in this particular case. The more context I can provide, the better the translation can be or better the transformation can be as well. The state-of-the-art model outside is roughly around 8,000 tokens currently. We have brought to you a context window size of 32,000 tokens, 4x larger than the best model out there. And this is not -- again, I don't want to say we are beating our chest on, hey, we have 4x better than others. We truly are being driven by what is the need of the use case. Again, I think this goes to IBM's perspective on AI for business. We are not doing this for the sake of AI only. Large language models are a means to an end. They are not an end in itself. And that's why all of us in this talk are focusing on that entire wheel, not just 1 part of the wheel only. Most importantly, I think just to give you confidence in the power of the model, and we are amazingly excited about this model. It's actually trained on 115 programming languages as opposed to models that are roughly in the 90s range, I would say. This is a very powerful model and it understands many different languages. And this enables us to bring a lot of questions that are being asked in the chat window as well related to, would you be supporting that language? Can I do this? Yes, we can do a lot of those things because the base model is very powerful. But we believe, and this is where I will go next is bringing the power of highly curated data to really let that base model shine in the context of the solution that we are looking at and that's where we will be focusing a lot of our work on. We are standing on a substrate or a platform in this particular case, which is very strong in terms of the base code-only large language model, which is state-of-the-art, but it is the additional highly trusted, highly curated data by the experts who have decades of experience developing mainframe COBOL code to really fine-tune this model to make it shine in the context of what we are looking at. In terms of how did we do our evaluations? And I'm going to show you some of the results. Some of the questions actually are related to it. We are actually continuing to develop these COBOL benchmarks, but I'm currently focused on 2 of them here. And I'm going to talk about others as well and how we are developing these benchmarks. We have -- as an exemplar, I'm showing you 2 benchmarks. One is of code type actually like small snippets. Think of it this way, tens of lines of code. It actually checks the base capability of the model in terms of taking a simple COBOL code, if I may say, short simple COBOL code and translate it to Java. We have generated lot of peers, which are functionally equivalent as well as they are algorithmically equivalent as well. We have other benchmarks where we are saying they are actually functionally equivalent but they may not be algorithmically equivalent, so we can test their validation, but then we cannot obviously test in terms of line-by-line or in terms of tokens if the naturalness is actually valid or not. The second benchmark we are looking at is gets closer to the reality where we are looking at hundreds of lines of code, which we call the long type, the long code type, which gets, as I said, closer to the reflection of what you may be encountering. Those are also functionally equivalent as well as algorithmically equivalent as well. And the reason I'm pointing these 2 out, as you will see in the results that I'll show in the later on, where we can measure what we call the naturalness property of the Java as well. Kyle emphasized this quite eloquently in terms of our goal is not just to be able to translate. A COBOL to Java translation with Java being [ JOBOL ] is as good or as bad as the COBOL originally, that was there. If you can't understand it, you can do nothing with it in either case. So I think that really, the point is to have it be translated into Java. And then be able to have that Java that human developers will actually naturally produce as well. So just in terms of these 2 benchmarks, I would say there are 3 measures of accuracy. Many times, you will see these AI models, they will talk about sort of AI accuracy scores. Because we are in the context of a particular solution, we are less worried about just the accuracy of the model only. What I really care about is how is this model doing in the context I am in. There are 3 tests that we look at. Number one, when I translate the code from COBOL to Java, does the Java compile. That's first test, actually. Now if it doesn't compile, kind of the description doesn't even start. Second one, it compiles, but can I run it, Okay? That's the second test. The third test is just because I can run it, doesn't mean it does the task I wanted to do. It may be doing something else. It runs, but it may be doing something else. So the third one is we have actually unit tests for these smaller -- short and long benchmarks as well. And we actually test whether it compiled, it ran, and it ran successfully and met all my validation tests as well. What I'm measuring here is actually the performance with respect to the tightest of those bounds or the tightest of those criteria in terms of translation accuracy as a percentage of programs that pass validation tests. Again, I think as an example in the case of watsonx Code Assistant COBOL model, which is built and fine-tuned on top of the base model that I just talked about with highly curated data produced by experts with decades of mainframe developers experience, we are able to produce in the benchmark 1, which is the short benchmark, 89% success rate. And in the case of long run, 63% success rate, which is if you really look at the how tight that criteria is, you're looking at these numbers as phenomenal. I think the second one I would like to point out is the one that many questions out there as well. Kyle emphasized it, what I'll call naturalness of Java. We measure it with what we call added distance from human-generated Java, added distance being how close it is to the human-generated Java. And in both of these benchmarks, we are able to actually achieve in the closer to the 90s range in terms of -- this is actually 1 thing that sometimes is misunderstood about while I'm not very sure AI-based translation will work or not. The cool thing about the generative AI technology is because it learns from the human patterns. It's actually much more able to reproduce those human patterns and is much more closer to a natural-looking Java that a human developer will produce than any rule-based translation will ever be and those rule-based translations can get really [indiscernible] as well. So we are sort of bringing this technology to the use case with, now as I said earlier, highly curated -- highly curated examples that are trained on top of it and really make it shine as well. Now let me go to the -- really the conclusion part of it and address some of the questions directly there. There have been many questions that have been asked on general-purpose models versus specialized models, big models versus small models. It's clear to us that purpose-built foundation models for COBOL understanding and transformation are critical to the success of this end-to-end solution. These purpose-built models induced quality at its core, which means higher performing Java, modernized for the IBM Z mainframe itself. It's important. I think Rich pointed this out. We've been running Java on the mainframe platform for decades. This is not a new thing for us. And once we transform into Java, it actually becomes a lot more maintainable, you can deploy more skills on it and so on. Second question. The question was also in the forum as well. What are we doing with data? As I mentioned earlier as well, this model has been trained with high-quality data curated by subject matter experts which where we really differentiate on. You would probably agree, we know a thing or 2 about mainframe development. It's trusted data created and curated, as I said, by mainframe developers. And this is where we truly differentiate. Another thing I wanted to point out, and I didn't mention this explicitly earlier, if you run a general model like you can pick ChatGPT, you can pick star coder, you can other, you can pick others. And we have done the benchmarking on it as well. Out of the box, in terms of the test accuracy, the accuracy is below 10%. And we have done our tests on this particular one, and we'll be publishing more results on it as we continue to move forward. But this is where this solution actually shines. The other place where it shines as Kyle actually covered it very briefly because we are sort of connecting this end-to-end across understanding, refactoring and transform, we actually extract metadata from earlier steps that which is given as a payload to the context, which actually makes the translation a lot better. The third point I wanted to emphasize and conclude on that is, it's really -- the model is designed for continuous improvements. Many people will ask, well, that pattern was not covered. I think we have a process set up now through which we will be doing a fast and continuous quality improvement for highly adaptive field deployments. We may not have every pattern covered today that your enterprise may see in your Java. So if you are not seeing some of that, we can actually get feedback from you incorporate it and get deployments very, very fast. So I think I'm just going to end there and really we'll continue to answer questions as well, but we are truly proud to bring the power of generative AI to you, and we are proud of the sort of achievements we have had, but are looking forward to partnering with you on this journey. Rich?
Rich Larin
executiveOkay. We covered a lot of ground over the past hour. Thank you all for all the amazing questions. We've got hundreds of questions we're answering as many as we can. But the good news is this is just the start of the journey of working with you all on this product. So please take note of some of the links here because we want to work with you and tell you more and answer your questions live and understand your use cases. So we are offering briefings and demos to go even deeper beyond what we did today. We have a web page where we're going to be continuously posting more information, and the GA is right around the corner, right? We're only a few days away from 4 quarters, so we'll be releasing the first version of watsonx Code Assistant for Z very soon. So thank you all so much for joining us today. Thank you to all my colleagues, who did an amazing job with their presentation, and we'll see you all soon. That concludes our webcast today. Thank you.
For developers and AI pipelines
Programmatic access to International Business Machines Corporation earnings transcripts and 32,000+ others is available through the
EarningsCalls.dev REST API. Plans from $24.99/month — full transcripts, speaker segments,
full-text search, and the recently-added /api/v1/transcripts/recent polling endpoint for ETL pipelines.