Roche Holding AG (ROG) Earnings Call Transcript & Summary

November 29, 2023

SIX Swiss Exchange CH Health Care Pharmaceuticals special 122 min

Earnings Call Speaker Segments

Bruno Eschli

executive

#1

So welcome to our third IR event on the topic of digitalization and looking back to 2020, when we first held our event on this topic, I think, really a lot has moved and we have seen a further acceleration, especially on the topics of generative AI, Machine Learning or Large Language Models. Can I have the next slide. Okay. Let me quickly take you through today's agenda. We have 5 speakers today with us. The first one will be Alan Hippe, our Chief Financial and Information Officer. Alan will talk about our ongoing investments in informatics and how we drive integrated data platforms and the infrastructure. He will also present the first 2 use cases here at GALILEO and ASPIRE. Our second speaker for today will be Moritz Hartmann, Head of Roche Informatics Solutions, which is located within Roche Diagnostics. Moritz will talk about how our increasing commercial portfolio enables laboratory efficiencies and advanced clinical decision support. The third speaker for today also from the Diagnostics division will be Kent Kost, our Global Head of Diagnostics operations. Kent will present a couple of more recent examples on how AI machine learning-based algorithms start to improve our manufacturing and distribution capabilities. And after that, we will have 2 very exciting presentations focusing on how the AI machine learning-based tools transform our early drug development, both at gRED and pRED. So the fourth speaker for today is with Aviv Regev, a well-known computational biologists who joined Genentech back in 2020 from the MIT and she is now leading gRED. Aviv will present the gRED Lab-in-a-loop concept, providing you with many examples on how computational approaches start to make inroads into every aspect of early drug development. And finally, we'll be joined by Scott Oloff, our Global Head of Data and Analytics at pRED. Scott's presentation will include examples from clinical drug development, especially to call out [ TA ] digital biomarkers for neurological diseases and deep learning algorithms in ophthalmology. So the total time for the presentation is set for 85 minutes, and then we will have a 35-minute Q&A session. Can I have next slide, please. As mentioned before, like in previous years, we have picked 20 use cases really throughout the entire value chains of both our traditional businesses, pharma and diagnostics. And this has to be proven in the past to be a very efficient way to really cover this very broad and complex topic. Looking at the use cases, we have picked this time. I think we really got a good coverage of the entire value chains. This time, we might more have a focus here on the preclinical and early clinical drug development part with examples spanning novel drug target identification, identifying new indications for drug currently in development or optimized molecule designs for small molecules, antibodies or cancer excellence. And these are just a few examples. Some listeners will also notice that we provide today updates on use cases, which we have presented previously back in 2020 and 2021. This holds especially true for the increasing product offerings in diagnostics, now organized under the Roche Information Solutions, where we have bundled some of our efforts and opened also a marketplace for healthcare algorithms, which are open to third-party developers. And finally, I would like to mention here something rather new, which we have not really had the opportunity to talk about. We are also currently in the process of building our own in-house Large Language Model infrastructure, a topic that will be covered both by Alan and by Aviv. On this slide, I just wanted to summarize on a very high level of what we hope to achieve with all these initiatives ongoing and in digitization. And probably, this picture is not complete. But as you can see here, going from the left to the right, you really see here on the starting point in early research and development, it's about gaining new insights in basic biology and human disease, identifying new drug targets and new drugs to be designed. And then if you move on, you see here it's about improving speed to market and better health economic assessments. On the manufacturing and distribution side, it's about producing ever more individualized drugs at an affordable price and also then to improve global access. On the diagnostics side here shown in green, you see basically, it's about improving diagnosis and disease prevention, but also improving treatment for patients by building more holistic solutions and by also supporting physicians and hospitals. But also here is a nice feedback loop. As you can see, these improved diagnostics allow us to gain additional insights in human disease and this nicely feeds back to the beginning of our journey. On the next slide, this is just a slide, I really had to sneak in before we kick off the event because it really raised the question, which probably is on the mind of many of our core participants and might come up later in the Q&A. When the IR team was trying to generate a cover picture for today's event using OpenAI's DALL-E program, we really tried -- initially, you see this on the left side, we tried to -- for protein flying over a circuit board. But however, we phrased this challenge, DALL-E was really not able to generate an image, which suited our needs. Admittedly, as you can see here on the left, there were quite some creative and really funny solutions. For example, I really enjoyed what looks like a measure of a protein chain with better fills, organized in the shape of a circuit board or the word protein or similar word just written on something which resembles the circuit board. In the end, it really turned out that degenerative AI was unable to deliver any picture, which only loosely resemble the [indiscernible] protein structure. However, then when we changed the request and asked for a DNA helix flying over a circuit board, we got thousands of useful pictures within a few seconds. So making this observation, but also watching recent AI successes, such as predicting protein structures with Chat GTP, the question came up, where do we really stand? And what is there to come in the next few years. And with that, actually with closing remarks, I would like to hand over to Alan here to kick off the event. Alan, please.

Alan Hippe

executive

#2

Yes. Bruno, thanks for the nice hand over -- the nice introduction. And certainly, I think it's great to see how you apply AI in Investor Relations, to facilitate how we bring the presentation alone. That's great to see. I think what your introduction also explained is that I'm surrounded by superstars, really with Moritz, with Kent, with Scott and certainly with Aviv. So I know that nobody is really waiting for me here. So that's why I have 10 minutes and just 10 minutes. And I'm promising that it will be crisp. We thought it makes sense to give a little bit of a view on what informatics does in Roche because if you like, we're generating the backbone for everything that you will hear about today. And it's a great pleasure to lead this organization. It's not really without effort and challenge. But certainly, it's great to be in the middle of great technologies. What do we do in Informatics overall? We drive platforms end-to-end in the company, and we drive integrated data into end in the company. That's what we're going for, based certainly on great technology and great knowledge and hopefully, that is brought along by the presentation. When I go to my first slide, you see really, okay, the technology trends driving the next wave of opportunities, and I don't want to go through them because you know them. I think that's pretty clear. And you see the current digital wave. So we are going through. I think we're all aware that these technologies have a huge potential to drive impact. And certainly, what is on everybody's mind at the moment is generative AI. And you see today out of the 20 cases, 7 are focused on that topic. You see that on the right-hand side of the slide, and you also see the other opportunities in healthcare, where we think we can make a difference with digital. And I think we covered the field pretty well with what we want to present today. Let me also say what I find really great about the presentation we're giving is, we're focused on use cases. It's not like that we give you a presentation about technologies in general. We really tell you what we're doing with it and where we're seeing the benefit lies. I think really, when we look at the industry as a whole, where do we stand and where does pharma stand? And I think, perhaps, worth to take a quick look, you see on the left-hand side, really an analysis that we have -- that I've also presented in 2021. In 2021, I spoke about that pharma is well behind and what you're seeing now in 2022, when you look at the Digital Acceleration Index is that healthcare and pharma is speeding up. You see really the second highest increase of the Digital Acceleration Index by industry now. So really -- and really with a high acceleration rate. And what is interesting is this analysis also shows that this has quite a profit impact, a positive profit impact, so evidently, now healthcare is moving into the right direction, and as we all know, we have the funds available. I think if we move in a certain direction and we're decisive, I think we can make great strides. And I think that's what's happening now. On the right-hand side, you see really what has happened with funding in Artificial Intelligence over the last couple of years. I think we're all not surprised that this has increased significantly. So lots of funds available. We will see how sustainable that is in the current environment. But at least we can say a lot of funds have been applied in the industry. So I think we can all expect great outcomes here. I think when we now ask ourselves, how is Roche tackling all of this and tackling these opportunities. And you see that slide, Slide 11. And I would say there are 4 messages coming from that slide. First is, I think we have a global footprint with informatics. And wherever we have a major site, might it be in San Francisco or in Basel or wherever, I think we have that co-location piece, and we are well-connected to our key sites. And then we have also some additional informatics hubs around the world that might be driven by knowledge, might be driven by cost. I think that's certainly then the complementing element, which I'm still -- something I have to digest is, well, we are spending more than CHF 3 billion on digital per year. And the informatics organization itself is spending more than CHF 2 billion per year. These are massive numbers. And that, I think, underlines and substantiate how serious we are in investing in digital. 23% of what we're doing of the informatics budget goes to software and cloud. We want to increase that over time. We want to bring the running costs down and invest more into the innovative stuff. I think we have made major strides, but there is more to do. And then certainly, we have 4,400 internal employees in informatics and a multiple of external ones. Good. Let me talk a quick moment about where we come from. And when I gave this presentation recently, I got a lot of feedback about that slide. Because when you look at the left-hand side, where we are coming from as informatics is really a pretty siloed structure. We had 4 separate units in the past. There was pharma informatics, diagnostics informatics, group functions informatics and global infrastructure and solutions. And what we're doing now and this is the transformation we're right in, at the moment that we're doing is that we are getting divisional agnostic. So we really bring informatics together. And as said, what we want to drive is not just the global infrastructure. That's what we have done in the past already, where we really standardized a lot and really connected it better. But now we want to drive the global platforms end to end, also from a process point of view, from a data point of view. And certainly with that, we want to create data lakes, global data lakes that can be mined and can be analyzed in a much better way compared to the past. And I think really, we want to make strides from finance to R&D here. And I think there is a great opportunity for us ahead. even 2024, we will do more here. Certainly, that comes along also with the topic of decommissioning and making our infrastructure and our whole setup much less complex. Do we have partners, certainly, we have partners, we collaborate with. And here are the ones that are more on the innovative side, if you like, that drive us, certainly, we work with a lot more partners, very clearly. I think recently, NVIDIA, I'm sure that Aviv will talk about NVIDIA. When it comes to drug discovery, Prescient is something that we have brought up and a nice collaboration here between gRED and group informatics. So I think that's something we are benefiting from. But certainly, we couldn't drive innovation in this company without external -- bringing in external innovation. Good. Is that all showing? Is that something which really makes a difference for the organization. And certainly, I would say, yes, but I might be biased. When we look really at external assessments, and you have 2 here on that slide, you see on the left-hand side, the former AI Readiness Index, which comes from CBInsight external. And they have ranked us #1 in the pharma AI Readiness. And they do that based on the assessment of 3 dimensions, talent, execution and innovation. And talent is certainly the ability to attract AI talent and to retain them. You see execution, really the ability to bring AI-powered product and services to market, and you will hear about that, which I think certainly is a key element, we're in business. And then I think, certainly about innovation, where we have really a very high score and that's really the track record of developing or acquiring novel AI capabilities. So I think with quite some pride, I think I'm happy to show that assessment. On the right-hand side, you show something which at least is equally important. Genentech, which is a major part of our informatics organization, has been ranked #1 and best place to work in IT by Computerworld, which I think certainly we take pride in and very clearly, it's testament to the exceptional talent and the collective expertise of our integrated informatics team on a worldwide basis. So I think we feel encouraged. So let me go to the use cases. And I have 2 and want to be brief about that. I think really, one is about generative AI. And how we use that in group, you will hear a lot of use cases and specific use cases today. And then certainly, I think our major investment into an ERP project, Enterprise Resource Planning, which is called ASPIRE. So let me start with GALILEO. GALILEO is really a program that we've established to drive forward our next-generation AI strategy and deliver on the best high-value AI opportunities and we base that on 4 pillars. Let me go through them. I think certainly, we want to support the use cases, which provides the highest value to Roche and to patients. I think that's the first one. And the second one is -- the next one, the second one is establishing a next-generation AI platform that allows us to build scalable AI applications, if you like, on a group-wide level. So really, you can tap into different models I will come to them. And then you can apply the models that work best for you in your workplace, if you like. That's what we would like to achieve because, well, let's face it, I think I see it as tools, which really create major productivity, if applied well. And we would like to give everybody in our organization the opportunity to do this. Third point is we prepare the organization to be AI-ready by upscaling our workforce. And that, as said, applies to everybody, you can get in contact with the GALILEO team and you can ask for advice, and you can get some really usable and tangible advice, how you can really improve your workplace and your outcomes by applying AI. And then the last point, a little bit overlooked. The first one is ensure we understand and use AI responsibly. I think we are in an industry with very high ethical standards, and we at Roche, we adhere to the highest ethical standards. And very clearly, I think when it comes to governance, principles, practicals -- practices for best support, when it comes to ethical, trusted platforms and solutions, I think certainly, we would like to guide people to get there and help them. That's taken it here to these standards. So how do we do that? Very quickly, I think, really, you see on the bottom of the slide, we start with solid data and an infrastructure as the foundation. You see really cloud computing certainly plays a major role here. And I can tell you our cloud cost next year, 2024, will go up quite significantly. And then on top of that, we are iteratively building the GALILEO operating system, so to say, that allows our developers to build AI-powered applications at a faster pace. Three parts to it, as you can see. I think on one hand; the model marketplace, which gives access to commercial, public and internally developed models, and you see really the list here. Then we have the retrieval-augmented generation engine that provides connectors to load data, create embeddings that can be used to ground the model's output and then I think last but not least, the responsible AI tools, as mentioned already, that to regulate AI applications and allow Roche to be compliant with external regulations and internal policies. In terms of generative AI applications, we have rolled out Roche chat. So we have a ChatGPT, which is in-house that allows Roche users to access this technology in a secure manner and certainly with full control about the data. And I think that's a key element to have. So we have our own Roche GPT. And similarly, we have rolled out the developer copilot that can write code, test case in their documentation to improve the developer productivity. I think that's my first use case. I hope you got a little bit of a feel how we approach it. And now let me switch over to ASPIRE. And ASPIRE that's just huge. I think let me say this is not just an IT project and informatics project. This is a whole business transformation program, which is running for quite a while now, now 2024 and 2025, will be the years where we have the major deployments around the world. You see on the right-hand side, the value chain processes and the enabling processes that basically we standardize and optimize in the company as a whole. These are backbone processes. When we started the project, we had 113 different processes in the company to pay an invoice. I think nowadays, we have 3 and that's also thanks to this program. And I think there is so much more going to manufacturing, going to other areas. But what a hercules effort it is, it demonstrates the left-hand side, where you see, first, is one of the largest S/4HANA projects worldwide. We have 500 colleagues involved. We will deploy in 200 legal entities. The investment in total is over CHF 2 billion. And certainly, basically, every user in Roche will be affected by this program. A little bit more in detail, but I don't want to spend too much time on that. I think you see the core of ASPIRE that we bring in. And certainly, that is, on one hand, the process optimization, the other piece is really the technology, but what we can do is we can bring many more technologies on top. And certainly, this will help us to apply Artificial Intelligence, but also other technologies to help us, to get more productive and to make Roche a less complex company that can apply complexity in other areas much better. Good. With that, I'm over time, yes. And I'm happy to hand over to Moritz.

Moritz Hartmann

executive

#3

Thank you very much, Alan, for kicking us off with these 2 essential use cases. Also from my side, a very warm welcome. I'm Moritz Hartmann, the Global Head for Roche Information Solutions, RIS, and I'm very pleased by the opportunity to present to you today, the direction we're taking with our Insights business. So I'll be focusing, as you can see here, more on the commercialization side and really speak about the use cases that we are as well commercializing. And before I go into the details of this RIS business, I would like to share with you a little about the bigger picture of the Insights business at Roche. As you know, we have a very strong footprint in pharma and diagnostics, and we believe that at the intersection of the two, is where lies the power of digital insights. And we brand that here, as you can see in that picture in the center, NAVIFY, which is a combination about navifying the data and actually verifying for decision support. With our broad healthcare expertise, we're in a leading position to bring healthcare insights solutions to our customers that are truly generating value for them, whether that's in the lab, whether that's in the clinical setting, to healthcare systems as a whole or ultimately to patients. How does that look in practice now? At Roche Information Solutions, we focus on 4 main segments: laboratory insights, then closely intertwined clinical workflow optimization and clinical decision support as well as remote patient monitoring. Those segments are interlinked and build on each other. In lab insights and with our position in Diagnostics, this is very close to our core business. And here, we want to become the digitalization partner of choice for our in vitro diagnostics customers. Our broad portfolio and already existing large footprint will allow us to connect all diagnostics disciplines in workflow systems and insight solutions. In the clinical space, we have the ambition to shape clinical workflows and lead the clinical decision support market. We believe we're uniquely placed to do that because we can derive insights from all the diagnostics disciplines and really provide multimodal decision support that we can as well then expand into modalities beyond diagnostics. Lastly, we also aim to support remote patient management to help patients, clinicians and health systems as they move to a more decentralized approach to care. Adding remote monitoring solutions to our offering, we can cover the entire health journey -- healthcare journey, spanning from the lab through the clinical setting and the home setting where we can provide a 360-degree view of the patient's journey. And across all of these settings, we have 3 distinct value propositions for our customers. First of all, medical insights. This is, of course, as a health care company, what our customers expect from us and where our expertise lies. Ultimately, we want to help our customers to make confident decisions and impact the care that they provide to patients. In addition to that, we're offering workflow solutions to improve the way that healthcare providers operate, be it in the lab or in the hospital. This is not only important from a financial point of view, but it's also the basis for a smoother and more integrated patient experience. Lastly, none of the 2 value proposition can really work without the necessary infrastructure in place and this is a need that we see frequently of our customers and providing that and recognizing that as a fulfillable and also commercial need, really helps us to not just set our customers up for success, but also drives then the adoption of our operational and medical Insight Solutions. All of these value propositions are reflected in a way how we have built our portfolio. And as you can see, we already have quite a comprehensive portfolio in market are at late stage and across these value propositions. Our most mature products can be naturally found in the lab setting, but we're rapidly expanding into the clinical setting and we have also first successes with our products in the home setting. First, I would like to talk a bit about a product we call NAVIFY Integrator that really forms the basis of our entire portfolio. As mentioned earlier, to derive insights, we need to make sure that data can flow and be accessed for -- by and for our products. In that sense, the integrator is really the WiFi of our portfolio. You notice from your daily smartphone use, you cannot download or use any of your apps if you're not connected to this WiFi. And this is exactly what the NAVIFY Integrator does. In addition, it provides customers a gateway to our portfolio through the NAVIFY portal. And this marketplace is like an app store where our own and as well as third-party applications can be accessed through. The -- so that is the NAVIFY marketplace. The NAVIFY Portal actually allows our customers to then access their apps and it functions like your display on a smartphone where you can really organize as well your applications and your workplace in a way how it best serves you and how you as an individual user best like to be set up for the use of our solutions. Now through this portal, then you access our entire portfolio of applications, whether that's operational excellence or a medical value portfolio, and it includes as well third-party solutions. Now let's move to the operational excellence portfolio that today specifically focuses around the laboratory. And what you see here on this slide is really the flow of a blood sample through an entire lab and through its process as it goes through from sample retrieval to a result that is delivered to a healthcare professional for decision-making. Our digital solutions covers the entire journey with 5 different products and some of which I'd like to highlight here. Starting with sample tracking. It's important to note that there is about 0.5% of all tests that have an error today. Now 0.5% sounds really small. But if you consider that there is only, by our customers, 25 billion tests are being made, and it's globally more than 25 billion samples that are being processed that is actually amounting to roughly 100 million errors a year in the lab. And that translates to healthcare costs of USD 15 billion. 2/3 of all these errors have their root cause before the sample even reaches the laboratory. And that is what we're tackling with the NAVIFY sample tracking that allows our customers to optimize the pre pre-analytic operations from the sample collection to transportation and the reception in the lab. There is already great solutions out there. And often, these are as well very locally tied to systems, for example, transportation systems. So we have built this solution in form of an open API solution that also connects all of these existing offerings. But really has as a differentiation, the integration in the lab process so that when the samples arrive in the lab, the sample is -- and the patient is already known to the lab and to the respective process. With inventory management and monitoring as well as control, these are different solutions that help to address some specific needs as their names indicate to control the manufacturing process. But I would really like to go here to NAVIFY analytics because this is a true insights product, and it takes data from across the lab and provides lab managers with an easy way to identify operational trends and challenges. Some of our customers tell us that they actually weren't even aware of issues until our analytics software outlined into them. And they have a huge impact with just sometimes small adjustments in the lab process. With this product, we become a true digitalization partner to our labs and to help them harness the power of insights and to make a difference in their respective environments. Let's move now into the clinical setting and our offering called the NAVIFY Algorithm Suite. The Algorithm Suite is a one-stop shop product for medical algorithms. This platform can be seamlessly integrated into the hospital's EMR system and provides healthcare professionals with a library of verified Roche algorithms as well as algorithms from partners to use in their everyday practice. We are constantly -- this is a different slide order. I'd like to share with you here -- excuse me, I was on the wrong slide. So this is here -- we're back. This is -- what you can see, here is our algorithm menu that we're constantly expanding and that you can see, is a real mix of both Roche and third-party algorithms. Our main focus today has been on oncology and cardiology, obviously, 2 areas that we are particularly strong in. But we're continuing to grow and expand as well in particular, in the area of chronic diseases, including kidney as well as infectious diseases. A little further down the line, we will also have panels for women's health and neurology. Of great interest to us is, as well, that some of the algorithms have been developed using machine learning technologies, such as ColonFlag. This particular algorithm has the additional benefit that it also applies machine learning at an individual patient level. So that the longer the algorithm knows the patient, the more accurate it actually becomes. I would also like to deep dive on another algorithm example that clearly demonstrates the way how Roche products interact with each other as showed at the beginning of the presentation. And this is the GAAD algorithm. The GAAD algorithm supports the diagnosis of early-stage hepatocellular carcinoma, or HCC. HCC often doesn't show symptoms until it's in advanced stage, and it's therefore, the majority of the HCC cases are diagnosed only in this advanced stage. Survival rate at this stage is very poor at less than 5% 5-year survival. While in the early stages, 5-year survival rates are up to 70%. There's a strong need to detect patients early in order to improve their outcomes and regular surveillance can help with that. The international guidelines recommend testing for HCC every 6 months in risk groups using a combination of ultrasound and serum AFP. Ultrasound may miss more of half of the early-stage HCCs and therefore, other methods of diagnosis such as CT scans or MRIs are used and are used when available. The GAAD score on the other side can easily calculate based on patient demographics and the measurement of blood-based tumor markers using a very small sample of a blood draw. So the GAAD score is a very practical and effective surveyance test, especially where ultrasonography and equipment and trained radiologists are scarce or expensive. Its use can lead to improve -- to improving the effectiveness of cancer-control programs and delivers a great value to patients as their potential early detection dramatically increases the outcome of their therapy. I'd also like to give a brief outcome on the latest expansion of our portfolio, remote patient monitoring that particularly addresses the fact that healthcare systems are moving increasingly to provide care in the home setting. Here, we are currently exploring a multi-disease solution that allows secure and scalable -- that serves as a secure and scalable platform and it enables care for patients wherever they are. These solutions aim to work on patient engagements, for example, through care plans or symptom tracking through care team workflow optimization when it comes to, for example, prioritizing patients or detecting longitudinal trends. And it also allows to provide actionable and on-demand care pathways, for example, for risk prediction or treatment optimization. All of this will be enabled through data that is connected, both from Internet of Things devices as well as digital biomarkers, integration in EHRs and laboratory information systems. And that we do through the NAVIFY Integrator, which I shared as one of the foundations of our portfolio. As mentioned, this is still in the build, and we are already commercializing one very exciting part of this of this journey and the digital biomarker for Parkinson's disease will actually be shared in more detail by my colleague, Scott, later in this presentation. Lastly, I'd like to speak about the importance of our business model and how we believe we are able to increase the revenues from our digital portfolio. First, as we enter into the development of our digital solutions, we have built a process that very early assesses the willingness to pay of our solutions on the customer side. So we work closely with customers to identify the value that these particular solutions add to their work and assess their willingness to share that value as well with us. Secondly, we're already seeing that our digital solutions and products are offering competitive advantages that allow us to win with our customers, in our respective pharmaceutical and diagnostics business, and we have also sold a number of our products in a stand-alone offering. Therefore, we are now increasingly making sure that we monetize separately wherever we add value through our digital products and solutions. We're also ensuring that our products are built in a modular way, so that we can as well deliver the individual value propositions for our customers in smaller and easier to deploy products instead of delivering very large and complex solutions that have a lot of additional benefits without being monetized. And as every good software company, we are selling these products as a service and on an annual subscription basis, which then creates a reoccurring revenue from our customer side. Outside of this, we also have components that we are providing free of charge to our customers, together with certain products that enable our portfolio and that ensure the stickiness of our digital solutions to our core offering. I hope the presentation has given you a comprehensive overview of where we stand today with our Insights business and portfolio offering and would like to thank you. And with this, I hand over to my colleague, Kent, who is also connected to what you have seen with how we integrate with our systems, speaking about how global operations then adds value through the use of AI and Machine Learning.

Kent Kost

executive

#4

Thanks very much, Moritz. And a big welcome from my side to everyone. I am Kent Kost, head of operations, Global Operations for Roche Diagnostics. We've heard from Alan on the value for laying the foundational infrastructure for our digital platforms. And we've heard from Moritz on how to enable lab efficiency and clinical decision support. I'm going to shift gears now and talk about briefly about what's going on in my world in operations. So let's go to the next slide. There's 4 use cases that I will cover, and it really spans the entire spectrum of plan, source, make and deliver, and that's how we think about it, making sure that we're looking at this from a true end-to-end perspective. So let's go to the next slide. And here, unlocking the value of digitalization at each and every step of the process. So in planning, we're really striving to dramatically increase our forecast accuracy and do it far more efficiently. And we believe that doing this, it will allow us to fundamentally lower our inventory and drive down our write-offs. On the sourcing side, it's a fascinated connection getting real-time Roche connection back into our suppliers. And we do this for a couple of reasons. Number one, we want to be able to transmit to them what we think our latest demand signals are. And on the other side, in the use case, I'm going to talk about is how we fundamentally improve our approach to risk management and then address it preventatively. On the manufacturing side, this one is absolutely fascinating, and I could spend most of the morning talking about what we're doing in the digitalization of our manufacturing processes and it's all driven by improved efficiency. In many cases, we combine this with our lean programs. So we digitalize that and then we go in and we lean out the process. And one of the case studies, which I won't go into detail, but it shows the power of this, is we actually have one of our manufacturing lines where we doubled the output, we fundamentally doubled the output without changing a single piece of hardware. So it just shows the power of what we can do in operations. And then finally, on the delivery side, it is all about making sure that we deliver to our customers on time. We do so in the most efficient way, and we've got the digital tools to allow us to do that. Let's go to the next slide. I'm going to take a deeper dive now on use cases on each one of these. And I think as Alan indicated at the onset, these are not theoretical approaches. This is a real tangible results that are driven across our value chain. So the first one is -- was in the planning area to establish a machine learning engine prime series forecast. And there's a couple of things that we've combined. So we've taken classical statistical modeling, and we've combined that with modern machine learning algorithms and, of course, one of the biggest inputs to this is the real-time customer demand patterns. We started to roll this out to multiple countries where we get the demand signals and two of which are depicted here, and I didn't cherry-pick these. These were just 2 very representative examples where you can see the adoption rate under the automated forecasting. So in the first example, adoption rate from January to October of this year was roughly 7x and we've improved the forecast accuracy by roughly 14%, which is certainly a solid step in the right direction. On the next hand, the adoption rate was a bit modest required with more training and some work on the interface, but we've increased the adoption rate of automated forecasting by a factor of 3x and the forecast accuracy has been -- went up by 17%. So it's -- we're seeing absolutely steps in the right direction and we see a significant benefit in both lowering inventory and reducing write-offs. And again, it's early days. But I expect that we've got roughly a 10% improvement in the near term in both lowering our inventory as well as the similar amount in reducing our write-off. So a really good step in the right direction. Let's go to the next slide. Sourcing and risk management. Why is this important? I want to take you back about over the last 18 to 24 months where many of you probably remember the tremendous challenges in the electronic component industry. And there was extreme shortages. Lead times went in our world, some of them from 4 weeks to 12 to 18 months and prices increased exponentially. So we had examples where price increases, lead times were going out and price increases were up by a factor of [ 20x ]. One of the things that we've done is we've introduced risk monitoring for our suppliers. The traditional approach was a paper-based, typically done on an annual basis. What we've got is a fully automated real-time solution today. So we've got more than 1,800 suppliers that we're monitoring daily. And we're actually taking that a step further now, and we're looking at category data -- looking at different commodities and looking at the risk profile that changes rather frequently surprisingly. We've got now -- with this automated monitoring, we've got a 7x higher probability, proactively prevent a supplier risk event and we put this into practice in the example, but I talked about the electronic component shortage and we've successfully navigated that. So really good progress on the sourcing side. Let's go to the next slide. And this one is actually one of my favorite examples. It's a Point of Care sensor. This was a sensor that we acquired via acquisition several years ago. And you can see, I'm going to take you to the right-hand side where you see in 2015, yield was roughly 6%. Now 6% is not a very viable product. It puts tremendous pressure on product supply. It puts pressure on quality and obviously we see enormous pressure on cost. So that was the starting place. We took a step back and we said, something -- let's digitize every step of the process. And so we did just that and when we started collecting data, data that we knew was likely to be important, equally important, started collecting data, but we had no clue that it was going to lead to anything or not. But we had this idea, it was probably a multivariable effect. We put this into practice. And you can see that we've got roughly 8 years of data now and we've driven yields from 6% to 80%. And early on in the process, we actually had some external experts come in and look at this particular manufacturing process, and they said, you'll never get it above 35%, and we're routinely now in the 80%, which has driven our manufacturing cost down dramatically, 50% reduction in manufacturing costs. Our customer complaints are nonexistent. And obviously, customer satisfaction has improved substantially. I think the important thing out of all of this, is that this laid the foundation for improvements across our entire manufacturing network. So we've got a next-generation sensor going along and we've been able to take a lot of these processes and practices and embed them into our next-generation design, which gives us an efficiency gain right out of the study. Let's go to the next slide, all right. So on the delivery side, I'm going to talk a little bit about our delivery mode optimization. Now why is this important? It's important for a number of reasons. Number one, we live in a competitive environment. And, obviously, on-time delivery is essential in my world. Distribution costs are a significant cost driver generally running. It depends upon mix, but it will run between 3% and 5% of revenue. It can be extremely volatile. So we obviously saw distribution costs very dramatically over the course of the pandemic, but also it can be driven by natural disasters, political events, which will fundamentally change how you want to optimize your mix. Now one other thing that I do want to mention is that air freight is significantly more expensive than sea freight. Roughly -- in our world, it's roughly 7x more expensive -- to ship by air versus shipping by the sea and it also has a major impact on our CO2 emissions. Of course, there is a downside to sea freight, and that's the element of time. So obviously, when you put something on the ocean, it takes a bit longer, and so you tie up cash a bit longer. And so finding that optimization is essential. So we have built a system, and it's actually unique to Roche, where we've got to optimize, always select the right mode. It may sound rather simple, but there's more than a dozen variables that go into this. So it's things like weight, volume, contracted costs, cold chains, shelf life, customer requirements, et cetera. And so there's a number of different factors that go into this. And we've had some dramatic results. So we've increased our sea to air on an absolute basis by about 8% and -- it's led to an 18% year-over-year cost reduction and a 34% reduction in our CO2 emissions. And so we're really bullish on further developing this capability. So let's go to the next slide. So I talked about the manufacturing example, and I talked briefly about we are going to digitize everything. Some of it is what you know, and some of it is also what you don't know, but it's going to lead to more predictive analytics that's going to streamline our processes and obviously then improve both speed of execution and cost efficiency. Thanks very much for the time today. And with that, I will turn it over to Aviv.

Aviv Regev

attendee

#5

Okay, and let me share my screen. And there we go. And so I'm Aviv Regev. I'm the Head of Genentech Research and Early Development. And I'm going to tell you today about our mission to transform drug R&D through something we call lab-in-a-loop, which is the way in which we combine experiments with algorithms. Now I'm going to start with a slightly personal note. As Bruno said, I came to gRED about 3 years ago. And I was motivated by a very particular effect. And this is that every step in making medicines is very hard. It's hard to infer the right target and then it's hard to generate the right medicine and to predict the right dose and to go to the right patient. And as you know, the result of that being so hard is that not only does it take more than 10 years to develop a single medicine, but more than 90% of drug candidates fail in preclinical research or in clinical trials in this industry as a whole. And as we are in digital day, I think it begs to compare that to the amazing performance of their favorite AI methods on many different kinds of problems, like vision or text or speech that you know just a decade ago were deemed just as hard and had horrible failure rates. And so you can't help but feel like there is something that can be done. And that feeling is actually what brought me here. Now there's a reason that these problems are so difficult, and this reason is scale. So there's thousands of different cell types instates in the human body. There's about 20,000 genes in our genome. There's 4x10^8 possible hypothetical variance for these genes and maybe 10x13 ways in which even the ones that we know about could hypothetically combine. Now of course, the vast majority of these things never actually happen. Our problem is that we can't just naively predict from, say, looking at a human genome, which ones would occur, what happens if they do and if it matters in disease. And then when you start thinking about medicines, these numbers get even bigger. And so there's about 10x60 possible drug-like small molecules, maybe 20x32 different relevant therapeutic like antibody sequences one might want to consider and billions of people and about 10,000 diseases. And so these numbers are big. Some of them are actually bigger than a number of atoms in the universe. We know upfront. We can test all of them in a lab or in a patient population. So it's not surprising that it's hard to find targets and to make drugs. But in the last decade, there have been multiple scientific breakthroughs that should really make a big impact and a dent in these numbers now. And so we believe that there's 4 such catalysts. We call them levers that can fundamentally change the picture. The first one is human biology. That's our ability to study disease processes directly in patients or their samples or in human-derived models so that human becomes our model organism. The second -- what we call high resolution and massive scale lab methods. They give us data across many targets at once. But at the depth that before we could only do for one target at a time, and they do this for only in marginally added cost. The third are the advances in therapeutic modalities that help us tackle unprecedented targets, that were not really addressable before or in ways that are much more efficient or better or safer for patients. And what ties this all together is when we pull our final lever of Machine Learning in AI. So these algorithms can take the data that we have from human biology at high resolution and massive scale and across different therapeutic modalities and they can finally span these massive numbers and help us discover targets that we wouldn't find otherwise, make new and better molecules for [indiscernible] outcomes better and increase our capacity and speed at the same time. That's our hypothesis. But -- each of these levers on its own is actually what not enough. What we need to do is find a way to put them together. And we do this in a unique way that we call a lab-in-a-loop. So at the basic level, the loop is very simple. We start with an experiment. We collect data ideally at high resolution and massive scale. We train a model on the data and then we use this model in order to predict the next set of experiments, and we basically iterate this loop. This sounds very simple. And as the algorithms get better, it also -- you almost can think of diagnosis, a self-driving lab. But you can only operate this if you work at the right scale. And for this, we have to change not just our models and algorithms, but also the way we think about science in the essence of our work on algorithm, and in the [indiscernible]. And so in this talk today, I'm going to show you different ways in which we can use this loop across our R&D work in gRED. And so first, I want to take you to look at when we start with biology and therapeutical processes and we wish to learn more about disease mechanisms or targets. So biologies are entry point now and the exit point will be something about our disease mechanism. So one central set of questions that we always have to answer, is what is my target molecule in cell and where are they? We need to do this in order to choose disease indications, in order to find combinations of targets for things like bio multi-specifics, in order to predict on target toxicities and for many, many other problems. Well, the question is how can data and algorithms help us in doing this. So for this, we're going to take a page from how Google does search. So first, when you want to do search, you actually have to index the web. You take all the data out there in the Internet and you make them searchable. By analogy, we built a tool called SCHub that has cell profiles, for more than 200 million cells from over 2,000 studies, high resolution and massive scale, and that spans hundreds of diseases from all over the body. Okay. So now we have a massive world to search in. We want to find a cell of interest. Now, unlike in regular search, where you can just type the name of something, we may not even have a name for the cells that we want to search. So what we want to mimic is actually something called reverse image search. So imagine you wanted to find a particular pharma CEO. You don't actually know their name or where they work, but you do have a picture of them. All you have to do is that you would upload photo into reverse image search by Google. And it will find you their name and plenty of other photos and website about the new stories and so on. So we basically use the same style, the same architecture of algorithms now, except that we have a reverse cell search. We input the profile of the kind of cell that we lack and we get their name and wherever they are in the human body across all 200 million cells in our index. This is what we call a deep-metric learning architecture. And once you train the model, just like the Google reverse image search, all you have to do is bring your new query cell, click a button and in a fraction of a second, voila, here is your cell. Now why the searching for cells actually matter for our portfolio. So a recent example is our Vixarelimab program. It is an anti-OSMR antibody. We originally in-licensed Vixarelimab based on our data that already suggested that signaling through OSMR in fibroblast drives fibrosis in interstitial lung disease. But now, what you want to know is where might there be other fibroblast that could use this same OSMR signaling mechanism. Well, if you actually reverse cell search for the OSMR fibroblast, each of these blue circles is a tissue and each of the inner circles is actually a particular condition where we can see these cells. So if you actually search for them, one of the places you definitely find them in, is in the lung and the conditions are different interstitial lung diseases just like we expected. But in addition to this, you also find them in the gut and specifically in IBD in a different subset to fibroblast that are called inflammatory fibroblasts. And on top of that, OSMR, when you look at human genetics data, is associated with IBD risk by genome-wide association studies. And so from these data, we formulated a second therapeutic hypothesis, the targeting OSMR using Vixarelimab in IBD would look inflammatory pathways that drive IBD disease. I hope you can still see my screen because it disappeared on my own screen. Can you still see my screen? Yes, hopefully. I can't hear anything. So that drives it in IBD and that blocking these pathways would bring a benefit for patients. And based on this, we actually launched a separate Phase II study in this indication with no additional lab experiments, nothing in animal models only data in [indiscernible]. Okay. My next example is when we start the loop, with massive scale, high resolution experiments, and our goal is to find better molecular pathways and mechanisms. So there is now a very large toolbox of approaches that we have established in gRED, which have one underlying principle in common. We can do very large-scale functional screens and at the same time, read very complex, high-content readouts. It can be cellular profile, like what I just showed you. It can be images at the individual and even the tissue level. And we can do these screens in cell cultures, we can do them in organoids and we can do them in animal models. To illustrate what we can do with these, I'm going to start with a simple example using one of these methods that we call Perturb-seq. It has a pooled [indiscernible] and uses single-cell RNA-seq as the readout to characterize the function of large numbers of genes. So in this example, this is a large family of E3 ligases in the innate immune response in dendritic cells. Dendritic cells are important for both inflammatory disease, and cancer immunology and our great interest for our scientists looking for target. This study had about 1 million cell profiles in one experiment. More than 1,000 genes were perturbed in the cell's naturally spanned multiple kinds of dendritic cells and macrophages. So we only can determine the impact on multiple cells at once. And once we collect the data, we first use a machine learning algorithm to fit a model of the regulatory circuitry of how the cells are actually being run by these genes. And this model I'm showing you here in a simplified form, it connects the E3 adapters, substrates and so on into these co-functional modules of genes that when we perturb them, have similar effects on the response of the cells. And it organizes the responding genes in the cell into programs, and those programs capture different aspects of immune function, the response to LPS, the presentation of antigens, ER stress and more. And now we have a full map of what all of these E3s are actually doing in the immune response in the dendritic cell and for every step in the life cycle of dendritic cells, we now have E3s, adapters and downstream transcription factors that control each aspect that our scientists can now aim to manipulate in order to generate the desired effect. That's the first step in order to get to targets. That's already very useful for our scientists, but the algorithms actually go further. Next, they connect for our scientists, these large genetic experiments that were done in immune cells in a dish with human genetics, that actually happens in patients. So this second algorithm tests which part of what I found in a dish by machine learning, that were constructed for me what happens in sales, actually explains the heritability of inflammatory and autoimmune disease risks so that I know that this will translate to patients. And then finally, the models do one better, we can learn a model that predicts what happens when we perturb multiple E3s in one cell. That is we give the model as an input to training cells that were perturbed in 2 genes or more. We train a deep variation of auto encoder this time, and it can then predict for us the outcome of other payers of perturbations that we never measured in an experiment. And that's very important because you remember those big numbers, we will never be able to test all the combinations in the last year. Now we've taken these kinds of approaches into our research portfolio programs. So this project is actually in collaboration with recursion, Alan mentioned a bit earlier, where we focus on two disease areas. One is colon cancer and the other is neurodegenerative diseases. So here, we use two kinds of rich, a massively parallel phenotypes. One is [Perturb-seq], just like I showed you for E3s and the other is cell images. We also use to different kinds of preturbations. One is genetics, like I showed you for the E3s, the other one is small molecule perturbations. So we're now screening for new small molecules. And in this way, we don't only identify targets, but we see simultaneously also find small molecule hits, that can enter our portfolio projects. So now I'm going to move from biological discovery to molecule making once we have a [target]. The same idea applies. In each time, we have a look -- in each look, we generate data in the lab, for example, on large molecules and small molecules or mRNA vaccines and so on. We use the data to train the model. We use the model to generate and predict the properties of new therapeutics, and we make and test them in the lab again. That yields more data, and that allows us to iterate the loop, both to reach our goal for a particular program, and to make a better algorithm that can be used across all programs. And I'm going to give you a couple of examples. So the first one comes from small molecules. And our method here is called [GNE prop]. This is an encoder classifier model, and we first train it with the results of the high-throughput small molecule screen of something. For example, of 1 million molecules assessed for antibiotic resistance, sorry, for antibiotic activity, this is the example that I'm going to show. Then we use the trained model is an Oracle, that I'm going to show a virtual molecule, and I'm going to ask my Oracle to predict whether it's going to be active or not. And in this way, we can screen billions of virtual molecules, predict the activity of each of them with my trained Oracle. And Oracle, we propose to meet new small molecules that should be active, and then we're going to synthesize those in the labs, test them and iterate. And so in this real life example, I'm going to take you through one cycle of this loop. We trained the model on 1.2 million small molecule high-throughput screening, atypic screen for antibiotics screen was done in 2017, well before the algorithm existed. Then we use this screen into -- we use this model as an Oracle in a virtual screen of 1.36 billion molecules for their activity. The algorithm predicted 345 compounds is active. We made those and we tested them in the last. And 82 of them or 24% were active. This is about 50-fold better than the heat rate, compared to the original approach when an expert medicinal group of medicinal [chemists] are best and brightest major choices. And even more exciting than that, more than 1/3 of these 82 molecules were actually with new scaffolds that we're knocking the training that were knocking our screening [indiscernible]. And the algorithm also predicted correctly when major changes in activity occurred, even when it was actually only a very small change in the molecule. This is something known as activity clips, which is very difficult, both for human medicinal chemists and for algorithms to actually predict for it. We use a similar lab in a loop for antibodies. We start with antibody sequences. We use them to train a model. We use the model to design a new antibody sequence, then we make those antibodies in the lab, test their properties, and we iterate. Now we already heard again briefly from Alan, just over 2 years ago, we acquired what was then a tiny-tiny proto company called Prescient Design. It basically consisted of the three founders at the time, and it was pre- [indiscernible] stage. And then we grew them ourselves, and they became our machine learning for drug discovery accelerator. And together with our antibody engineers, they built the lab in a loop, for antibody machine learning and drug discovery. So we have gone through many cycles of the loop already with Prescient and the antibody optimization algorithms have gotten better and better through these multiple cycles of our loop, and through development of multiple algorithms to tackle different goals that you have when you make an antibody. So when Prescient started, we actually have signed them for practice targets. But by now, this has become part of our portfolio projects, while continuing to improve the algorithms. Doing these iterations is only possible when you have both ends of the effort in your hands. Computational and experimental, and they can work fully together, and fully transparently on all data in all ways. This is something we cannot do simply with an external partner in the same way, and that's actually why you chose to make an acquisition and invest in this heavily, internally. Now Prescient developed the whole zoo of models and methods, some of them focus on optimizing an antibody that we found experimentally, and others are generative AI that makes molecules to [indiscernible]. Now Bruno showed you that it's not so easy to just ask [ DALL-E ] to make you a protein, but you can use generative AI to make protein. So one method that I'm going to -- I can describe today is called [indiscernible] diffuser. And it allows us to do fast prediction of antibody structures. So just like the [indiscernible], which we use for the small molecules, we like to use algorithms as oracles in virtual screens for antibodies, too. But here, you have to generate a very large number of antibody structures, not just sequences and that is a lot more computationally heavy than generating a lot of small molecules virtually. Now even in silico, this takes time. I'm sure you've all heard and probably use [indiscernible], where machine learning is used to predict structure from sequence. But antibodies are unusual. They have these regions that are super variable, so you can compare them to anything else. And running just one structure for an antibody can easily take you about an hour. Now if you just want antibody, that's not a problem, but if you want to do a huge virtual screen, you add to that's actually not feasible. And so one advance that the team made was to combine new course grain models, something called geometric deep learning, together with diffusion models in order to generate protein structure predictions, and I ran one live for you right now, and design more [indiscernible] antibodies. And the accuracy of these models is state-of-the-art, but there are 1,000 fold factor. Now diffusion models are the architecture behind things like DALL-E. So I did want to tell Bruno, that Gen AI can make you awesome new proteins. And not only do they look nice on the screen, they actually express when you make them in the lab, and they can even bind their desired targets. Okay. So next up, I want to turn to an example in our patients. And for this, I'm going to switch to autogene cevumeran, which is our personalized neoantigen cancer vaccine, which we're developing in collaboration with BioNTech. I'll invite you to think of this example a little bit at the clinic in the loop. With cancer vaccine, we aim to target the immune system to recognize neoantigens that are unique to each patient's tumor. And so a patient's tumor is sequenced, and then we use an algorithm to select neoantigens from the sequence, and then a personalized vaccine is synthesized and is given to each patient. For the vaccine, it's absolutely crucial that we choose the best neoantigens, which means they would be presented on MHC Class 1 and they would elicit a good T cell response. And for this, we incorporated amongst many things, a transformer-based model, transformers are the family of models that are now very popular as LLMs or large language motives. Now these kinds of models are showing superior prediction performance, and generalize across dozens of MHC [indiscernible] when they're trained well. And what is also [cool] kind of a business side is that once we have a model architecture like this in place, we can sometimes reuse or extend it with some modifications to other related applications that we need. So one example of this is moving from prediction of MHC class I presentation for the cancer vaccine to MHC class II presentation, which is related, but it's a harder computational problem. And if we can predict MHC Class II presentation, that actually helps us tackle antigen -- sorry, antibody immunogenicity, so that we can predict it, and then we can engineer it out of our antibodies before we ever had to patients. And so I showed you many examples of our lab in a loop, and I want to turn to this loop one last time in thinking about the next level, at which we try to tie this loop together in how our algorithms enable our creative scientists. So we want to be able to give the scientists the strongest and most transformative tools in their hands. I showed you our machine learning in general and generative AI in particular, are at the center of this loop to discover targets, generate better molecules, but there is more that we can still pull in. On the data side, we have these massive volumes of written knowledge in old slides and lab notebooks, electronic lab notebooks, fortunately. In nontext-based experimental results in images and gels, and you name it, it's there. On the algorithm side, there's now fantastic tools to build foundational models, not just of antibodies and sequences and cells like I showed you already, but also of our human knowledge, like multimodal LLMs. And from such data, they should enable our scientists to reach their maximal creativity. And so in the beginning of this year, as Alan pointed out, we leveraged our existing investment in talent in this area, to launch an effort for all of Roche, to train our own large language models on a combination of public and proprietary Roche data. And this means text data and multimodal data from the notebooks, the large data sets imaging more. And then we're going to close in the next 2 minutes, by telling you a little bit about it. So in Roche, our LLM approach relies on three pillars: The first is Galileo, which Alan already described, it's in the use pillar, where we're working to deploy existing public launch language model in any future private LLM into our work, and fine-tuning the existing LLM for tasks that are important for us. The second is that we're architecting and training new LLMs, where we use internal data in the training of text-only and multimodal mode. And finally, in the third, we're developing new algorithms to solve key problems by incorporating an LLM component in other algorithmic setups, like prompting models with data or using autonomous agent. So in terms of using just from an R&D perspective, you've got a broader one from Alan. LLMs are already rapidly impacting everything that we do. So this includes things like writing or optimizing code and rotating cell types, summarizing and searching the scientific literature, creating talking points, suggesting analysis methods, writing lab protocols, even checking clinical symptoms. You can see some of the snippets from a flag channel, a real flag channel inside [indiscernible]. That's already great, but we think our true advantage will come from our training of new LLMs that incorporate Roche and Genetics proprietary data. This includes our documents and our lab data. Our Prescient team are experts in LLM and other generative AI approaches, because these architectures are actually the basis of all of these great molecular designs that I showed you. And so that allowed us to hit the ground running. And we've already trained and tested both 0.5 billion parameter and 7 billion parameter model. And we have a 30 billion parameter model that's under testing now. These models are trained from scratch, which allows us to emphasize our own data and biomedical corporate in general, because those are more important for our application. And we're in second round alpha testing across all parts of Roche. And we're observing already improved performance over commercial and open source models for the kinds of specialized tasks that are important for us. We also believe that the most exciting component will be including our model to address further scientific questions that help drive the lab in the loop. So for this, we plan to prompt the model, not only with human language, but those with experimental data and with autonomous agents. And then the model can answer with human language to a human user, but you can also provide input to the autonomous agent, so that autonomous agent can continue doing work and help the scientists as they progress as their co-pilot, I guess, for the lab. To empower our scientists in their search for targets and formats. And so today, I talk to you about our strategy in AI for research and early development. Through my examples, I highlighted our three pillars of differentiation. First and foremost, -- sorry, my slide advanced too quickly. First and foremost, our lab in the loop. AI does not stand alone and our power is in the ability to iterate with experiments again and again across all aspects of R&D up to the self-driving labs, one day. Second, in data, you need both scale and resolution. And we have the data generation capabilities to work like this. In the world of AI, quantity becomes quality, and it pays off to be big. So we're maximizing the benefit of our large size, our proprietary legacy data and our ongoing data generation capacity. And then finally, we make sure to have the right partnerships and the right opportunities. In some cases, it's about acquiring and then investing and growing like we did with Prescient Design. This ensures that we have the top capabilities in-house to run our look with full transparency. But we also seek partnership with the best out there for unique data generation capacity, for unique hardware for other specialized expertise, and the latest example of that is the partnership that we announced recently with NVIDIA to actually bring together the power that we have in our lab in a loop, together with NVIDIA's, both infrastructure, resources, and technical expertise in order to solve these very challenging questions for patients. And with this, I will conclude, and I'm going to turn it over back to my colleague, Scott in [indiscernible]

Unknown Attendee

attendee

#6

Great. Hi, everyone. Pleasure to be here and glad you can make the call. So today, I talked about how in pRED, we're transforming drug discovery through data and analytics. And as a first step, I'll talk about our overall approach. We do it in two ways. One, on the left-hand side, how do we use AI, ML and automation to run that R&D engine, and that lab in the loop as Aviv highlighted as fast as possible. How do we learn from those insights and the data that we have to make the next best step in molecule design. And then the second, on the right-hand side is more about how do we change that loop? How do we change how we do drug discovery, what are those smart bets that we can make? And here, I'm going to highlight a few examples in both categories. So the first off, I'm going to cover -- Aviv covered our discovery area so well. I'm just going to throw in one example there of what we're doing in pRED. And then I'm going to give three examples more in our early development space. The first being how we run that lab optimization loop. And here, we've got an example with our ML Ops environment. We've got starting at the top, about 150 models running in our ML Ops environment. But leverage all the scientific content we have available to us for a small molecule design in both internal IP and also publicly available. And we use those models to predict -- what are the best molecules and properties that we want to optimize in our Design Hub platform. So this is an environment where project scientists can work together, to really just brainstorm and give each other feedback, these are the best molecules we should synthesize next. But once they've decided which molecules I want to move forward into the synthesis phase, you have to actually come up with a synthetic pathway of actually how do you want to make that molecule. We leverage chemical by AI and some internal tools to leverage all the publicly known chemical reactions, and the millions of proprietary chemical reactions we have available to us within Roche, to really design that best synthetic pathway, both from an efficiency time perspective but also from an overall yield. Once we've made the molecule, then of course, you have to test it. And it's usually not just one test such as a potency assay, but it's selectivity assays, ADME properties and others. We track that and enable it through a whole workflow cascade. You can almost view it like a decision tree. If one assay gives you the result you want, you might kick off the next set of cascades of assays. Here, it automates that whole process in terms of the planning, ordering the reagents in place, but also helps us track where are we on that assay progress. Once we've gotten the results, then we view them, of course, on our D360 analytics environment, but that's not all running in that lab on the loop. Majority of the models in ML Ops takes the data that we're generating every day, rebuilds themselves so that it can give the next best set of predictions for the next optimization route. Really -- so we leverage ML and AI, not only for the design and the prediction, but also automating that process as fast as we can. The next example I'll give is in real-world data. So I'm proud to say that Roche is by far the leader in how we leverage real-world evidence more than any other pharma in the world. And we do that in strong partnership with our Flatiron sister company. So in this particular example, we had an oncology trial, where we designed it in a way that we were trying to find the right patient for the right molecule that we had, designed at the right time. And we found that we had a slow recruitment rate. But using real-world data, we were able to recognize that if we were open up the scope a little bit in terms of enabling certain pretreatments, we are able to have no significant change to the patients progression-free survival rates. So we could actually increase our patient population by 15% and it doubled the onboarding rate for patients in our clinical studies. So that helped us overall, both on an efficiency perspective, but improved our patient inclusion in the trials to make sure we get our molecules to the right and the best patients. The next example I'll give is in the digital biomarker space, something more it's already highlighted earlier. So we use digital biomarkers as a means of increasing the sensitivity of how we measure patient progression and also how they're recognizing how they respond and benefit from the molecules that we're designing. I'll speak to both our Parkinson's digital biomarker and Huntington's, both leveraging a digital motor score. In the case of Huntington's, we have over 1,000 patients that have participated in our clinical trials. 2.5 million hours of passive monitoring data, that's 300 patient years of data that we have. It's an order of magnitude above any other pharma from a Huntington's dataset package. And when we look at the results of how we can use that information, it increases the sensitivity of us recognizing how a patient is responding to a particular treatment. So in the case of prasinezumab and our PASADENA study, looking at that digital motor score, we can get the same clinical readouts, with 40% to 70% fewer patients. And the Huntington's is even more significant. So if you look at week 20, in the blue column, using our digital motor score, you'll see that we get the same level of sensitivity at 20 weeks compared to 68 weeks when we use a more traditional biomarker, really showing the impact that it can have, not only from a time perspective, but also from an overall patient population perspective. In this case, 75% fewer patients needed to get that same clinical readout. So it's really impressive work that we're able to leverage using digital biomarkers. And then the last example I'll give is in the ophthalmology space. So in diabetic macular edema, those patients have what are called HRFs. Are these small hyper reflective fossae in their retina. And it's been known for over a decade that they exist. And what we have is in our clinical studies, we have over 50,000 retina scans. So we wanted to create all of the clinical data ingestion pipelines and workflows so that we can recognize for all of these images what patient was it from, from what trial visit, which eye is it from, which slice of the retina scan is recognized. And we created all of that automation in place so that we can process all of those images and do deep learning. To recognize how our patients are responding to our particular treatments. And here, I'll show you a video of a rendering of one of those retina scans. Here, we have a DME patient starting off with the blue being fluid in their retina, and the red dots being those HRF specs. And you'll see after just one treatment week 16, there has a dramatic reduction in those, both fluid and the HRFs. And we really think this is helps us understand the biology, what is actually happening in the patients. And we think it's related to also the design of VABYSMO with a dual action inhibition, that VABYSMO is targeting. So in a post-hoc analysis, comparing faricimab and aflibercept or EYLEA, we're able to show that the reduction in HRFs through this deep learning process is actually more significant than EYLEA. And we get a better overall readout. And we think this is related to that dual therapeutic potential that we see with VABYSMO. But it doesn't stop here. We're leveraging this information both for our understanding of inflammation as a whole. We can apply it in other disease areas. And because we've gotten such significant benefit where these clinical images get back in the hands of our clinicians within 2 weeks of us receiving them, we're applying these same data ingestion processes for all of our clinical studies and other disease areas. How can we recognize other biomarkers and get that data in the hands of our scientists to analyze and learn truly in that lab in the loop, understanding the biology and designing molecules in the future. So with that, I will hand it over to Bruno, who will walk us through the Q&A.

Bruno Eschli

executive

#7

Thanks a lot, Scott. And Scott and also thanks to all the other speakers for these very insightful presentations. We will take the first question from the phone, and it comes from Charlie Mabbutt from Morgan Stanley, Charlie, please.

Charles Mabbutt

analyst

#8

Bruno, Charlie Mabbutt from Morgan Stanley. So I guess, firstly, with technology [indiscernible] drug discovery, what percentage of R&D costs do you think could ultimately be saved, Alan? And do you think long term that could lead to declines in R&D expenditure across the industry? And secondly, how are the regulatory authorities reacting to studies, such as you suggested in IBD where the only evidence is in an AI model? And also use of digital biomarkers, which you talked about in the last presentation in Parkinson's, et cetera.

Alan Hippe

executive

#9

Yes. Charlie, I think -- I wish I could come up with a great prediction here. And honestly, I don't have it on air. I think that's really -- it's early days. I think it very clearly the leading computation biologist on this planet. I think it's great that we have really this -- how should I say, this knowledge in-house and that we can work on it. And I think you've seen how she has demonstrated it. For us, I think the first thing is outcome. Bring outcome better, be faster, have better outcome. And I'm sure with that certainly, I think, as you know, the major cost is the cost of failure. I think if we can bring that down over time. I think that would be a major, major achievement. Aviv said with 90% that we're doing in pharma, we are failing I think if we bring that to 88% or 87%, I think there would be a fantastic outcome here. But as I said, I think it's really early days. I don't have a good prediction here.

Charles Mabbutt

analyst

#10

And the second question was about IBD and how IM models will get accepted by the regulators.

Aviv Regev

attendee

#11

I think that was for me on [indiscernible]. So I think some things are important to mention there -- this molecule was already added to Phase II. It was safe. It was preclinically tested. It was in patients before it was efficacious. So not for IBD, but for other conditions. So something that if this has been -- if you rewound 7, 8 years earlier, these tools didn't exist, of course, at the time. But if they had and that is how the predictions would have been made, you would still probably today need to do preclinical work appropriately also in the context of safety and other things. There's a lot of work also on predictive safety and predictive analytics, but we didn't talk about them today. But that's just to be clear for this particular example. The second really important point to make is what the regulators see is not just an AI prediction. What they see is this is done on human data. This is actually stronger data than when you have an animal model because it is based on data actually from patients with IBD and what their cells are like, and what their human genetics is like. And so the algorithm makes a prediction, but it makes it based on very high-quality, high-resolution, human biology -- and I actually think it's hard to compete with that with lab information. Lab information is often more decant from the disease than the kind of data that the algorithms actually helped us way through and find this particular hypothesis. The trial is on clinicaltrials.gov. So clearly, this is an appropriately done work.

Alan Hippe

executive

#12

And maybe if I can also add here in a recent example in our Phase III EMBARK study in DMD, for example, in the secondary endpoints, we had the S395 endpoint, which is the first EMA accepted a digital endpoint. And it was interesting to see the results because when it came to the NorthStar, the primary endpoint readout, we would see significant differences between individual countries, but interestingly, when you look at this digital endpoint, the results were much more in line and similar. So I think this really tells you something about the power of these tools, and I think the significance will only increase.

Bruno Eschli

executive

#13

Did we -- Charlie, did we answer all your questions?

Charles Mabbutt

analyst

#14

I think there was one on the digital biomarker.

Bruno Eschli

executive

#15

Yes, I think was were related, but the digital biomarkers, maybe, Scott, do you want to add something about it?

Unknown Attendee

attendee

#16

Yes, I can just add very briefly that. Yes, we are using it as confirmatory measures at the moment, but we're also in active discussions with regulators. Can we use them as our primary endpoint. And if you think about movement-based disorders, how better to measure the impact on patients than measuring the movement overall. So there, I think we have some promising progress.

Bruno Eschli

executive

#17

Then let's take a second question from the phone. This would be Peter Welford, Jefferies. Peter?

Peter Welford

analyst

#18

Two questions. One is [indiscernible] a similar vein. When do you think we actually start see the possible benefits of this in terms of if you were to say that now, obviously, this has started in the process of improving R&D in terms of the initial stages. I mean how many -- I guess, how many years do you think we need to see before we can actually you can confidently say that these processes are actually improving success rates. And equally, I guess, one for Aviv, but how far are we along in this journey in terms of -- do you feel at the moment, is this really just the start because we hear a lot about AI? Or do you feel now that you've sort of got the tools that are needed, and we are now sort of beginning to actually reap the rewards, if you like. And then the second question is just on, I guess, data sharing, if you want to call it that. I mean, presumably, most of the major pharma companies are all building up these libraries of billions of data points across billions of cells that always [indiscernible] whatever the numbers were. And I guess my question is, where does the value lie? I guess? I mean, the R&D of these data points shared across the industry, is every company building similar data sets and in the end it the analysis tools that differentiate you? Or I guess, how do you think one pharma company's data set and ability differentiated, perhaps versus another, if that makes sense?

Aviv Regev

attendee

#19

Alan wants to start with the first part, and I'll do the other two.

Alan Hippe

executive

#20

Yes. I think it's anyway related, Aviv. Let me make a quick comment here. I think are we starting to see the benefits? I think we have to distinguish between really the use case. I would argue, everything which is related to productivity, what Kent has talked about, what I have talked about, that is imminent that is there. I think we see that every day and it's increasing, and it makes a huge difference moving forward. I leave that to you, Aviv, to go to the R&D side, but here I feel that it will take a little bit, but perhaps you can be more precise about that.

Aviv Regev

attendee

#21

Yes. I think it does take time in R&D, and let me try and explain it in a little bit more of a timeline way. So first of all, I'm actually going to go to other fields. It's always hard to predict the future, but there's lessons from the past. Usually with the technological advances, I believe that you tend to be on a 10-year timeline roughly, in the science advances, and this is because it takes about the first -- it's the first 3 years is the proof-of-concept periods. The first 5 is really the build. But the second 5 is really the reaping of the benefit. And by the end of the second 5, people are like, yes, like it's been here all the time. It's like we can't imagine working in any other way. I've seen that happen multiple times in my own life. So that's where I draw the conclusion from. I think the distinction, of course, in our unique field of drug R&D is that you have a very lagging indicator of impact in the patients and in some diseases, you really have to wait a while for that indicator. By the way, some of what [indiscernible], the digital biomarkers you sometimes can know sooner as a result of them where you're headed, but sometimes, we still actually have to wait a fair amount of time for the ultimate measure of success, although you have kind of indicators along the way. In terms of where we are specifically in this journey, I think we're now in -- you remember in the first 3 years and then the 5 and so on, we're kind of at the close of those first 3 years. We're starting to have a more and more robust way of not just having brought the people and build the tools and it's not just building the tools, it's integrating them together into this loop. We are now seeing it deployed in our work. That's the examples I actually showed you, right? These were actual projects of an actual portfolio at different phases of them. And that will have to increase while we're still improving the tools. The tools are not all there for all problems and their engagement and deployment means changing everything that you do. So it is a gradual process of this full stack change. So I would say, maybe 1/3, 1/4 of the way through to the promised land, but prophecy was given to the full. So you never know exactly. But I think so. And it's a hard road, it's not an easy one. Biology problems, in particular, are harder than some of the other programs that people have worked on. Causality is harder. We need causality for targets. So that's probably the hardest problem of the model. And in terms of productivity impact of that, I would say that if you go after the right targets, then everything improves in your productivity. So that's kind of the most ultimate thing. But where you see the earlier impact, it will be in the molecules. And actually, digital biomarkers, design of trials and so on, that's impacted even earlier than that. So it's almost like walking backwards on the -- on the R&D process in terms of when you see each of the impact, but ultimately, you have to go to the root cause. Your second question was on data sharing. And where the competitive advantage lies. So I'll start with the data piece. The trick is actually to both use your data and that of the rest of the world. The example I showed with Vixarelimab, those 200 million cell profile is a combination of data that is a genetic and rose proprietary and data that is actually from the public world. We use everything that's out there in the world that's available to us. We don't limit ourselves to our own data. Genentech is also committed to publications and we do publish, but of course, with certain constraints. So that is -- there are areas where we are in consortia with other pharmaceutical companies and academic institutions in human genetics that has brought huge benefit, I think, to all companies and to society. that these consortia exists, and we are part of them, often a founding member. And there's other areas like that as well. So we, I think, use a very balanced approach in terms of the data itself. Where I think the competitive advantage lies is twofold. The first is having the right mindset, the right questions and the right models and algorithms to address them so that you can frame something for the algorithm that is actually meaningful to you. In many, many cases, people just use methods that are out there. But they're not actually the right methods for the problems of target discovery, drug discovery and drug development. They were developed for other problems. And as a result, they don't give you exactly the right answers. So modifying things so that they suit our problems actually requires a lot of innovation. A great example of that is protein structure in general, an antibody protein structure, in particular. We make therapeutic antibodies. Those proteins matter to us. And they have very unique characteristics and they need their own algorithms. You can't just use general purpose tools and get the answers. So that's one aspect of it. And the second, and I think the most material aspect is this iteration. -- it's never just the algorithms alone. AI on its own is not enough. AI needs data and ideally iterations, and that's what we're building, and that's unique. I think I answered all three questions.

Bruno Eschli

executive

#22

Yes, I think so. Peter, did we answer all your questions? Yes. And maybe here, another comment from my side because I know you're interested in the pipeline. I think you've seen the example with Scott on prasinezumab, for example, in Parkinson's disease. I think this is one of the molecules where we kept the development program going and we will have Phase II results come next year. And we would probably have stopped this molecule just based on the traditional endpoints, but we saw this consistent strong signal in some of the biomarkers. So let's see how this plays out next year, but it already has an impact on one or the other molecule in our current portfolio. With that, I would now maybe read a couple of questions, which came in here. And the first one would go to Kent. This is about COVID-19 caused problems for many forecasting models. And how did you address the demand shop when building your model? And is the new model capable of recognizing future demand drugs?

Kent Cost

executive

#23

Yes, that's -- thanks, Bruno. That's -- it's a great question. What I'd say is demand during COVID was nearly possible -- impossible to predict, both going up as well as going down. And we went through multiple waves. And I think we weren't alone in that. I think the entire healthcare industry was hit by that. So our model is primarily built around detecting and responding to this plus minus 25%. And so we put that into routine. Let me talk about briefly because I think the question gets at what would we do during the next pandemic? And again, I think a couple of things. Number one, we have a good handle on what our installed base capacity is so how many tests you could run theoretically if you are running 24/7. We've got a good idea on our manufacturing capacity along instruments, reagents and consumables. So everything it takes to run a test, and we know exactly how long it takes to ramp up these manufacturing lines. So what we do, do in an extreme case is we monitor the external environment. And we're looking for triggers that could signal the next pandemic, so that we are, in fact, ready to respond. And I think monkeypox was a good example where we -- it came on our alert early. We were able to address the capacity and then we monitor the uptake and consequentially put ourselves in a good position. This develop further. So again, I think you have to separate from the routine versus something that would be extraordinary, but I believe we're well prepared now for both. And the lessons from COVID were certainly invaluable.

Bruno Eschli

executive

#24

Thanks a lot, Kent. We'll take another question, and this is about people and talent. The question is what professional profiles will be necessary to deploy these health strategies based on information management, and of these different talent you need, which other ones hardest to find? I think yes, whoever wants to...

Unknown Attendee

attendee

#25

I'll go ahead and then pass over to you, Aviv, if that's okay. And I think one important profile is really that combination of clinical information and medical information. So both the marriage of the two, understanding of disease or understanding a patient pathway and standing the needs in the care system and matching that with the possibilities that data give us. That's a key element. We see that a lot with this profile of Chief Medical Information offices so that are coming up. But that's really an important marriage, and the key elements that every organization will need to be successful in digital healthcare to combine. I think another important part, and that's certainly an increasing -- an ever-increasing challenge is the availability of talent in the area of cybersecurity and privacy which is particular for healthcare and health data is so critical. And that's probably would be my tip for every young talent today. Actually, if you want top security in the coming years, that's a good area to find jobs. Aviv, I'm sure you have some further profiles.

Bruno Eschli

executive

#26

Maybe Aviv, ask one question to put on top for you because I think you're the right person to take it. Excluding the U.K. and Ireland, what would be the best universities to be trained as computational biologists? Maybe since you have close ties to academia, you might have some thoughts on this one as well.

Aviv Regev

attendee

#27

I wanted to answer that, not the talent question Okay. So I'll pitch with the talent and then I'll come back to this. So there's -- I would organize it on the spectrum from the what I would call the most computationally technical to where it's more in our domain. It can be biology, it can be chemistry, and I'm definitely restricting it to the R&D side. So on the most technical side, you have two major phenotypes of people that you need in order to build these capabilities. And these are your computer scientists, ideally in machine learning expertise and related areas, it's not just machine learning, there's multiple advanced computational approaches that are important, but I'm just putting it under this title, I'll call it, algorithms. And the second is engineers. Both machine learning engineers and software engineers more generally, that can really take algorithms and make them into robust, very high-performing code in an enterprise environment. There's U.S. that goes with that. There's a whole world, right? That's the -- a particular kind of core technical expertise. The second category you have are what I would call analysts in many places, people will call them data scientists. These are people who understand the use of computational tools. They don't necessarily invent new algorithms, they typically actually don't. But they are the best today at putting them to use together with a domain-specific question. They can be in the biological domain and then their computational biologies, they can be in the chemistry domain and maybe they're called chemical [indiscernible] chemists or coming from [indiscernible] and so on. That is a very crucial interface and today is really inhabited by people who are bilingual. And some of them come first from the domain and into the other language, some have made the other career path, but that's where their strength lies. And the third category of people that I think is often neglected in these kinds of conversations is actually changing at the core, the way that all our scientists do their work. So when you do your lab biologies or you're classically trained biologies, I would call it, or a chemist and so on, you actually need to think about biology and chemistry differently when you realize this is in the world. And that is a major part of the shift that is happening. You do your experiments differently, you design them differently. And you don't design them alone. It's always kind of a joint activity. And what we're seeing is and that I've seen even before coming to roof, there's a generation of shift. Scientist trained today are much more native in thinking across this spectrum, than scientist strained even 10 years ago. And as a result, when they come, they actually always need an analyst, they can analyze their data on their own. They can think of new problems in a way that's formulated well for an algorithm, because they're used to working in this way. And that's a gradual shift in the talent landscape. I would say finding great people is always hard, always hard. Will always be hard in a field when there's opportunity, there is demand. It's kind of an obvious thing. What we need to do is provide environments for people where they can do things that they can't do anywhere else. That's what attracts people. That's an emission. And if you give them that, they do the work because there's no other place where they can achieve it. To the question on training, there are so many good places to train. I think that's often a mistake to think. There's just a select a few -- is actually not true, in the world which is very big. Somebody -- they said, excluding the U.K. and Ireland, that is basically the rest of the world, and there's so many good places to do this kind of work. If your focus is in the interface between, say, the computational sciences and one of the other scientific areas, I would look for a place that has enough critical mass in both. And if we're talking about grad school, I think what's primarily important is finding places where there's a critical mass of strong research labs. And that you can find in multiple institutes on the East Coast or West Coast, the center in the South of the United States, in Canada, in France, in Germany, in -- I can't. I mean, it's not innumerable at that level. And I will also say, if you're at the point that you're kind of making a career choice, you can always send an e-mail, including to me, and I would always reply, [indiscernible] never ask the question.

Bruno Eschli

executive

#28

Very good. Then I'll pick one more question here. This comes from [ Shan Hamad ] and the question is what percentage of drugs currently in Roche's clinical pipeline, where is covered through AI machine learning algorithms? I think we have seen a couple of examples where AI is contributing already. prasinezumab, [VABYSMO], for example. We have the gene therapy, which we mentioned in D&D, but the question is really, I would say, where there's a fundamental contribution that this drug is entering the pipeline. And maybe Aviv, probably first give it to you here. When is the first molecules entering a Phase I, which are really based on AI approaches?

Aviv Regev

attendee

#29

So I actually showed you, it's not a classical molecule, but the RNA vaccine is based on machine learning -- is an algorithm that chooses, which antigens, which new antigens to put in the vaccine. I think that -- maybe I wasn't very clear about that. That's not chosen -- that requires an algorithm. You actually can't design it without an algorithm because you have to predict which peptides will be presented on an MHC Class I. It's actually a hard prediction problem. So in clear disclosure, the specific graphs I showed were not from patients data. They're from a paper, it's a slightly different method, but the general statement is very accurate and correct. For other types of molecules, antibodies and small molecules, I cannot make a specific statement on a specific program. But what I will say is that we have programs in advanced stages in research, that have this next-generation AI component to them. I will also say that for small molecule drug discovery for at least 20 years, people have been using machine learning approaches, if not longer, but they weren't these next-generation methods. They were kind of the previous generation of machine learning. So in full clarity, I thought that was important to state.

Bruno Eschli

executive

#30

And you had the example of the new lead structures in the antibiotics. So this is something...

Aviv Regev

attendee

#31

Newly structures in antibiotics. We have things in targets all over the portfolio. I just can't comment on specific ones beyond the examples that you saw.

Bruno Eschli

executive

#32

So we should expect, I think, in the next 3 years to first molecules really to show up here in the pipeline I assume. Very good. Scott, anything to add from you?

Unknown Attendee

attendee

#33

I think Aviv has covered it extremely well. We use ML AI to augment the design of all of our molecules, all modalities. I think it's really essential. It helps us remove properties from those molecules that we don't want, and I don't know that we could do it without those algorithms and rule-based tools. So I think it's going to continue to evolve, but it's already there.

Bruno Eschli

executive

#34

So there's one more question from the phone here from Harry Gillis.

Harry Gillis

analyst

#35

Great presentation. So I just wanted to ask how you think about measuring returns on your digital investments or your AI investments. So for example, I can imagine this is a lot easier for some of the maybe -- the supply chain processes versus the earlier stage R&D work. And then, I guess, thinking about the R&D investments, will it really just take time until we see some of these drugs discovered by AI until they ultimately reach the clinic to be able to assess sort of the returns on these investments, how that ultimately plays out? And then related just given sort of the breadth of exciting opportunities and capabilities you've highlighted. I was just wondering how you make sure that you continue to spend on the right programs and capabilities? And I guess, if the scientists are going to the CFO with all these amazing projects, sort of what processes are in place to maybe hold that back and ensure that the money spent wisely?

Alan Hippe

executive

#36

That's a huge question. I think we could spend hours on this one, to answer it. So let me give it a try. I think everything where it makes sense to measure -- when it comes to digital, I think certainly, I think we come up with a return. I think when I -- when we do an ERP program like Aspire, I think very clearly, we have a business case behind it and I had it even on my slide, we have a positive NPV, and it's even a massive positive NPV behind it because we go through a business transformation. So that's pretty clear. I can well imagine and Kent truly will make a comment on and he even said it in his presentation. I think there's very clear applications where you can measure the outcome, and where the returns are clear. It's very hard, as Aviv said, when it really comes to the R&D points, how we do that. And I think certainly also what Scott has said. I think here, we go in and say, fine, that's part of the R&D budget and you guys hopefully spend it wisely. And you see also, what we do really with when it comes to the large language models, Galileo, we even leverage that in the company. So I think it's not like that this is really Genentech exclusive or gRED exclusive. I think we leverage into other areas like a tool, that we use I think it's really like, okay, if you want to calculate a return on excel, if you like, in our company. But very clearly, I think wherever it makes sense to come up with a business case, certainly, we do that. But there is a bulk of investment that we do on the digital side, which basically has no explicit return, and I would even argue it would be very hard to calculate it. And that's very hard and the predictions are anyway wrong, honestly, then we stay away from it and leave it by [indiscernible] and put it in a budget.

Unknown Attendee

attendee

#37

I'll give -- Aviv covered it really well. It's easier in the clinical space where you've got huge swaths of data than in the discovery space where -- sometimes those data points help move the needle in the right direction, but are inflections. So in one example I didn't highlight is we use a platform called Edison for all clinical data ingestion, both on RED side but also in PD, and we save about 400,000 per clinical study, helping us put all of the metadata around the data coming in so that we know how to process it, what -- where is it coming from? And how do we use that for ML and AI. Those are things that we previously had to do manually. So it's a direct return there. And the ophthalmology example I put forward, we're able to measure the volumetric impact of those HRF reductions, the volumetric impact of the fluid reductions, things is just were not physically possible. You couldn't do it manually. So with those deep learning algorithms, you're able to actually quantifiably measure science and biology that you couldn't do before. So I think there's already huge benefits there.

Aviv Regev

attendee

#38

So if I can chime in on that, taking it back again, an early development as well as research. First of all, I think Scott hit the nail on the head that there's two kinds of gains that we get. One of them is efficiency gains. And the other is things we simply wouldn't have succeeded with otherwise. So in the efficiency gains, there's many that are easily measurable, another example in the clinical domain is automating the draw of the clinical data back from sites, which takes you from things that take 24 days, to things that take half an hour, that require people and now we don't require them at all, and now you can quantify it per trial and you can just see your cost drop. Your costs drop and it's faster and it's more accurate. That's an easy win, and it's easy to count. In the molecular realm, we are increasingly developing ways to actually track this contribution per molecule, so that we also know which are the investments that are starting to show signs that they are panning out, so that we can continue investing into them, and what are areas that might have seemed like a great idea and there's a lot of activity. But in the end, they're not really generating what we were hoping for, which is totally possible in science. You never know until you try things out. So an example like Vixarelimab that I gave you is a good example, because the original financials for this asset were based on one indication. Now we have to that changes the calculus, -- of course, it would still have to succeed. But if that, that's a major outcome for actually a very modest investment in this case on the algorithm side. In terms of where we invest, we do focus on identifying areas for critical mass because you are right, people can otherwise run all over the place, with all sorts of creative ideas, and leaving enough room for some unbridled creativity, because you never know where the best idea will come from. But the places where we decide to double and triple and quadruple down are well defined, because you have to actually have a sustained investment over time to reap the big benefits. And once we build a platform, that's something that Alan alluded to, we all use it. So for example, anything that Prescient Design develops is available to antibody engineers in period, just as much as it is in [indiscernible] in fact, it's also useful for the entire projects. So it spans the role of Roche. And that's the third aspect that I think we've been very conscious of, which is to centralize work or it is where the skills are, and to maximize its impact in this way. So for example, the training of large language models is central in the fine-tuning is centralized in the hands of one expert team on all of Roche data, rather than having many, many, many parallel things that would take too much financial investment, but also might not necessarily bring the payoff in the end. So we're very conscious, I think, of these details.

Bruno Eschli

executive

#39

Then there's maybe one final written question. And I think then we are already done with the Q&A. And this Alan might directly to you. How do you want to ensure data integrity in an AI environment?

Alan Hippe

executive

#40

No, it's a great question. And certainly, how should I say it, a very important. Let me first make a comment that comes to my mind. I think when it comes really to the efficiency when it comes to R&D. I think certainly, the major point, the Holy Grail is bringing the failure rate down. I think we're making this a different industry if we were achieving this. With all the tools that we apply. Let me say data integrity. Look, I think for me, perhaps three pillars that I look at, I think, accuracy, completeness and consistency of data when it comes to data integrity, and with that certainly also the regulatory compliance. Why I'm not so concerned, well, look, I think it's not like that it's the first time. If we tackled that problem. We do a lot of clinical trials. We deal with patient data every day. So I think we have a huge experience in dealing with that. And I would argue that is also the reason why we came up with the responsible artificial intelligence framework in the company, right from the beginning, where we give good hints, where we really outlined policies where we make sure that people really that use external models are careful with data, that they put into these external models. While we have internal models where you can do that very safely. So I think we have a lot of instructions to get into the right path here. And certainly, from a data quality point of view and whatever, I think that's in our best interest that we have well-structured, high-quality data that we mix, if you like, with our internal data and do this. Having said this, I think certainly even in that business with [indiscernible], I think as you know, we even sell data externally to other pharma companies, to bring that in. So I would argue there's a high level of expertise how to ensure data integrity, and we work that -- we work on that every day.

Bruno Eschli

executive

#41

Thank you, Alan. And I think with that, we are at the end of today's present session. I would like to thank the many contributors here who made this event happen, especially, thanks again to all the speakers and their respective teams for the time and the commitment, and also to the IR team members who works on the individual slide decks. I have to call out here, [ Jon Maya and Jerry Tobin,] who worked on Alan's deck and the overall management of the deck [ Alina Levchuk ] at big Matas to work on Moritz and Kent's deck, and then [ Laurent Kalman and Tang ] who worked and managed Aviv's deck and then [ Jan Philip ] who worked on Scott's deck. And from the back office, thanks also to [indiscernible]. If there are any remaining questions, then please reach out to the IR team any time. We are happy to follow up and assist you. And with that, I would wish you a good day and hopefully talk to you soon.

This call discussed

For developers and AI pipelines

Programmatic access to Roche Holding AG earnings transcripts and 32,000+ others is available through the EarningsCalls.dev REST API. Plans from $24.99/month — full transcripts, speaker segments, full-text search, and the recently-added /api/v1/transcripts/recent polling endpoint for ETL pipelines.