BioNTech SE (BNTX) Earnings Call Transcript & Summary
October 1, 2025
Earnings Call Speaker Segments
Unknown Executive
ExecutivesHi. Good afternoon, everybody. So welcome to all of you in London in the beautiful venue of the Science Museum, and to everyone joining us by webcast to our second AI Day. And we're really excited today to show you some interesting new things. But before that, the formalities. So here are our forward-looking statements, which you can see. And you can refer to those in the presentation, if you'd like more details. We do not commit to updating these, and these are current as of today. So let's have a look at the agenda that you can see here. So first of all, we will have some upfront sections from Ugur and from Karim that will talk about how AI is fully integrated into development and our entire business model in BioNTech. And then we'll move on to some InstaDeep examples where you're going to see lab-validated results and their applications. So we're really excited, and that will show an evolution from what you saw last year. So now I'd like to introduce to the stage, Ugur Sahin, our CEO of BioNTech, and he's going to present the opening presentation on advancing a disruptive tech-bio company.
Ugur Sahin
ExecutivesYes. Thank you, Michael. Thanks, everyone. I would like to welcome everyone also on behalf of Karim. And I would like to give you the scientific biological background, why we need AI and for what we are using AI. But first of all, it's a really great pleasure to be here in this place. Actually, this place here, the Science Museum, is the place where the first COVID-19 vaccines, the empty vials, I saw it here. So on December 8, Margaret Keenan was the first person on the planet who received an approved COVID-19 vaccine, and this vial is straight here, close to the lancet from Edward Jenner, who introduced the very first vaccine study worldwide. But today, it's about AI. And this is just showing you that our AI approach is not limited to London. It's a global approach. We have sites where we do AI on multiple continents. And actually, we have been doing AI and 2019 was the first time that we met Karim and started to work with InstaDeep, but we are -- we were doing AI before that, but we didn't call that AI. It was machine learning, yes. And what is really new was InstaDeep coming in. While we were developing our tools based on existing technologies, I think what we can say about InstaDeep is really research on developing new technologies, completely new technology. And therefore, we are not only doing research and development in pharmaceuticals but also research and development for our AI tools. So a few words about BioNTech. We are a clinical -- late-stage clinical company with multiple programs in oncology, which is our core focus, but we have also both pipeline in infectious diseases, particularly for diseases with high medical need, for example, TB, malaria, HIV and others. And AI is in the meantime, really fully integrated into BioNTech. There is -- there are -- there is -- I would say, there are only a few projects where we don't use AI approaches. And the most important is that we are continuing, continuing to improve our methods and technologies with AI. I give you a little bit of background that you understand what we are doing and how AI is connected. Our core focus is oncology. And oncology is making a lot of progress in the last 20 years. And one of the breakthroughs in oncology was immunotherapy, the use of the patient's immune system to fight cancer. And there were a number of breakthroughs that we started in improved survival of patients, but still there's a huge medical need. More recently, we've seen new treatments based on ADCs and bispecific antibodies. And of course, we believe in the future of messenger RNA therapeutics immunotherapies that could provide us additional benefit for eliminating tumor cells. But let's start with immune modulators. So the classical immune modulator, which is the most widely used category is anti-PD-1 antibodies. An example is nivolumab or pembrolizumab, which are used -- have been used in hundred thousands of patients. And we were developing in the last years bispecific antibodies because we were interested to increase the fraction of patients who can respond to bispecific antibodies. And one of the molecules that came in from a partnership with a Chinese company, Biotheus, is BNT327, Pumitamig. And this is a highly interesting molecule. It combines 2 mode of actions. It's anti-PD-1, which releases immune cells, which are inhibited by the tumor cells and enable immune cells to act and cure tumor cells. And on the other side, it inhibits the generation of vasculature based on a mechanism, which is blocking VEGF, and it has a number of additional magic tricks, which result in immune responses and objective responses in cancer patients across multiple cancer entities. And this is really exciting development in the whole field. We call this the bispecific anti-PD-L1/VEGF class, which is expected not only to reach the tumor types that are currently used, currently address its entire PD-1 treatments on the left side, but also can go into categories, cancer indications where anti-PD-1 treatments are not approved yet. And we realized in the last 18 months, this is really a big, big, big opportunity, and it's too big to do it alone. Therefore, we decided to go into a partnership and announced a few months ago a partnership with Bristol Myers Squibb. It's a global partnership to develop this class of antibody, Pumitamig, in multiple cancer indications. We have data in the meantime from more than 1,000 patients in more than 10 indications, giving us the direction which indications could benefit from that. And this will not only be monotherapies, but also combination therapies. So we believe that these types of treatments can help us to control tumors and, in some patients, also provide a lasting clinical benefit, ideally cures. But cancer is very complicated. And most of the patients who have initial control progress over time. And the reason is here depicted in a simple cartoon. So cancer is evolving from healthy cells by DNA mutations, and these DNA mutations are just accumulating over 5 to 20 years. And so that means during the accumulation, the tumor cells because these are all random mutations, generate a heterogeneity and an individual cancer by this reason is really individual, no 2 patients share the same type of mutations. But the bigger problem is even that we have a intratumoral heterogeneity that means every tumor cell carries another set of mutations, which means that we have a situation where cancer can evolve over time and it's an evolvement not only against the treatment, but also an evolution against the immune system. And so knowing that from the very beginning of the 1990s where tumor immunology really become molecular, we were interested in cancer vaccines, because cancer vaccines come up with the promise that we might be able to induce immune responses against multiple epitopes. So if the tumor is polyclonal, the idea is here to induce a polyclonal T cell response that go into different directions so that we can combine multiple antigens. And this polyspecific activity is expected to move the last million tumor cells that remain after treatment, for example, with checkpoint blockade. So we have developed and pioneered several approaches to that. Two of them are shown here. One is aiming really to completely individualize the treatment. That means it's based on identification of the mutations in individual cancer patients by sequencing. And then these mutations because they are recognized by immune response are called neoantigens. And then we assemble a vaccine, which is tailored to these mutations. So the first description of this approach was in 2011. We showed the preclinical approach. And in the meantime, we have multiple clinical trials running in various indications, including pancreatic cancer. The second approach uses a complementary concept. That means if we say certain tumors tend to have certain type of tumor antigens. So we identified them. And for example, in melanoma, we have identified 4 antigens, which cover 100% of patients with melanoma in lung cancer. We have identified 8 antigens covering more than 90% of patients. So it's a combination vaccine approach where the vaccine is off the shelf and then used directly for vaccination. So there is no need to generate de novo. So this is the approach shown in more detail, taking the individual sample from the patient and mapping the mutations by comparing with normal the neoantigen prediction, which is computationally done, then on-demand mRNA manufacturing and transportation to the patient. And of course, this all is driven by data and algorithms. And the manufacturing is just in time. We can deliver the vaccine in less than 8 weeks, and our aim is to be able to deliver a vaccine in less than 4 weeks. And this shows you an example how this works. So the sequencing of the tumor can yield up to thousands of mutations. In some tumors, only dozens of mutations. So there are tumors with high number of mutations, and there are tumors with lower number. And we do the computational ranking of these mutations based on the idea that these epitopes can bind to the human leukocyte antigens, which are presenting this. And the way how they bind is defined by patterns, binding patterns. And this can be computationally calculated. And on this calculation, many other features, for example, how much the mRNA encoding these mutations is expressed in the tumor and whether we expect heterogeneity of the gene. So whether there is frequencies or the fraction of tumor cells in the tumor because of the heterogeneity, it is quite possible that you target the mutations, which is only in 75% of tumor cells, so you are inducing a selection of 25% of tumor cells. So this is our ranking algorithm. And this is based on computational approaches. We used the classical approaches. And in the meantime, we worked with InstaDeep to develop deep learning approaches, including a number of additional aspects of the antigens, for example, in which cell compartment they are expressed, which type of molecular patterns they have? So it's multiple additional features, which come in. I would like to give you here an example of, at the end of the day, the terminal biological mechanism that we need for this type of approaches. It's about killing of the tumor cells by lymphocytes. And this is an approach in biology in nature, which is among the most deeply quality-controlled biological events. Because at the end of the day, it is about a cell in the body killing another cell. It needs to be authorized. And the authorization is really done by a complex process, and this complex process is also misused by cancer cells to avoid killing by using mechanism and that circumvent that. But if you go deeper into that, one of the key aspects here is the recognition of the tumor cell by T cells. And this is happening by the T cell receptor. So that means these T cells have T cell receptors. And every T cell has a different T cell receptor. There is a complexity here. So our biology in humans allow that we have around 500 million different T cell receptors in our body. So there is a huge library of T cells that can recognize something. Then we have the HLA molecule that is on the tumor side, which is presenting an antigen which is inside the cell. So the mechanism is here actually evolved to recognize hidden viruses in our body cells. So it's an immune attack against infections. And our HLA diversity in humans have been evolved that viruses don't circumvent the killing by just avoiding the patterns that are presented on individual HLAs. But this makes it extremely difficult. And then you have the peptide, which is a fragment of 8 to up to 15, 16 amino acids depending on the HLA class. And we have now extremely complex interaction with the T cell receptor. Actually, these have 2 chains. We have the HLA, and we have the peptide, and only if the combination of all 3 works, we get a killing. And the problem is even bigger because the T cells have to avoid that this interaction does not kill any other cells. So it's also quality control. So -- and this is one of the biggest challenges in AI, to identify the T cell receptors that can recognize an MHC peptide sequence. So this is part of our prediction algorithm. We know the HLA of the patient, we know the mutated epitope and we would like to understand whether there is an immune response. And nature is doing that extremely well. So the COVID-19 pandemic showed us that 2 different COVID-19-infected patients could develop the same T cell receptors when they have the same HLA. So that means this discovery of hundreds of millions of T cell receptors works very well. The question is -- we have 2 types of questions. Can we identify T cell receptors that are recognizing tumor antigens? And the second question, of course, is, can we do it better than nature does? And we will have, I think, 2 talks about this, about these problems and how AI is used there. So this is complex and computation is important. But the situation is even more complex in cancer. So we have several levels of diversity and heterogeneity in cancer. So this is here the cancer heterogeneity and the clonal evolution. And on the other side, we have the immune system, and the immune system can also evolve. So I compare that sometimes like a Go game. So the immune system is playing against cancer, but the situation is even more complex because other factors are also playing a role, HLA molecules, the microbiome of the patient, environmental factors and so on. So the question is, are we going to understand -- if this is a game, yes, are we going to understand the moves of the cancer cells by reading this? And we believe into that. We believe that if we feed enough data into AI and if we really bring in the biology that we can make the evolution of a cancer cell predictable. And if we know if it's predictable, we can interfere with multi-specific approaches. So we are translating that into a collaborative approach. On the one side, we are generating data from our personalized vaccine studies, preclinical and clinical studies. And on the other side, InstaDeep and colleagues are developing new tools like DeepChain. We are using these tools, of course, not only for optimization of cancer vaccines, but also optimization of proteins, mRNA structures and so on. And Karim will talk about our in-house supercomputing clusters and how this contributes to obtaining better results. So in summary, this slide shows our vision. And I think this is more than our vision. It's really the view of our future fully integrated AI tech company, which combines a number of capabilities. Our vision in future is that we can take clinical samples, do a personalized omics, understand what is going to happen, and use our pipeline of molecules to come up with a combination treatment, which is consisting of off-the-shelf drugs, for example, Pumitamig, or our ADC, plus a personalized vaccine. So this is, in principle, we have shown that this is doable, but we need to do it at scale, and we need to do it in an affordable manner. So I think this is a good introduction for Karim, who now will come and introduce the capabilities that InstaDeep has built.
Karim Beguir
ExecutivesThank you so much, Ugur, for this exciting presentation. So I'm Karim Beguir, I'm the Co-Founder and CEO of InstaDeep, the AI unit of BioNTech Group. And I mean, it's a very exciting time to be building in AI, and there is a lot to talk about. But very briefly, I will try to give you a sense of our approach and the opportunities that we are developing and how we work collaboratively with our BioNTech colleagues to get things done with results as well. So I mean, AI, I've been working in the field for more than 10 years. And if you were to summarize what's been happening since the beginning of the deep learning revolution in 2012, really it is a triple exponential. So you have data growing exponentially, I think, for example, about the cost of like whole genome sequencing, which is now around $100, which is absolutely insane. So massive exponential growth of data, but it's also compute and finally, model innovation. So the compute side of things, I mean, it's pretty incredible how predictable things are. I mean this is for me one of like the most impressive sort of like graphs in the history of like computing and machine learning more recently. But if you look at since the 1940s, since everything started with the first computers like Colossus, ENIAC and the other. And despite technology changing so much, I mean, at the time, it was vacuum tubes, now we're like in chips and semiconductors, in the future, it may be quantum computing, yet everything is so sort of like linear in log space. So here, the y-axis is actually sort of like increasing by a factor 10 at every point. So you see like how this exponential keeps going and sort of like this is also famous as Moore's Law, which in general for computing is like compute efficiency for a given budget, you get twice as much compute every roughly 2 years. But in AI, it's actually Moore's Law on steroids, like the amount of compute, which is deployed in ML workflows doubles every 4, 5 months. So this is actually pretty insane. But it doesn't stop there. And the third point is model innovation because, yes, we have more and more data. Yes, we have more and more compute or at least more affordable compute. But the third point and perhaps the least understood is the efficiency of the models themselves. In 2012, in the beginning of the deep learning revolution, you needed to label literally like millions of data points to get an algorithm to learn. You don't need to do that anymore. With self-supervised learning and recent progress, you can literally feed the entire Internet or entire databases and get the system to learn. And incredibly, actually, the efficiency of those models improves every 8 months by a factor 2. What do I mean by that is, if you want to get to a certain level of performance, every 8 months, you need just half the compute to get there. So not only do we have a crazy amount of compute coming in and becoming available, a crazy amount of data, but the efficiency of the models is absolutely incredible. So it is a triple exponential. But yet, if you follow what's been going on in terms of like progress and the like, it does feel that progress is almost vertical. And sometimes one wonder like, is this hype? Is this a bubble? Or is it true that progress in AI is going to be incredible in coming years? And I would tell you like having been training neural nets for more than 20 years, I think that this is real, and we're going to see incredible progress in coming years. And actually, there's something qualitatively different that is happening that didn't happen before. And what is that? Well, it is that AI is so competent that is actually now accelerating itself. So if you think of AI as a plane with 3 engines, and these engines like we saw our data, compute and model innovation, AI is so competent now that it is at the point where it's going to accelerate every one of these drivers and hence, push progress much faster and deeper. And so this new era that Richard Sutton, one of the godfathers of AI named this year as the era of experience is taking us to new heights and allowing progress in a way that was simply impossible before. And here, I want to show you a little bit like what happened since the beginning of the deep learning revolution. In the very early days here, you had progress, which was coming from games, like you had with technologies such as reinforcement learning, you had a score and you had a system that would learn by maximizing this score. And then we had the big ChatGPT moment, and this is like became the golden age of large language models around 2022. But then again, we are coming back to a time now where systems based on trial and error are coming back to become very effective. And the reason is, AI is so competent now that what used to be possible only for games becomes possible for a larger class of problems. And so systems now can actually improve data. They can create synthetic data for the problem at hand and using reinforcement learning, improve their own answer to those questions. So this is what has been driving, for example, like progress in reasoning systems. You could see it also, for example, in a lab to optimize a certain protein sequence, like lab results are understood by AI that's going to use those to further improve its answer. So effectively, we're not limited by data anymore. We still need a lot of data, but systems now can generate synthetic data and take advantage of it and progress. So AI now is driving data, which is the first engine. The second engine like we said, is compute. Well, today, I don't know if you noticed, but NVIDIA keeps coming with these GPUs much faster than before, roughly now a new generation every year. This is because those hardware systems are actually co-designed with AI. It's not only very clever engineers. It's also AI systems using technologies such as reinforcement learning that are really accelerating progress in hardware. This is the same for Google with its TPU v7 chips and other that are codesigned by AI. So you see how AI is boosting its own hardware. And finally, model innovation itself, one of the things which is the most amazing about this period is that we actually reached the gold medal at the International Math Olympiads (sic) [ International Math Olympiad ], at the International Olympiads of Informatics (sic) [ International Olympiad in Informatics ] Computer Science this summer. To give you an idea, only a few years ago, like 4 years ago in 2021, we thought it would take another 20 years to get to these results. And these were the experts in the field. And yet we are there. And again, technologies such as reinforcement learning are key here because you have a score. You can actually evaluate a chain of reasoning or code. And this means that now AI systems can conduct machine learning research or more precisely work hand-in-hand with experts to push progress forward. So really like this is accelerating faster. And again, technologies like reinforcement learning, becoming more and more important. So at InstaDeep, we have anticipated those trends for a long time. And I'm very proud to say that we've been active in reinforcement learning research, in particular, for many years, and this work is coming to fruition. So today, I'm very happy to let you know that at the next NeurIPS Conference, which will take place in December 2025, NeurIPS is the largest and most influential machine learning and AI conference in the world, we actually have multiple research RL papers accepted, including for the first time in our history, an Oral excerpt on top of a Spotlight excerpt. So really, congratulations for the team for pushing the envelope in terms of algorithmic innovation in RL. It's really exciting. But it doesn't stop here. And what's exciting also is that we've had the most productive 12 months in InstaDeep history. If we look at where we were at the last AI Day to this day, we've had actually 6 Nature journal publications on biology and AI. And for me, this is a testimony of the quality of the innovation that is taking place between BioNTech and InstaDeep. This is collaborative work between our different teams and really like super, super exciting, including for the first time in our history, a cover of Nature Machine Intelligence in June. So as you can see, we've been having fun in R&D and innovation. But in reality, you need a lot more than that to win in AI today. And like Ugur mentioned, you need a full integrated approach. We need to be competitive at every step in the process. And so what does it mean? If you think about what I was saying earlier about the 3 engines of AI or the 3 engines of the AI plane, where you kind of see them, it means being excellent in compute and model scaling, you need to train those models at incredible large size. We're talking about like hundreds of billions of parameters, trillion levels of parameters. But you need also the AI innovation, model innovation, which we discussed. And you need also to have a great data strategy and an ability also to use AI to accelerate your data acquisition. And if you do these 3 things, then you get to exciting applications. And this is exactly what we're going to cover in this order and starting with compute and model scaling with Alex, who is going to present our latest results.
Alexandre Laterre
ExecutivesHi, everyone. It's a pleasure to be here. So I'm Alex. I'm the Head of AI Research at InstaDeep. And echoing what Karim said, we had indeed a very interesting summer, a very exciting one. So we have seen that AI has made the headlines with major accomplishment being, achieving gold medal at the International Math Olympiad, winning programming contests with generally capable models and even now stepping into the real world, right, with advances in robotics and physical intelligence. And I would say, in my view, these are not isolated milestones. These are predictable outcomes of the scaling laws, right? So the scaling laws, let's say, it's an empirical law stating that the performance of modern AI system is a predictable function of the resources spent to train such system being data, time, compute, memory and so on. And so it's not only a single scaling laws nowadays, actually. It's the pretraining scaling laws as we know, but it's also post-training, which is really enabling agent to actually interact with an environment being simulated or real and learning from these interactions to accomplish even greater task. We also have the inference time, the test-time inference scaling laws, which state that you can spend more compute to refine and polish the results of what the AI system will produce. So the question for us is, okay, as a company, how do we position ourselves there to perform in this new environment? And the philosophy of InstaDeep has always been to build an integrated AI ecosystem, starting from the hardware going to the orchestration and software. Because it's only through -- this is our belief, it's only through a tight hardware, software integration that we can gain the performances, the cost efficiency and the control required to achieve our objective. So how does it work in practice? Last year, it's for this reason that we had the pleasure to announce that we built the Kyber cluster, which is a AI supercomputer made of NVIDIA H100. This contained 14 of these racks, which have been engineered in-house by our bare metal team to optimize the performances for our own AI workflow, for example, scaling large language models training or running simulation for RL training, like Karim mentioned. It brings our total compute capacity to 500 petaFLOPs and is now our major source of compute power for the company. Now that we have this hardware, which is critical for work, we need to make it very easily accessible by all engineers to empower their work. And so there also, we built our own product, our own platform, which is called AIchor. It's a full product available to our customer, and really enable us to run very seamlessly experiments on Kyber. So with just a GitHub process, git commit, we can run experiment very easily. And that's why our engineers, around 200, 300 of our engineers have actually submitted more than 15,000 experiments a month in average in 2025. We also keep our GPUs and hardware very buzzy, where we maintain a very high usage of 75% of GPU usage on our cluster. The next step in building on top of that is obviously the ML software stack that we have to design to, let's say, squeeze the most performance out of each hardware accelerator. And that's why we've been building an entire ML ecosystem that is meant to be very efficient, scalable, modular, such that we can answer the requirement of the research development in which we operate. So let me give you 2 examples of this in action. The first one is about scaling large language models. So LLMs are part of our daily life and sometimes we train them, quite often actually. And so here, we took the challenge of trying to scale our Nucleotide Transformer models, which we published in Nature Method later last year to a 100 billion parameters model. The first challenge here is that 100 billion parameters model does not hold on a single device, on a single GPU. So the first thing we have to solve is actually how do we distribute it, how do we shard it across GPUs? And our answer here is to use fully sharded data parallelism within a single DGX, within 8 GPUs and then horizontally scale that across all the racks of Kyber here, I just depicted 3, but we have more than that. Horizontally scale that using data parallel. If we were to grow a model even more, we could use Tensor parallelism, we could use pipeline parallelism or even like sequence parallelism, if we want to handle a very, very long context length. So that's the first point. The second one, at the code level, we have to do a lot of optimization as well. We can use advanced CUDA kernels like Flash Attention. We can do mixed precision and quantization or we can try to optimize the XLA compiler and use better network configuration and so on and so forth. The result is a staggering 66% Model FLOPs Utilization. So just the definition means that basically, we maintain our hardware buzzy, 66% of the theoretical limits of the hardware, right? And so just to give you a reference point, the large public run of Llama 3.1, which contain like 400 billion parameters was around -- the MFU was around 40%. Of course, it's a much larger network. It's run on thousands of GPUs, which will give you a sense of the meaning of that number and how high it is. And we're actually going to talk a lot more about the foundation model we built in the next section in a few minutes about our AI innovation. The second example I want to give you is about scientific computing. Traditionally, when scientists are trying to discover and look for a new molecule, a new drug, chemicals, materials, well, they start with thousands of candidates. But realistically, only a few of them can be tested in the wet lab, right? So scientists face not only this kind of discovery problem, but this smart selection problem, right? The problem is that if you choose the wrong candidate among the many potential, well, you waste time and resources. So how can we do a smart selection here? And lucky for us, let's say, most of this property can actually be accurately estimated using quantum chemistry. The problem here is that quantum chemistry is extremely low -- extremely slow, sorry. It's accurate, but slow. On the other hand of the spectrum, you have classical force fields that are extremely fast, but really prone to errors. So how do we handle this? Our objective has been trying to combine the best of both world, the quantum level accuracy, but order of magnitude faster. And our answer to achieve that is MLIP, machine learning interatomic potential. These are a class of machine learning models that are trained on quantum chemistry data, so very accurate data, but that run much faster. And the result is indeed very impressive. In terms of accuracy, we see that there's a near perfect correlation between MLIP and the reference DFT calculation, energy level, whereas classical force field are prone to error as you can see. So it's accurate one. But second, it's also much, much faster and cheaper, actually up to 10,000x cheaper. For any dollar spent on MLIP, you have to spend more than $10,000 of classic DFT calculation. So it's usually a huge improvement. And in addition to that, as opposed to classic quantum chemistry method that don't scale very well, MLIP does. You can run simulation on tens of thousands, if not hundreds of thousands of atoms very efficiently. So we are very excited about this technology. It's the early days, but we're excited about the potential because of its application to so many different domains and application of interest for us. So I invite you to keep a look on that and we have a booth downstairs about it. So that, I hope, gives you a sense of what we've been doing at InstaDeep in terms of developing your AI stack going from the hardware level with Kyber, the orchestration level with AIchor in our product there and the machine learning software stack. Now I want to give some space for Bernardo, who is going to describe how we've been using this stack to develop the next generation of foundation models in genomics. Bernardo.
Bernardo Almeida
ExecutivesThank you. Thank you, Alex, and well, hello, everyone. It's a pleasure to be here. I'm Bernardo, Senior Research Scientist at InstaDeep. It's my pleasure to present our work on AI applied to genomics. So genomics is a study of our genome, our genes and how they play together in our cells. And I want to show you how we are using AI to understand that. So the first thing we did last year was to publish our first model, our foundation model for genomics called Nucleotide Transformer. And since then, it has become one of the most popular genomics AI models in the field and used in many papers and many -- and to develop many new models. On our side, we have used Nucleotide Transformer, here on the left, to develop new iterations or kind of fine-tuned versions for different applications that were published over this year where we used Nucleotide Transformer to annotate the genome at single nucleotide resolution with SegmentNT, so the second model. We built Isoformer that combines DNA, RNA and proteins to perform at different tasks. And we even combined Nucleotide Transformer with a conversational agent which made the cover of Nature Machine Intelligence with ChatNT. So all this is published, but we are already working beyond this. So if I put into perspective the current models that exist in the field, nowadays, we have models that learn from genomes. So Nucleotide Transformer is an example, Evo as well. So they are just trained on genomes and then fine-tuned on different downstream tasks. And on the other side, we have models that learned from functional data. So Borzoi or AlphaGenome from Google. So today, we are very proud to announce the release of our new Nucleotide Transformer version of NTv3, where we try to unify both paradigms into a single model that learns from genomes and in this case, more than 150,000 species genomes, but also at the same time, it's post-trained on thousands of functional data from many different experiments across different organisms. So what is NTv3? So NTv3 combines a full set of capabilities being multispecies but also multimodal going from genomes to functional tracks, genome annotation, all at once. It goes from human genomics to plants and metagenomics, is now capable to process sequences of 1 million nucleotides, so the longest that exists nowadays. And it's also generative. So you can design DNA sequences with de novo properties and I will show you some validations in the lab as well. We built a suite of models from small 10 million, very affordable, to 4 billion parameter models and is also designed for efficiency even with this long context and model size. So now we will dive into the details of NTv3, starting by the main pretraining phase. So we take NTv3, and we pretrain it on more than 150,000 species' genomes, so that's about 8 trillion nucleotides. And we do it in different phases from short to longer sequence lengths to cover the whole tree of life from very small virus, plasmid sequences to human genome of almost 1 million nucleotide sequences. All these through 15 trillion tokens. So the longest pretraining existing in genomics. So you can do this using a masked language modeling objective where you perturb the sequence, for example, if you mask 15% of the nucleotide and you ask the model to reconstruct it. And if you do it over and over again, the training loss or the error at this objective kind of starts going down, so the model starts learning this objective. And by doing this at different model scales, we can really see the scaling laws of AI in action. So our smallest model gets this performance, and the bigger the model, up to 4 billion, gets even better. So we have these different models of different sizes, different efficiencies that you can use now for various applications. So starting with some inference time just to show you how efficient our model is. This is the current set of models in the field. So when you compare across different sequence lengths up to 1 million, the efficiency in terms of inference time, you can see that they all suffer and it's very hard to scale these models to long sequences. So that's a common issue in the field. NTv3 was designed for efficiency for this problem. So with our 3 -- here shown, our 3 NTv3 models, you can go to 1 million nucleotide sequences with a minimal loss in terms of inference time. So it's really very affordable and possible to use for downstream task at this scale. We tested this NTv3 on the first set of tasks, long range, around 44 tasks that go from gene expression, chromatin accessibility, genome annotation across various human tissues. And this is just to show you kind of a busy plot of all the tasks we have been compared NTv3 against other competitive models. But if I summarize all this information and group it by quantitative and classification tasks, we can observe that our models are better than the competitive models and particularly our small model, just 10 million parameters and it's already very efficient. So that's the main message, the first main message. So a very good small foundation model, very easy to use. But if you scale the model size, you can see that you get performance on both types of tasks. So larger models, better performance. So that's on a set of kind of downstream tasks that are already useful for people, but we wanted to take a step further and bring all these functional data of genomic tracks and genome annotation into an additional post-training phase. So we take our NTv3 model, and post-train it on genome annotation and genomic tracks experimental data. So this means that we take for a few set of species, all the introns, exons, splice sites, all these elements that matter in the genome, and we try to use NTv3 to predict them from the sequence. And at the same time, NTv3 needs to predict all these experimental data, so around 17,000 experiments from 16 animals and 6 plant species at single nucleotide resolution. So these are the kind of example profiles that NTv3 needs to predict. And doing this with sequences up to 1 million nucleotides long. So you do the post-training on all these data, and then we can show you how we perform with NTv3 on genome annotation and genomic tracks experimental data. So we start with genomic tracks. And just to give you an idea of what the actual predictions look like. This is a piece of our genome with 2 different genes. This is 1 million nucleotide window. And here, I'm showing experimental data from K562 leukemia cells. So at the top, you have the experimental data, for example, from RNA-seq, DNA-seq and other different assays. And at the bottom, you see the NTv3 predictions. So it's -- in one go, the NTv3 can predict for a 1 million sequence, single nucleotide profiles that match very well the experimental performance. So you see with NTv3, you can predict and recapitulate these assays. This is an example for 2 genes. But if we now look across the genome and just compare with the state-of-the-art model, so the Borzoi model across these different experimental readouts in human and mouse, we are showing an improvement over the state-of-the-art across all of them. So we are outperforming the current models on this single nucleotide prediction of experimental data from human and mouse cells. So that's on genomic tracks. We can also evaluate now our model on genome annotation. Again, this is a busy plot, but that's how our genome looks like. So we have a 1 million window with many genes at the top. And our model has to predict all these different elements, where is the gene, the intron, the exon, splice sites, and all these elements have different resolution. So I'm again showing you the actual annotation with the predictions of NTv3. And if we zoom in -- to be easier, if we zoom in into a gene, now you can see kind of a better pattern of the gene with all the exons and the introns in these lines, we can see that NTv3 predicts that indeed it's a gene, the locations of all the introns and the locations of all the exons and even the splice sites, which are just 1 nucleotide out of this 1 million context window, and the same for the UTR regions. So very rich predictions from NTv3. They look like the actual annotation and we can again summarize the performance and compare with the state-of-the-art model, SegmentNT. So the percentage improvement across all these different elements of 14 elements that we train NTv3, again, showing that we outperform the current state-of-the-art on gene finding, regulatory elements like promoters and enhancers and also splice sites. So we take the pretraining, learn from genomes. We take the post-training, learn from functional data, and we outperformed the state-of-the-art models there. And we can even now bring the model further. And instead of just being predictive, like previous models, the previous version of NT and former, we want to bring this model to the generative space as well. So nowadays, we have models like Evo that are generative, but we don't have models that do the 2. So NTv3 is the first model that can do the 2 in 1 go. So NTv3 learns from these native predictions and functional data, but can also do de novo and conditional sequence generation. And that's thanks to the masked discrete diffusion framework that we implemented in NTv3, where you can guide NTv3 to generate sequences with a given property. So we are unifying representation capabilities with these fine-tuning approaches with generative capabilities. And I want to demonstrate this using an example that we actually took it to the lab and validated. So we collaborated with the researchers from Vienna from the IMP Institute to design enhancers that are promoter specific. They activate specific genes. And so enhancers are sequence elements that modulate the expression of genes. So they can be very useful for gene therapy, to activate genes in different cell types. So we wanted to design enhancer specific for promoters, but that are active at different levels as well. So we took NTv3 with this masked diffusion approach, made it generative and generated enhancers for different tasks, I will show you after, and validated them in the lab through reporter assays. So first experiment that we did was to prompt NTv3 to design enhancers with different activity. So you take a gene of interest and you want to design an enhancer that activates the gene with low, medium, high levels. We train NTv3 to do that, generated a few sequences in the computer, sent them to the lab. They generated the sequences and added them into cells in a reporter assay. And in this spot, I'm showing the experimental results. So in gray is the native enhancers from the cells. So you see that you have enhancers that activate the gene at low levels, medium and high levels. And I'm very happy to say that when we tested the generative NTv3 enhancers, we observed the same kind of phenomenon. Our prompted enhancers for low activity were indeed lowly active, activated the gene less. But we could also design enhancers that activated the gene even stronger than the native enhancers. So this was a success in terms of generating enhancers that activate genes at different levels, again validated in the lab. So these are the first experiment. Then the second one was to design enhancers that activate specific genes, specific promoters. So you prompt NTv3, for example, with a high activity in one promoter and low activity in the other. And then we test it in the lab, the activity of the 2 promoters with the same enhancer. So here, I will show you the fold change between the prompted high-active promoter and the low-active promoter. So you want high fold changes, so high specificity. And we tested 2 different promoters. So these are DSCP -- the DSCP gene, and compared with the state-of-the-art generative model from using DeepSTARR, we have served a stronger specificity for the DSCP gene and an even stronger difference also for this RpS12 gene. So this is showing that our models can design highly specific enhancers towards specific genes. And again, in gene therapy, for example, this can be very promising. So these were 2 experiments validated in the lab. And I just want to come back again to the whole presentation and the different key points that I mentioned today. So NTv3 can be used to predict experimental data that we call genomic tracks from different cells. So think about gene expression, chromatin accessibility, et cetera, can be used to predict the annotation of genomes and can be applied across different species. For example, genes, splice sites, et cetera. By predicting all these properties, we can now infer or interrogate NTv3 to predict the impact of variants on all these different properties. And you can even bring this further and generate sequences with specific properties like enhancers in this case. So very, very happy, I think, for this milestone to present here today this NTv3. And with this, yes, thanks a lot.
Karim Beguir
ExecutivesThank you so much, Bernardo. And really like I want to congratulate you, [ Thomas ] and the entire NT team. NTv3 is a breakthrough. And really like it's so extraordinary to see the team getting to build the largest context window in genomics today, state-of-the-art performance, an order of magnitude faster inference than anybody else in the field, and all this at very reasonable budgets, if we compare to Frontier Labs, it's really like a testimony to the incredible innovation happening at InstaDeep and BioNTech. But it doesn't stop here, and we've shown -- just shown you like state-of-the-art lab-validated results in genomics, but we are also very active in protein space. And we're going to have Bora introduce our latest cutting-edge results in protein design. Bora.
Bora Guloglu
ExecutivesPerfect. Thank you very much, Karim. So hi, everyone. My name is Bora. I'm a Research Scientist here at InstaDeep. And today, I'd like to take a little bit of time to talk to you about our use of GenAI for protein and specifically antibody engineering. So I'd like to start by taking a few seconds just to set the scene. So when we are normally designing a protein, we're not just designing for one property. We're actually optimizing multiple properties all at once. And the solution essentially needs to satisfy multiple constraints. Now the traditional way to approach this would be to develop N models for N different tasks and then apply after the other. The problem with this is that it's very, very inflexible. If the task at some point should change, so maybe your internal pipelines or the actual experimental pipeline changes, then you need to go all the way back to scratch, develop new models, curate new data and so on and so forth. So we want to flip this on its head a little bit. What we envision is essentially just one big model that has been trained with as much of the data of interest as possible and so is aware of all of these things. Essentially, it's learned a very rich joint distribution over all of the different attributes that we care about. That means that at inference time, the scientists using this model and interacting with it can essentially prompt the model, specifically with only the things that they care about. So one model essentially becomes all of these previously mentioned models. Another advantage here is that because you're training the model with lots of data, the model can also learn correlations that were previously invisible and that drives our performance. So we spent a lot of time thinking about what sort of model, so what sort of architecture ML paradigm is the thing to go with here. And we ended up using Bayesian Flow Networks. These are very well suited to different types of data, which we encounter in scientific settings. And we first started by publishing a proof-of-concept paper where we introduced our models, ProtBFN and AbBFN. These are sequence-only models, and we actually showed that compared to leading autoregressive models, BERT style transformers and diffusion models, they outperform them in terms of both sequence naturalness, diversity and all the things that we care about. But today, I'd like to take this a little step further and introduce AbBFN2. AbBFN2 is our first truly multimodal antibody design model, and it allows a scientist essentially to flexibly interact with the model, design antibodies for any task that they're interested in and optimize them on multiple fronts. So when I say antibody is what I'm really referring to in this case is the F v region. So that is made up of these 2 chains, the heavy chain and the light chain, and it's actually part of the larger molecule. The reason why we focus on these F v regions is because essentially in the past years, we've seen a massive, massive expansion in the different formats of antibody-based therapeutics. You've got your kind of standard IgG molecules, but also antibody-drug conjugates, bispecifics, slightly more esoteric novel versions of bispecifics or multi-specifics and so on and so forth. But the one thing that's common to all of these things is that the key recognition of the antigen happens via an F v. And so that's why we need to model this, and we need to model it very, very accurately. The problem is further -- is made even more complex because F vs are highly, highly diverse. So a very, very conservative estimate would be that there's more than 10^16 possible naive antibody sequences as we call them, which makes this a massive needle-in-a-haystack problem. But the issue is also that antibodies are weird molecules. Normally, a protein is expressed from 1 single gene, whereas for an antibody, 5 different random genes are essentially spliced together to produce the molecule, and the biophysics of the molecules are also very interesting. So that means that your haystack is now huge, but it's also multidimensional. So you really need fine grain control over the generative process to actually pick something out from here that works for your purposes. And that's what AbBFN2 does. I'm not going to bore you with kind of the details of these things, but this is essentially 45-plus different modalities or attributes of an antibody that the model includes explicitly. So any design task that we can express in terms of these modalities, the model can tackle. If we don't care about one property at one time, it doesn't matter. We just ignore it, and we focus on the other ones. So this includes stuff like the genetics of the antibody, the biophysics of the antibody, but also the sequence. And we're constantly developing new capabilities. So we now can do per-residue energetics to stabilize an antibody. We also look at things like germline families and also genetic information at the residue level. And we're also working on including structure of both the antibody and the antigen as well as quantum accuracy energetics. So a couple of results here. The first thing that we do is essentially use the model to label known sequences, that is, I have an antibody sequence. I want you to tell me everything there is about the sequence, and I want you to label it very accurately. So here, we've tried 23 different tasks, and we find that AbBFN2 outperforms every other baseline that we've tested on all of these tasks, sometimes by a very large margin. This is very nice because it essentially means that the model has really learned the relationship between sequence and metadata or attributes. And it also means that practically, the model is essentially a one-stop labeling tool. Rather than using 5, 10 different tools, all of which have different software requirements, you can just put your sequence through AbBFN2 and get all of the information about that you care about. It also means that we can tackle the inverse problem. So that means I have a specific requirement and I want to design an antibody that satisfies that requirement. So as an example here, I've chosen to show you some stabilization results. So stabilization of an antibody here refers to the interface of the heavy and the light chain. So this is where they bind together to each other. And this is really, really, really important, both in the clinic, but also naturally. If an antibody is very stably bound to its kind of paired chain, then that means that it's more stable, which means it's easier to express in large quantities, which brings down costs. It also means that it's easier to store, and it's just generally something that we're interested in. This is also specifically very important in the case of bispecific antibodies because there, you really need fine-grain control over which chains will pair up with each other. So last year, we were able to essentially recapitulate natural interface stability. So these are interface stabilities that you would expect to see in natural immune repertoire, so sequences that come from actual human immune system. This year, we've pushed this even further, and we can now actually arbitrarily set the energy that we want. And so we can tune essentially the stability of a given heavy/light chain pair. Another thing that we're interested in is multiparameter optimization. So this is, you have 5 different properties or 10 different properties that you all want to optimize. And as I said, traditionally, you would use 5 or 10 different tools, one after the other. The problem is that these tools are unaware of each other, so they might undo each other's effect, so to speak. And also, they will introduce more mutations that are strictly necessary. In our case, we make use of AbBFN2's capability to understand all of the attributes all at once. And we also make use of inference-time compute scaling. So we tell the model, here is the starting sequence. Here are the 5 things that I want you to optimize, so bring into those blue regions. And then essentially, we allow the model to think about its response, edit it here and there and make changes progressively. And we see really, really nice results with this. So when we look at all of the antibodies that we've tested here, we have an 80% success rate. If we actually look at only the antibodies that you would, in the first place, take a little bit further during preclinical developments or the tractable ones, the success rate shoots up to more than 90%. And the very, very interesting results in this case is that the number of mutations for one objective, for instance, is at 46.6%. This is roughly in line with experimental approaches to doing this. But when we add 4 extra objectives that we optimize for, we actually only need 10 more mutations. So the model is really aware of if I make this change, this actually satisfies multiple things at once. So this is the best one to choose. Now part of this is also sequence humanization or essentially reducing the risk of an adverse immune response of a sequence. Traditionally, again, with a purely experimental approach, this is often done in a kind of trial-and-error way. You take your starting sequence, you introduce a few mutations, you check that everything still works. You do that again. If something breaks, you revert back to a previous state. Do that again over and over until you essentially find your idealized candidate. This can take a very long time. But you might also, at some point during this process, realize, oh, this antibody was never going to be optimized. So what we want to do is essentially integrate models like AbBFN2 into the experimental workflow. So rather than having this iterative approach, we essentially use AbBFN2 to optimize the immunogenicity risk. This takes 20 minutes. And then afterwards, you can still do all of the things that you were going to do, including affinity optimization. And this really is as easy as I make it sound because we've also ensured that the model is usable, right? So we've packaged the model and it's now available on DeepChain, and we've essentially made sure that certain workflows that people might be interested in are easily accessible. So in this case, for instance, we can do conditional generation where I have certain attributes that I want in an antibody. So I could say, for instance, oh, you know, I have this specific CDR-H3, so loop-length in mind, I have the light chain sequence already, and I have most of the heavy chain. So I want you to just generate me the rest of the heavy chain, generate me a library that I can then take forward. Alternatively, for the humanization workflow, we've actually packaged this as well. So in this case, all we need to do is enter the sequences that we're interested in, set essentially how many times we want the model to iterate on these sequences and then press go. So to save us the time here, I've actually pre-run one of these humanization experiments. And you can see here that the input sequence is given. And you can see, as the kind of the model works its magic, changes are made progressively and over time, the humanness increases. We can also then, for instance, scroll down and check that the sequence still folds up in the same way, so nothing has been disrupted. And this is really just to make life easier for the bench scientists using the model. So with that, I'll take you back to the slides because we've actually tested these things in the lab. In silico results are well and good, but you always need to demonstrate that these things work. So in our case, we've taken 4 antibodies, these are clinical stage antibodies, against 4 diverse targets. And these are antibodies that have actually undergone a humanization procedure experimentally. We've also done this with AbBFN2 and tested that they still bind. In all of these cases, the antibodies still bind with good affinity. But what's really remarkable is in most of them, we actually need far fewer mutations, which allows you much more space to then do further optimization according to your needs, be that what it may be. So this is really, really exciting. We've done the work on a computer essentially, and we can show that it works in the lab. And with that, I just kind of want to pull it back and say that the aim of the model is really to integrate into pre-existing workflows. No one should have to change their experimental workflows to fit the way a model works, but rather the model should be able to fit to your needs. And this is really possible with AbBFN's kind of, as we like to call it, condition anywhere, generate anywhere paradigm. And with that, I'd like to thank you all for listening and hand back to Karim.
Karim Beguir
ExecutivesThanks, Bora. And it's really exciting to see the progress on our Bayesian Flow Network models. And as you can see, I think one of the differences with last year is this time, we have lab validated results. You saw that for nucleotide transformer, you're now seeing it for our generative protein models. And we are really focused on having an impact. And so where are we now in this presentation? We past the halfway. And as you have seen, we've been looking at compute or Kyber cluster results on scientific computing, then we looked at algorithmic innovation. And now we're going to get closer and closer to applications. And I think a specific point, which is extremely important, is working hand-in-hand with our biotech colleagues on the data front, making sure we can extract as much insight as possible from the data. And in this context, I'm very happy to introduce Nicolas, the Head of our BioNTechAI team as well as Youssef to tell us more about the work we're doing in data.
Nicolas Lopez Carranza
ExecutivesHello. Pleasure to be here. My name is Nicolas. I am the Head of BioNTech AI team at InstaDeep. Hi, Youssef.
Youssef Ben Dhieb
ExecutivesHi, I'm Youssef and I'm Machine Learning Engineer at InstaDeep.
Nicolas Lopez Carranza
ExecutivesBasically, BioNTechAI strategy is quite simple, as Ugur mentioned, it's driven by data, and there is always potential to continuous improvement of our algorithms. The more and more we generate data, we will show that in the context of the iNeST1 personalized vaccines, but this is all across the company. And we are also aiming to learn as much as possible from the tumor. This is where the information is and this is where we need to develop algorithms to leverage as much as more this information for the design of effective vaccines. So we would like to walk you through 2 examples of how we are designing AI algorithms and tools to actually learn from the tumor and learn from the data itself, one on the Sequence Space and one on the Image Space. First, let's talk about the Sequence Space. And for that, I want to introduce you the concept of the Dark Proteome. The Dark Proteome encompasses uncharacterized proteins from hidden translation products beyond the canonical proteins and known PTMs. Now those proteins that come from protein coding genes, traditional classical protein coding genes. So there is a whole new world of proteins or peptides that are not born the same way. They came from aberrant splicing events or gene fusions or long noncoding RNA sequences or noncanonical open reading frames. So how can we look at this? We wish we had like sort of lantern to illuminate the dark proteome. And for this, we developed InstaNovo a tool that -- thus protein -- sorry, peptide sequencing, de novo peptide sequencing library free peptide sequencing. And I will tell you why this is very important. Sequencing peptide is very complex, right? It's not as simple a sequencing DNA. You need to chop your peptide into pieces, into fragments and then accelerate those fragments in a magnetic field, these fragments have a master charge, so they give a trajectory, and then we end up having a spectrum like the one you see here, the MS2 spectra here. In traditional mass spectrometry, what you view is you will have a library, a reference library where you really know what you are looking for, for canonical human proteome that is easy, but for de novo peptide -- sorry, for noncanonical peptide, that's a bit more complex for dark proteome. And once you have the library, you do a database search. So you try to match this is Spectra with your library to finally get -- in this case, you're a know dark tumor antigens. What InstaNovo does is the problem of having this library, which we don't really know in the context of these noncanonical peptides. Another interesting thing is that these peptides could be very cancer specific. So they are great for designing targets, new targets, target discovery or biomarkers for cancer. Ugur said that in the end, the cancer fighting cells like your immune system fighting your cancer cells, well, you want to kill the ones that are cancerogenous, right? So your target needs to be cancer specific. So just to give you an idea of how we are using InstaNovo here, we see a table here where we have tumor and normal identifications, and we find a few peptides where you see that the number of tumor identification is much larger. The output of these peptides come from InstaNovo. So it has already shown this potential in detecting tumor-specific epitopes from this undocumented open reading frames. The InstaNovo has been published in Nature Machine Intelligence, and we made it available for the whole community to use it and try it. And it has been also covered by Science Magazine on an article of next-generation de novo peptide sequencing. This is work that has been done in collaboration with Professor Tim Jenkins and DTU. And we are extending this collaboration for introducing InstaNovo V2, and even larger model, 63 million labelled spectra, where you see the increase in the peptide spectrum matches, and it has a higher accuracy like 10%, 15% increase in accuracy in the data set that we have been testing. So we are very excited to apply it in BioNTech for the discovery of new targets and biomarkers -- cancer specific targets and biomarkers. With this, I would like to leave the place to Youssef to show us a bit of how we are trying to improve our digital pathology algorithms.
Youssef Ben Dhieb
ExecutivesThank you, Nicolas. Hi, everyone. So last year, we showed our AI-Assisted annotation tool and how we increased the efficiency of pathologists fivefold. However, 5x faster pathologists is still not enough because we have thousands of whole slide images to annotate. And, the question we had to answer is how can we reduce the pathologists' annotation efforts while ensuring the best model performance? And the answer to this is data. In computer vision, usually, when you look at your data when it's unlabeled and labeled, it's different points like you see here. And what we do usually is that we take random points from your data to use it for the model training. This works when you have a lot of data, thousands or millions of data points, you can label. But when you have a few data points, and we want to reduce actually the pathologists efforts. And you take your data and you plotted for example, in a t-SNE graph like this one, for example, it's a real t-SNE graph of a data set. You will see that your data points are not covering all of the patterns. So here, each cluster is actually a different pattern in your data set. And you will be missing the highlighted patterns here, for example. When you test your model after that, you are not sure you will be getting good results in these patterns because the model didn't see them. And what you want actually is that you cover all of your patterns and you don't have to have a lot of data to label. And for that, we actually took the leading open source software in the data curation and the histopathology visualization. And we built our own internal product on that which helped us to explore, understand and work with our histopathology data. And here, I will show a demo for that. So what you are seeing here is actually the real clusters of data. This is the CRC 100K data set, for example. And when you look at one of the clusters, here, for example, I guess, it's the tumor, you will see the same pattern there. And when you go on the other side, this one, I think it's the adipose or the fat cells. Yes. And you see a totally different pattern. And for this data set, we have the ground truth labels. So if you visualize the labels here, you will see that actually the foundation model is doing really well in clustering the data set. So you can see here that for -- it's different clusters, you have specific colors. For example, the yellow one is the debris and the green one is the tumor, for example. And it's doing even better because for the tumor, for example, here, you see that we have a lot of different clusters. So if you take this part from the tumor here a specific pattern, and you take another part here, it will give you a totally different pattern in the tumor. So we have even subclasses for each class. And it doesn't only work on these patches. What we made it also -- we made it work on whole-slide images. And you can take a cohort, for example, for the task of the MSI MSS, and you can see all your whole slide images. And you can also see their embeddings and their t-SNE graph. And here we fine-tune a little bit the model on the task itself. And when we visualize the label, you can actually see that the MSI are most of them are grouped together, and you have the MSI low and MSS other ones. You can also see your data to find the outliers, the most unique one. So we can visualize the uniqueness here. Yes. So the brightest -- the point is the most -- the more unique. Here, for example, if we take this point and we investigate it, let me open this one here. And we can also investigate the whole slide images inside the app. And when we zoom in, actually here, we find it's the most unique because it's out of focus. And that's how we can find the outliers or the wrong data. It's actually the focus is on the market made by the pathologists and not on the sales themselves. Another picture also you can do here, if you can go back to the presentation. One of the feature also you can actually see your whole slide images and you can zoom up to the cellular level to investigate them. And we actually built a nice module on there where you can test different AI agents from different providers. For example, here, we are just assessing image developed by Google DeepMind and we want to see its answer to the question. So you can select the region and then you can ask the agent. For example, here, we are asking if this, for example, if it can confirm the presence of invasive colorectal cancer in the image. Yes. And here you get the response. Yes, it confirms that. Yes, you can also give it a try after that in the booth after the presentation.
Nicolas Lopez Carranza
ExecutivesThank you very much. So you see how we are empowering digital pathologies that BioNTech with these tools. And yes, you are more than welcome to give a try downstairs soon. Now, with Karim for more applications in AI in BioNTech.
Karim Beguir
ExecutivesThank you, guys. Thanks, Nicolas, and thanks you, Youssef. It's really exciting to see the progress we're making in terms of like improving the data quality that we have and also quantity. And so if we summarize, if you remember at the beginning, we said we have 3 engines that are powering the AI plane. The first one is compute, and that's what we saw with Alex. And then we looked at AI innovation with Bernardo and Bora. And finally, now on the data front with Nicolas, Youssef and the BioNTech AI team. So this is all very nice, you could tell, but then what can we do with all this? And what is really exciting with having all those capabilities under the same roof at InstaDeep and BioNTech is that we can start now to tackle truly hard biotech problems. And today, we're going to show you our first results in terms of applications, starting with nanoparticle design with Lexi and Cheng.
Lexi Walls
AttendeesHi, everyone. My name is Lexi, I'm a scientist at BioNTech. And I've been working together with Cheng for the last year. One thing that we're really interested in is how to develop the best vaccine. And in order to do this, we look at first, what is our immune system trained to respond to. Oftentimes, that is viruses and bacteria, and these viruses are large, and they have a highly repetitive surface and sometimes that surface is symmetrical. So what we can also do is look at what have some historically successful vaccines looked at. They've actually taken advantage and harnessed this capability of having something that is large, something that has a repetitive system on the surface and is symmetrical. Some examples of this include the Hepatitis B vaccine against the Hepatitis B virus. The human papilloma virus vaccine that helps against cancer and more recently with the malaria vaccine. Now all of these really harness what our immune system is trained to respond to. They have an antigen on their surface in this large repetitive manner. And so we would like to combine with InstaDeep to be able to do this from scratch using AI-assisted de novo protein design. But that's not the only thing that we want to do with this innovation as we also want to marry this together with the power of mRNA technology, which has been so, so successful for many vaccines. Now what does this look like practically? What this looks like is we would like to deliver mRNA and utilize the cell to build our nanoparticles from scratch. This begins by starting with a single protein component that must first find its friends and velcro to 3 other components of the same thing in a really oriented way. Once they have found these friends, they need to continue to assemble into up to 20 of these trimers coming together to form these beautiful repetitive arrays to form a nanoparticle vaccine. And this nanoparticle vaccine will eventually hold antigens of interest that we want to tailor to our specific vaccine of choice. Now what can this look like? We want to be able to design not just one of these, but ideally, we would have a library of these tools that are tailored and fine-tuned to the application at hand. And here is just an image showing how many of these nanoparticle designs that we want to be able to build and bring to life. So just to really drive home how complex of a process we are trying to do here. What we really are asking is to build a protein from scratch that we can launch from mRNA and have this protein really interact at the molecular level with not just 3 other proteins, but come together and form a 6 steamer, up to 6 steamer of these proteins in this beautiful, amazing nanoparticle array. And so to walk you through some of these details, I'm going to hand it over to Cheng so that she can tell you about the amazing advances that they've done.
Cheng Zhang
ExecutivesThank you, Lexi. Hello, everyone. I'm Cheng, Research Engineer at InstaDeep. So now let's see how can we build a nanoparticle step-by-step. Just as you can see in the video, it's like building a house. So we start by designing some small pieces of building blocks. In our case, they are the trimers, which is an assembly of 3 identical proteins. So using generative AI models, we can design thousands of de novo trimers as you can see here, always different sizes and shapes. These trimers will form the basis of building blocks to build our nanoparticles, okay? Now we've built our building blocks how do we construct the nanoparticle exactly? Just as houses have their architecture, nanoparticles will have their symmetries, as you can see here on the left, they can be a tetrahedron, which consists of 4 trimers or in the middle and octahedron, which consists of 8 trimers or even on the right, you can see the biggest one icosahedron consisting of 20 trimers. So all these previously generated building blocks can be computationally assembled to this user defined various shapes, and this leads to thousands of symmetric nanoparticle assemblies. Until now, we've only designed this 3D structure of the nanoparticles, but in order to make a house habitable, you need to add [indiscernible] to consolidate the structure. So in the case of protein design, we will need to design the amino acid sequences to make the protein really functional and really forms the desired shape. In this case, we use AI models to generate hundreds of amino acid sequences per nanoparticle, which are supposed to really form the desired structure. Okay. Now we've generated hundreds of thousands of nanoparticles, but it's extremely challenging to -- for these small pieces of proteins to find themselves and really assemble as exactly as what we want. To confirm this, we will need laboratory testing. But it's usually time consuming and very limited by capacity. So the question is that how can we select the most promising candidates so that we can test them more efficiently and achieve a higher success rate. Here comes InstaDeep solution DeepChain Folding Studio. It integrates the state-of-the-art protein folding models and allows large-scale screening within a short amount of time. So just to give you an idea, we can screen 10,000 designs within 1 day. So now Lexi will show you how this narrow down high-quality designs perform in vitro testing.
Lexi Walls
AttendeesThanks, Cheng. So this is the moment of truth for a biologist is to go into the lab and see how we actually designed these proteins to structurally form what we want them to do. And what I have the pleasure of sharing with you today is that, yes, we can do this. We can build these nanoparticles, as you can see the models on the top of the screen, a variety of different shapes and sizes. And then we can go into the lab and utilize an electron microscope to see that, yes, we are able to build these nanoparticles as Cheng and her team have designed. But we didn't decide to stop there. What we're really interested in is functionalizing these nanoparticles and placing antigens of interest on to the surface. And so we took it this step further, and we can also show that we can place antigens on the surface of these nanoparticles and they still can structurally come together as designed and intended as again shown by these electron micrographs. So this is really an amazing feat of AI-assisted de novo protein design and structural biology coming together for enhanced vaccine. So thank you so much, and I'll hand it back to Karim.
Karim Beguir
ExecutivesThanks a lot, Lexi and Cheng. And really, I don't know about you, but for me, this is really magic to think that you can design a protein sequence just purely with AI and have it to self-assemble in a trimer and then self-assemble again at much larger motives. And then potentially have this as a scaffold of use to be -- to put antigens and trigger like immune responses. And so that's a significant challenge that we managed to overcome in this project but the applications don't stop here. And for our last but not least presentation, we're going to show you amazing work done into the other side of developing like an immune response, which is, could you actually fit a particular -- design particular TCRs for a given antigen target? And this is what Mike and Antoine are going to tell us about it.
Michael Rooney
AttendeesGreat. Thank you, Karim. So my name is Mike. Nice to meet you all. I'll be presenting today with my InstaDeep colleague and, Antoine on our work on T cell receptors also known as TCRs and specifically how to make these into a strong binders as possible, something that we think is critical to unlocking their full therapeutic potential. So why focus on TCRs? Well, one reason is that TCRs can unlock antigens that are otherwise not available with conventional antibody-based therapies such as ADCs. And the reason for this is that antibodies need to target things that are on the cell membrane. And the limiting factor here is that membrane targets by and large, are not usually cleanly tumor-specific because this means that there's some residual expression on normal tissues, which limits the dose. TCRs on the other hand, recognize antigen in a completely different way. And this is something that we were actually mentioned earlier, but we have a process called MHC presentation where proteins inside cells. The whole protein is subject to this. They're digested into peptides at the end of the life cycle and they're sent to the cell surface on a molecule called MHC. And that's what TCRs can recognize. And so it's essential here is that's -- something T cells can see basically the whole proteome not just the component that's on the cell surface membrane protein. And because of this, this unlocks antigens that are some of the highest quality cancer antigens we know of like oncoviruses, cancer mutations, new antigens as well as genes that are expressed due to dysregulated gene expression in cancer. The other reason we really like T cells and TCRs is that we believe that they are likely critical to getting durable responses in cancer. Probably the best example of this is checkpoint blockade where we now have data showing how durable these responses can be. So this is data from nivolumab in non-small cell lung cancer showing that 5 years out, we had this tremendous divide between patients who got the nivolumab versus chemotherapy. And then more recently, we have data from TCR-T. TCR-T is a cell therapy where patients cells are engineered to express the cancer-specific TCR. And the data we've seen so far is that these can also have very durable responses. This is a TCR against a cancer antigen called PRAME. So our thesis is here is that likely to get the most durable effects with cancer therapy, we likely want to be bringing T cells into the fight. But there's a challenge with T cells, which is that their natural binding affinity to their targets is actually quite weak. It's in the micromolar range. And this is okay for their day job, which is going after viruses and bacteria, which are very highly expressed. But when we want to go into cancer where the antigens are more typically weekly expressed or variably expressed, we need these to be very strong binders. So to be in the TCR-T cell space -- the cell therapy space, we probably need our binders to be nanomolar binders. And to get that, we either need to be very lucky and find the very rare natural T cells that combined at that very strong level or we need to do some sequence engineering to make these into stronger binders. Now if we want to go with an off-the-shelf biologic, avoid cell therapy, we need probably an even stronger binders, something in the picomolar range. And that's going to be a million-fold increase in binding overload you would typically see with a natural TCR. That's a huge increase. You're not going to get there ever with a natural TCR, it's probably going to take 10 to 15 mutations. So that's a serious mathematical engineering problem to solve. And one thing that we've realized now after several runs that we need to have a really strong computational process. The standard approach to this problem is something called face display. So in face display, it's a fully experimental process that is randomly exploring the sequence space. It's done for each of the 6 CDR loops, the complementary determining loop of the TCR. And at the end of the day, this will typically explore about 1 billion sequence variance, which sounds great. However, the true space, which I said is about 10 to 15-point mutations away from natural TCR is 10 to the 32, that's a huge number. So even with the face display, we're just scratching the surface of all those variations and to find a TCR that is developable, it binds strong, it binds specifically. It'd be quite lucky with just a random exploration of the sequence space. So what we've developed is a new approach. We replaced the face display with something that's AI-guided and it's rational. It's choosing no variance in this huge space to 10 to the 32, but in a way that we think is much more effective. And because of this, we are having success in finding TCRs that can check all these boxes. But again, like the competition is key. And after having done this multiple times now, what we realize is having a solid understanding of the peptide and we see TCR structure, which varies target to target is critical. And Tom will talk about our nuances on that difficult problem.
Antoine Delaunay
ExecutivesThanks, Mike. So yes, let's dive a little bit into the structures of TCR MHC complexes. So the reason why we are interested in this structure is because we want to understand the physical interactions between the TCR on one side and the MHC on the other side. So now when we talk about TCR affinity structures, we often mention the CDR loops. So there are 6 CDR loops, 3 on the alpha chain, 3 on the beta chains. And these loops are highly flexible regions of the TCR that are getting in contact with the MHC. So if you look at this left graph here, you can see that the -- where we represented 12 structures, right, lined on the same MHC. You can see that the loops tend to cluster into the same regions. So that tells you that overall, the docking mode is concert. However, if you want to know the exact shape and position of each loop, then you have a huge diversity. And that's really the key problem when it comes to TCR-pMHC structure prediction. And we have a good example here where we took the CDR2 beta and MHC of 2 different complexes. And these segments were in the structure are actually fully determined by the genome. So you would expect that if they share the same sequence, they will have the same interactions. Okay? But it turns out it's not the case. So this is why we really need to have a very accurate TCR-pMHC structure model if you want to be able to understand these interactions. So now let's talk a bit about how you can model this in silicon. And this has been a lot of improvements in the field over the last few years, a lot of exciting work in the community. So based on this amazing work, we've decided that we would actually build our own model, right? When we want to benchmark these models, we actually are interested in the accuracy of the CDR loop that I've just mentioned. So here we've benchmarked these models on a set of completely unseen targets, and you can see that our model performs better than our competitors. Now our competitors are generative models. So this means that a common strategy, if you want to boost their performance is just to sample many more structures for each target. But actually, you can see that if you do this, while they don't even match the performance of our internal model. So now you may wonder, okay, how do we leverage the structure into our pipeline to design these T cells. So we start from the natural TCR that has a low binding affinity, then we obtain the structure, and then we use 2 different AI algorithms. So the first one is the variant sampler. So it's going to propose candidate mutations. And the second one is the affinity predictor. So the affinity predictor is here to rank all of these mutations and help us to select them. So then we go to the lab, we make experimental measurement of the binding affinity, and we repeat this process 3 times until we reach the desired TCR binding affinity. And the nice thing is that we don't need to actually test thousands of mutations. We can just restrict to a few hundreds. So on the right side, you have an example of a very successful campaign that we had. So on this graph, we are representing the dissociation constant. So the lower the value, the stronger the banding affinity is. So initially, we saw with this with [ WT ] TCR and on 0. And then after just one round, we enter into the nanomolar range. So this unlocks the first therapeutic modality, which is called TCR-T. And then we continue after 2 rounds we -- and for 2 rounds and then at round 3, we reached the picomolar range, which unlocks the second therapeutic modality [indiscernible] TCR. And the nice thing with this pipeline is that we are actually able to repeat this process, and we repeated this on 4 different targets. And on average, we had an average of binding affinity enhancement of 50,000 fold. So now I'm going to show you something in nicer, which is in vivo results on an animal model. So the experiment is actually quite simple. You take the subject and you implement the tumor. And then every day, you measure the volume of the tumor and you inject the treatment. So if the treatment works, the tumor should not grow and if the treatment doesn't work, then it grows. So we've tested this on 2 different cancer targets. So we have 3 curves here. So the gray one is our first control. So this is what happens if we don't inject any treatment. Then the second one is the red one. So this is our second control. And this is what happens if you inject the wild-type TCR. So without any affinity enhancement. And then the last one, the yellow one is the tumor controlled with our enhanced TCR. So you can see the results are quite striking, I think, here, yes. Now yes, we are quite happy with this pipeline. We've built this very robust pipeline. We hope that we can move forward and in the long run, make very good progress to elicit durable immune response for patients affected by cancer. Thank you.
Karim Beguir
ExecutivesThank you, guys, and really like very exciting results and I'd like to congratulate the joint teams at BioNTech and InstaDeep for the results. And like Ugur mentioned, this wouldn't be possible without like a lot of collaboration. And perhaps like these 2 examples that you have seen really show you the power of combining together AI expertise, compute at scale with lab experiment capabilities, but also the significant biotechnology expertise of our colleagues at BioNTech. And so this is really integral to getting results and state-of-the-art results like we've shown you today. So really, congrats to the 2 teams. And we're going to have a Q&A with Ugur, and we're going to make this part a bit more interactive. So Ugur, if you'd like to join us.
Unknown Executive
ExecutivesAny questions from the audience?
Karim Beguir
ExecutivesQuestions? Don't hesitate. Yes, we have one here.
Unknown Attendee
AttendeesI'm Francisco, Genomics England. I have a question about your approach to understanding tumor biology. So we have excellent insights and applications based on the understanding of the human -- or the tumor genome proteome. I wonder if you can tell us about your strategy to also incorporate those interactions between the tumor and its microenvironment and how that can lead to new treatments?
Ugur Sahin
ExecutivesYes. This is an excellent question. Of course, the tumor, there is much more information than the genomic information. We do also transcriptomic analysis and the transcriptomic analysis gives us, of course, an understanding, for example, whether T cells infiltrate the tumor, the activation status of the T cells, but we can also decipher more or less all types of cells that are infiltrating the tumor. And in cancer immunotherapy, there is -- there are, at the moment, categorization, the simple categorization of tumors into PD-L1 positive tumors. PD-L1 high positive, low positive tumors. So at the moment, the pharmaceutical industry is running just with a single parameter. But we see that the information in the tumor is much, much bigger. We can also see evolutionary processes happening in the tumor, for example. For example, we have found tumors related to microglobulin, which is the key molecule for presentation, presentation of these epitopes is lost. So there is much more information, and this will come over time. Yes, we will make use of the information to see how the battle between tumor and immune system is going on.
Unknown Analyst
AnalystsHello. I have a question from the webcast here. This is from Jeana Han from TD Cowen. Where are you mining data from to train your models? And secondarily, how do you incorporate data generated in-house at BioNTech, either preclinically or from clinical trials to help train and improve the models?
Karim Beguir
ExecutivesUgur, perhaps I can say a few words on the part about data mining and Ugur can mention on the clinical side. So really like what we've been trying to do and in some cases, like managed to do quite well is collect all the available open source data that we can get our hands on. This is what we did, for example, for the nucleotide transformer series NTv3, we reached 15 trillion nucleotides in total in terms of like data to train on. And so that's one part. And then working collaboratively with all the different biotech teams to have specific data adding to that mix. I want to mention also that data is not all of the same type. We start with pretraining at very large scale. But the value of data that is specific to give an experiment, if you think about, for example, what we have shown on TCR affinity maturation. So lab data, like developed in collaboration with the different teams is essential here, and that makes also like a huge difference in terms of getting results. So get as much data as possible externally and add specific but very high-quality data internally. And we're working more and more with the clinical teams also to unlock these capabilities. And I don't know if you wanted to say a few words on that, Ugur.
Ugur Sahin
ExecutivesYes. It is indeed, yes, with regard to the clinical data, we are not yet in the space of big data, but we are in the space of deep data. With deep data, I mean, really, the multidimensional information that we can get for patients, including, for example, the image from the histology. And even with a few hundred patients and without using our sophisticated AI systems, but more or less unbiased statistical testing. We are able to see amazing, amazing correlations between survival and biomarker data.
Karim Beguir
ExecutivesAnd actually, Ugur, you told me sometimes even like looking at data manually with your expertise, you could see like patterns, yes.
Ugur Sahin
ExecutivesYes. This is actually one good friend of mine is using the term AI for actual intelligence. And we can really benefit from them until we have much more data and then actual intelligence could become artificial intelligence. So this is the way at end of the day, the learning systems are really based on our human knowledge, where we need to pay attention. And later on, the power of the systems of the AI systems is to go beyond that and see patterns that we as humans can't see.
Karim Beguir
ExecutivesYes. And we see actually even like an opportunity to learn directly from the expertise of the biotech experts. If you think about like processes like RLHF like -- from human feedback, it's really about that. And so modern systems could actually learn directly interacting with the different scientists and this is like one of the projects we're working on. So there is a lot more to uncover.
Michael Pye
AnalystsMichael Pye from Baillie Gifford. It sounds like a lot of the advances you've made in models, so NTv3, the antibody discovery design model. These sound like intuitively to me as a non-specialist could be very valuable outside BioNTech. How do you strike a balance between advancing the field and kind of be monetizing the work that you've done in this space?
Karim Beguir
ExecutivesI would say -- so first, like we try to bring all this innovation to our DeepChain platform, and you've shown -- you've seen some live demos of it today. I would say, in terms of priorities, clearly, our #1 priority is to work with our BioNTech colleagues to progress the different projects we have in Oncology. But biology is extremely vast. So there are lots of opportunities to develop collaborations with external partners when these do make sense for BioNTech Group, which is very often the case. Not everybody is focused on the same problems. And like you said, those models are quite generic, if you understand antibody structure, different properties that has like a broad application. And I would say, even broader for nucleotide transformer. To give you an example, we've built -- we built partnerships even in plant genomics. So yes, there is a broad appeal, but our first mission is to support our BioNTech colleagues. And when there is a chance to build win-win partnerships, we will take it.
Michael Pye
AnalystsAnd a very quick follow-on, if I may, further. Can you help us to understand what can you do with nanoparticle mRNA that you cannot do with today's mRNA constructs?
Ugur Sahin
ExecutivesI think one of the key aspects is the duration of T cell responses. So as we know, we know that mRNA can induce really high antibody titers. But we also know that the titers as dropping with the half-life of antibodies in the range of 21 to 30 days. And with nanoparticles. We hope to see more stable antibody titers because there's nanoparticles remain in the body for a pretty long time.
Unknown Executive
ExecutivesMaybe another one from the webcast then. Agentic Systems are the new frontier for AI models, such as Gemini and ChatGPT just to name a couple. When do you think BioAI agents will become viable and/or useful to us?
Karim Beguir
ExecutivesI mean they're already useful. I think as you've seen, like, we're very excited about the applications and some of these are already coming into fruition. But it is true that we're going to see much more in coming years. Perhaps a limitation that we have with the current systems is the fact that -- like if you look at models like ChatGPT, Gemini, Grok and others, they're really focused on learning and training on the Internet as a whole. So when they understand biology, they understand it like from reading articles or web pages, they do not, for most of the time, understand biological sequences themselves. And when you bring in like deep understanding of a biological sequence at the nucleotide level, you see really like magic being uncovered. And we have that example with our ChatNT work that made the cover of nature machine intelligence. This was really bringing an expert nucleotide biological sequence model with a general purpose language model, and that showed a lot of promise. So I think that's the frontier. And in this particular case, the team was able to push the frontier forward, but we're going to see a lot more from that. And the future of Agentic Systems is our system that really understand the biological data in multiple modalities but they also can read like scientific literature in real time to be able to provide perspective and almost like generate novel ideas. I remember, Ugur, you had given us this out like a few years ago, and this is starting to become true. The systems are capable of formulating scientific hypothesis and we see this coming and becoming more and more frequent. But if I had to say, like what's the year where you see this really starting to play at scale, I would say probably 2026, so next year.
Unknown Executive
ExecutivesGreat. Maybe one final question from the webcast. Given the broad potential for AI, both at BioNTech in general, which technologies, modalities and applications do you think will be prioritized at BioNTech specifically to deploy our AI tools or AI capabilities?
Ugur Sahin
ExecutivesThe way how we develop BioNTech is really solving one big problem, yes, how to improve cure for cancer. And this is not a single application. It is, as you have seen today, really a series of modular task that if combined, provide powerful capabilities to develop novel antibodies, better ADCs, better mRNA therapeutics, better vaccines. And in this setting, I think one of the presenters said, having a system which is universally aware of all the schools has advantage from a very specialized as compared to a very specialized model. So we have to see how this evolves, but we are we are very confident that this approach -- this holistic approach, understanding immunity understanding cancer, supporting development of therapeutics in the broader scale is the way how future pharmaceutical companies should be built.
Unknown Executive
ExecutivesCool. I think this was our last question. So thanks again for everybody who attended either in-person or online. It was a pleasure. And yes, stay tuned for more progress. Thanks a lot.
This call discussed
For developers and AI pipelines
Programmatic access to BioNTech SE earnings transcripts and 32,000+ others is available through the
EarningsCalls.dev REST API. Plans from $24.99/month — full transcripts, speaker segments,
full-text search, and the recently-added /api/v1/transcripts/recent polling endpoint for ETL pipelines.