MindWalk Holdings Corp. (HYFT) Earnings Call Transcript & Summary

December 13, 2024

NASDAQ US Health Care Life Sciences Tools and Services special 30 min

Earnings Call Speaker Segments

Shuji Sato

executive
#1

My name is Shuji Sato, I'm Senior Director at ImmunoPrecise Antibodies. And here today, I have with us Dirk Van Hyfte, the Head of Innovation at BioStrand, the In silico team of ImmunoPrecise Antibodies. And Adam Roots, the VP and Head of Protein Sciences from one of the pioneering companies of AI biologics generation, a company called General Biomedicines, locally here in the Boston area. So title of the session here is Beyond Conventional Biologics, The Intersection of Machine Learning and Biological Engineering to invent novel medicines. So really our focus here on biologics. We're going to touch with on a few key topics here, how AI and machine learning has really transformed the way that we design biologics and really shorten time lines while exponentially increasing through, but and also address some of these term gaps while they still -- we have made a lot of improvements, but there's still challenges in the de novo design that requires wet lab validation. And then we'll also touch upon some multi-omics data solution to provide detail and scalable insight. So I'll have the guests and presenters here introduce themselves.

Adam Root

attendee
#2

All right. So thank you very much for this opportunity and for the invitation to speak today. As Shuji said, I'm Ad, I lead the protein sciences team at Generate Biomedicines. I've been there for about 4.5 years now. Before that, I worked at Pfizer and Y before that in traditional drug discovery and pharma and joined Generate really to help build a lot of the traditional protein biochemistry, protein sciences capabilities in the company. They were at that point around 2.5 years old and we're really just getting off the ground in terms of some of the wet lab capabilities. In terms of my background coming into this, obviously, coming from large pharma, traditional drug discovery, working on biologics, working on a number of different protein modalities, antibodies, cytokine fusions, targeted nanoparticles, bispecifics, things like that. So really joining this company, we were talking before the session around like why do we join these companies and really about where you see the future going and a little bit of fear of missing out as well as fear of potentially losing your job someday to AI and ML. And it's been an incredible opportunity for me for professional growth, but also really to be a part of amazing technology development that I think we're all experiencing across the field in biologics. In terms of my role at Generate, as I said, responsible for all the protein sciences activities. So that includes, as you can imagine, the protein production and the characterization of all the different proteins that our in silico platforms are producing. And as you can imagine, the computers can produce more protein variants and designs than we can feasibly make. So a lot of the work that we've done over the last few years to build those traditional capabilities around protein production, expression purification, all of the analytical capabilities as well as some of the functional and more recently, we've invested heavily in structural biology, and I'll talk about that in a moment. but really building state-of-the-art and then in the last year or 2, it's been pushing more towards the future of where the space is going in biologics, producing at scale in terms of hundreds of thousands of millions of designs to really test sequence space into new areas to be providing data for our machine learning models. So things like microfluidics, things like cell-free expression where, again, we're not alone there. A lot of companies are doing this and being able to really rapidly build, test and then learn off of the protein designs. And then as I mentioned, the structural biology component. When you think about what's important for computational drug development and drug design, it's data, and it's functional data, it's biophysical data, it's the types of attributes that you want in a therapeutic protein. And a company like ours structural information is really critical. And so we made a big investment in cryogenic electron microscopy about 2 years ago, and we now have a 4 microscope facility in Andover, Massachusetts, where our goal is to not just produce project-specific structures to -- but to industrialize that to be producing on the order of thousands of new structures per year to really kind of surpass what is out there. And I know there's hundreds of thousands of structures in the BDB, but what's lacking is oftentimes the specific protein-protein interaction structures that are related to disease that are related to actual therapeutic modalities. And so we're trying to fill some of those gaps. I know that's one of the questions we'll touch on is the gaps in this space. And yes, I mean, I think that it's been a really incredible journey for me professionally as well as helping to build the new capabilities that we're now seeing become more and more prevalent in the industry. I guess I didn't talk too much about Generate, but I can come back as soon as we come to it.

Unknown Executive

executive
#3

Perfect. Perfect. So yes, indeed, you said it already, it's all about data. And that's also, let's say, going back to my root. So I'm a clinician, studies medicine in Leuven, did also the specialization of psychiatry and did my PC and medical artificial intelligence. So already a long time ago, and it was really about precision medicine. So I really have built a psychoactive drug selection system. So really to improve rational drug prescription. And one of the biggest issues that I had to solve was how to integrate all this data because if you prescribe a medicine, then you need to look at clinical data of the patient, but also psychopharmacological data and try to find a certain algorithm how to bring all this data together. So in the year 2000, I was working on that element and say, okay, how can I improve rational drug discovery or prescription. And today, I'm working in the area of rational drug discovery. So also there, how can we integrate all that knowledge, all that data to build end-to-end solutions. So that's a key thing. And so when I finished my PC in the year 2000, then I started my first company around NLP, natural language processing because what I've learned during my PhD was that most important valuable information is in unstructured information, so in text in clinical nodes. So how could we solve that, how could extract this knowledge out of that data set. And so while we started really building from scratch or on NLP natural language processing engine. And so one of the key things of that technology, and it's still valid even today, even with all these large language models is that it can immediately extract all the meaningful work groups and the interrelationship between these meaningful work groups. So my company was in 2010, acquired by InterSystems. It's U.S. based also here in Boston-based database provider that's underneath quite a lot of big health care systems. And so after the acquisition, I worked in, let's say, in predictive medicine. lots of projects there. And at some point, I also start looking into genomic, omics data, DNA, RNA and amino acids. And I did a discovery. I really found very specific fingerprints in omics data. I said, wow, these are the words of biology. And so yes, of course, I was very excited by that. I said, okay, I need to patent this technology. And I'm really happy that I did this because it's also granted already and said, okay, let's start the company. Let's start the company, BioStrand and then doing my next journey indeed, rational drug discovery. And so in 2020, I started that. And then in 2022, we became a part of ImmunoPrecise and so IPA. And that is indeed -- that's also the message. The statement that we want to do is to develop rational drugs, especially antibodies to make it safe, but also very efficacious. So I'm very glad that also you encountered, let's say, the problem of data and data integration. I think we will talk further on that. But can you just give a global overview on, let's say, the current wet lab procedures, how long it take to come to INDN, for instance. And then yes, also the costs that are associated with that. So just to give an idea on the current wet lab procedures.

Adam Root

attendee
#4

Yes, absolutely. And I think maybe to help frame those questions, I think akin to your vision on linking together the words to understand biology, I think those were a similar set of -- a similar mindset to the founders of Generate Molly and Gabor around learning the language of proteins and learning the language of how these amino acids come together and how they form tertiary structures, coronary structures to then drive function. And so turning that into a programmatic pursuit as they say. And so I think from that perspective, it's really about getting the data that then informs the ability to learn and then not just read but write the proteins in a generative way so that you can create new proteins that maybe haven't existed before, but can do something specific that you want. So in terms of the question around the wet lab capabilities, which again was part of the responsibilities of the merchant Sciences team as well as our pharmacology siblings, what to really take what as often siloed research that's happening in different groups with data being captured in Excel notebooks and someone's personal drive, turning that into a centralized system with a highly organized database system that is easily capturing the data and then translating it through to our generation scientists on the computational side. And so in terms of the wet lab capabilities, it's not all that different in the sense of their traditional techniques of expression purification and analytical characterization, but it's really thinking about how you're establishing those assays and workflows, where you're capturing and what types of data through that process and then how you're rapidly taking that back into your -- in our case, an internally built system highly structured so that it allows for the rapid generation and learning.

Unknown Executive

executive
#5

Perfect. And so yes, wet lab and AI, that's hand in hand. So how do you see the relation of that? And what are the wet lab capabilities that also supports and learn from building your AI here and machine learning algorithms.

Adam Root

attendee
#6

Yes. No, great question. And I think when we first started our company, we felt that our competitive advantage was on the machine learning side, but we very quickly realized that it was that integration of the wet and the dry lab and that ability to build and test and learn as quickly as possible and also scaling that. When you think about machine learning data, you often need a lot of data to train these models on and also the right types of data, the areas of the models aren't getting things correct. So we invested pretty heavily in those areas to find the missing pieces of data, structural data that didn't exist in PDB or alpha fill didn't get things right, certainly functioning and tying into structure. As you know, making a biologic can be very challenging and can fail for a lot of biophysical reasons during manufacturing. And so gathering all the types of data that we need from a biophysical standpoint and feeding that back in and again, just trying to as best as possible because we're oftentimes working with third-party instrumentation and software to extract that and feed that back into our models.

Unknown Executive

executive
#7

And of course, we are very enthusiastic about all these new capabilities and so on, but do you still sell gaps in that process?

Adam Root

attendee
#8

Yes. I mean, definitely. And I think part of what we've been doing is in addition to building the tools to capture, it is building the tools to understand where the data is having impact on our models and our designs and then areas where our models are still getting things wrong. And so again, I'm speaking very specifically around the protein design space. I think as we think about the broader spectrum of discovery and into development, there's obviously a lot of functional biological data that I will talk about in a moment. That is also a gap. I think that is probably the bigger gap in some ways as our models get better and better about learning the properties of amino acids and proteins, that part is more of a realistic thing where we're now at being make protein designs, but it's more importantly like where do we then point those proteins and why and getting -- tying that back together to the biology.

Unknown Executive

executive
#9

Yes. Well, that's a perfect bridge to Yes, interested in and especially, let's say, one of the problems that I tried to solve is what we have called the information integration dilemma. And so really, what this means is if we look at, okay, all the different types of data that we have to bring together. So it's, on the one hand, unstructured information. So it could be text, but also sequential data is unstructed information. also have, of course, the classical structural information like the lab results and so on. So that's a part of it. But also more and more, we have call it synthetic data. So data that resides in, for instance, all these large language models. And so the basic question that I ask myself is how can we integrate that? How can we bring all these different types of data and different types of modalities together. And that's why I came up to with this famous fingerprint. So it's a kind of an indexing system. And with 660 million of these patterns, we can really rewrite and index whatever exists in the biosphere. So it's really very generic. And an important aspect of this fingerprints, and it's based on, let's say, an entropy model, that is that we know exactly what's possible and what's not possible in the biosphere. And it helps us to really build generative AI because you can generate whatever. But that's endless. It's also computational, very hard to do. But if you know the boundaries, that's a very helpful thing. And so that is the basic of this approach, what we have called the integration information [indiscernible].

Adam Root

attendee
#10

And I guess I'll put the question back around to you where you still see the gaps. It sounds like you have a lot of data that's out there. But where do you see opportunities to get more or the like types of data that are missing. And I guess, tying that to the successes of the outputs of these models? And where is it working maybe where it could still improve with maybe the right types of data?

Unknown Executive

executive
#11

Yes. Well, that brings us to the next point. The fact is, indeed, like we said, it's quite a lot of data. Also quite a lot of is generated by all the classical AI algorithms, but they are really black boxes. people don't understand exactly what's going on. And that is what the research who is doing the interpretation of -- the wants to understand, okay, what is really going on? How is the interpretation of that data. And so one of the biggest and very interesting challenges is to put context around your information around your data. And just to give you a specific example, so we have, for instance, like our immunogenicity screening, so where we analyze its binding and so on. We also do a check in the human proteome. And again, with these patterns, you can do that very quickly. So it's really in just a few minutes that you can do this kind of look. But then you want to associate your information really at the patient population level and bring that in context. So it's not just a matter of saying, okay, here, I have an immunogenic zone. you want to bring information in context. And of course, going more and more towards [indiscernible].

Adam Root

attendee
#12

And I guess, do you see that more moving towards like precision medicine because now you can get very specific in terms of that specific patients, biological uniqueness.

Unknown Executive

executive
#13

Yes. That's why so important is on precision medicine. Yes but it's all about data and the access to data. And the more people already in the labs start thinking in that process, what this really the meaning precision medicine that you need to integrate all this information. It's also start with, for instance, if you look at rational drug development, it's about understanding the target, understanding the biology. And also, not only the biology but the structure, the physical chemical properties of that and how that relates to the population.

Adam Root

attendee
#14

And are you able to look at where you've been applying these sorts of models and seeing where they've been successful and maybe they've been less successful to understand then those gaps? Is it because of -- I know we talked a little bit about omics, -- is it because maybe it's not representing the disease population as well enough or maybe because it is so specific in a certain disease population that's not as translatable in others? I guess I'm just trying to understand where you've seen it work and that work and where those gaps are.

Unknown Executive

executive
#15

Well. Also a very interesting thing is that you see companies that try to build the biggest large language model. We have the biggest model, and we have integrated but it's still first in between that model. And so with our indexing system, we can associate and let's say, the most important information from different large language models to our objects. And then we start seeing, okay, but the difference between the different large language models and see the nuances and then start seeing the gaps because if you only are looking within one single word of say, okay, this is the biggest large language model and here I can solve whatever you don't see the nuance, you don't see the edges of it. And so that's quite interesting approach.

Adam Root

attendee
#16

And one thing that I think we were paying close attention to in our own designs, you see with things like ChatGPT, where it can hallucinate, it can be based on wrong data, maliciously created data. And so certainly, we're not feeding bad data. We're trying not to feed bad data into our models, but there's always a risk that, that can happen sometimes and of course, if we're using publicly available data, there's that risk. And I'm just curious, from your perspective, with these generative models, there is always that risk. And are you less concerned because of just the scale of the sheer volume of data that you have that it mitigates the presence of data that can be misinforming. And are you seeing things like hallucinations?

Unknown Executive

executive
#17

Yes. Of course, and especially also in -- as a good example, say languages innovation. And also, I think people don't think that's too much in our accounts that it's always data in, data out. So this generative AI models are very good in to generate data, but you really need to feed them with the exact right data or the exact promts or whatever. And that's -- that's the hardest part. And so that I think also in the way you look at the data sets, you want to build your own structural models so that you have control over your input and then use all this AI and so on to generate models, but the input and the exact question of the problems that you are asking to this generative models. That's the key thing.

Adam Root

attendee
#18

That makes a lot of sense and aligns very much with kind of how we've thought about our data and focusing it on the therapeutic areas, the modalities that we are most interested in. Our platform was -- has been built to be modality agnostic, so very generalizable. And what's nice about that is that no matter what protein format we're working on, our model is -- it's the same model that we're using to produce all of those, and it's the same data capture feedback. So we are learning across all of those, which I assume is a similar in the case of the different whether it's therapeutic areas or indications with any given therapeutic area that what you're working on.

Unknown Executive

executive
#19

So it's really about an integration of data. And so also maybe to explain a little bit, so we are not only looking at sequential information, but also structural information, functional information. We put that in one big moment, huge graph. And we associate, like I said already, information from different large language models to all these levels. So we are really stacking different large language models on top of each other. And that's very powerful. And for instance, in one of our applications that we see this is like our expansion pipeline. So for instance, starting with, let's say, a handful of binders and you have a handful, let's say, 5 binders, but then you have your repertoire, maybe even repertoire. And then how can I bring or how can I map this -- and as you know, classical people are just looking at sequential homology and even maybe just the small part, only the CDRT for instance, and look at that. But if you have the capability to look at the full sequence level and the structure level and this physical chemical properties in one single framework, that's where magic starts.

Adam Root

attendee
#20

Like you're talking to my founder right now, similar visions of having that holistic view. And then I guess maybe that was another question I had for you, thinking about your background as well as the way that you've worked on different things over the years. Are you using the technologies that you've been building in a way to focus the research around certain biologies? Or are you now focusing more on the protein designs themselves? Or is it both? I guess one thing maybe I can speak about and generate is that we've been heavily invested in the protein design and characterization aspect of it. We haven't invested as heavily in things like omics, data mining, data capture. That is an area that, of course, we want to move into, but it's a separate business in a way that it seems you have kind of both things under one roof in a way. And so just curious how you balance those things as you think about projects you're starting? And how you do your design against those different therapeutics?

Unknown Executive

executive
#21

Well, it's perfectly, let's say, in the call fee-for-service models of IV as a CRO. And so that's why we said, okay, let's first build an end-to-end pipeline, but covering the whole 3 different steps, so coming from target identification, the discovery part and then also the lead optimization. So it's really -- and that's our primary focus. But in intrinsic that the model that we are using is generic, so it can be applied in different application areas.

Adam Root

attendee
#22

And I guess one other question I think we had talked about before this is kind of right now, there's -- I would say, out of the spectrum of activities across the drug discovery and development range of activities that companies undertake, there are applications across that entire spectrum from target ID, protein design, clinical trial design, even to commercialization. And so just curious where you see the next kind of horizon, the next generation of focus and where we can be applying these types of technologies.

Unknown Executive

executive
#23

Bringing this to the clinic. So that's the whole thing and bringing more and more to that vision that I said already, so how clinicians will apply and prescribe the antibodies to the patients, but bring that back fast as possible to the lab and that people start to understand it. And that's also one of the key things that I've learned, let's say, also even during my PGD, it was the multidisciplinary that you really need to talk from white lab people to AI people and even to the clinicians. So -- and the better that we can understand this also the way that everyone has a own vision in that, but trying to solve the multidisciplinity of the field.

Shuji Sato

executive
#24

Happy to open the floor off to some questions. We have a few minutes here.

Unknown Analyst

analyst
#25

Thank you. It was wonderful, very interesting. So a question is, it's very important to make good in silica screening methods when you design proteins. But nowadays, there are no working AI models, for example, docking. Even AlphaFold cannot make good screening when making like complexes. For example, it always tries to make complexes, even there is no complex. So it's interesting. So what do you think about docking problems here?

Adam Root

attendee
#26

No, it's a really nice question. And I think that hits at the core of our founder's vision for protein generative design to design with intent to bind in a specific way. So that's been probably one of the earliest pursuits and still continues to be one of the biggest focuses of our company. We had a recent publication in nature around our chroma model. That's one of our models. We have additional generative models that are designed specifically to design for protein-protein interaction. So not just an optimization stack of optimizing something that already exists, but to design from scratch a protein to bind to another in a very specific way. And again, that kind of goes back to the future of this field and how we think like having the right therapeutic hypothesis, but also the right modality engaging that target in the right way. And we've been investing heavily, as I said, in the wet lab capabilities to produce and test these to confirm and see where the models are getting it wrong. And so we refer to this as our de novo technology stack at Generate, where we spend a large amount of resources every year on this initiative. We're producing hundreds of thousands of different protein designs against all different types of targets. We run these large matrix 5x5 experiments against different targets. We produce hundreds of thousands of designs against each target, and then we screen those designs against the other targets as well, so we can monitor not only our hit rate, but our off-target hit rate. We then take those through conversion. Those are often done in a display system and then we test those in traditional recombinant format for biophysical properties. And now that we have these structural capabilities, we're starting to then close that loop back to get the structure of where these proteins are actually binding to and how accurate or how inaccurate the model got it. And that is really important when you think about the data feedback so the model can actually see where it got it wrong and then improve on that. But you're right, the models that are out there right now, I think maybe besides some of what we're seeing out of Baker Lab, there's really been limited successes in that true de novo generation of interactions, and that's been an area of heavily focus for us.

Unknown Analyst

analyst
#27

Yes. So I have a question for Adam as well. Just to follow on that point, can you elaborate a little bit more on the wet lab and other things, right? As you mentioned, you can design hundred proteins. But can you talk a little bit more about how you actually make them, how you actually test them in terms of the entire time line, the modality of the proteins, is like cell free or some other method, that would appreciate that.

Adam Root

attendee
#28

Yes, absolutely. And yes, I mean that's been a lot of the effort over the years to build those capabilities. So we refer to these things as mega sets. So these are designs of tens or hundreds of thousands of unique computationally defined proteins. And so you can imagine there's a DNA aspect of it as well, where you have to make all of the individual DNA constructs either individually or as pools, but not produce things that are maybe outside of the model's proposals because you don't want to confuse the model. You want to make everything that the model had proposed and then test those for the binding or function. So we do use a lot of display systems used to mammalian display. We are advancing some of our clonal workflows to get into the tens of thousands of designs using things like cell-free so that we're able to make these and actually have very specific sequence back to function relationship. But with clever barcoding, we can track back through pool-based systems to learn on that. Happy to -- I know we're at time probably, so I'm happy to follow up with questions in the other sessions.

Shuji Sato

executive
#29

Thank you very much, everybody. We're ImmunoPrecise Antibodies, and we have a booth on the very end on the right-hand side. So really appreciate your attention here. And thank you, Derek. Thank you, Adam. We're honored to have you.

For developers and AI pipelines

Programmatic access to MindWalk Holdings Corp. earnings transcripts and 32,000+ others is available through the EarningsCalls.dev REST API. Plans from $24.99/month — full transcripts, speaker segments, full-text search, and the recently-added /api/v1/transcripts/recent polling endpoint for ETL pipelines.