From 10,000 Structures to 1.8 Billion Interactions: Breaking the Data Bottleneck to Engineer Efficacious Therapeutics - Part 1

Antibody therapeutics have transformed modern medicine, but for many scientists, developing new candidates still feels like searching for a needle in a haystack—a slow, expensive, and unpredictable process. Structural biology and high-throughput data generation are now collapsing that haystack, offering unprecedented visibility into the molecular handshake that drives life: protein-protein interactions.

In this episode from the Smart Biotech Scientist Podcast, David Brühlmann meets Troy Lionberger, Chief Business Officer at A-Alpha Bio, a biotechnology company harnessing synthetic biology and machine learning to measure, predict, and engineer protein-protein interactions. 

Key Topics Discussed

  • Antibody development is shifting from a handcrafted, trial-and-error process to a systematic, data-driven engineering discipline.
  • Traditional in vivo and in vitro discovery methods rely on limited structural data and face challenges such as species differences and cross-reactivity.
  • Animal-based antibody discovery offers natural diversity but is constrained by ethical, biological, and translational limitations.
  • In vitro display technologies enable speed and scale but often introduce developability, stability, and human compatibility challenges.
  • AI and machine learning now augment antibody discovery through in silico design, with the potential to reduce or replace wet-lab steps.
  • High-throughput data combined with machine learning compresses antibody lead optimization from years to months without sacrificing quality.
  • Massive quantitative affinity datasets generated by advanced platforms are transforming biologic drug discovery by improving speed, cost, and reproducibility.

Episode Highlights

  • The historical and current challenges in characterizing these interactions at scale [06:22]
  • How new technologies—especially high-throughput platforms—are changing the needle-in-the-haystack approach [08:40]
  • A comparison of traditional in vivo and in vitro antibody discovery methods, along with their strengths and limitations [09:06]
  • The evolving role of AI and machine learning in antibody discovery and lead optimization [12:11]
  • Real-world examples of how A-Alpha Bio’s approach is compressing years of work into months without sacrificing quality [13:58]
  • The science behind A-Alpha Bio’s AlphaSeq technology and how it leverages yeast display and genomics for large-scale affinity measurements [20:43]
  • The practical affinity range the technology can measure, covering most therapeutic applications [23:25]

In Their Words

As a biologist, I would tell you that ultimately life doesn't exist if proteins aren't on some level interacting with other proteins. So whether it's catalyzing force in your muscles or replicating DNA, proteins have to interact with other proteins to carry out all of the cell functions that are necessary for life. If there's ever cell dysfunction, it's oftentimes in some way, shape, or form tied back to some sort of protein–protein interaction that's both the origin of many disease states, but also the opportunity for therapeutic intervention.

Episode Transcript: From 10,000 Structures to 1.8 Billion Interactions: Breaking the Data Bottleneck to Engineer Efficacious Therapeutics - Part 1

David Brühlmann [00:00:35]:
Protein–protein interactions govern almost every biological process and hold the key to treating cancer, infectious diseases, and neurological disorders. Yet, with only 10,000 antibody–antigen structures in public databases, we're building tomorrow’s medicines on yesterday's data. Today, Troy Lionberger, Chief Business Officer at A-Alpha Bio, reveals how measuring millions of interactions simultaneously changes everything. By generating unprecedented quantities of high-quality data, they are accelerating the discovery of rare antibodies, engineering better protein therapeutics, and training AI models that predict what works before you ever touch the lab bench.

Let's explore how. Welcome, Troy, to the Smart Biotech Scientist. It’s good to have you on today.

Troy Lionberger [00:02:42]:
Thanks, David. It's a pleasure to be here.

David Brühlmann [00:02:43]:
Troy, share something that you believe about biotherapeutic development that most people disagree with.

Troy Lionberger [00:02:51]:
It's an interesting question. I think the most controversial view I harbor right now is—given my background—there is an overwhelming historical acceptance that antibody therapeutic development is artisanal and bespoke, that you're really hunting for needles in a haystack, if you will, as is often the common analogy.

I think the controversial statement I would make is that it’s far more systematic today than I ever imagined. For example, most people are surprised when I tell them that there are tractable and reproducible ways to make antibodies that have the same affinity for their human therapeutic target as animal targets. I mean, not just cross-reactive, but the same quantitative affinity, which could help streamline preclinical development, for example, of therapeutic antibodies. Most people I’ve spoken with seem to think that is a flight of fancy. Fundamentally, there are processes that make this happen today. That is the most surprising thing I share with people on a day-to-day basis.

David Brühlmann [00:03:52]:
And this will open avenues to novel therapeutics and also more efficacious drugs.

Troy Lionberger [00:03:59]:
That's right. I mean, preclinical development of antibodies is fundamentally constrained by the challenges in developing these therapeutic molecules. In large part, getting them to work with the animal models required in your studies is problematic. Oftentimes, the affinities of your molecules for those animal targets are far worse than for your human targets. So while you may have a drug that works quite well in humans, you can’t get it to the clinic because the animal might have toxicity issues, simply because you had to administer so much drug into it.

David Brühlmann [00:04:33]:
Before we dive deeper into today's topic, take us back to the beginning. What first sparked your interest in biotech, and how did that journey lead you to A-Alpha Bio?

Troy Lionberger [00:04:43]:
The origin for me was really in college, when a faculty member teaching structural biology started describing proteins as nanomachines. That visual has always stayed with me. It got me interested in science and wanting to understand how these fascinating machines, which operate with very different materials and properties than anything we’ve created as human beings, function and work.

That naturally led to understanding that these proteins, which we barely understand, are ultimately at the root of all human disease—leading to cell dysfunction, which we describe as disease states. It’s ultimately the understanding of these basic building blocks of life that drives biotechnology: figuring out how these proteins can be manipulated and controlled to elicit therapeutic effects.

To answer your question about how I ended up at A-Alpha Bio, my career in biotechnology started at a life science tools company called Berkeley Lights, where I helped invent an exciting technology to discover therapeutic antibodies. That experience naturally led to working with many teams in the industry to support their discovery efforts, and I became increasingly aware of the next major constraint: the preclinical development of those drugs. That is, in large part, the problem we are trying to solve at A-Alpha Bio right now.

David Brühlmann [00:06:05]:
When I looked at your website, what struck me is that your company, A-Alpha Bio, describes itself as a protein–protein interaction company. Why are these interactions so fundamental to drug discovery, and why are they also so difficult to characterize at scale?

Troy Lionberger [00:06:22]:
It's a great question. My background is in biology. As a biologist, I would tell you that ultimately life doesn't exist if proteins aren't on some level interacting with other proteins. So whether it's catalyzing force in your muscles or replicating DNA, proteins have to interact with other proteins to carry out all of the cell functions necessary for life. If there's ever cell dysfunction, it's oftentimes in some way, shape, or form tied back to some sort of protein–protein interaction.

That's both the origin of many disease states, but also the opportunity for therapeutic development.

Being able to characterize these protein–protein interactions—there are many technologies that have come forth to help do this. Surface plasmon resonance (SPR) is an industry-standard way to study protein–protein interactions, and we call this affinity. Understanding the strength of those interactions, or the affinity of those interactions, is ultimately how biophysical characterization describes these protein–protein interactions.

The problem historically is that, despite how advanced these technologies are, they are also quite costly and difficult to use. And when I say difficult, I don’t mean it’s impossible—people do this every day in labs all around the world. It’s just that if your goal is to make millions and millions of those measurements, it’s not a scalable technology.

A great example: to generate the amount of affinity measurements that come off just one of our experiments using SPR, it would take a few weeks. At A-Alpha Bio, the equivalent amount of affinity data would cost you between $1 and $500 million if done at a CRO. We do a million measurements at a time. That math illustrates the fundamental constraint in the industry. Despite increasing awareness that this volume of data is transformative for understanding biology, no one is going to pay $100 million for a weeks-long experiment.

So the constraint we’re trying to solve is making these data—otherwise far too expensive and too hard to generate—easier, more affordable, and more economical.

David Brühlmann [00:08:40]:
That's exciting, and that's definitely the way to go—to be able to screen a lot more and then find, quote-unquote, “needles in the haystack,” but for a much smaller, modest budget. If we just look at the general picture—because drug discovery has been done for decades—how do companies do this traditionally? What are the traditional workflows and methods? Let’s start with the basics.

Troy Lionberger [00:09:06]:
The branch of therapeutic discovery that I come from is called in vivo discovery. In in vivo discovery, you are typically relying on an animal model whose competent immune system is ultimately responding to the presence of an antigen that’s presented, raising an immune response against that antigen. On the discovery side, scientists will access those antibody-producing cells, identify the ones producing an antibody very specific to your disease target, and then sequence those antibodies to move forward in developing them into a drug.

There’s a complementary approach called in vitro discovery, where you use what are literally called panning methods. You can imagine gold miners panning for gold, which gives you an appreciation for the basic philosophy behind current discovery: needles in a haystack, mining for gold. In phage panning, you use bacteriophage to express a version of your therapeutic and access very large diversities—many different combinations of molecules. You expose these to your therapeutic targets, find those that bind, sequence them, and move them forward in the development process.

David Brühlmann [00:10:20]:
I imagine there are a lot of advantages to this traditional approach. Can you highlight what those advantages are, and also what the limitations are?

Troy Lionberger [00:10:27]:
The advantages of the in vivo approaches using animals are that you're taking advantage of really one of the world's most sophisticated ways of generating diverse sequences of antibodies, which is a competent immune system. To date, while there is promise in AI, AI has not been able to generate the diversity of functional antibodies that a competent immune system can. That is the promise for the future of in silico methods. But to date, hands down, one of the finest ways of generating a diverse antibody response is using an immune system.

The advantages are the diversity. The disadvantages are that, in many cases, you're not able to get human antibodies because it would be unethical to immunize human beings for the purpose of generating antibodies for therapeutics. So we’re limited to animal models that produce antibodies that then have to be further engineered to be compatible with human biology. We have developed humanized animal models to solve that problem, but these are expensive and not commonplace. That is the challenge there.

On the in vitro side, using phage panning, it's much faster. The downside is there are often biophysical characterization issues with those molecules. For example, we’re phage panning at room temperature, but antibodies have to survive body temperature. If they melt or denature at body temperature, that's a problem. So there are other liabilities with the in vitro technologies.

David Brühlmann [00:11:58]:
With the new technologies advancing very rapidly, what is the picture you're seeing? Are we going to have a side-by-side approach, or eventually will AI, machine learning, and so on take over?

Troy Lionberger [00:12:11]:
It's definitely top of mind for me personally, and I should be upfront and say this is the first time I've worked on the machine learning and AI side of the industry. I'm definitely new to the game. So with that caveat, I’ll just say I’ve mentioned in vivo, in vitro, and now, as you mentioned, in silico approaches, which are now complementing the first two antibody discovery approaches.

De novo antibody design is the name of the field that is essentially trying to predict sequences of antibodies that will bind efficaciously to a therapeutic target. Right now, I see all of these as complementing one another. As I said, there are advantages to in vivo and in vitro technologies. In silico approaches often take the output of those approaches as inputs to their models. They’re absolutely interlinked today.

I think the promise of in silico methods is to eventually amass enough data that you can generalize these models so that you don’t actually need a wet bench. But I would argue the constraint is always—even if you didn’t need data to train your models, you will need data to validate your outputs. There’s no escaping the data cycle in this space.

There’s a lot of talk in AI about AI models taking the lead. I think there are advantages to de novo design in terms of epitopic accessibility and creating next-generation molecules. But as it stands today, I would describe them as complementary.

David Brühlmann [00:13:43]:
And I imagine that with AI and machine learning, you’ll be able to accelerate the workflows. It's not one or the other, as I’m hearing, but it’s definitely making certain steps of the process faster and more efficient.

Troy Lionberger [00:13:58]:
That's absolutely right. And I’ll give you a great example—an area that A-Alpha Bio is heavily involved in right now: optimizing antibodies to be lead candidate molecules for preclinical teams. Typically, after the discovery of an antibody, you want to optimize that candidate. That could involve making it more human so that it doesn’t interact negatively with the human immune system when injected into patients. It could involve improving developability, like reducing the propensity for aggregation or increasing how much you can produce from cells at scale. But you’re also optimizing affinities.

Historically, this has been a very complicated process. You’d focus on driving affinities to where you want them, then worry about secondary characteristics of molecules that may affect manufacturability downstream. After making those changes, you’d have to go back to ensure your affinity hasn’t veered off course. It’s a slow, recursive, iterative process that’s expensive and time-consuming at each phase, often ping-ponging back and forth through various parts of the value chain. This could take over a year of hard work and significant investment.

What we’re doing now, leveraging disruptive data generation to inform machine learning models in a bespoke way, is transformative. We can generate datasets that train models to predict higher-affinity antibodies while simultaneously optimizing other characteristics. For example, we can insert up to 21 mutations into an antibody and still achieve greater than 50% accuracy in maintaining higher affinity than the parental antibody. That shows the flexibility now possible—where previously you could only add two or three mutations to achieve desired characteristics.

We can now predict sequences that are more human, more developable, higher expressing, higher affinity, and more stable—all at the same time—in a three-month process. This condenses what used to take potentially years of effort into just a few months.

David Brühlmann [00:16:25]:
So traditionally, drug discovery would take years—like two years, three years? What is the benchmark?

Troy Lionberger [00:16:33]:
It really depends. Having been in this space, it depends on the approaches taken and the complexity of the target. But in terms of lead optimization, the part of the process I’m referring to, that could traditionally take one to three years, which we are now compressing into a three-month process.

David Brühlmann [00:16:54]:
Wow, that’s massive. Yeah, that’s a big change.

Troy Lionberger [00:16:56]:
Yeah. And after that three-month process, we’ve also done things that historically couldn’t be done. For example, we can drive the affinities of preclinical animal model antibodies to match human target affinities. That’s where it gets really exciting—asking how this changes the dynamics of the overall ecosystem if everyone’s projects could be accelerated through preclinical development. These molecules are essentially tailor-made for the experiments planned in preclinical studies.

David Brühlmann [00:17:30]:
At the end of the day, we need to find effective antibodies—the purpose of the whole drug discovery process is to find the most efficacious antibody. Now what I’m hearing is that we have technologies to accelerate the workflow. My question is: do these technologies also enable finding that very effective antibody, or do we need additional technologies on top of that?

Troy Lionberger [00:17:59]:
No, I would argue that the problem I just described is not just about making a faster, more efficient process. We are also hitting the target product profiles required for these therapeutics. There’s no sacrificing quality by accelerating the timeline, which is, I think, a rare example. In this three-month process, you’re actually starting to guarantee results—something I never thought would be possible with a service provider.

David Brühlmann [00:18:28]:
Can you lead us into how you’re achieving that? What is the technology used, and what are the major steps?

Troy Lionberger [00:18:34]:
At a high level, here’s how it works: we generate tens of thousands of mutations of a parental antibody. We randomly insert arbitrary point mutations, making mutants with one, two, or three mutations at a time. We then measure the affinity of those tens of thousands of antibodies against a panel of different targets—not just the human target, but maybe mouse and cynomolgus targets, as well as point mutations of targets or comparable family members. This helps avoid early readouts of nonspecific binding.

We gather data not only on binding affinity and cross-reactivity, but also specificity from each of those tens of thousands of molecules. We then use all that data to fine-tune AlphaBind, our computational platform. The AlphaBind model has been trained on close to a billion different affinity measurements generated by the company. Fine-tuning the model with data from a specific parental antibody trains it to predict mutations that can be introduced into the original molecule.

The machine learning picks up on synergistic and compensatory mutations that might not be obvious to the human eye but are clear in the data. As a result, we can be greater than 90% confident in generating antibodies with higher affinity, even with up to 15 point mutations.

In parallel, we use off-the-shelf developability models to downselect molecules. While optimizing affinity, we simultaneously evaluate expression, solubility, and melting temperature to ensure the molecules are manufacturable.

David Brühlmann [00:20:33]:
So you generate millions of affinity measurements in a week. Are you using yeast display and genomics to do that, or what’s the trick behind it?

Troy Lionberger [00:20:43]:
Yeah, let me go a bit under the hood. Our wet-lab technology is called AlphaSeq. It’s a yeast display system that takes advantage of yeast mating pathways in nature. In yeast genetics, the A and α (alpha) strains have mating receptors that engage to form a diploid cell containing the genetics of both parent cells.

What our co-founder and CEO, David Younger, proved while in David Baker’s lab is that there’s nothing extremely specific about these mating receptors. You can attach molecules of interest to the outside of these cells, and if those molecules have measurable affinity, they will increase the rate of diploid formation—or yeast mating.

Here’s how it works in practice: you have a culture with a library of A strains expressing an antibody library and α strains expressing targets of interest. Under optimized culture conditions, mating occurs. After some time, you sequence everything. The number of times you see a specific gene pair correlates with binding affinity.

The diploid cells contain the genetics of both “mom” and “dad,” so each readout provides a genomic barcode corresponding to a specific antibody–antigen pair. You can quantitatively relate that to affinity in molarity, which correlates well (about 0.85) with standard measurements like SPR (surface plasmon resonance) or BLI (biolayer interferometry).

The trick here is extracting a biophysical measurement from a genomic readout. Next-generation sequencing provides the scalable data we need, and we can reliably relate it to binding affinities for millions of molecules simultaneously.

David Brühlmann [00:23:18]:
You have a good correlation. I’m curious—what is the affinity range you’re able to measure?

Troy Lionberger [00:23:25]:
Great question. The affinity range we usually see—though we can tune our culture conditions to adjust sensitivity—is typically from hundreds of picomolar up to tens of micromolar. So it’s a very large dynamic range that covers the therapeutic range most people are interested in.

David Brühlmann [00:23:53]:
That wraps up part one of our conversation with Troy Lionberger on the data revolution in antibody discovery. We’ve explored the limitations of traditional methods and how A-Alpha Bio’s AlphaSeq platform is changing the game.

In part two, we’ll discover how machine learning transforms this massive dataset into predictive power. If you’re finding value in this episode, please leave us a review on Apple Podcasts or your favorite platform—it helps other scientists like you discover the show.

All right, smart scientists, that’s all for today on the Smart Biotech Scientist Podcast. Thank you for tuning in and joining us on your journey to bioprocess mastery. If you enjoyed this episode, please leave a review on Apple Podcasts or your preferred platform. By doing so, we can empower more scientists like you.

For additional bioprocessing tips, visit us at www.bruehlmann-consulting.com. Stay tuned for more inspiring biotech insights in our next episode. Until then, let’s continue to smarten up biotech.

Disclaimer: This transcript was generated with the assistance of artificial intelligence. While efforts have been made to ensure accuracy, it may contain errors, omissions, or misinterpretations. The text has been lightly edited and optimized for readability and flow. Please do not rely on it as a verbatim record.

Next Step

Book a free consultation to help you get started on any questions you may have about bioprocess development: https://bruehlmann-consulting.com/call

About Troy Lionberger

Troy Lionberger serves as Chief Business Officer at A-Alpha Bio. He started his career as a research scientist after earning his PhD from the University of Michigan and completing postdoctoral training at UC Berkeley. During nearly a decade at Berkeley Lights, Troy held senior leadership positions spanning R&D, product strategy, and business development.

Before joining A-Alpha Bio, he was Chief Business Officer at Abbratech, where he guided the company from stealth into a partner-focused antibody discovery biotech. At A-Alpha Bio, Troy applies his strong technical background and experience scaling platform companies through strategic partnerships to help position the company as a key enabler in the industry.

Connect with Troy Lionberger on LinkedIn.

David Brühlmann is a strategic advisor who helps C-level biotech leaders reduce development and manufacturing costs to make life-saving therapies accessible to more patients worldwide.

He is also a biotech technology innovation coach, technology transfer leader, and host of the Smart Biotech Scientist podcast—the go-to podcast for biotech scientists who want to master biopharma CMC development and biomanufacturing.  


Hear It From The Horse’s Mouth

Want to listen to the full interview? Go to Smart Biotech Scientist Podcast

Want to hear more? Do visit the podcast page and check out other episodes. 
Do you wish to simplify your biologics drug development project? Contact Us

Free Bioprocessing Insights Newsletter

Join 400+ biotech leaders for exclusive bioprocessing tips, strategies, and industry trends that help you accelerate development, cut manufacturing costs, and de-risk scale-up.

Enter Your Email Below
Please wait...

Thank you for joining!

When you sign up, you'll receive regular emails with additional free content.
Most biotech leaders struggle to transform promising molecules into market-ready therapies. We provide strategic C-level bioprocessing expert guidance to help them fast-track development, avoid costly mistakes, and bring their life-saving biologics to market with confidence.
Contact
LinkedIn
Seestrasse 68, 8942 Oberrieden
Switzerland
Free Consultation
Schedule a call
© 2026 Brühlmann Consulting – All rights reserved
crossmenu