From Data Silos to Autonomous Biomanufacturing: Digital Twins and AI-Driven Scale-Up - Part 2

Biomanufacturing has always dealt with the challenge of turning vast, complex datasets and intricate production steps into life-changing therapies. But when batch records multiply and process deviations loom, how do biotech teams make sense of it all? In this episode, we move beyond theory to the nuts and bolts of how AI—when thoughtfully deployed—can turn bioprocessing chaos into actionable intelligence, paving the way for the factory of the future.

Our guest, Ilya Burkov, Global Head of Healthcare and Life Sciences Growth at Nebius AI, doesn’t just talk about data wrangling and algorithms—he’s spent years building tools and strategies to help scientists organize, contextualize, and leverage real-world datasets. 

Having worked across tech innovation and pharmaceuticals, Ilya Burkov bridges cutting-edge computation with the practical realities of CMC development and manufacturing, making him a trusted voice on how bioprocessing is rapidly changing.

Key Topics Discussed

  • Biotech struggles with massive, unstructured data. AI (especially Generative AI) is essential for labeling, structuring, and turning raw data into actionable insights.
  • The industry is moving toward Industry 4.0/5.0, featuring fully interconnected, self-learning manufacturing facilities that use real-time AI to autonomously optimize yield and quality.
  • Small companies should begin AI implementation by focusing on clean data from one process unit and targeting high-impact areas (e.g., media optimization) to demonstrate proven ROI before scaling.
  • Cloud solutions offer superior scalability and cost efficiency compared to managing expensive, obsolescence-prone on-premise compute infrastructure for handling big data.
  • AI is a partnership between technology and human expertise.
  • Investing now in clean data, tools, and skilled teams is critical for competitive survival in the evolving bioprocessing landscape.

Episode Highlights

  • Tackling the complexities of organizing huge and often unstructured datasets in bioprocessing [03:08]
  • Advice for biotech scientists on learning from innovations in other industries [02:21]
  • Techniques and tools to structure, label, and prepare data for AI—including Nebius’s in-house tool, TractoAI [06:24]
  • The vision for the “factory of the future”: AI-driven, interconnected, and self-learning manufacturing environments [08:11]
  • Strategies for startups and small teams—how to begin implementing AI and what areas of bioprocessing to focus on first [10:12]
  • Navigating the decision between on-premise and cloud computing for scalable, cost-effective AI workloads [12:32]
  • The importance of partnership between scientists and AI, emphasizing collaboration and data-driven decisions [15:47]

In Their Words

AI isn't just a tool for faster experiments. It's transforming how we develop, how we optimize, and how we manufacture biologics from start to finish. When integrated thoughtfully, it can empower a lot of scientists to improve quality, to accelerate timelines, and ultimately it can help get a lot of therapies to patients faster. AI doesn't replace human expertise — it amplifies it.

Episode Transcript: From Data Silos to Autonomous Biomanufacturing: Digital Twins and AI-Driven Scale-Up - Part 2

David Brühlmann [00:00:28]:
Welcome to Part Two with Ilya Burkov from Nebius AI. In the first half of our conversation, we explored how AI is transforming process development from data overload to autonomous DOE studies. Now we're diving into the challenges many of you are facing: how to organize huge datasets, where to store your data, and what the factory of the future looks like. We'll also get practical, so we'll answer this question: If you want to start using AI tomorrow, where should you begin? Well, let's jump back in and talk about manufacturing's AI-enabled future.

You work in all kinds of industry, or your company is working in different parts of the industry or even other industries. But previously, when I was working in technology innovation, we would always look above and beyond — look at what other industries are doing and try to learn from them. Because there's no doubt that in a lot of innovations and a lot of trends, other industries are much further ahead than the biotech industry, because obviously the biotech industry is more conservative for good reasons. What do you think? Should we as biotech scientists learn from other industries? Where should we leverage, whether it's technology or a mindset or ways to collaborate, in order to make bioprocessing even better?

Ilya Burkov [00:03:04]:
Yeah, that's a great question. When you look at it, no two processes are the same. The idea is that when you are working with huge volumes of data — everybody from drug discovery and development to genomics to even imaging data from CTs or MRIs — you're working with a lot of unstructured data. The understanding of how to label it, how to prepare the data beforehand, is key, because there's no good in having tons and tons of data but then you can't use it for any kind of workloads. You don't have deep understanding of what this means. So I think that understanding the statistical background and understanding how we can use that data is key.

A lot of mathematics and a lot of algorithmic work is needed, irrespective of which industry you're coming from — understanding how you can really structure that data, how you can prepare it for a lot of the training runs. That's how I would say it. There are a lot of reports, and humans might miss a lot of these things. If you're not programming the code in the right way, the code is not going to fix it magically for you. So that needs to be done at hand.

In terms of specific industries, I don't think one industry does it better than the other. It's just that they're working with different types of data. When you're looking at drug discovery and drug development, a lot of the pharma companies are sitting on exabytes of data. It doesn't mean that they are ready to be used immediately.

David Brühlmann [00:04:29]:
Yeah, unstructured data — that's a huge challenge. Because if you look, for instance, at the manufacturing department, they have batch records, they have investigation reports, they have operator notes, they have all kinds of analytical data and so on. I mean, there's so much going on. And now, as companies are going toward real-time release, I'm wondering: what is the right way to go about that? Like, where do you start, and how do you actually make sure that finally you organize the data in the right way? Can you give us some advice here?

Ilya Burkov [00:05:01]:
Sure. So, I mean, generative AI can read and summarize and connect these data sources to really identify the patterns and root causes. It can be used as a tool to transform a lot of the raw information into more actionable intelligence. It can help teams prevent future deviations if there have been any in the past, and really optimize the processes a lot faster. And AI doesn't replace the operators, as I said. It doesn't replace the process engineers — it augments them. It is able to be used as a tool to read and synthesize unstructured reports.

AI itself provides insights that humans might miss. So you use it as a guide or a smarter decision process for the next batch. It's like giving a team a microscope for process intelligence, helping every production run learn from the last. You're giving them very, very clear insights to understand and then decide what direction to take from that.

David Brühlmann [00:05:59]:
And any advice about how to structure the data — like what system to use, especially if you don't have that many resources? Can you use a simple database for that, or do you need a sophisticated program? I mean, you see a lot of people building some simple AI stuff. I'm not so sure if this is suitable for the biopharmaceutical world, but what is your take on that?

Ilya Burkov [00:06:24]:
There are various tools out there — both commercially available tools that work with databases and large datasets. In Nebius we have an in-house tool called TractoAI, which is used to basically accelerate the pre-training of the data, to label it, to identify the data, and also prepare it for a large training round. But there are a lot of different tools out there on the market. It just depends on the volume, size, and what people are using. For us, for high-performance compute, Tracto is very good at working with petabytes and exabytes of data. So when teams have huge volumes of information, especially if they haven’t structured it very well, that's what we recommend.

David Brühlmann [00:07:05]:
And I guess speed is also an issue once you have that much data, right?

Ilya Burkov [00:07:09]:
Yes, speed. Because even though it's text data, if you're working with petabytes or exabytes of data, everything's going to slow down. You make one mistake, you're going to have to repeat it. And those kinds of processes typically take a few hours to a few weeks to run. So having the right tools in place will significantly help you reduce that time to market as well. Before the next training round is done, the iteration period needs to be as low as possible to decrease the time that you would spend on those processes. Nobody likes to work with it — it's not a fun thing to do — but it needs to be done, because otherwise you're going to get rubbish coming out.

David Brühlmann [00:07:48]:
I'd like to touch upon another part of manufacturing because we hear a lot about the factory of the future, and I know in bioprocessing a lot of people have talked about Industry 4.0. We probably are now at Industry 5.0 with all the AI and so on. What is the new trend there? Is bioprocessing going to evolve in the next few years?

Ilya Burkov [00:08:11]:
That's a great question. I'd love to think about the future and really understand. We don't know the answers, but I would say that having an AI-enabled biomanufacturing facility that is fully interconnected — and by interconnected I mean sensors across every unit operation from cell culture to purification to fill-finish — a factory that's able to feed all the real-time data into an AI system. And then these AI systems analyze, predict, and optimize a lot of this continuously, live-adjusting various parameters autonomously to maintain the yield, the quality, and the consistency of the product that's being made.

It's a self-learning ecosystem where every process informs the next. Having that automated would be incredible. Having this AI-enabled biomanufacturing process transforms production into a much more predictive and responsive ecosystem, where every part of the system corrects any mistakes or deviations that might occur in real time before they impact production and before the facility wastes time and effort. So the factory of the future for me is much faster, smarter, and far more reliable than anything that we see at the moment. Having those various checkpoints in place to automate the system — I think that would be ideal.

David Brühlmann [00:09:39]:
Yeah, it's definitely going to be exciting to see where everything is going. It's very hard to predict — there's so much going on every day. Let's make this now very practical, because let's assume I'm a CEO of a startup company and I want to make best use of all the AI technologies or the new technologies out there. Where should I start? I mean, it can be very overwhelming, and it can also be dangerous to jump on every train. You know, there's so many things going. What should I pursue, and maybe where should I wait?

Ilya Burkov [00:10:12]:
That's a great one as well. I've been asked that a lot in general, especially with the teams that are just adopting AI, just starting to work with it. But the key, as I always say, is to start with clean and structured data. Because even a single process unit, like a bioreactor, can generate very valuable insights if the data is well organized. From there, the teams should focus on very high-impact areas like feed strategies or media optimization, where small, tiny incremental improvements in process parameters can yield quite significant gains in the end product.

And you don't need to have a full AI team from day one. Start with the data you already have and really clear measurable goals. Don't try to optimize everything at once — it's impossible to do. Pick one process that is well instrumented and has a high value — for example, upstream cell culture or a purification step. Collect the historical data and the real-time data. Apply simple predictive models, use those insights to then guide the decisions. And once you see that, make it very measurable and data-driven. I like to say data says a lot more than opinions. Have those measurable improvements so that you can expand to other areas.

Small pilot projects are the fastest way to demonstrate a return on investment and build confidence. So AI is the tool that can amplify a lot of this work. And that's how I would say is the best way to start — by using AI to identify these patterns and predict the outcomes in one area. And the scientists always remain in control. They interpret the results and they guide the next steps. But that's the collaboration that's needed between technology and the scientists at the end of it. And don't think that you have to do everything at once — step by step. Start somewhere, work with that, and progress.

David Brühlmann [00:12:06]:
Yeah, exactly — start simple, start somewhere, keep going, and then improve as you progress. Add another layer as you grow.

Ilya Burkov [00:12:16]:
That's what science is. That's what science has been for centuries. You learn, you iterate, you repeat, and you improve.

David Brühlmann [00:12:23]:
Before we wrap up, Ilya, what burning question haven’t I asked that you think our biotech scientist community should hear about?

Ilya Burkov [00:12:32]:
Oh, that's an interesting one. I suppose you haven't asked much about using compute on-prem versus cloud. We can talk about that and the pros and cons of each system. You touched a little bit at the beginning on security and safety, but we also need to highlight that the level of scale is very different.

If someone is doing on-prem compute, they might have a section of their building filled with GPUs, with the required power to run them, and so on. But say they discover a process that makes things a lot easier for them—but they need more compute, as they say in the industry, “to throw more compute at it.”

To do that in-house, they’d need to build the next building next door, host a few thousand more GPUs, connect all the power, make sure everything works, and get it all up and running. That process—from start to finish—would take months, if not years, for inexperienced groups, and it costs a lot of money. Millions of dollars are at stake there.

Whereas the other option is to keep the workload they’re comfortable running on-prem, and rely on the cloud for burst capacity or expansion needs—because those needs aren’t always constant. They may need 100 GPUs in-house, but then have a workload that needs 1,000 for half a year or a year, and after that period, they don’t need that many anymore.

If they build for the maximum in-house, it will be wasted—it won’t be used. So having the flexibility to scale up and down is key.

And I think a lot of companies are at a stage now where they compare the cost of building on-prem versus using cloud compute. Sometimes they don’t factor in everything. They think that if they buy it and use it, they won’t need to think about it later. But GPUs get outdated quickly. Every few years you get a newer, better version, and you’re essentially investing millions—if not hundreds of millions—into infrastructure that will be redundant in five years. So why take on all that burden and cost when you can rely on a cloud provider like Nebius?

David Brühlmann [00:14:54]:
Great point. Anyway, that’s an important question—whether it’s computing, relying on a manufacturer, or relying on a CRO for analytics. These are very important decisions: what do you keep in-house and what do you outsource? These are crucial questions to navigate. So, with everything we’ve covered today, Ilya, what is the most important takeaway?

Ilya Burkov [00:15:18]:
Well, the most important takeaway is that AI isn’t just a tool for faster experiments. It’s transforming how we develop, optimize, and manufacture biologics from start to finish. When integrated thoughtfully, it empowers scientists to improve quality, accelerate timelines, and ultimately help deliver therapies to patients faster.

AI doesn’t replace human expertise—it amplifies it. The key idea is that the future of bioprocessing is a partnership: humans guiding strategy and AI providing predictive insights and autonomous optimizations. Together, they make processes faster, smarter, and more reliable.

And if there’s only one thing to remember from this entire podcast, it’s that investing in clean data, AI-driven tools, and skilled teams today is how companies stay competitive tomorrow. It determines how they grow, how they expand. Organizations that embrace AI now will define the next generation of biomanufacturing. Those that don’t risk being overtaken by the ones that do.

David Brühlmann [00:16:36]:
Thank you, Ilya, for sharing your perspective on AI and where the industry is heading. I think it’s an important conversation. Where can people get a hold of you?

Ilya Burkov [00:16:48]:
Absolutely. LinkedIn would be the first place to start. Nebius.com as well—we have a dedicated Life Science and Healthcare section. Feel free to reach out to me on LinkedIn or via email, the usual connecting sites.

David Brühlmann [00:17:02]:
All right, great. I’ll leave the information in the show notes. And Ilya, thank you very much for being on the show today.

Ilya Burkov [00:17:08]:
Thank you, David. It’s been really fun.

David Brühlmann [00:17:11]:
This wraps up our conversation with Ilya Burkov from Nebius AI. From predictive scale-up to autonomous production ecosystems, we’ve seen that AI isn’t replacing bioprocess scientists—it’s amplifying what you do best. The future of biologics manufacturing is happening now, and you are part of it.

If you found value here, please leave us a review on Apple Podcasts or wherever you’re listening from. Your support keeps this show going, so thank you so much. I’ll see you next time, and keep doing biotech the smart way.

All right, smart scientists—that’s all for today on the Smart Biotech Scientist Podcast. Thank you for tuning in and joining us on your journey to bioprocess mastery. If you enjoyed this episode, please leave a review on Apple Podcasts or your favorite podcast platform. By doing so, we can empower more scientists like you.
For additional bioprocessing tips, visit us at www.bruehlmann-consulting.com. Stay tuned for more inspiring biotech insights in our next episode. Until then, let’s continue to smarten up biotech.

Disclaimer: This transcript was generated with the assistance of artificial intelligence. While efforts have been made to ensure accuracy, it may contain errors, omissions, or misinterpretations. The text has been lightly edited and optimized for readability and flow. Please do not rely on it as a verbatim record.

Next Step

Book a free consultation to help you get started on any questions you may have about bioprocess development: https://bruehlmann-consulting.com/call

About Ilya Burkov

As the Global Head of Healthcare and Life Sciences Growth at Nebius AI, Ilya Burkov focuses on driving cloud adoption across the EMEA region. His background, which includes a PhD in Medicine and eight years in the life sciences sector, allows him to bridge the gap between complex healthcare challenges and advanced AI solutions.

His role involves planning and executing sophisticated projects to deliver genuine value to customers and partners. He is dedicated to maximizing growth, managing significant portfolios, and cultivating strong relationships with C-level executives. Ilya is passionate about leveraging strategic methods and data analysis to accelerate innovation and transformation in healthcare.

Connect with Ilya Burkov on LinkedIn.

David Brühlmann is a strategic advisor who helps C-level biotech leaders reduce development and manufacturing costs to make life-saving therapies accessible to more patients worldwide.

He is also a biotech technology innovation coach, technology transfer leader, and host of the Smart Biotech Scientist podcast—the go-to podcast for biotech scientists who want to master biopharma CMC development and biomanufacturing.  


Hear It From The Horse’s Mouth

Want to listen to the full interview? Go to Smart Biotech Scientist Podcast

Want to hear more? Do visit the podcast page and check out other episodes. 
Do you wish to simplify your biologics drug development project? Contact Us

Free Bioprocessing Insights Newsletter

Join 400+ biotech leaders for exclusive bioprocessing tips, strategies, and industry trends that help you accelerate development, cut manufacturing costs, and de-risk scale-up.

Enter Your Email Below
Please wait...

Thank you for joining!

When you sign up, you'll receive regular emails with additional free content.
Most biotech leaders struggle to transform promising molecules into market-ready therapies. We provide strategic C-level bioprocessing expert guidance to help them fast-track development, avoid costly mistakes, and bring their life-saving biologics to market with confidence.
Contact
LinkedIn
Seestrasse 68, 8942 Oberrieden
Switzerland
Free Consultation
Schedule a call
© 2026 Brühlmann Consulting – All rights reserved
crossmenu