Today we're talking about one of the most frustrating bottlenecks in biosimilar development: how to screen efficiently a long list of quality-modulating compounds when your standard tools — OVAT on one end, a massive DoE on the other — both break down under the weight of the problem.
This is a two-part episode of the Smart Biotech Scientist Podcast. In Part 1, I'll walk you through the problem and the conceptual framework we developed to solve it. In Part 2, we go hands-on: how to actually build this in your lab.
Your Parallel Screening Playbook — From 17 Candidates to Process Winner
In Part 1: we covered why OFAT takes 12 months and misses interaction effects, why a single large DoE with 17 factors fails under dilution, masking, and combinatorial toxicity — and how the parallel group design sidesteps all three. Plus the three-tool multivariate pipeline: PCA, Mahalanobis distance, decision tree. Today we build it.
Section 1: How to Group Your Compounds
Let’s start with compound grouping, because this is where the design logic either holds together or falls apart.
The governing principle: group by biological mechanism, not by convenience.
Four rules.
One: maximum five factors per group.
This is a hard limit. Five factors keeps dilution effects manageable — at most five stock additions per well — and prevents a single dominant compound from masking everything else.
Two: include one to two anchor compounds in every group.
These are well-characterized compounds with documented glycan effects — compounds you trust to give a consistent, interpretable signal. In our study, manganese and asparagine served this role. Their effects are documented across many CHO cell lines, and they provide internal calibration points that enable cross-group comparison, even though the groups are independent experiments.
Three: separate known strong modulators.
If a compound drives a dominant glycosylation effect — for example, a mannosidase inhibitor pushing high mannose to 80–90% — it should be in its own group, or at least be the only potent modulator in that group. Otherwise, its signal overwhelms the others, and you lose most of the information those compounds could provide.
Four: don’t guess unknown mechanisms.
If you don’t know how a compound works, don’t guess. Run a short univariate screen in shake flasks — low, mid, high concentration. Measure the glycan response, then group by response pattern: compounds that increase high mannose together, compounds affecting sialylation together.
A note on scale.
The parallel group method is designed for when your candidate list is too large for a single clean experiment. If you have three to five compounds, run a single multi-factor DoE. With six to nine, use two or three groups. With fewer than three, OFAT or a simple dose–response is the right tool.
Section 2: Concentration Range Selection
Concentration range selection is the step most teams underinvest in, and it’s the one that can invalidate entire experimental groups before you’ve analyzed a single result.
The problem: if your upper concentration level is too high for a potent compound, cells die in those wells. Dead cells produce no glycan data. And because data loss within a group is clustered — all conditions share the same plate — one mis-calibrated compound can corrupt 20–30% of a group’s wells before you’ve run anything.
Three things to do.
One: known potent compounds
For enzyme inhibitors, glycosidase inhibitors, and metabolic modulators, start from published concentration ranges in the literature. Set your upper bound conservatively. You can always extend upward in a follow-up. This is far cheaper than re-running a group.
Two: unknown or poorly characterized compounds
Don’t estimate. Run a preliminary dose–response study in shake flasks: three to five concentration levels over about one week. Identify where growth inhibition begins, then set your DoE upper bound below that threshold. This is the most important preparation step for the entire screen.
Three: pre-qualify osmolality
Multiple stock additions can increase medium osmolality in non-obvious ways. This is a hidden confounder that affects both cell growth and glycosylation independently of compound effects. Measure or estimate osmolality for each condition before the screen, and adjust sodium chloride to bring all conditions back to the same target. We saw this specifically with raffinose, a trisaccharide that meaningfully contributes to osmolality at effective concentrations. We covered this in Episode 227.
Section 3: Running the 96-Well Screen
A few execution details determine whether your data is trustworthy or misleading.
Evaporation. Use vented lids and validate your evaporation correction. In a 96-well plate, edge wells evaporate more than center wells. This creates position-dependent changes in medium concentration and osmolality, which can generate false biological signals if uncorrected. Measure evaporation rates across the plate layout and incorporate corrections into data processing.
Liquid handling. Use robotics. Stock solution additions are often in the low microliter range, where manual pipetting error can exceed the biological effects you are trying to detect. If robotics are not available, that is a real limitation of the method.
Reference wells. Include replicate reference-condition wells distributed across both plates, not clustered in one area. These provide a noise estimate and a check for positional effects.
Feeding. Your feed schedule must exactly replicate your production process. Changes in feed timing or composition shift the metabolic baseline, which shifts the glycan profile. Without this control, you are comparing against a non-representative system.
Scale-down validation. Before the full screen, run your reference condition in both 96-deep-well format and shake tube format, then compare glycan profiles. If they do not track closely, your screening results will not translate into confirmation experiments. Everything downstream depends on this correlation holding true.
Section 4: Making the Statistics Accessible
I want to address the statistics barrier directly. You don’t need a bioinformatics background. These tools are already built into standard statistical software most bioprocess labs already have access to.
What you need is to understand what each tool is doing.
PCA. Run it on your full glycan dataset. The goal is to compress your 13-dimensional quality space into two or three dimensions you can visualize on a score plot.
How many components to retain: look at the scree plot — cumulative variance explained versus component number. Keep components up to the natural inflection point. In our study, three components explained 76% of total glycan variance.
Then project the reference product as an external point onto the same score plot. Conditions close to that reference point are your candidates. Conditions far away are not.
Mahalanobis distance. Calculated in your PCA score space. It converts visual closeness into a single number per condition: how far each condition is from the reference product in the full glycan space.
Unlike Euclidean distance, it accounts for correlations between glycoforms — and glycoforms do not change independently.
You then rank all conditions and take the top 20–25%. These are your shake tube confirmation candidates.
Decision tree. Takes your Mahalanobis rankings and identifies which compound levels reliably predict whether a condition ends up in the top group or bottom group.
Two non-negotiable rules: cross-validate (sevenfold is standard) and prune the tree after validation. An unvalidated, unpruned tree will overfit your data and produce rules that only work on the conditions you already ran.
The output you want is simple, interpretable if–then rules you can explain to any stakeholder.
No black box.
Section 5: Three Things I'd Do Differently Today
Three specific things I would change if I were running this study today.
First: pre-qualify every unknown compound with a dose-response study before the screen. This is about flagging compounds that could cause cell death at your intended concentration range before you commit them to a group. One week in shake flasks, three to five levels. If growth inhibition appears, you adjust your DoE bounds before the screen, not after you’ve lost 30% of a group’s wells. The cost-benefit is clear in retrospect.
Second: measure beyond your primary quality target. We tracked glycan profile as our primary CQA, but we also monitored aggregation and charge variants across all conditions. That broader analytical panel revealed quality effects that glycan data alone would not surface. Whatever your primary target is, add at least two secondary quality readouts from Day 1. You are already running the cells and doing the analytics — the incremental cost is small, and occasionally you find effects that change which condition you advance.
Third: integrate hybrid modeling at two stages. At initial screen design, if you have historical bioprocess data, a hybrid model combining mechanistic understanding with machine learning can predict which concentration ranges and combinations are most informative, allowing you to design a better experiment before running any wells. At the shake tube validation stage, the model identifies the minimum set of conditions needed to confirm results, reducing the number of shake tube runs and therefore analytical cost and workload.
The Bigger Lesson
Here's the mindset I want to leave you with.
Process development is an information problem, not a time problem. As you are starting out, you may move slowly because you generate information slowly — one experiment, one question, one answer at a time. The parallel group method changes that by letting you ask multiple questions simultaneously, in the same calendar window as asking one. The multivariate analysis pipeline changes it further: instead of extracting one data point per condition, you extract the full picture — all 13 glycoforms, ranked and explained, against a single target.
The teams that reach IND fastest are not running more experiments. They're running smarter ones. Rational compound grouping, parallel execution, multivariate selection.
This framework extends beyond media optimization. Clone selection, feed development, process characterization, scale-up decisions — wherever you have too many variables for sequential testing and too many interactions for a single large experiment, this logic applies.
Ask better questions. Ask them in parallel. Let the data lead. That's what smarter biotech looks like.
Further Reading
The full peer-reviewed paper: D. Brühlmann et al., "Parallel Experimental Design and Multivariate Analysis Provides Efficient Screening of Cell Culture Media Supplements to Improve Biosimilar Product Quality," Journal of Biotechnology.
Further Listening
Episodes 05 - 06: Hybrid Modeling: The Key to Smarter Bioprocessing with Michael Sokolov
Episodes 173 - 174: Mastering Hybrid Model Digital Twins: From Lab Scale to Commercial Bioprocessing with Krist Gernaey
Episodes 99 - 100: From Raw Data to Actionable Insights: Unlocking the Power of Process Models with Fabian Feidl
Episodes 137 - 138: Skip 90% of Bioreactor Runs: The In Silico Revolution in Bioprocess Development with Yossi Quint
Next Step
If you found value in today’s episode, take a moment to like, follow, and leave a review on Apple Podcasts or your favorite platform—it helps us reach and support more scientists like you.
Thanks for tuning in to the Smart Biotech Scientist podcast and being part of this journey toward bioprocess mastery. For more insights and practical tips, visit www.smartbiotechscientist.com.
David Brühlmann is a strategic advisor who helps C-level biotech leaders reduce development and manufacturing costs to make life-saving therapies accessible to more patients worldwide.
He is also a biotech technology innovation coach, technology transfer leader, and host of the Smart Biotech Scientist podcast—the go-to podcast for biotech scientists who want to master biopharma CMC development and biomanufacturing.
Hear It From The Horse’s Mouth
Want to listen to the full interview? Go to Smart Biotech Scientist Podcast.
Want to hear more? Do visit the podcast page and check out other episodes.
Do you wish to simplify your biologics drug development project? Contact Us

