Book

Strain Design by Wishful Thinking

Strain design by wishful thinking

Context

Economic challenge

One of the big challenges for precision fermentation that is shared globally is price parity, or technoeconomics. In the case of precision fermented (PF) food, 30% of US consumers would pay 25% more for PF products (Hartman Group, 2023). In Germany, 14% of consumers are willing to pay more for PF food (Kühl et al., 2024). In the UK, market share potential of precision fermented foods is projected to fall from 22% to 2% when price doubles over conventional products (Slade & Thomas, 2023).

Designing more productive strains is one crucial improvement required to improve the economics of PF. This means making more product with less feedstock (higher yield), in less time (higher productivity). This means higher titer (concentration) for a defined bioprocess time.

Strain design is complex due to multiple factors:

unclear what the bottlenecks for titer are
the bottlenecks to titer are distributed across 100s to 1000s of interacting cellular processes
the total combination of gene edits is vast – infeasible to test all of them
strain behavior varies from small to commerial reactors

Let's imagine we're focused on the first three challenges for now.

Strain design challenge

We want to design a microbe that expresses a protein product that gets secreted at high titers. We're allowed to make gene edits to the chassis strain, including gene knockouts, activation and repression. Let's assume the protein sequence is fixed, as we need to stay within the natural products space.

As a Computational Biologist, how would we address this design challenge?

We're essentially designing a program to execute an algorithm to design a strain. So, we can take inspiration from the programming world, which also develops complex programs with many moving parts and intricate code with complex layers of dependencies.

Computational Strain Design by Wishful Thinking

This approach is inspired by Programming by Wishful Thinking. Put simply: design your program imagining you already have any function you wish to solve your problem. This simplifies code that uses those functions, even if the functions themselves are complex (c2.com).

Despite the name, wishful thinking does not indicate lack of (scientific) rigor. Rather, it provides a useful framework to approach complex problems by clarifying logic flow before implementing the technical details. It ensures we've verified the outcomes we want first, and then choose the best way to achieve these outcomes.

This is a useful way to design strains with code. Here's how we can apply this framework to our strain design challenge.

Start from the End

def suggest_strain_modifications(protein, organism):
  """ Ultimately, we want strain design recommendations. """

Ultimately, we want the AI to recommend how to optimally modify our strain to maximize protein titer.

Next, let's call on required functions, imagining they're already implemented.

Implement your strain design approach

We'll use a simple iterative approach: make several gene edits, culture the strain, measure protein titer. Then, repeat by adding or removing gene edits in each round until we can't improve titer further. It's essentially a trial and error approach but sped up 1,000x since we're simulating everything on a computer.

Here's a barebones implementation:

def suggest_strain_modifications(protein, organism_model, maxIter=100):
  """ Ultimately, we want strain design recommendations. """

  # Initial titer is zero
  final_titer = 0
  # No gene edits initially
  final_gene_edits = []
    
  # Iterate until max iterations
  while iter < maxIter:    
    # Predict titer and suggest gene edits
    titer = predict_titer(protein, organism_model, gene_edits)
    gene_edits = suggest_gene_edits_for_max_titer(
      current_edits=gene_edits, number_of_edits=3)
    
    if new_titer > previous_titer:
      # Keep new design if better than previous one
      final_titer = titer
      final_gene_edits = gene_edits
    else:
      # If no improvement found, return best strategy so far
      return gene_edits, final_titer
      

  return final_gene_edits, final_titer

Wishful functions to fill later

The implementation above calls key functions, which we define at a high level but defer the details through comments.

def predict_titer(protein, organism_model, gene_edits):
  """ Will need:
  - simulate titer given protein sequence and organism model
  - simulate protein expression
  - simulate protein secretion / translocation
  - account for gene edits to modify organism model
  - dynamic simulation to predict titer (not just yield)
  """
  pass

def suggest_gene_edits_for_max_titer(current_edits, number_of_edits):
  """ Will need:
  - optimization method with gene edits as decision variables
  - number of edits allowed
  """
  pass

Next steps

Now that the logic flow and design approach is defined, here are the next steps.

Implement the core functions
Add supporting functions
Verify everything with test data
Run program with real data

When implementing the core functions, you have a choice of simulators and optimization algorithms. This ranges from flux balance analysis, multi-scale simulators, dynamic bioprocess simulation, data-driven surrogate machine learning models, bilevel optimization algorithms for strain design, and so much more.

We'll discuss these points in a subsequent article.

Sidenotes

the trial-and-error approach can be improved by formulating it as one big optimization problem. For many gene edits (10 or more), we'd likely need to use scalable methods like iterative local search (Lun et al., 2009) or linearization techniques (Yang et al., 2011).
simulating protein expression and secretion can be achieved using multiscale metabolism and macromolecular expression (ME) models (Lloyd et al., 2018).

Strain Design by Wishful Thinking

Context

Economic challenge

Strain design challenge

Computational Strain Design by Wishful Thinking

Start from the End

Implement your strain design approach

Wishful functions to fill later

Next steps

Sidenotes

Read next

Modeling time-series using pytorch

The Side Project - a short story

The Computational Biologist - a short story

Context

Economic challenge

Strain design challenge

Computational Strain Design by Wishful Thinking

Start from the End

Implement your strain design approach

Wishful functions to fill later

Next steps

Sidenotes

Subscribe to the newsletter

Read next