NO.157 Formal methods for the synthesis of biomolecular circuits

Shonan Village Center

September 2 - 6, 2019 (Check-in: September 1, 2019 )


  • François Fages
    • Inria Saclay IdF, France
  • Katsumi Inoue
    • National Institute of Informatics, Japan
  • Heinz Koeppl
    • Technische Universität Darmstadt, Germany
  • Chris Myers
    • University of Utah, USA
  • Yoshihiro Shimizu
    • RIKEN Center for Biosystems Dynamics Research, Japan


Description of the meeting

The proposed seminar will bring together researchers from synthetic biology, from formal methods for synthesis and verification and from biophysical modeling and engineering in order to think about new ways of rationally building synthetic biomolecular circuits. The seminar will be highly interdisciplinary.

1.1 Topics of the seminar

Synthetic biology advocates an engineering approach to biology and hence it comes at no surprise that concepts from traditional electrical engineering, computer science and control theory resurface in this field. The use of these concepts appears justified even though the biochemical computing substrate of synthetic biology is very different from that of electronics. Key features of engineered systems such as modularity and composability appear desirable also in synthetic biology. Large libraries of characterized promoters, terminators, insulator sequences, transcription factors and so forth have been generated and these elements were composed semi-rationally into more complex circuits. Often a screening approach is taken, where different circuit variants (e.g. different order of genes on a plasmid, different insulator sequences) are generated and characterized experimentally in order to determine functional variants. For circuit design in synthetic biology one ultimately aims at providing a formal behavioral specification to a software tool and receiving the optimized sequence of DNA of the desired circuit as an output.

Recently successful attempts have been made in this direction by automatically generating DNA templates of combinational circuits according to a behavioral specification given in terms of the classical hardware description language Verilog [1]. Gates from a library of characterized ​E. coli ​NOR gates were composed to generate around 60 different logic circuits. Another group managed to compose a 12-input combinational circuit using RNA-based regulation [2]. The generation of such library and the rules of which gates are compatible with which other gates, however, are most often informed by trial-and-error screens that were done beforehand. In other words, for each gate configuration it was tested whether the abstraction into a (in this case) simple Boolean operation is permissible under the specific conditions. Whether such abstraction remains permissible upon addition of an new element to the library or upon changing the cellular microenvironment of the host cell cannot be answered and requires new tests. With the increase of the library size and the complexity of the desired circuit such procedures soon become infeasible. Biophysical models can be used to predict such host-circuit inferences and gate-gate crosstalks and to ultimately guide us what level of abstraction is permissible.

Given this state of affairs, we believe that synthetic biology can profit from concepts, and eventually methods, of EDA and program synthesis from computer science that have reached a considerable sophistication over the last decades (see e.g. [3]). We identify the following key aspects that we want to investigate during the seminar and that will also define working groups and according to which we are roughly classifying invitees (see Section 3).

1. Abstraction of biophysics (AB).

​Digital circuit design proved to be so scalable because the Boolean abstraction of devices that are governed by real-valued semiconductor physics was so robust and accurate. In synthetic biology, such robust abstraction has not been defined. Put differently, we were yet not able to optimize biomolecular parts such that they permit a robust Boolean abstraction. We expect that new abstractions will be necessary that can incorporate context-dependencies effects through more expressive interfaces at the part and circuit level. In order to efficiently study this problem, will require wet-lab and dry-lab synthetic biologists to team up with experts in synthesis of digital circuits, cyber-physical systems and of software.

2. Specification languages (SL).

Based on the above considerations, traditional description languages of digital circuits such as Verilog will be insufficient to specify circuit behavior even if the desired circuit should perform Boolean computation. For instance, additional specification may capture the host’s cellular state under which such operations are allowed. Beyond the plain Boolean formalisms, temporal logics such as LTL (linear temporal logics) or STL (signal temporal logic) have already been used to specify behavior of circuits in systems and synthetic biology. While such languages are powerful enough to express most behaviors required in synthetic biology, they lack features to express the context-dependency of circuits. For instance, one may wish to specify that the an LTL specification must be met without allocating host cellular resources (e.g. ATPs, ribosomes) beyond a certain threshold. In software engineering resource consumptions have been built-in in languages such as ABS (abstraction behavioral specification). Hence, to date, no framework exists that meet the requirements of synthetic biology and the proposed seminar will provide the ideal setting to come up with new ideas. Among the invitees are also founders of SBOL (Synthetic Biology Open Language) that emerges as a standard to specify the parts of a biomolecular circuits. In order to ultimately allow for automatic synthesis tools and adoption in the community any envisioned behavioral specification language needs to be developed in tight coordination with SBOL.

3. Synthesis methods (SM).

The problem of synthesis in synthetic biology has for now been limited to either brute-force sampling and testing of all circuit topologies and parameters or just to the parameter synthesis problem alone. For the latter, first attempts have been made where Markov chain Monte Carlo (MCMC) sampling schemes are guided by the degree violation of a LTL formula with real-valued quantifiers. Parameter synthesis can greatly profit from advanced techniques in machine learning of which we have representatives on our invitation list.

On the other hand, formal methods for the synthesis of hardware, software and embedded cyber-physical systems advanced considerably in the last decade [3]. Deductive methods (i.e. SAT or SMT solvers) were combined with inductive methods that resort to machine learning techniques to arrive at candidate models from labelled behavioral examples. It is the goal of this seminar to investigate which of the concepts and methods developed in this domain can be transferred to the synthetic biology design process.

1.2 Goals of the seminar

The goal of the seminar is to initiate new research threads for the design automation within synthetic biology. We believe that an exchange with the formal methods community will be essential to move this field forward in the sense that practical design cycle will become shorter and that biology eventually really becomes engineerable as promised by synthetic biology.

Synergies and research questions. Fine-grained biophysical of biomolecular circuits are often based on continuous-time Markov chains that are also subjects of considerable research efforts in the domain of probabilistic model checking. Hence, even for the aspect AB we expect synergies due to the presence of computational modelers and formal methods researchers from this subdomain. For aspects ​SL and ​SM it is evident that dry-lab and wet-lab synthetic biologist need to engage with formal methods researchers to generate new solutions to those standing problems. The list of invitees also include the main drivers of computer-aided design tools for synthetic biology (e.g. Peccoud/GenoCAD, Densmore/Cello) and we expect an interesting exchange of ideas between those and researchers from the formal methods domain. Exemplary questions to be addressed are:

1. What type of biomolecular circuits are currently in use and promising and what is their target behavioral class (e.g. Boolean, combinational, sequential, analog)?

2. What are appropriate abstractions of such circuits and their behavior and how to engineer components (e.g. through insulation or orthogonality) to make them adhere to a certain desired abstraction level?

3. What are behavioral specification languages that are powerful enough to express stochasticity and context-dependency, intuitive enough to be adopted by the synthetic biology community but yet formal enough to be the starting point for formal synthesis?

4. How do we extend machine learning techniques for parameter synthesis to circuit synthesis and how do we incorporate deductive methods into the synthetic biology design flow?

Impact on research community​.

​The seminar will stimulate new research in the field of synthetic biology design automation and raise the awareness in the formal methods community of the many challenging open problems in this field. Existing synergistic efforts between formal methods and systems biology (e.g. see CMSB – International Conference on Computational Methods in Systems Biology) could, as a result, include new synthetic biology problem statements. The results of the seminar will also have an immediate impact on the current synthetic biology LOEWE cluster CompuGene1 at Technische Universität Darmstadt.

The community needs to gear toward real-life applications of synthetic circuits. The days when modeling and subsequent experimental demonstration of a two-input AND gate in ​E. coli are becoming a thing of the past. While these fundamental developments have set the stage for the current state of the art, the next phase of computer-aided design must be able to generate circuits of substantial utility in application ranging from biomanufacturing and metabolic engineering to environmental monitoring, diagnostic tools and novel therapies. The success metrics are the ability to dramatically cut down the time it takes to develop a product. The members of modeling community will hear from the experimental practitioners about specific unmet needs in circuit engineering and will be able to channel their efforts toward tangible goals with clear success criteria. We believe that this (re)assessment of the goals by the community will be critical in order to move the field of synthetic biology to the forefront of real-word biotechnology and biomedicine.

Societal impact​.

Synthetic biology in the sense of large-scale genetic engineering and manipulation of biological systems, is becoming widespread in many biotechnological applications, such as the production of fine chemicals or biofuels as well as in biomedical applications such as bioanalytics, diagnostics, and therapies. However, successful products, such as the yeast strain producing Artemisin (Keasling lab) or the strain producing a biofuel Farnesin (Amyris) are still a product of years if not decades of labor and hundreds of millions of investment dollars. Very high anticipated costs of the CAR-T cell therapies and of the first approved gene therapies (Glybera, T-Vec) further reinforce this pattern. In order to deliver successful applications within reasonable timeframe and budget, the main hurdle remains the establishment of a rational design pipeline to provide a small number of robust solutions in an expedient and reproducible manner. Therefore the topic of the seminar is very timely and essential to make synthetic biology into a major driver of economic development.

Relevance for mainstream media.

​Recent developments of technologies that manipulate or recreate genomes have received widespread media coverage, which is in part driven by ethical concerns over the use of such technologies, particularly over the uncertainty of unanticipated consequences of genome editing. For example, the CRISPR/Cas9 technology brings unprecedented potential to make targeted changes to DNA sequences, which is already accelerating fundamental research, and could also be used to treat genetic diseases. However, there are naturally questions about the predictability of off-target effects of CRISPR/Cas9, which could have latent consequences that become apparent in later development. The development of more formal descriptions and characterization of the biophysics of genome editing systems could be instrumental in protecting against unanticipated consequences.
1 See ​

1.3 References

[1] Nielsen AAK, Der BS, Shin J, Vaidyanathan P, Paralanov V, Strychalski EA, Ross D, Densmore D, Voigt CA. Genetic circuit design automation. ​Science​, 352, 6281, 2016.

[2] Green AA, Kim J, Ma D, Silver PA, Collins JJ, and Yin P. Ribocomputing devices for sophisticated ​in vivo​ logic computation. ​Proceedings of the 3rd ACM International Conference on Nanoscale Computing and Communication,​ 2016.

[3] Sethia, SA. Combining induction, deduction, and structure for verification and synthesis. ​Proceedings of the IEEE​, 103(11), 2015.