Research Article: Rule-Based Design of Plant Expression Vectors Using GenoCAD

Date Published: July 6, 2015

Publisher: Public Library of Science

Author(s): Anna Coll, Mandy L. Wilson, Kristina Gruden, Jean Peccoud, Martina Stromvik.


Plant synthetic biology requires software tools to assist on the design of complex multi-genic expression plasmids. Here a vector design strategy to express genes in plants is formalized and implemented as a grammar in GenoCAD, a Computer-Aided Design software for synthetic biology. It includes a library of plant biological parts organized in structural categories and a set of rules describing how to assemble these parts into large constructs. Rules developed here are organized and divided into three main subsections according to the aim of the final construct: protein localization studies, promoter analysis and protein-protein interaction experiments. The GenoCAD plant grammar guides the user through the design while allowing users to customize vectors according to their needs. Therefore the plant grammar implemented in GenoCAD will help plant biologists take advantage of methods from synthetic biology to design expression vectors supporting their research projects.

Partial Text

Synthetic biology aims at bioengineering organisms that perform beneficial functions, generally by means of a rational design approach [1]. Plants have largely been unexploited for synthetic biology, but they offer great potential [2]. To fully benefit from synthetic biology, significant efforts have been dedicated to the development of robust, less-demanding, and more reliable methods to assemble increasingly complex designs (see review [3]). Beyond the assembly of constructs, the design of complex multigene vectors is a big challenge. Editing large DNA sequences increases the risk of introducing errors. Furthermore, identifying suitable biological parts is becoming more difficult as the number of parts for synthetic biology increases. Therefore, there is a need for software tools that guide plant synthetic biologists through the design of application-specific expression vectors. GenoCAD ( is a Computer-Aided Design (CAD) software for synthetic biology which allows the user to quickly design protein expression vectors, artificial gene networks and other genetic constructs based on the notion of genetic parts [4]. GenoCAD includes a system to manage annotated and user-defined genetic parts. Moreover, it also guides the users through the design by means of a set of predefined rules that describe the design strategy for a specific type of application and which can be expressed in a context-free grammar [5]. By default, GenoCAD includes a simple grammar used for demonstration purposes. However, the grammar editor embedded in GenoCAD enables users to develop brand new grammars, and therefore provides biologists with a tool to formalize custom design strategies. Several grammars have already been added by users i.e. (i) a grammar to design a family of vaccine vectors derived from vesicular stomatitis virus (VSV) [6], (ii) a grammar to design Chlamydomonas reinhardtii expression vectors [7] and (iii) a grammar to design synthetic transcription factors in eukaryotes [8].

Plant transformation for functional analysis experiments has become a routine tool in plant research. Nowadays, commercial plant expression plasmids for different applications are available. However, this is a rigid option, and to change any element of the backbone plasmid is a time-consuming task. Moreover, most of them are based on classical or Gateway cloning systems. There is a need for simple and versatile design strategies to allow high throughput approaches in synthetic biology studies. Therefore, we developed a GenoCAD grammar to design constructs for in planta functional analysis studies (S1 File). The plant grammar is available in Figshare (

We implemented a plant grammar in GenoCAD, which guides the user through the design of plant expression vectors for functional studies experiments. The PPI branch of the grammar includes rules that express the requirement that genetic parts located on different plasmids interact with one another. For instance, rules describing the use of split fluorescent proteins to visualize the interactions between two proteins require two plasmids to use sequences complementing each other. This type of systems-level analysis relying on trans-interactions between genetic parts had been proposed before [33] but this is the first time that this design constraint is captured in a grammar.