Date Published: June 12, 2019
Publisher: Public Library of Science
Author(s): Chloe Singleton, James Gilman, Jessica Rollit, Kun Zhang, David A. Parker, John Love, Michael H. Kogut.
Geobacillus thermoglucosidans DSM2542 is an industrially important microbe, however the complex nutritional requirements of Geobacilli confound metabolic engineering efforts. Previous studies have utilised semi-defined media recipes that contain complex, undefined, biologically derived nutrients which have unknown ingredients that cannot be quantified during metabolic profiling. This study used design of experiments to investigate how individual nutrients and interactions between these nutrients contribute to growth. A mathematically derived defined medium has been formulated that has been shown to robustly support growth of G. thermoglucosidans in two different environmental conditions (96-well plate and shake flask) and with a variety of lignocellulose-based carbohydrates. This enabled the catabolism of industrially relevant carbohydrates to be investigated.
The Geobacillus  have great potential for use as an industrially relevant microbial chassis [2–4]. In particular, Geobacillus thermoglucosidans DSM 2542 (NCIMB 1195) (hereafter G. thermoglucosidans) is a promising candidate as it has previously been engineered for bioethanol production .
The growth characteristics of G. thermoglucosidans were investigated in both complex and defined media in order to inform the development of this bacteria as an industrial host. This study highlighted the need for a new defined media recipe capable of sustaining G. thermoglucosidans growth the absence of complex bionutrients which would otherwise confound efforts to metabolically profile the species. The DoE approach to developing a defined media allowed a statistically robust investigation into media formulation including the use of the relatively new technology of artificial neural networks (ANNs). In this study, ANNs were not used to directly predict an optimised media formulation. Instead, an ANN was used to create a weighted ensemble of 15 linear regression models, each of which predicted ΔOD600 as a function of various media components (S8 Table). Ensembling is a commonly accepted approach, analogous to calculating a mean value of a continuous measurement of interest from biological replicates with less variance to the true or ideal value; by training multiple models on the same task and then combining the outputs of these models, ensembling attempts to exploit information about the design space that might have otherwise been judged redundant if only a single “best” performing model were isolated [38–40]. Theoretically, ANNs could have been trained that predicted ΔOD600 directly as a function of the identified media components [41–43]. However, the “black-box” nature of ANNs complicates their use in assessing the contribution of individual media components to culture growth. The combination of linear regression techniques, through which the effect of individual media components can be calculated, and an ANN to create a non-linear ensemble, was shown in this investigation to result in a model which could be applied to both accurately predict culture growth in an optimised growth medium and also improve our fundamental understanding of the contribution of individual media components to culture growth.
The mathematically defined medium developed in this work using a design of experiments approach should allow future metabolic profiling studies to be performed with G. thermoglucosidans (DSM2542) thereby allowing a more comprehensive understanding of metabolism and providing a better starting point for metabolic engineering of this industrially important microorganism.