Research Article: Reaction Factoring and Bipartite Update Graphs Accelerate the Gillespie Algorithm for Large-Scale Biochemical Systems

Date Published: January 6, 2010

Publisher: Public Library of Science

Author(s): Sagar Indurkhya, Jacob Beal, Mark Isalan.

Abstract: ODE simulations of chemical systems perform poorly when some of the species have extremely low concentrations. Stochastic simulation methods, which can handle this case, have been impractical for large systems due to computational complexity. We observe, however, that when modeling complex biological systems: (1) a small number of reactions tend to occur a disproportionately large percentage of the time, and (2) a small number of species tend to participate in a disproportionately large percentage of reactions. We exploit these properties in LOLCAT Method, a new implementation of the Gillespie Algorithm. First, factoring reaction propensities allows many propensities dependent on a single species to be updated in a single operation. Second, representing dependencies between reactions with a bipartite graph of reactions and species requires only storage for reactions, rather than the required for a graph that includes only reactions. Together, these improvements allow our implementation of LOLCAT Method to execute orders of magnitude faster than currently existing Gillespie Algorithm variants when simulating several yeast MAPK cascade models.

Partial Text: Dynamic Monte Carlo methods are a common means of simulating the time-evolution of chemical systems. The Gillespie Algorithm (SSA) [1] is the standard algorithm for this process, and has inspired a variety of derivative methods that speed up computation, including the Optimized Direct Method (ODM) [2] and the Next Reaction Method (NRM) [3]. These methods, however, are still computationally costly. Speeding up the Gillespie Algorithm and related hybrid methods will likely play an important role in advancing the productivity of computational systems biology.

We make two observations that appear to apply in many models of large scale biochemical systems:

We experimentally verified the speed advantage of LOLCAT Method on a set of yeast MAPK cascade models obtained from the Yeast Pheromone Model repository [10]. Note that we are concerned only with the fact that these are complex biochemical models that a scientist would reasonably wish to simulate, not with the correctness of these particular models. Six different versions of the cascade model were used, each with a different number of reactions and species. Each model was run to steady-state for seconds (about hours) of simulation time. We then changed the pheromone concentration from 0 nM to 100 nM for each model, and benchmarked ODM, NRM, MODM and LOLCAT Method.

LOLCAT Method uses two key ideas: (1) grouping reactions with common reactants and updating the propensities of many reactions in a single operation, and (2) using a bipartite update dependency graph of species and reactions, resulting in a much more compact form. Note that the factoring of reactions allows for the dependency graph to be further compressed beyond the simple species-based dependency graph used by MODM. These two principles allow LOLCAT Method to outperform other popular methods by orders of magnitude on the chemical systems we benchmarked. Furthermore, the performance advantage of LOLCAT Method is expected to increase as the size of the systems being modeled increases. LOLCAT Method is also able to gracefully handle systems with a large number of interdependent reaction propensities, something that all previous methods are not able to do.



0 0 vote
Article Rating
Notify of
Inline Feedbacks
View all comments