Research Article: Repairing Boolean logical models from time-series data using Answer Set Programming

Date Published: March 25, 2019

Publisher: BioMed Central

Author(s): Alexandre Lemos, Inês Lynce, Pedro T. Monteiro.

http://doi.org/10.1186/s13015-019-0145-8

Abstract

Boolean models of biological signalling-regulatory networks are increasingly used to formally describe and understand complex biological processes. These models may become inconsistent as new data become available and need to be repaired. In the past, the focus has been shed on the inference of (classes of) models given an interaction network and time-series data sets. However, repair of existing models against new data is still in its infancy, where the process is still manually performed and therefore slow and prone to errors.

In this work, we propose a method with an associated tool to suggest repairs over inconsistent Boolean models, based on a set of atomic repair operations. Answer Set Programming is used to encode the minimal repair problem as a combinatorial optimization problem. In particular, given an inconsistent model, the tool provides the minimal repairs that render the model capable of generating dynamics coherent with a (set of) time-series data set(s), considering either a synchronous or an asynchronous updating scheme.

The method was validated using known biological models from different species, as well as synthetic models obtained from randomly generated networks. We discuss the method’s limitations regarding each of the updating schemes and the considered minimization algorithm.

The online version of this article (10.1186/s13015-019-0145-8) contains supplementary material, which is available to authorized users.

Partial Text

Computational biology plays a crucial role in the modern understanding of biology itself [1]. In particular, modelling helps to build systematic representations of biological systems, that can be used to simulate and make predictions in silico. However, most biological models are manually defined requiring a great amount of effort by the modeller. Also, many computational models can coherently explain the same time-series data set, and consequently, different modellers are likely to reach different models given the same data.

In this section, we introduce the required definitions concerning logical formalism and ASP. We then review the literature on the use of ASP for the model repair problem.

In this work, we developed a tool to repair inconsistent models implemented in C++. The tool encapsulates an ASP solver (clingo [32] solver version 5.1.0) providing the user with an easy way to generate the ASP facts. Figure 2 gives an overview of the tool main components. The tool receives a model in the DNF format and one or more time-series as matrices. Not all values have to be present in the time-series matrices. If not present, the missing values will be computed according to the chosen dynamics. As the tool repairs models with different updating schemes, it is required to specify the preferred updating scheme (steady state, asynchronous or synchronous). The user can also choose which type of repairs is desirable by combining the atomic repair operations, making sure the result meets the user requirements. Finally, the modeller can also provide a list of repairable nodes where the problem may reside, reducing the search space and potentially the execution time. The output of the tool is all the cardinality minimal repaired models. These models are exported in DNF more precisely in the boolSim format. Note that, if the process is interrupted before finding the optimal solution, then the current best solution will be returned. The tool does not guarantee to return models with minimized functions since the minimization algorithm is not executed after repairing the model.Fig. 2Overview of the tool. The different components of the proposed tool

Ostrowski et al. [9] successfully used ASP to infer networks based on time-series data. The objective is to find all networks that satisfy the time-series data sets. To achieve this goal, all combinations of edges and Boolean functions are tested. The considered dynamic allows any number of components to be updated at the same time. Another approach is to use genetic algorithms [35] to optimize Boolean networks from time-series data. These authors consider an asynchronous updating scheme to generate the dynamics. The training set is a set of time-series data which the model has to reproduce. Considering that the original models are large, it becomes difficult to reason over these models. With this in mind, the objective is to find the smallest possible sub-network to describe all the experimental values. However, not all nodes can be removed. These nodes are defined by the user and can represent key experimental readouts. Moreover, the optimization process tries to maintain the largest possible number of edges, removing only the edges that are inconsistent with the time-series data.

In this section, we evaluate and compare our method with the one recently proposed by Merhej et al. [18], the synchronous updating scheme.

In this work, we proposed an ASP-based tool capable of repairing the logical functions of a Boolean logical model, in order to make it consistent with a (set of) time-series data sets. The extension to multivalued logical models would be straightforward by applying a Boolean mapping [14].

 

Source:

http://doi.org/10.1186/s13015-019-0145-8

 

0 0 vote
Article Rating
Subscribe
Notify of
guest
0 Comments
Inline Feedbacks
View all comments