Infoscience

Thesis

Evolutionary reverse engineering of gene networks

The expression of genes is controlled by regulatory networks, which perform fundamental information processing and control mechanisms in a cell. Unraveling and modelling these networks will be indispensable to gain a systems-level understanding of biological organisms and genetically related diseases. In this thesis, we present an evolutionary reverse engineering method, which allows to simultaneously infer both the wirings and nonlinear dynamical models of gene regulatory networks from gene expression data. The proposed method reconstructs gene networks by mimicking the natural evolutionary process that constructed them. This is achieved by modelling both the way in which gene networks are encoded in the biological genome, and the different types of mutations and recombinations that drive their evolution, using an artificial genome called Analog Genetic Encoding (AGE). Since AGE mimics the evolutionary forces and constraints that shape biological gene networks, the reconstruction is naturally guided towards biologically plausible solutions. Consequently, the search space is explored more efficiently, and the networks are recovered more reliably, than with alternative methods. We have confirmed the state-of-the-art performance of AGE both in vivo (on real gene networks) and in silico (on simulated networks). In particular, AGE achieved winning performance in the in vivo gene network inference inference challenge of the 2nd DREAM (Dialogue on Reverse Engineering Assessment and Methods) conference, which consisted in predicting the structure of a synthetic-biology gene network in Saccharomyces cerevisiae from time-series data. In vivo performance assessment of network-inference methods is problematic because it is in general not possible to systematically validate predictions, except for few well-characterized gene networks. Consequently, in silico benchmarks are essential to understand the performance of network-inference methods. We have developed tools to generate biologically plausible in silico gene networks, which allow realistic performance assessment of network-inference methods. In contrast to previous in silico benchmarks, we generate network structures by extracting modules from known gene networks of model organisms, instead of using random graphs. Furthermore, we simulate network dynamics using more realistic kinetic models, which include both mRNA and proteins. We have implemented this framework in an open-source Java tool called GeneNetWeaver (GNW). Using GNW we have generated benchmarks for community-wide challenges of the 3rd and 4th DREAM conference (the DREAM in silico network challenges). Here, we assess the performance of 29 network-inference methods, which have been applied independently by participating teams of the DREAM3 challenge. Performance profiling on individual network motifs reveals that current inference methods are affected, to various degrees, by three types of systematic prediction errors. We find that these errors are induced by inaccurate prior assumptions of prevalent gene-network models. The evolutionary reverse engineering approach, which would have ranked 3rd in this challenge, can be used with a wide range of nonlinear models. It could thus provide the necessary framework for the development of models that better approximate different types of gene regulation, thereby enabling ever more accurate reconstruction of gene networks.

Related material