Complex networks : from biological applications to exact theoretical solutions

Caretta Cartozo, Cécile

doi:10.5075/epfl-thesis-4462

doctoral thesis

Complex networks : from biological applications to exact theoretical solutions

Caretta Cartozo, Cécile

2009

The last decade has witnessed a fundamental change in the role of network theory. Side by side with the increasing relevance of interdisciplinarity, this ensemble of mathematical tools and methods has known such a success and such a wide range of different applications to transform it in the proper scientific field of complex networks that groups researchers from different communities. Two different approaches can be distinguished. The first one is descriptive, tightly related to the experimental world. Reality is permeated with complex systems that lend themselves naturally to a network representation. The amount of data on natural, social and artificial systems is constantly increasing, and network theory provides a powerful framework for the analysis of such large datasets. The second is theoretical, aimed at producing models and analytical solutions for the reproduction and understanding of real networks properties. The results presented in this Thesis reflect the interdisciplinarity of the field of complex networks as well as its twofold nature. The first part of the manuscript describes new methods for the study of gene expression and protein interactions. Biological systems are highly complex and sensitive to the environment. Experimental data are affected by errors and poor reproducibility. We propose a method for the comparative analysis of similar datasets. We present an application to three independent studies on gene expression in the cell cycle of the fission yeast that show very poor agreement. In an attempt to reconcile them, we define a periodic genes network in which each node represents a gene identified as periodically regulated in the cell cycle and an edge between two nodes is drawn according to the phase difference between the expression peaks of the corresponding genes. The analysis of the topological structure of the network reveals a universal picture that overcomes the discrepancies and that is strongly related to the biological structure of the cell cycle and to the regulation of its progression. The method is able to group genes that are co-expressed during the cell cycle and to identify putative regulators of the transition from one phase to the next. Proteins are the fundamental product of gene expression. They are characterized by a modular structure comprising independent domains that act as mediators of protein interactions and are common to evolutionary related proteins. We propose the study of domain-centered interaction networks as a complementary tool for the understanding of protein interactions. We explore their topological structure looking for stable similarity groups and we investigate domain-peptide correlations that could be used as patterns for the prediction of the properties and functions to unknown proteins. The second part of this Thesis presents an exact theoretical solution for the navigability of small world networks with an inverse power-law distribution of the length of the connections. Optimal navigability is crucial for real networks involving transportation or communication processes (Internet, electric power grid, neural networks). Most of these systems are small worlds, characterized by a surprisingly small distance between any two nodes that grows only logarithmically with the size. How they are able to exploit this property for optimized performances without a complete knowledge of the structure is still an open question. It has been shown that a small world can arise from a proper compromise between local and long-range connections. In the specific case in which the length of the connections is power-law distributed and a greedy algorithm is used to forward a message from a source to a target, we obtained an original exact solution for the behavior of the expected delivery time in any dimension. We discussed the resulting scenario for the navigability and the implication of different structural parameters.

Name

EPFL_TH4462.pdf

Access type

restricted

Size

20.95 MB

Format

Adobe PDF

Checksum (MD5)

82a367a250e33b1af32f4fa0e3e02fae