Infoscience

Thesis

Computational methods and modeling of cellular function for the optimization of protein production and the study of cellular disease states

Systems biology is a multidisciplinary field that weaves together all the basic sciences through the use of computational and bioinformatics tools, to provide a more integrative view of the complex molecular interactions taking place within and among cells. The successes in the development and improvement of techniques for high throughput –omics has rapidly increased the amount of available data. The complexity of the underlying biological system it describes requires the development of tools to properly process it and analyze it. Computational models mathematically describe the systems interactions, allowing intrinsic properties of the data to emerge that would otherwise be overlooked. These models provide context to the data and are used to make predictions about the behavior of the system and to simulate a broader landscape of hypothesis, saving the time and cost of performing numerous experiments. However, the number of required parameters to mathematically formulate the system increases with the model complexity to integrate the available data. Thus, the development of estimation procedures and workflows to retrieve these values from literature and databases become crucial. In this work, we developed a set of computational models, analysis tools, and pipelines to support the study of two biological systems crucial to the cell survival: metabolism and protein synthesis. Metabolism is responsible for the production of most cell biomass, including proteins. Together, these two systems balance the renewal of protein in the cell, where metabolism provides the amino acids obtained from protein breakdown to the mRNA translation machinery. Deregulation of these systems is known to cause multiple disorders, such as neurodegenerative diseases and cancer. In the study of protein synthesis, we employ a combination of deterministic and stochastic modeling approaches to understand its intrinsic mechanistic properties and its rate-limiting steps. A better understanding of the system properties can have a profound impact on the development of drug targets and, in particular, in the optimization of heterologous protein production. Our studies revealed that more than one factor plays a role in the speed of translation: competition for tRNA resources and the type of cognate binding interaction between tRNA and the mRNA-ribosome complex. We also derived an equation that, given the knowledge about certain intracellular parameters pertaining to the host organism of interest, can assist in the design of transcripts for optimizing heterologous protein production. For the study of human metabolism, we established a pipeline to generate tissue-specific reduced metabolic models that can be used to study the metabolic reprogramming of different cancers and compare it with the metabolic phenotype of a healthy cell type. Despite being presented herein for human models, this pipeline is general and can be applied to the models of any organism. Starting from a human genome scale metabolic model, the pipeline improves compound annotation, identification, thermodynamics parameter retrieval, and facilitates data integration through the connection of several compound databases in a semi-automatized fashion. This work sets a standard for metabolic model assessment and curation and improves on existing tools to generate the first thermodynamically feasible reduced model of human metabolism, which is specifically tailored to the physiology and conditions under study.

Fulltext

  • Thesis submitted - Forthcoming publication

Related material