Files

Abstract

Gene regulatory networks control gene expression levels, and therefore play an essential role in mammalian development and function. Regulation of gene expression is the result of a complex interplay between DNA regulatory elements and their binding partners, known as transcription factors (TFs). Due to their vital role in development, intercellular signalling, cell cycle and disease development, elucidating the mechanisms by which TFs regulate gene expression is of crucial importance in the vast majority of biological processes. In particular, understanding how each TF contributes to the expression output of its respective target gene in space and time will help to elucidate how gene regulatory networks (GRNs) behave under different physiological or pathological conditions. Although extensive work has been accomplished in characterizing the key TFs involved in many biological processes, almost no quantitative information is currently available in the literature. To get a deep insight into the complex mechanisms underlying the regulation of gene expression, we need to acquire quantitative information, since TF abundance within the cell can be linked to their transcriptional capabilities. Such information would be of utmost importance to build accurate in silico quantitative DNA binding models that could predict and explain the particular properties of gene regulatory mechanisms. The quantification of TFs is a difficult task due their natural low abundance in cells, and their reliable detection is therefore very much dependent on the overall sensitivity of current technologies. In recent years, a new MS-based technology termed selected reaction monitoring (SRM) has gained popularity due to the targeted nature of its approach that allows the detection and quantification of proteins in complex samples with an exceptional sensitivity and specificity. I will show in this thesis, this approach is particularly well suited for targeting low abundant proteins such as TFs, which are otherwise difficult to identify with conventional shotgun proteomics experiments. Consequently, the main focus of my thesis research project entailed the development of an SRM-based platform aimed at quantifying TFs in absolute amounts based on in vitro protein expression during the terminal stage of adipogenesis, using the pre-adipocyte 3T3-L1 cell line. Interestingly, our initial efforts led to the creation of an atlas of TF-specific peptide data, which could be readily used for the design of quantitative assays. In the first phase, abundance measurements in terms of copies per cell were derived at precise differentiation time-points for two major adipogenic players, PPARγ and RXRα. In the second phase, we expanded the number of adipogenic TFs that can be monitored in one assay, allowing for the quantification of up to 10 TFs in one single, integrated SRM run. Such upscale increases the practical usefulness of the methodology while reducing the associated costs, and ultimately allows for non-negligible time-savings. The availability of absolute protein copy number data permitted us ultimately to examine the relationship between the number of genome-wide DNA binding events and TF molecules. We derived a quantitative DNA binding model that allowed the prediction of the number of PPARγ ChIP-seq binding events given its nuclear abundance, chromatin state, and DNA binding energetics. As such, we were able to explain the paradoxical observation of a significant increase in PPARγ binding sites despite a saturation in the number of PPARγ molecules. We thus demonstrate how TF abundance data can be modeled in conjunction with large-scale DNA occupancy and chromatin state data to further our understanding of gene regulatory mechanisms mediating cellular differentiation. We are now starting to build on our pioneering work to quantify in absolute terms key players of the entire, core adipogenic GRN, as such aiming to provide a quantitative explanation of the regulatory mechanisms at play during the terminal phase of adipocyte differentiation. Moreover, to increase the explicative power of our methodology and to alleviate the throughput limitation that comes with obtaining absolute protein measurements, we decided to perform copy-number estimates for a larger set of adipogenic TFs utilizing a modified version of our original approach. At the cost of a modest loss of accuracy, we are now aiming to develop a sensitive and robust methodology that will allow the quantification of entire GRNs at low cost and in a time-effective manner. This is consistent with the overall goal in life sciences or clinical research to improve our ability to accurately and reproducibly quantify entire pathways or biological networks to improve our systems understanding of biological processes.

Details

Actions

Preview