Understanding hormone signaling and its risk on estrogen receptor-positive breast cancer
Next generation sequencing (NGS), in vivo models, and drug screening reshaped research. In breast cancer, transcriptomics plays an important role in subtyping the disease. In vivo models are used to better understand the disease and drug screenings are used to find new actionable targets. However, there is a lack of a universal framework to analyze transcriptomics data in both research and clinics. For in vivo models and drug screening, there is a gap between bioscientists and proper methodology for the data analysis that is easy to use.
In this thesis, we present an embedding approach termed molecular EMBeddER (EMBER) that creates a unified space of 11,000 breast cancer transcriptomes and predicts phenotypes of transcriptomic profiles on a single sample basis. EMBER accurately captures the five molecular subtypes. Key biological pathways, such as estrogen receptor (ER) signaling, cell proliferation, DNA repair, and epithelial-mesenchymal transition determine sample position in the space. Of direct clinical importance, we show that the EMBER-based ER signaling score is superior to the immunohistochemistry (IHC) based ER index used in current clinical practice to select patients for endocrine therapy. As such, EMBER provides a calibration and reference tool that paves the way to use RNA-sequencing (RNA-seq) as a standard diagnostic and predictive tool for ER+ breast cancer. Our work is also available as a R package (https://chronchi.github.io/ember/).
We also propose a workflow for longitudinal data analysis. We show the importance of the exploratory data analysis approach and made use of Bayesian inference with hierarchical modeling to provide further enhancements in the data interpretation by focusing on effect sizes and their directions. We also provide a new randomization algorithm that incorporates the proposed workflow to reduce the number of experimental units unnecessarily used in the experiments. Our freely available open-source R package biogrowleR (https://upbri.gitlab.io/biogrowleR/) contains tutorials, pipelines and helper functions for analyzing longitudinal data from cancer research experiments.
EPFL_TH10906.pdf
main document
openaccess
N/A
22.16 MB
Adobe PDF
6adc0c078d8e1512c26bc8d8e9237bbe