000148590 001__ 148590
000148590 005__ 20190619003300.0
000148590 0247_ $$2doi$$a10.5075/epfl-thesis-4494
000148590 02470 $$2urn$$aurn:nbn:ch:bel-epfl-thesis4494-0
000148590 02471 $$2nebis$$a6045554
000148590 037__ $$aTHESIS
000148590 041__ $$aeng
000148590 088__ $$a4494
000148590 245__ $$aPrediction of Survival and Risk Assessment Using Joint Analysis of Microarray Gene Expression Data
000148590 269__ $$a2010
000148590 260__ $$bEPFL$$c2010$$aLausanne
000148590 300__ $$a197
000148590 336__ $$aTheses
000148590 520__ $$aGene expression profiles have been widely used in molecular classification, diagnosis and prediction, particularly in the area of oncology where accurate and early diagnosis is needed for appropriate treatment. Avoiding under-/over-treatment when it is not necessary can extend a patient's survival and prevent disease recurrence. These high-throughput assay technologies have generated terabytes of data exploited extensively to provide insights on cancer biology and the underlying mechanism of disease progression. The ultimate goal is to identify possibly tailored treatment and therapy for personalized medicine. Analysis of microarray data is constrained by the following characteristics: (i) noisy due to missing or erroneous values; (ii) high dimensional due to a large number of genes versus a few number of samples in which their expression levels are measured; (iii) costly due to expensive microarray experiments. Abundant microarray gene expression data should be processed by appropriate computational and statistical learning methodologies such as machine learning techniques. These methods are robust to noisy data and have a great capacity to analyze high dimensional data. Their computational power is nevertheless limited to sample size based on which these methods are built. These algorithms have been widely applied to microarray gene expression data to identify a set of genes known as a gene signature whose expressions are highly correlated to a target value or outcome such as disease status, tumor subtype, a patient's survival time, risk of mortality or cancer relapse. Prediction of survival time and a patient's risk which is unknown at diagnosis presents a more challenging task for machine learning methods than tumor subtype or disease classification, which is already established by oncologists. The properties of microarray data cited above, the limitation of the number of samples in cancer patients and dependency of the machine learning methods' performance on sample size justify joint analysis of microarray data to increase the number of samples. We applied joint analysis methods to breast and lung cancer data sets to improve survival prediction and risk assessment. In overall, no significant improvement or deterioration of the performance accuracy was obtained with joint analysis. However, increasing sample size helped to identify robust or stable gene signatures predictive of survival time and risk assessment. Our achievements and learned-lessons from joint analysis of microarray gene expression data can be used as a guideline for future research studies in classification and prediction.
000148590 6531_ $$amicroarray gene expression data
000148590 6531_ $$asurvival prediction
000148590 6531_ $$ajoint analysis
000148590 6531_ $$amachine learning
000148590 6531_ $$agene signature
000148590 6531_ $$arobust prediction
000148590 6531_ $$adonnées d'expression de gènes
000148590 6531_ $$apuces à ADN
000148590 6531_ $$aprédiction de survie
000148590 6531_ $$aanalyse combinatoire
000148590 6531_ $$améthodes d'apprentissage
000148590 6531_ $$asignature de gènes
000148590 6531_ $$aprédiction robuste
000148590 700__ $$aYasrebi, Haleh
000148590 720_2 $$aBucher, Philipp$$edir.$$g113607$$0244404
000148590 720_2 $$aNaef, Félix$$edir.
000148590 8564_ $$uhttps://infoscience.epfl.ch/record/148590/files/EPFL_TH4494.pdf$$zTexte intégral / Full text$$s2854668$$yTexte intégral / Full text
000148590 909C0 $$xU11260$$0252319$$pUPNAE
000148590 909C0 $$xU11780$$0252244$$pGR-BUCHER
000148590 909CO $$ooai:infoscience.tind.io:148590$$qGLOBAL_SET$$pSV$$pthesis$$pthesis-bn2018$$pthesis-public$$pDOI$$qDOI2
000148590 918__ $$dEDCI$$cISREC$$aSV
000148590 919__ $$aGR-BUCHER
000148590 919__ $$aUPNAE
000148590 920__ $$b2010
000148590 970__ $$a4494/THESES
000148590 973__ $$sPUBLISHED$$aEPFL
000148590 980__ $$aTHESIS