Batch Mode Reinforcement Learning for Controlling Gene Regulatory Networks and Multi-model Gene Expression Data Enrichment Framework
Over the last decade, modeling and controlling gene regulation has received much attention. In this thesis, we have attempted to solve (i) controlling gene regulation systems and (ii) generating high quality artificial gene expression data problems. For controlling gene regulation systems, we have proposed three control solutions based on Batch Mode Reinforcement Learning (Batch RL) techniques. We have proposed one control solution for fully, and two control solutions for partially observable gene regulation systems. For controlling fully observable gene regulation systems, we have proposed a method producing approximate control policies directly from gene expression data without making use of any computational model. Results show that our proposed method is able to produce approximate control policies for gene regulation systems of several thousands of genes just in seconds without loosing significant performance; whereas existing studies get stuck even for several tens of genes. For controlling partially observable gene regulation systems, firstly, we have proposed a novel Batch RL framework for partially observable environments, Batch Mode TD(λ). Its idea is to produce approximate stochastic control policies mapping observations directly to actions probabilistically without estimating actual internal states of the regulation system. Results show that Batch Mode TD(λ) is able to produce successful stochastic policies for regulation systems of several thousands of genes in seconds; whereas existing studies cannot produce control solution for regulation systems of several tens of genes. To our best knowledge, Batch Mode TD(λ) is the first framework for solving non-Markovian decision tasks with limited number of experience tuples. For controlling partially observable gene regulation systems, secondly, we have proposed a method to construct a Partially Observable Markov Decision Process (POMDP) directly from gene expression data. Our novel POMDP construction method calculates approximate observation-action values for each possible observation, and applies hidden state identification techniques to those approximate values for building the ultimate POMDP. Results show that our constructed POMDPs perform better than existing solutions in terms of both time requirements and solution quality. For generating high quality artificial gene expression data, we have proposed a novel multi-model gene expression data enrichment framework. We have combined four gene expression data generation models into one unified framework, and tried to benefit all of them concurrently. We have sampled from each generative models separately, pooled the generated samples, and output the best ones based on a multi-objective selection mechanism. Results show that our proposed multi-model gene expression data generation framework is able to produce high quality artificial samples from which inferred regulatory networks are better than the regulatory networks inferred from original datasets.