The field of synthetic data is more and more present in our everyday life. The transportation domain is particularly interested in improving the methods for the generation of synthetic data in order to address the privacy and availability issue of real data. Since we want to generate data for Activity Based Models, the key challenge of this project is to expand the existing simulation generation method, Markov Chain Monte Carlo (MCMC), to generate data about the activities of individuals. This allows us to anonymize people's trips and to analyze how people's behavior is related to their trips (e.g. home-work-supermarket-home for people living alone or home-study-sport-home for students). The generated data can be useful for other studies or for planning in the professional transportation field. Once data is generated, we have to validate the representativity of the synthetic sample compared to the real one. The first step in using MCMC is to prepare the inputs by creating conditional probabilities. The construction of these vectors varies depending on the type of data that we want to generate (e.g. continuous, discrete). In the current version of the existing framework, only discrete attributes are defined. We plan to expand on the generation of continuous attributes and sequential data. The data used are from the Swiss Mobility and Transport Micro Census Data (MTMC). The Federal Office for Spatial Development (ARE) and the Federal Statistical Office (FSO) conducted a national survey to gather the data. This data sample gathers information on people's mobility behaviors. Respondents list their socioeconomic features, their daily mobility routines (such as time or distance to work), and detailed records of their travels throughout a reference period (1 day).
2023_06_23_master_thesis_qbochud.pdf
n/a
openaccess
n/a
5.93 MB
Adobe PDF
9889562bf54cb311f358e45d5fd7e3c0