The increased accessibility of data that are geographically referenced and correlated increases the demand for techniques of spatial data analysis. The subset of such data comprised of discrete counts exhibit particular difficulties and the challenges further increase when a large proportion (typically 50% or more) of the counts are zero-valued. Such scenarios arise in many applications in numerous fields of research and it is often desirable to infer on subtleties of the process, despite the lack of substantive information obscuring the underlying stochastic mechanism generating the data. An ecological example provides the impetus for the research in this thesis: when observations for a species are recorded over a spatial region, and many of the counts are zero-valued, are the abundant zeros due to bad luck, or are aspects of the region making it unsuitable for the survival of the species? In the framework of generalized linear models, we first develop a zero-inflated Poisson generalized linear regression model, which explains the variability of the responses given a set of measured covariates, and additionally allows for the distinction of two kinds of zeros: sampling ("bad luck" zeros), and structural (zeros that provide insight into the data-generating process). We then adapt this model to the spatial setting by incorporating dependence within the model via a general, leniently-defined quasi-likelihood strategy, which provides consistent, efficient and asymptotically normal estimators, even under erroneous assumptions of the covariance structure. In addition to this advantage of robustness to dependence misspecification, our quasi-likelihood model overcomes the need for the complete specification of a probability model, thus rendering it very general and relevant to many settings. To complement the developed regression model, we further propose methods for the simulation of zero-inflated spatial stochastic processes. This is done by deconstructing the entire process into a mixed, marked spatial point process: we augment existing algorithms for the simulation of spatial marked point processes to comprise a stochastic mechanism to generate zero-abundant marks (counts) at each location. We propose several such mechanisms, and consider interaction and dependence processes for random locations as well as over a lattice.
EPFL_TH5442.pdf
openaccess
1.37 MB
Adobe PDF
db0570853997f87cb169f574f11425ca