On Detecting Incomplete Soft or Hard Selective Sweeps Using Haplotype Structure
We present a new haplotype-based statistic (nS(L)) for detecting both soft and hard sweeps in population genomic data from a single population. We compare our new method with classic single-population haplotype and site frequency spectrum (SFS)-based methods and show that it is more robust, particularly to recombination rate variation. However, all statistics show some sensitivity to the assumptions of the demographic model. Additionally, we show that nS(L) has at least as much power as other methods under a number of different selection scenarios, most notably in the cases of sweeps from standing variation and incomplete sweeps. This conclusion holds up under a variety of demographic models. In many aspects, our new method is similar to the iHS statistic; however, it is generally more robust and does not require a genetic map. To illustrate the utility of our new method, we apply it to HapMap3 data and show that in the Yoruban population, there is strong evidence of selection on genes relating to lipid metabolism. This observation could be related to the known differences in cholesterol levels, and lipid metabolism more generally, between African Americans and other populations. We propose that the underlying causes for the selection on these genes are pleiotropic effects relating to blood parasites rather than their role in lipid metabolism.