The influence of genomic context on mutation patterns in the human genome inferred from rare variants
Understanding patterns of spontaneous mutations is of fundamental interest in studies of human genome evolution and genetic disease. Here, we used extremely rare variants in humans to model the molecular spectrum of single-nucleotide mutations. Compared to common variants in humans and human-chimpanzee fixed differences (substitutions), rare variants, on average, arose more recently in the human lineage and are less affected by the potentially confounding effects of natural selection, population demographic history, and biased gene conversion. We analyzed variants obtained from a population-based sequencing study of 202 genes in >14,000 individuals. We observed considerable variability in the per-gene mutation rate, which was correlated with local GC content, but not recombination rate. Using >20,000 variants with a derived allele frequency >= 10(-4), we examined the effect of local GC content and recombination rate on individual variant subtypes and performed comparisons with common variants and substitutions. The influence of local GC content on rare variants differed from that on common variants or substitutions, and the differences varied by variant subtype. Furthermore, recombination rate and recombination hotspots have little effect on rare variants of any subtype, yet both have a relatively strong impact on multiple variant subtypes in common variants and substitutions. This observation is consistent with the effect of biased gene conversion or selection-dependent processes. Our results highlight the distinct biases inherent in the initial mutation patterns and subsequent evolutionary processes that affect segregating variants.