Comparative genomics of Esx genes from clinical isolates of Mycobacterium tuberculosis provides evidence for gene conversion and epitope variation
The 23-membered Esx protein family is involved in the host-pathogen interactions of Mycobacterium tuberculosis. These secreted proteins are among the most immunodominant antigens recognized by the human immune system and have thus been used to develop vaccines and immunodiagnostic tests for tuberculosis (TB). Gene pairs for 10 Esx proteins are contained in the ESX-1 to ESX-5 loci, encoding type VII secretion systems. A subset of Esx proteins can be further classified into the Mtb9.9, QILSS, and TB10.4 subfamilies. To survey genetic diversity in the Esx family and its potential for antigenic variation, we sequenced all esx genes from 108 clinical isolates of M. tuberculosis from different clades by using a targeted approach. A total of 109 unique single nucleotide polymorphisms (SNPs) were observed, and 59 of these were nonsynonymous. Some of the resultant amino acid substitutions affect known Esx epitopes, including two in the EsxB (CFP-10) and EsxH (TB10.4) antigens. Assessment of the SNP distribution across the Esx proteins revealed high genetic variability, especially in the Mtb9.9 and QILSS subfamilies, and more conservation in the ESX-1 to ESX-4 loci. Comparison of the DNA sequences of variable esx genes provided clear evidence for recombination events between different genes in the same strain, some of which are predicted to truncate the corresponding protein. Many of these polymorphisms escape detection by ultrahigh-throughput sequencing using short sequence reads, as such approaches cannot distinguish between closely related genes. The esx gene family is dynamic, and sequence changes likely lead to immune variation.