Macro-array and bioinformatic analyses reveal mycobacterial 'core' genes, variation in the ESAT-6 gene family and new phylogenetic markers for the Mycobacterium tuberculosis complex
To better understand the biology and the virulence determinants of the two major mycobacterial human pathogens Mycobacterium tuberculosis and Mycobacterium leprae, their genome sequences have been determined recently. In silico comparisons revealed that among the 1439 genes common to both M. tuberculosis and M. leprae, 219 genes code for proteins that show no similarity with proteins from other organisms. Therefore, the latter 'core' genes could be specific for mycobacteria or even for the intracellular mycobacterial pathogens. To obtain more information as to whether these genes really were mycobacteria-specific, they were included in a focused macro-array, which also contained genes from previously defined regions of difference (RD) known to be absent from Mycobacterium bovis BCG relative to M. tuberculosis. Hybridization of DNA from 40 strains of the M. tuberculosis complex and in silico comparison of these genes with the near-complete genome sequences from Mycobacterium avium, Mycobacterium marinum and Mycobacterium smegmatis were undertaken to answer this question. The results showed that among the 219 conserved genes, very few were not present in all the strains tested. Some of these missing genes code for proteins of the ESAT-6 family, a group of highly immunogenic small proteins whose presence and number is variable among the genomically highly conserved members of the M. tuberculosis complex. Indeed, the results suggest that, with few exceptions, the 'core' genes conserved among M. tuberculosis H37Rv and M. leprae are also highly conserved among other mycobacterial strains, which makes them interesting potential targets for developing new specific anti-mycobacterial drugs. In contrast, the genes from RD regions showed great variability among certain members of the M. tuberculosis complex, and some new specific deletions in Mycobacterium canettii, Mycobacterium microti and seal isolates were identified and further characterized during this study. Together with the distribution of a particular 6 or 7 bp micro-deletion in the gene encoding the polyketide synthase pks15/1, these results confirm and further extend the revised phylogenetic model for the M. tuberculosis complex recently presented.