To further unravel the mechanisms responsible for attenuation of the tuberculosis vaccine Mycobacterium bovis BCG, comparative genomics was used to identify single nucleotide polymorphisms (SNPs) that differed between sequenced strains of Mycobacterium bovis and M. bovis BCG. SNPs were assayed in M. bovis isolates from France and the United Kingdom and from different BCG vaccines in order to identify those that arose during the attenuation process which gave rise to BCG. Informative data sets were obtained for 658 SNPs from 21 virulent M. bovis strains and 13 BCG strains; these SNPs showed phylogenetic clustering that was consistent with the geographical origin of the strains and previous schemes for BCG genealogies. The data revealed a closer relationship between BCG Tice and BCG Pasteur than was previously appreciated, while we were able to position BCG Beijing within a grouping of BCG Denmark-derived strains. Only 186 SNPs were identified between virulent M. bovis strains and all BCG strains, with 115 nonsynonymous SNPs affecting important functions such as global regulators, transcriptional factors, and central metabolism, which might impact on virulence. We therefore refine previous genealogies of BCG vaccines and define a minimal set of SNPs between virulent M. bovis strains and the attenuated BCG strain that will underpin future functional analyses.