Cole, S TBrosch, RParkhill, JGarnier, TChurcher, CHarris, DGordon, S VEiglmeier, KGas, SBarry, C ETekaia, FBadcock, KBasham, DBrown, DChillingworth, TConnor, RDavies, RDevlin, KFeltwell, TGentles, SHamlin, NHolroyd, SHornsby, TJagels, KKrogh, AMcLean, JMoule, SMurphy, LOliver, KOsborne, JQuail, M ARajandream, M ARogers, JRutter, SSeeger, KSkelton, JSquares, RSquares, SSulston, J ETaylor, KWhitehead, SBarrell, B G2010-09-072010-09-072010-09-07199810.1038/31159https://infoscience.epfl.ch/handle/20.500.14299/532109634230Countless millions of people have died from tuberculosis, a chronic infectious disease caused by the tubercle bacillus. The complete genome sequence of the best-characterized strain of Mycobacterium tuberculosis, H37Rv, has been determined and analysed in order to improve our understanding of the biology of this slow-growing pathogen and to help the conception of new prophylactic and therapeutic interventions. The genome comprises 4,411,529 base pairs, contains around 4,000 genes, and has a very high guanine + cytosine content that is reflected in the biased amino-acid content of the proteins. M. tuberculosis differs radically from other bacteria in that a very large portion of its coding capacity is devoted to the production of enzymes involved in lipogenesis and lipolysis, and to two new families of glycine-rich proteins with a repetitive structure that may represent a source of antigenic variation.Genome, BacterialDeciphering the biology of Mycobacterium tuberculosis from the complete genome sequencetext::journal::journal article::research article