Tuberculosis is a major cause of human death around the world and is caused by various strains of mycobacteria, usually Mycobacterium tuberculosis. The bacterium infects 1.8 billion people yearly which equals one-third of the world population. Pathogenic bacterial genomes contain many perfect, imperfect and approximate tandem repeats that can serve as marker for genotyping these pathogens. Tandem repeats are generated by duplications during successive generations which changes genome structures, thereby providing diversity and improving the fitness of the pathogen during infection. In the present study, we have retrieved reference genomes of 30 Mycobacterium tuberculosis strains and used Tandem Repeats Finder tool for the prediction of genome-wide repeats. The genomes of publicly available Mycobacterium tuberculosis strains varied in size from 4.38 MB to 4.42 MB. After trimming out low quality tandem repeats, we found the strain M. tuberculosis CCDC5079 to have the maximum tandem repeats density number (DN) while density length (DL) was found to be maximum in M. tuberculosis KIT87190. We found the lowest repeat density number and length in M. tuberculosis Erdman. This study would help in mapping and identification of pathogenic bacterial species and to understand the importance of tandem repeats in their evolution.
Original languageEnglish
JournalJournal of Computational Methods in Sciences and Engineering
Publication statusPublished - 2016


Dive into the research topics of 'Repeat sequence analysis of Mycobacterium Tuberculosis'. Together they form a unique fingerprint.

Cite this