TY - JOUR
T1 - Repeat sequence analysis of Mycobacterium Tuberculosis
AU - Paul, Bobby
AU - Gupta, Himanshu
AU - Thokur, Murali S
AU - G, Vasudevan T
AU - Kapaettu, Satyamoorthy
PY - 2016
Y1 - 2016
N2 - Tuberculosis is a major cause of human death around the world and is caused by various strains of mycobacteria, usually Mycobacterium tuberculosis. The bacterium infects 1.8 billion people yearly which equals one-third of the world population. Pathogenic bacterial genomes contain many perfect, imperfect and approximate tandem repeats that can serve as marker for genotyping these pathogens. Tandem repeats are generated by duplications during successive generations which changes genome structures, thereby providing diversity and improving the fitness of the pathogen during infection. In the present study, we have retrieved reference genomes of 30 Mycobacterium tuberculosis strains and used Tandem Repeats Finder tool for the prediction of genome-wide repeats. The genomes of publicly available Mycobacterium tuberculosis strains varied in size from 4.38 MB to 4.42 MB. After trimming out low quality tandem repeats, we found the strain M. tuberculosis CCDC5079 to have the maximum tandem repeats density number (DN) while density length (DL) was found to be maximum in M. tuberculosis KIT87190. We found the lowest repeat density number and length in M. tuberculosis Erdman. This study would help in mapping and identification of pathogenic bacterial species and to understand the importance of tandem repeats in their evolution.
AB - Tuberculosis is a major cause of human death around the world and is caused by various strains of mycobacteria, usually Mycobacterium tuberculosis. The bacterium infects 1.8 billion people yearly which equals one-third of the world population. Pathogenic bacterial genomes contain many perfect, imperfect and approximate tandem repeats that can serve as marker for genotyping these pathogens. Tandem repeats are generated by duplications during successive generations which changes genome structures, thereby providing diversity and improving the fitness of the pathogen during infection. In the present study, we have retrieved reference genomes of 30 Mycobacterium tuberculosis strains and used Tandem Repeats Finder tool for the prediction of genome-wide repeats. The genomes of publicly available Mycobacterium tuberculosis strains varied in size from 4.38 MB to 4.42 MB. After trimming out low quality tandem repeats, we found the strain M. tuberculosis CCDC5079 to have the maximum tandem repeats density number (DN) while density length (DL) was found to be maximum in M. tuberculosis KIT87190. We found the lowest repeat density number and length in M. tuberculosis Erdman. This study would help in mapping and identification of pathogenic bacterial species and to understand the importance of tandem repeats in their evolution.
U2 - DOI: 10.3233/JCM-160597
DO - DOI: 10.3233/JCM-160597
M3 - Article
SN - 1472-7978
JO - Journal of Computational Methods in Sciences and Engineering
JF - Journal of Computational Methods in Sciences and Engineering
ER -