Repeat sequence analysis of Mycobacterium Tuberculosis

Research output: Contribution to journalArticle

Abstract

Tuberculosis is a major cause of human death around the world and is caused by various strains of mycobacteria, usually Mycobacterium tuberculosis. The bacterium infects 1.8 billion people yearly which equals one-third of the world population. Pathogenic bacterial genomes contain many perfect, imperfect and approximate tandem repeats that can serve as marker for genotyping these pathogens. Tandem repeats are generated by duplications during successive generations which changes genome structures, thereby providing diversity and improving the fitness of the pathogen during infection. In the present study, we have retrieved reference genomes of 30 Mycobacterium tuberculosis strains and used Tandem Repeats Finder tool for the prediction of genome-wide repeats. The genomes of publicly available Mycobacterium tuberculosis strains varied in size from 4.38 MB to 4.42 MB. After trimming out low quality tandem repeats, we found the strain M. tuberculosis CCDC5079 to have the maximum tandem repeats density number (DN) while density length (DL) was found to be maximum in M. tuberculosis KIT87190. We found the lowest repeat density number and length in M. tuberculosis Erdman. This study would help in mapping and identification of pathogenic bacterial species and to understand the importance of tandem repeats in their evolution.
Original languageEnglish
JournalJournal of Computational Methods in Sciences and Engineering
DOIs
Publication statusPublished - 2016

Fingerprint

Tuberculosis
Sequence Analysis
Genes
Genome
Pathogens
Trimming
Bacteria
Duplication
Imperfect
Fitness
Infection
Lowest
Prediction

Cite this

@article{9420ba94c5e944dcbd47caad6f33b3d0,
title = "Repeat sequence analysis of Mycobacterium Tuberculosis",
abstract = "Tuberculosis is a major cause of human death around the world and is caused by various strains of mycobacteria, usually Mycobacterium tuberculosis. The bacterium infects 1.8 billion people yearly which equals one-third of the world population. Pathogenic bacterial genomes contain many perfect, imperfect and approximate tandem repeats that can serve as marker for genotyping these pathogens. Tandem repeats are generated by duplications during successive generations which changes genome structures, thereby providing diversity and improving the fitness of the pathogen during infection. In the present study, we have retrieved reference genomes of 30 Mycobacterium tuberculosis strains and used Tandem Repeats Finder tool for the prediction of genome-wide repeats. The genomes of publicly available Mycobacterium tuberculosis strains varied in size from 4.38 MB to 4.42 MB. After trimming out low quality tandem repeats, we found the strain M. tuberculosis CCDC5079 to have the maximum tandem repeats density number (DN) while density length (DL) was found to be maximum in M. tuberculosis KIT87190. We found the lowest repeat density number and length in M. tuberculosis Erdman. This study would help in mapping and identification of pathogenic bacterial species and to understand the importance of tandem repeats in their evolution.",
author = "Bobby Paul and Himanshu Gupta and Thokur, {Murali S} and G, {Vasudevan T} and Satyamoorthy Kapaettu",
year = "2016",
doi = "DOI: 10.3233/JCM-160597",
language = "English",
journal = "Journal of Computational Methods in Sciences and Engineering",
issn = "1472-7978",
publisher = "IOS Press",

}

TY - JOUR

T1 - Repeat sequence analysis of Mycobacterium Tuberculosis

AU - Paul, Bobby

AU - Gupta, Himanshu

AU - Thokur, Murali S

AU - G, Vasudevan T

AU - Kapaettu, Satyamoorthy

PY - 2016

Y1 - 2016

N2 - Tuberculosis is a major cause of human death around the world and is caused by various strains of mycobacteria, usually Mycobacterium tuberculosis. The bacterium infects 1.8 billion people yearly which equals one-third of the world population. Pathogenic bacterial genomes contain many perfect, imperfect and approximate tandem repeats that can serve as marker for genotyping these pathogens. Tandem repeats are generated by duplications during successive generations which changes genome structures, thereby providing diversity and improving the fitness of the pathogen during infection. In the present study, we have retrieved reference genomes of 30 Mycobacterium tuberculosis strains and used Tandem Repeats Finder tool for the prediction of genome-wide repeats. The genomes of publicly available Mycobacterium tuberculosis strains varied in size from 4.38 MB to 4.42 MB. After trimming out low quality tandem repeats, we found the strain M. tuberculosis CCDC5079 to have the maximum tandem repeats density number (DN) while density length (DL) was found to be maximum in M. tuberculosis KIT87190. We found the lowest repeat density number and length in M. tuberculosis Erdman. This study would help in mapping and identification of pathogenic bacterial species and to understand the importance of tandem repeats in their evolution.

AB - Tuberculosis is a major cause of human death around the world and is caused by various strains of mycobacteria, usually Mycobacterium tuberculosis. The bacterium infects 1.8 billion people yearly which equals one-third of the world population. Pathogenic bacterial genomes contain many perfect, imperfect and approximate tandem repeats that can serve as marker for genotyping these pathogens. Tandem repeats are generated by duplications during successive generations which changes genome structures, thereby providing diversity and improving the fitness of the pathogen during infection. In the present study, we have retrieved reference genomes of 30 Mycobacterium tuberculosis strains and used Tandem Repeats Finder tool for the prediction of genome-wide repeats. The genomes of publicly available Mycobacterium tuberculosis strains varied in size from 4.38 MB to 4.42 MB. After trimming out low quality tandem repeats, we found the strain M. tuberculosis CCDC5079 to have the maximum tandem repeats density number (DN) while density length (DL) was found to be maximum in M. tuberculosis KIT87190. We found the lowest repeat density number and length in M. tuberculosis Erdman. This study would help in mapping and identification of pathogenic bacterial species and to understand the importance of tandem repeats in their evolution.

U2 - DOI: 10.3233/JCM-160597

DO - DOI: 10.3233/JCM-160597

M3 - Article

JO - Journal of Computational Methods in Sciences and Engineering

JF - Journal of Computational Methods in Sciences and Engineering

SN - 1472-7978

ER -