Bacterial populations are routinely characterized based on microscopic examination, colony formation, and biochemical tests. However, in the recent past, bacterial identification, classification, and nomenclature have been strongly influenced by genome sequence information. Advances in bioinformatics and growth in genome databases has placed genome-based metadata analysis in the hands of researchers who will require taxonomic experience to resolve intricacies. To achieve this, different tools are now available to quantitatively measure genome relatedness within members of the same species, and genome-wide average nucleotide identity (gANI) is one such reliable tool to measure genome similarity. A genome assembly with a gANI score of <95% at the intraspecies level is generally considered indicative of a separate species. In this study, we have analysed 300 whole-genome sequences belonging to 26 different bacterial species available in the NCBI Genome database and calculated their similarity at the intraspecies level based on gANI score. At the intraspecies level, nine bacterial species showed less than 90% gANI and more than 10% of unaligned regions. We suggest the appropriate use of available bioinformatics resources after genome assembly to arrive at the proper bacterial identification, classification, and nomenclature to avoid erroneous species assignments and disparity due to diversity at the intraspecies level.
All Science Journal Classification (ASJC) codes
- Molecular Biology