TY - JOUR
T1 - An information set-based robust text-independent speaker authentication
AU - Medikonda, Jeevan
AU - Bhardwaj, Saurabh
AU - Madasu, Hanmandlu
N1 - Funding Information:
This is a part of the ongoing project on “Personal Authentication using Multimodal Behavioral Biometrics: Voice and Gait” and the authors express their gratitude to the Department of Science and Technology, Government of India (Grant No. SB/S3/EECE/0127/2013) for funding the project.
Publisher Copyright:
© 2019, Springer-Verlag GmbH Germany, part of Springer Nature.
Copyright:
Copyright 2020 Elsevier B.V., All rights reserved.
PY - 2020/4/1
Y1 - 2020/4/1
N2 - This paper presents a method for the extraction of twofold information set (TFIS) features for the text-independent speaker recognition. The method takes the Mel frequency cepstral coefficients from the frames of a sample speech signal and forms a matrix. From this, both spatial and temporal information components are derived based on the information set concept using the entropy framework. The TFIS features comprising their combination of two components are less in number thus reducing the computational time, complexity and improving the performance under the noisy environment. The proposed approach is tested on three datasets namely NIST-2003, VoxForge 2014 speech corpus and VCTK speech corpus in terms of speed, computational complexity, memory requirement and accuracy. Its performance is validated under different noisy environments at different signal-to-noise ratios.
AB - This paper presents a method for the extraction of twofold information set (TFIS) features for the text-independent speaker recognition. The method takes the Mel frequency cepstral coefficients from the frames of a sample speech signal and forms a matrix. From this, both spatial and temporal information components are derived based on the information set concept using the entropy framework. The TFIS features comprising their combination of two components are less in number thus reducing the computational time, complexity and improving the performance under the noisy environment. The proposed approach is tested on three datasets namely NIST-2003, VoxForge 2014 speech corpus and VCTK speech corpus in terms of speed, computational complexity, memory requirement and accuracy. Its performance is validated under different noisy environments at different signal-to-noise ratios.
UR - http://www.scopus.com/inward/record.url?scp=85071006926&partnerID=8YFLogxK
UR - http://www.scopus.com/inward/citedby.url?scp=85071006926&partnerID=8YFLogxK
U2 - 10.1007/s00500-019-04277-9
DO - 10.1007/s00500-019-04277-9
M3 - Article
AN - SCOPUS:85071006926
SN - 1432-7643
VL - 24
SP - 5271
EP - 5287
JO - Soft Computing
JF - Soft Computing
IS - 7
ER -