TY - JOUR
T1 - Advanced text documents information retrieval system for search services
AU - H S, Chiranjeevi
AU - Shenoy, Manjula K.
N1 - Funding Information:
This work was supported by the Vission Group of Science and technology, Government of Karnataka, India [629] This work is supported by the Vision group of science and technology (VGST), Government of Karnataka, India [grant number 629, under RFTT Scheme, 25/08/2017, and submitted on 27/06/2019]. We thank our industry mentors who have supported the research work to carry out; Dr Syam Sundar, IBM India.
Funding Information:
This work is supported by the Vision group of science and technology (VGST), Government of Karnataka, India [grant number 629, under RFTT Scheme, 25/08/2017, and submitted on 27/06/2019]. We thank our industry mentors who have supported the research work to carry out; Dr Syam Sundar, IBM India.
Publisher Copyright:
© 2021 The Author(s). This open access article is distributed under a Creative Commons Attribution (CC-BY) 4.0 license.
PY - 2020
Y1 - 2020
N2 - Information technology has explored the growth of text documents data in many organizations and the structural arrangement of voluminous data is a complex task. Handling the text document data is a challenging process involving not only the training of models but also numerous additional procedures, e.g., data pre-processing, transformation, and dimensionality reduction. In this paper, we describe the system’s architecture, the technical challenges, and the novel solution we have built. We propose a Recurrent Convolutional Neural network (RCNN), based text information retrieval system which efficiently retrieves the text documents and information for the user query. Pre-processing using tokenization and stemming, retrieval using TF-IDF (Term Frequency-Inverse Document Frequency), and RCNN classifier which captures the contextual information is implemented. A real-time advanced search system is developed on a huge set of MAHE University dataset. The performance of the proposed text document retrieval system is compared with other existing algorithms and the efficacy of the method is discussed. The proposed RCNN-based text document information retrieval model performs better in terms of precision, recall, and F-measure. A high-quality and high-performance text document retrieval search system is presented.
AB - Information technology has explored the growth of text documents data in many organizations and the structural arrangement of voluminous data is a complex task. Handling the text document data is a challenging process involving not only the training of models but also numerous additional procedures, e.g., data pre-processing, transformation, and dimensionality reduction. In this paper, we describe the system’s architecture, the technical challenges, and the novel solution we have built. We propose a Recurrent Convolutional Neural network (RCNN), based text information retrieval system which efficiently retrieves the text documents and information for the user query. Pre-processing using tokenization and stemming, retrieval using TF-IDF (Term Frequency-Inverse Document Frequency), and RCNN classifier which captures the contextual information is implemented. A real-time advanced search system is developed on a huge set of MAHE University dataset. The performance of the proposed text document retrieval system is compared with other existing algorithms and the efficacy of the method is discussed. The proposed RCNN-based text document information retrieval model performs better in terms of precision, recall, and F-measure. A high-quality and high-performance text document retrieval search system is presented.
UR - http://www.scopus.com/inward/record.url?scp=85100038573&partnerID=8YFLogxK
UR - http://www.scopus.com/inward/citedby.url?scp=85100038573&partnerID=8YFLogxK
U2 - 10.1080/23311916.2020.1856467
DO - 10.1080/23311916.2020.1856467
M3 - Article
AN - SCOPUS:85100038573
SN - 2331-1916
VL - 7
JO - Cogent Engineering
JF - Cogent Engineering
IS - 1
M1 - 1856467
ER -