Analysis of feature selection and extraction algorithm for loan data: A big data approach

Research output: Chapter in Book/Report/Conference proceedingConference contribution

3 Citations (Scopus)

Abstract

Fraudulent activities in financial institutes can break the economic system of the country. These activities can be identified using clustering and classification algorithms. Effectiveness of these algorithms depend on quality of the input data. Moreover, financial data comes from various sources and forms such as financial statements, stakeholders activities and others. This data from various sources is very vast and unstructured big data. Hence, parallel distributed pre-processing is very significant to improve the quality of the data. Objective of this work is dimensionality reduction considering feature selection and extraction algorithm for large volume of financial data. In this paper an attempt is made to understand the implications of feature extraction and transformation algorithm using Principal Feature Analysis on the financial data. Effect of reduced dimension is studied on various classification algorithms for financial loan data. Parallel and distributed implementation is carried out on IBM Bluemix cloud platform with spark notebook. The results show that reduction of features has significantly improved execution time without compromising the accuracy.

Original languageEnglish
Title of host publication2017 International Conference on Advances in Computing, Communications and Informatics, ICACCI 2017
PublisherInstitute of Electrical and Electronics Engineers Inc.
Pages2147-2151
Number of pages5
Volume2017-January
ISBN (Electronic)9781509063673
DOIs
Publication statusPublished - 30-11-2017
Event2017 International Conference on Advances in Computing, Communications and Informatics, ICACCI 2017 - Manipal, Mangalore, India
Duration: 13-09-201716-09-2017

Conference

Conference2017 International Conference on Advances in Computing, Communications and Informatics, ICACCI 2017
Country/TerritoryIndia
CityManipal, Mangalore
Period13-09-1716-09-17

All Science Journal Classification (ASJC) codes

  • Computer Networks and Communications
  • Computer Science Applications
  • Information Systems

Fingerprint

Dive into the research topics of 'Analysis of feature selection and extraction algorithm for loan data: A big data approach'. Together they form a unique fingerprint.

Cite this