Analysis of feature selection and extraction algorithm for loan data

A big data approach

Pai M.M. Manohara, Girija Attigeri, Radhika M. Pai

Research output: Chapter in Book/Report/Conference proceedingConference contribution

Abstract

Fraudulent activities in financial institutes can break the economic system of the country. These activities can be identified using clustering and classification algorithms. Effectiveness of these algorithms depend on quality of the input data. Moreover, financial data comes from various sources and forms such as financial statements, stakeholders activities and others. This data from various sources is very vast and unstructured big data. Hence, parallel distributed pre-processing is very significant to improve the quality of the data. Objective of this work is dimensionality reduction considering feature selection and extraction algorithm for large volume of financial data. In this paper an attempt is made to understand the implications of feature extraction and transformation algorithm using Principal Feature Analysis on the financial data. Effect of reduced dimension is studied on various classification algorithms for financial loan data. Parallel and distributed implementation is carried out on IBM Bluemix cloud platform with spark notebook. The results show that reduction of features has significantly improved execution time without compromising the accuracy.

Original languageEnglish
Title of host publication2017 International Conference on Advances in Computing, Communications and Informatics, ICACCI 2017
PublisherInstitute of Electrical and Electronics Engineers Inc.
Pages2147-2151
Number of pages5
Volume2017-January
ISBN (Electronic)9781509063673
DOIs
Publication statusPublished - 30-11-2017
Event2017 International Conference on Advances in Computing, Communications and Informatics, ICACCI 2017 - Manipal, Mangalore, India
Duration: 13-09-201716-09-2017

Conference

Conference2017 International Conference on Advances in Computing, Communications and Informatics, ICACCI 2017
CountryIndia
CityManipal, Mangalore
Period13-09-1716-09-17

Fingerprint

Feature extraction
Electric sparks
Big data
Economics
Processing

All Science Journal Classification (ASJC) codes

  • Computer Networks and Communications
  • Computer Science Applications
  • Information Systems

Cite this

Manohara, P. M. M., Attigeri, G., & Pai, R. M. (2017). Analysis of feature selection and extraction algorithm for loan data: A big data approach. In 2017 International Conference on Advances in Computing, Communications and Informatics, ICACCI 2017 (Vol. 2017-January, pp. 2147-2151). Institute of Electrical and Electronics Engineers Inc.. https://doi.org/10.1109/ICACCI.2017.8126163
Manohara, Pai M.M. ; Attigeri, Girija ; Pai, Radhika M. / Analysis of feature selection and extraction algorithm for loan data : A big data approach. 2017 International Conference on Advances in Computing, Communications and Informatics, ICACCI 2017. Vol. 2017-January Institute of Electrical and Electronics Engineers Inc., 2017. pp. 2147-2151
@inproceedings{db499b2c22c841e4976ffa4811611849,
title = "Analysis of feature selection and extraction algorithm for loan data: A big data approach",
abstract = "Fraudulent activities in financial institutes can break the economic system of the country. These activities can be identified using clustering and classification algorithms. Effectiveness of these algorithms depend on quality of the input data. Moreover, financial data comes from various sources and forms such as financial statements, stakeholders activities and others. This data from various sources is very vast and unstructured big data. Hence, parallel distributed pre-processing is very significant to improve the quality of the data. Objective of this work is dimensionality reduction considering feature selection and extraction algorithm for large volume of financial data. In this paper an attempt is made to understand the implications of feature extraction and transformation algorithm using Principal Feature Analysis on the financial data. Effect of reduced dimension is studied on various classification algorithms for financial loan data. Parallel and distributed implementation is carried out on IBM Bluemix cloud platform with spark notebook. The results show that reduction of features has significantly improved execution time without compromising the accuracy.",
author = "Manohara, {Pai M.M.} and Girija Attigeri and Pai, {Radhika M.}",
year = "2017",
month = "11",
day = "30",
doi = "10.1109/ICACCI.2017.8126163",
language = "English",
volume = "2017-January",
pages = "2147--2151",
booktitle = "2017 International Conference on Advances in Computing, Communications and Informatics, ICACCI 2017",
publisher = "Institute of Electrical and Electronics Engineers Inc.",
address = "United States",

}

Manohara, PMM, Attigeri, G & Pai, RM 2017, Analysis of feature selection and extraction algorithm for loan data: A big data approach. in 2017 International Conference on Advances in Computing, Communications and Informatics, ICACCI 2017. vol. 2017-January, Institute of Electrical and Electronics Engineers Inc., pp. 2147-2151, 2017 International Conference on Advances in Computing, Communications and Informatics, ICACCI 2017, Manipal, Mangalore, India, 13-09-17. https://doi.org/10.1109/ICACCI.2017.8126163

Analysis of feature selection and extraction algorithm for loan data : A big data approach. / Manohara, Pai M.M.; Attigeri, Girija; Pai, Radhika M.

2017 International Conference on Advances in Computing, Communications and Informatics, ICACCI 2017. Vol. 2017-January Institute of Electrical and Electronics Engineers Inc., 2017. p. 2147-2151.

Research output: Chapter in Book/Report/Conference proceedingConference contribution

TY - GEN

T1 - Analysis of feature selection and extraction algorithm for loan data

T2 - A big data approach

AU - Manohara, Pai M.M.

AU - Attigeri, Girija

AU - Pai, Radhika M.

PY - 2017/11/30

Y1 - 2017/11/30

N2 - Fraudulent activities in financial institutes can break the economic system of the country. These activities can be identified using clustering and classification algorithms. Effectiveness of these algorithms depend on quality of the input data. Moreover, financial data comes from various sources and forms such as financial statements, stakeholders activities and others. This data from various sources is very vast and unstructured big data. Hence, parallel distributed pre-processing is very significant to improve the quality of the data. Objective of this work is dimensionality reduction considering feature selection and extraction algorithm for large volume of financial data. In this paper an attempt is made to understand the implications of feature extraction and transformation algorithm using Principal Feature Analysis on the financial data. Effect of reduced dimension is studied on various classification algorithms for financial loan data. Parallel and distributed implementation is carried out on IBM Bluemix cloud platform with spark notebook. The results show that reduction of features has significantly improved execution time without compromising the accuracy.

AB - Fraudulent activities in financial institutes can break the economic system of the country. These activities can be identified using clustering and classification algorithms. Effectiveness of these algorithms depend on quality of the input data. Moreover, financial data comes from various sources and forms such as financial statements, stakeholders activities and others. This data from various sources is very vast and unstructured big data. Hence, parallel distributed pre-processing is very significant to improve the quality of the data. Objective of this work is dimensionality reduction considering feature selection and extraction algorithm for large volume of financial data. In this paper an attempt is made to understand the implications of feature extraction and transformation algorithm using Principal Feature Analysis on the financial data. Effect of reduced dimension is studied on various classification algorithms for financial loan data. Parallel and distributed implementation is carried out on IBM Bluemix cloud platform with spark notebook. The results show that reduction of features has significantly improved execution time without compromising the accuracy.

UR - http://www.scopus.com/inward/record.url?scp=85042675377&partnerID=8YFLogxK

UR - http://www.scopus.com/inward/citedby.url?scp=85042675377&partnerID=8YFLogxK

U2 - 10.1109/ICACCI.2017.8126163

DO - 10.1109/ICACCI.2017.8126163

M3 - Conference contribution

VL - 2017-January

SP - 2147

EP - 2151

BT - 2017 International Conference on Advances in Computing, Communications and Informatics, ICACCI 2017

PB - Institute of Electrical and Electronics Engineers Inc.

ER -

Manohara PMM, Attigeri G, Pai RM. Analysis of feature selection and extraction algorithm for loan data: A big data approach. In 2017 International Conference on Advances in Computing, Communications and Informatics, ICACCI 2017. Vol. 2017-January. Institute of Electrical and Electronics Engineers Inc. 2017. p. 2147-2151 https://doi.org/10.1109/ICACCI.2017.8126163