A study of various varieties of distributed data mining architectures

Sukriti Paul, Nisha P. Shetty, Balachandra

Research output: Chapter in Book/Report/Conference proceedingConference contribution

Abstract

Owing to the explosion of data in today’s world, datasets are enormous, geographically distributed and heterogeneous. Data mining aims extracting useful information from voluminous repositories where data is stored. Predictive analysis of hidden patterns in massive datasets poses to be a challenge. The problems faced while using the data warehousing model for such datasets were privacy, centralization of the data present at multiple independent sites, bandwidth limitation, complexity of integration, and analysis of the data at a global level. Distributed algorithms have been designed to address the same. Distributed data mining (DDM) techniques regard the distributed datasets as one virtual table and assume the existence of a global model which could be designed if the data were combined centrally. This paper presents distributed data mining systems and frameworks for analyzing data and mining the required knowledge from it. Emphasis has been laid on the architectures of such models. Factors like computation resources, communication, hardware, and usage of distributed resources of data have been considered while analyzing or designing distributed algorithms. Such algorithms primarily aim at memory expense and average distribution of working load. Distributed data finds its application in e-commerce, e-business, intrusion detection systems, and sensor networks.

Original languageEnglish
Title of host publicationInformation and Decision Sciences - Proceedings of the 6th International Conference on FICTA
PublisherSpringer Verlag
Pages77-88
Number of pages12
ISBN (Print)9789811075629
DOIs
Publication statusPublished - 01-01-2018
Event6th International Conference on Frontiers of Intelligent Computing: Theory and Applications, FICTA 2017 - Bhubaneswar, India
Duration: 14-10-201715-10-2017

Publication series

NameAdvances in Intelligent Systems and Computing
Volume701
ISSN (Print)2194-5357

Conference

Conference6th International Conference on Frontiers of Intelligent Computing: Theory and Applications, FICTA 2017
CountryIndia
CityBhubaneswar
Period14-10-1715-10-17

Fingerprint

Data mining
Parallel algorithms
Data warehouses
Intrusion detection
Sensor networks
Explosions
Hardware
Bandwidth
Data storage equipment
Communication
Industry
Predictive analytics

All Science Journal Classification (ASJC) codes

  • Control and Systems Engineering
  • Computer Science(all)

Cite this

Paul, S., Shetty, N. P., & Balachandra (2018). A study of various varieties of distributed data mining architectures. In Information and Decision Sciences - Proceedings of the 6th International Conference on FICTA (pp. 77-88). (Advances in Intelligent Systems and Computing; Vol. 701). Springer Verlag. https://doi.org/10.1007/978-981-10-7563-6_9
Paul, Sukriti ; Shetty, Nisha P. ; Balachandra. / A study of various varieties of distributed data mining architectures. Information and Decision Sciences - Proceedings of the 6th International Conference on FICTA. Springer Verlag, 2018. pp. 77-88 (Advances in Intelligent Systems and Computing).
@inproceedings{40bb5b33f75741eabfdbd323513b9fe3,
title = "A study of various varieties of distributed data mining architectures",
abstract = "Owing to the explosion of data in today’s world, datasets are enormous, geographically distributed and heterogeneous. Data mining aims extracting useful information from voluminous repositories where data is stored. Predictive analysis of hidden patterns in massive datasets poses to be a challenge. The problems faced while using the data warehousing model for such datasets were privacy, centralization of the data present at multiple independent sites, bandwidth limitation, complexity of integration, and analysis of the data at a global level. Distributed algorithms have been designed to address the same. Distributed data mining (DDM) techniques regard the distributed datasets as one virtual table and assume the existence of a global model which could be designed if the data were combined centrally. This paper presents distributed data mining systems and frameworks for analyzing data and mining the required knowledge from it. Emphasis has been laid on the architectures of such models. Factors like computation resources, communication, hardware, and usage of distributed resources of data have been considered while analyzing or designing distributed algorithms. Such algorithms primarily aim at memory expense and average distribution of working load. Distributed data finds its application in e-commerce, e-business, intrusion detection systems, and sensor networks.",
author = "Sukriti Paul and Shetty, {Nisha P.} and Balachandra",
year = "2018",
month = "1",
day = "1",
doi = "10.1007/978-981-10-7563-6_9",
language = "English",
isbn = "9789811075629",
series = "Advances in Intelligent Systems and Computing",
publisher = "Springer Verlag",
pages = "77--88",
booktitle = "Information and Decision Sciences - Proceedings of the 6th International Conference on FICTA",
address = "Germany",

}

Paul, S, Shetty, NP & Balachandra 2018, A study of various varieties of distributed data mining architectures. in Information and Decision Sciences - Proceedings of the 6th International Conference on FICTA. Advances in Intelligent Systems and Computing, vol. 701, Springer Verlag, pp. 77-88, 6th International Conference on Frontiers of Intelligent Computing: Theory and Applications, FICTA 2017, Bhubaneswar, India, 14-10-17. https://doi.org/10.1007/978-981-10-7563-6_9

A study of various varieties of distributed data mining architectures. / Paul, Sukriti; Shetty, Nisha P.; Balachandra.

Information and Decision Sciences - Proceedings of the 6th International Conference on FICTA. Springer Verlag, 2018. p. 77-88 (Advances in Intelligent Systems and Computing; Vol. 701).

Research output: Chapter in Book/Report/Conference proceedingConference contribution

TY - GEN

T1 - A study of various varieties of distributed data mining architectures

AU - Paul, Sukriti

AU - Shetty, Nisha P.

AU - Balachandra, null

PY - 2018/1/1

Y1 - 2018/1/1

N2 - Owing to the explosion of data in today’s world, datasets are enormous, geographically distributed and heterogeneous. Data mining aims extracting useful information from voluminous repositories where data is stored. Predictive analysis of hidden patterns in massive datasets poses to be a challenge. The problems faced while using the data warehousing model for such datasets were privacy, centralization of the data present at multiple independent sites, bandwidth limitation, complexity of integration, and analysis of the data at a global level. Distributed algorithms have been designed to address the same. Distributed data mining (DDM) techniques regard the distributed datasets as one virtual table and assume the existence of a global model which could be designed if the data were combined centrally. This paper presents distributed data mining systems and frameworks for analyzing data and mining the required knowledge from it. Emphasis has been laid on the architectures of such models. Factors like computation resources, communication, hardware, and usage of distributed resources of data have been considered while analyzing or designing distributed algorithms. Such algorithms primarily aim at memory expense and average distribution of working load. Distributed data finds its application in e-commerce, e-business, intrusion detection systems, and sensor networks.

AB - Owing to the explosion of data in today’s world, datasets are enormous, geographically distributed and heterogeneous. Data mining aims extracting useful information from voluminous repositories where data is stored. Predictive analysis of hidden patterns in massive datasets poses to be a challenge. The problems faced while using the data warehousing model for such datasets were privacy, centralization of the data present at multiple independent sites, bandwidth limitation, complexity of integration, and analysis of the data at a global level. Distributed algorithms have been designed to address the same. Distributed data mining (DDM) techniques regard the distributed datasets as one virtual table and assume the existence of a global model which could be designed if the data were combined centrally. This paper presents distributed data mining systems and frameworks for analyzing data and mining the required knowledge from it. Emphasis has been laid on the architectures of such models. Factors like computation resources, communication, hardware, and usage of distributed resources of data have been considered while analyzing or designing distributed algorithms. Such algorithms primarily aim at memory expense and average distribution of working load. Distributed data finds its application in e-commerce, e-business, intrusion detection systems, and sensor networks.

UR - http://www.scopus.com/inward/record.url?scp=85045659728&partnerID=8YFLogxK

UR - http://www.scopus.com/inward/citedby.url?scp=85045659728&partnerID=8YFLogxK

U2 - 10.1007/978-981-10-7563-6_9

DO - 10.1007/978-981-10-7563-6_9

M3 - Conference contribution

SN - 9789811075629

T3 - Advances in Intelligent Systems and Computing

SP - 77

EP - 88

BT - Information and Decision Sciences - Proceedings of the 6th International Conference on FICTA

PB - Springer Verlag

ER -

Paul S, Shetty NP, Balachandra. A study of various varieties of distributed data mining architectures. In Information and Decision Sciences - Proceedings of the 6th International Conference on FICTA. Springer Verlag. 2018. p. 77-88. (Advances in Intelligent Systems and Computing). https://doi.org/10.1007/978-981-10-7563-6_9