An expemmental study of the effect of frequency of co-occurrence of features in clustering

Radhika M. Pai, V. S. Ananthanarayana

Research output: Chapter in Book/Report/Conference proceedingConference contribution

Abstract

In this paper, an attempt has been made to explore the effect of frequency of co-occurrence of features on the accuracy of the clustering results. This has been achieved by incorporating the frequency component in the clustering algorithm. The frequency, we mean here is the number of times the sequence of features appear in the data set. We try to utilize this component in the algorithm and study its effect on the resultant accuracy. The algorithm we have used is the PC(pattern count)-tree based clustering algorithm. The PC-tree is a compact and complete representation of the data set. It is data order independent and incremental. It can be applied to changing data and changing knowledge. i.e. dynamic databases. This algorithm is based on a compact data structure called PC-tree. The node of the PC-tree has, in addition to other fields a count field, which keeps track of the count of the number of features shared by the pattern. In the literature, the PC-tree was used for clustering and the count field was used only to retrieve back the transactions. In this paper, we try to make use of this field in clustering. We have also used the partitioned PC-tree based algorithm and studied the effect of frequency on the accuracy. We have conducted extensive experiments with the OCR handwritten digit dataset, a real dataset and observed the effect of frequency on the clustering results. The results of all our experiments are tabulated.

Original languageEnglish
Title of host publication2007 9th International Symposium on Signal Processing and its Applications, ISSPA 2007, Proceedings
DOIs
Publication statusPublished - 2007
Event2007 9th International Symposium on Signal Processing and its Applications, ISSPA 2007 - Sharjah, United Arab Emirates
Duration: 12-02-200715-02-2007

Conference

Conference2007 9th International Symposium on Signal Processing and its Applications, ISSPA 2007
CountryUnited Arab Emirates
CitySharjah
Period12-02-0715-02-07

Fingerprint

Clustering algorithms
Optical character recognition
Data structures
Experiments

All Science Journal Classification (ASJC) codes

  • Signal Processing

Cite this

Pai, R. M., & Ananthanarayana, V. S. (2007). An expemmental study of the effect of frequency of co-occurrence of features in clustering. In 2007 9th International Symposium on Signal Processing and its Applications, ISSPA 2007, Proceedings [4555535] https://doi.org/10.1109/ISSPA.2007.4555535
Pai, Radhika M. ; Ananthanarayana, V. S. / An expemmental study of the effect of frequency of co-occurrence of features in clustering. 2007 9th International Symposium on Signal Processing and its Applications, ISSPA 2007, Proceedings. 2007.
@inproceedings{91ac39e376cc4106aeb5c67ae53e5ad9,
title = "An expemmental study of the effect of frequency of co-occurrence of features in clustering",
abstract = "In this paper, an attempt has been made to explore the effect of frequency of co-occurrence of features on the accuracy of the clustering results. This has been achieved by incorporating the frequency component in the clustering algorithm. The frequency, we mean here is the number of times the sequence of features appear in the data set. We try to utilize this component in the algorithm and study its effect on the resultant accuracy. The algorithm we have used is the PC(pattern count)-tree based clustering algorithm. The PC-tree is a compact and complete representation of the data set. It is data order independent and incremental. It can be applied to changing data and changing knowledge. i.e. dynamic databases. This algorithm is based on a compact data structure called PC-tree. The node of the PC-tree has, in addition to other fields a count field, which keeps track of the count of the number of features shared by the pattern. In the literature, the PC-tree was used for clustering and the count field was used only to retrieve back the transactions. In this paper, we try to make use of this field in clustering. We have also used the partitioned PC-tree based algorithm and studied the effect of frequency on the accuracy. We have conducted extensive experiments with the OCR handwritten digit dataset, a real dataset and observed the effect of frequency on the clustering results. The results of all our experiments are tabulated.",
author = "Pai, {Radhika M.} and Ananthanarayana, {V. S.}",
year = "2007",
doi = "10.1109/ISSPA.2007.4555535",
language = "English",
isbn = "1424407796",
booktitle = "2007 9th International Symposium on Signal Processing and its Applications, ISSPA 2007, Proceedings",

}

Pai, RM & Ananthanarayana, VS 2007, An expemmental study of the effect of frequency of co-occurrence of features in clustering. in 2007 9th International Symposium on Signal Processing and its Applications, ISSPA 2007, Proceedings., 4555535, 2007 9th International Symposium on Signal Processing and its Applications, ISSPA 2007, Sharjah, United Arab Emirates, 12-02-07. https://doi.org/10.1109/ISSPA.2007.4555535

An expemmental study of the effect of frequency of co-occurrence of features in clustering. / Pai, Radhika M.; Ananthanarayana, V. S.

2007 9th International Symposium on Signal Processing and its Applications, ISSPA 2007, Proceedings. 2007. 4555535.

Research output: Chapter in Book/Report/Conference proceedingConference contribution

TY - GEN

T1 - An expemmental study of the effect of frequency of co-occurrence of features in clustering

AU - Pai, Radhika M.

AU - Ananthanarayana, V. S.

PY - 2007

Y1 - 2007

N2 - In this paper, an attempt has been made to explore the effect of frequency of co-occurrence of features on the accuracy of the clustering results. This has been achieved by incorporating the frequency component in the clustering algorithm. The frequency, we mean here is the number of times the sequence of features appear in the data set. We try to utilize this component in the algorithm and study its effect on the resultant accuracy. The algorithm we have used is the PC(pattern count)-tree based clustering algorithm. The PC-tree is a compact and complete representation of the data set. It is data order independent and incremental. It can be applied to changing data and changing knowledge. i.e. dynamic databases. This algorithm is based on a compact data structure called PC-tree. The node of the PC-tree has, in addition to other fields a count field, which keeps track of the count of the number of features shared by the pattern. In the literature, the PC-tree was used for clustering and the count field was used only to retrieve back the transactions. In this paper, we try to make use of this field in clustering. We have also used the partitioned PC-tree based algorithm and studied the effect of frequency on the accuracy. We have conducted extensive experiments with the OCR handwritten digit dataset, a real dataset and observed the effect of frequency on the clustering results. The results of all our experiments are tabulated.

AB - In this paper, an attempt has been made to explore the effect of frequency of co-occurrence of features on the accuracy of the clustering results. This has been achieved by incorporating the frequency component in the clustering algorithm. The frequency, we mean here is the number of times the sequence of features appear in the data set. We try to utilize this component in the algorithm and study its effect on the resultant accuracy. The algorithm we have used is the PC(pattern count)-tree based clustering algorithm. The PC-tree is a compact and complete representation of the data set. It is data order independent and incremental. It can be applied to changing data and changing knowledge. i.e. dynamic databases. This algorithm is based on a compact data structure called PC-tree. The node of the PC-tree has, in addition to other fields a count field, which keeps track of the count of the number of features shared by the pattern. In the literature, the PC-tree was used for clustering and the count field was used only to retrieve back the transactions. In this paper, we try to make use of this field in clustering. We have also used the partitioned PC-tree based algorithm and studied the effect of frequency on the accuracy. We have conducted extensive experiments with the OCR handwritten digit dataset, a real dataset and observed the effect of frequency on the clustering results. The results of all our experiments are tabulated.

UR - http://www.scopus.com/inward/record.url?scp=51549104346&partnerID=8YFLogxK

UR - http://www.scopus.com/inward/citedby.url?scp=51549104346&partnerID=8YFLogxK

U2 - 10.1109/ISSPA.2007.4555535

DO - 10.1109/ISSPA.2007.4555535

M3 - Conference contribution

SN - 1424407796

SN - 9781424407798

BT - 2007 9th International Symposium on Signal Processing and its Applications, ISSPA 2007, Proceedings

ER -

Pai RM, Ananthanarayana VS. An expemmental study of the effect of frequency of co-occurrence of features in clustering. In 2007 9th International Symposium on Signal Processing and its Applications, ISSPA 2007, Proceedings. 2007. 4555535 https://doi.org/10.1109/ISSPA.2007.4555535