A diverse assimilation of sequence and structure dependent features for amyloid plaque prediction using Random Forests

Smitha Sunil Kumaran Nair, N. V.Subba Reddy, K. S. Hareesha, S. Balaji

Research output: Contribution to journalArticle

1 Citation (Scopus)

Abstract

The failure of proteins to fold correctly result in amyloidosis. Therefore, amyloid plaque prediction has become significant to narrow down the exploration of anti- amyloidosis and related drugs. In this research article, we propose a unique hybrid approach to computationally predict the formation of amyloid plaques by exploiting diversity in the feature vector extracted from protein sequences and structures. The diversity in the sequence of feature space is exploited using structure dependent features besides the physico-chemical information from amino acid chemistry and frequency spectrum based parameters. We explored the prediction capability with independent and integrated feature vectors by an ensemble machine learning classifier, Random Forests. Computational analysis evidence that the assimilation of diverse feature set outperform individual feature array with a balanced prediction accuracy of 0.830 and Receiver Characteristic Curve area of 0.918 on stratified10-fold cross-validation test.

Original languageEnglish
Pages (from-to)38-44
Number of pages7
JournalCurrent Proteomics
Volume10
Issue number1
DOIs
Publication statusPublished - 2013

Fingerprint

Amyloid Plaques
Amyloidosis
Amyloid
Proteins
Amino Acids
Learning systems
Classifiers
Research
Pharmaceutical Preparations
Forests
Machine Learning

All Science Journal Classification (ASJC) codes

  • Biochemistry
  • Molecular Biology

Cite this

@article{ded6c91ee90e4f38b7f5670a30f650ce,
title = "A diverse assimilation of sequence and structure dependent features for amyloid plaque prediction using Random Forests",
abstract = "The failure of proteins to fold correctly result in amyloidosis. Therefore, amyloid plaque prediction has become significant to narrow down the exploration of anti- amyloidosis and related drugs. In this research article, we propose a unique hybrid approach to computationally predict the formation of amyloid plaques by exploiting diversity in the feature vector extracted from protein sequences and structures. The diversity in the sequence of feature space is exploited using structure dependent features besides the physico-chemical information from amino acid chemistry and frequency spectrum based parameters. We explored the prediction capability with independent and integrated feature vectors by an ensemble machine learning classifier, Random Forests. Computational analysis evidence that the assimilation of diverse feature set outperform individual feature array with a balanced prediction accuracy of 0.830 and Receiver Characteristic Curve area of 0.918 on stratified10-fold cross-validation test.",
author = "Nair, {Smitha Sunil Kumaran} and Reddy, {N. V.Subba} and Hareesha, {K. S.} and S. Balaji",
year = "2013",
doi = "10.2174/15701646112099990006",
language = "English",
volume = "10",
pages = "38--44",
journal = "Current Proteomics",
issn = "1570-1646",
publisher = "Bentham Science Publishers B.V.",
number = "1",

}

A diverse assimilation of sequence and structure dependent features for amyloid plaque prediction using Random Forests. / Nair, Smitha Sunil Kumaran; Reddy, N. V.Subba; Hareesha, K. S.; Balaji, S.

In: Current Proteomics, Vol. 10, No. 1, 2013, p. 38-44.

Research output: Contribution to journalArticle

TY - JOUR

T1 - A diverse assimilation of sequence and structure dependent features for amyloid plaque prediction using Random Forests

AU - Nair, Smitha Sunil Kumaran

AU - Reddy, N. V.Subba

AU - Hareesha, K. S.

AU - Balaji, S.

PY - 2013

Y1 - 2013

N2 - The failure of proteins to fold correctly result in amyloidosis. Therefore, amyloid plaque prediction has become significant to narrow down the exploration of anti- amyloidosis and related drugs. In this research article, we propose a unique hybrid approach to computationally predict the formation of amyloid plaques by exploiting diversity in the feature vector extracted from protein sequences and structures. The diversity in the sequence of feature space is exploited using structure dependent features besides the physico-chemical information from amino acid chemistry and frequency spectrum based parameters. We explored the prediction capability with independent and integrated feature vectors by an ensemble machine learning classifier, Random Forests. Computational analysis evidence that the assimilation of diverse feature set outperform individual feature array with a balanced prediction accuracy of 0.830 and Receiver Characteristic Curve area of 0.918 on stratified10-fold cross-validation test.

AB - The failure of proteins to fold correctly result in amyloidosis. Therefore, amyloid plaque prediction has become significant to narrow down the exploration of anti- amyloidosis and related drugs. In this research article, we propose a unique hybrid approach to computationally predict the formation of amyloid plaques by exploiting diversity in the feature vector extracted from protein sequences and structures. The diversity in the sequence of feature space is exploited using structure dependent features besides the physico-chemical information from amino acid chemistry and frequency spectrum based parameters. We explored the prediction capability with independent and integrated feature vectors by an ensemble machine learning classifier, Random Forests. Computational analysis evidence that the assimilation of diverse feature set outperform individual feature array with a balanced prediction accuracy of 0.830 and Receiver Characteristic Curve area of 0.918 on stratified10-fold cross-validation test.

UR - http://www.scopus.com/inward/record.url?scp=84881492491&partnerID=8YFLogxK

UR - http://www.scopus.com/inward/citedby.url?scp=84881492491&partnerID=8YFLogxK

U2 - 10.2174/15701646112099990006

DO - 10.2174/15701646112099990006

M3 - Article

VL - 10

SP - 38

EP - 44

JO - Current Proteomics

JF - Current Proteomics

SN - 1570-1646

IS - 1

ER -