A combination of compositional index and genetic algorithm for predicting transmembrane helical segments

Nazar Zaki*, Salah Bouktif, Sanja Lazarova-Molnar

*Kontaktforfatter for dette arbejde

Publikation: Bidrag til tidsskriftTidsskriftartikelForskningpeer review

Resumé

Transmembrane helix (TMH) topology prediction is becoming a focal problem in bioinformatics because the structure of TM proteins is difficult to determine using experimental methods. Therefore, methods that can computationally predict the topology of helical membrane proteins are highly desirable. In this paper we introduce TMHindex, a method for detecting TMH segments using only the amino acid sequence information. Each amino acid in a protein sequence is represented by a Compositional Index, which is deduced from a combination of the difference in amino acid occurrences in TMH and non-TMH segments in training protein sequences and the amino acid composition information. Furthermore, a genetic algorithm was employed to find the optimal threshold value for the separation of TMH segments from non-TMH segments. The method successfully predicted 376 out of the 378 TMH segments in a dataset consisting of 70 test protein sequences. The sensitivity and specificity for classifying each amino acid in every protein sequence in the dataset was 0.901 and 0.865, respectively. To assess the generality of TMHindex, we also tested the approach on another standard 73-protein 3D helix dataset. TMHindex correctly predicted 91.8% of proteins based on TM segments. The level of the accuracy achieved using TMHindex in comparison to other recent approaches for predicting the topology of TM proteins is a strong argument in favor of our proposed method. The datasets, software together with supplementary materials are available at: http://faculty.uaeu.ac.ae/nzaki/TMHindex.htm.

OriginalsprogEngelsk
Artikelnummere21821
TidsskriftPLOS ONE
Vol/bind6
Udgave nummer7
ISSN1932-6203
DOI
StatusUdgivet - 1. aug. 2011

Fingeraftryk

amino acid sequences
Genetic algorithms
topology
Amino Acids
Proteins
amino acids
Topology
methodology
proteins
protein structure
amino acid composition
bioinformatics
membrane proteins
Bioinformatics
Computational Biology
prediction
Membrane Proteins
testing
Datasets
Chemical analysis

Citer dette

@article{a9f04208057444b687d4babfcfdfce8a,
title = "A combination of compositional index and genetic algorithm for predicting transmembrane helical segments",
abstract = "Transmembrane helix (TMH) topology prediction is becoming a focal problem in bioinformatics because the structure of TM proteins is difficult to determine using experimental methods. Therefore, methods that can computationally predict the topology of helical membrane proteins are highly desirable. In this paper we introduce TMHindex, a method for detecting TMH segments using only the amino acid sequence information. Each amino acid in a protein sequence is represented by a Compositional Index, which is deduced from a combination of the difference in amino acid occurrences in TMH and non-TMH segments in training protein sequences and the amino acid composition information. Furthermore, a genetic algorithm was employed to find the optimal threshold value for the separation of TMH segments from non-TMH segments. The method successfully predicted 376 out of the 378 TMH segments in a dataset consisting of 70 test protein sequences. The sensitivity and specificity for classifying each amino acid in every protein sequence in the dataset was 0.901 and 0.865, respectively. To assess the generality of TMHindex, we also tested the approach on another standard 73-protein 3D helix dataset. TMHindex correctly predicted 91.8{\%} of proteins based on TM segments. The level of the accuracy achieved using TMHindex in comparison to other recent approaches for predicting the topology of TM proteins is a strong argument in favor of our proposed method. The datasets, software together with supplementary materials are available at: http://faculty.uaeu.ac.ae/nzaki/TMHindex.htm.",
author = "Nazar Zaki and Salah Bouktif and Sanja Lazarova-Molnar",
year = "2011",
month = "8",
day = "1",
doi = "10.1371/journal.pone.0021821",
language = "English",
volume = "6",
journal = "P L o S One",
issn = "1932-6203",
publisher = "Public Library of Science",
number = "7",

}

A combination of compositional index and genetic algorithm for predicting transmembrane helical segments. / Zaki, Nazar; Bouktif, Salah; Lazarova-Molnar, Sanja.

I: PLOS ONE, Bind 6, Nr. 7, e21821, 01.08.2011.

Publikation: Bidrag til tidsskriftTidsskriftartikelForskningpeer review

TY - JOUR

T1 - A combination of compositional index and genetic algorithm for predicting transmembrane helical segments

AU - Zaki, Nazar

AU - Bouktif, Salah

AU - Lazarova-Molnar, Sanja

PY - 2011/8/1

Y1 - 2011/8/1

N2 - Transmembrane helix (TMH) topology prediction is becoming a focal problem in bioinformatics because the structure of TM proteins is difficult to determine using experimental methods. Therefore, methods that can computationally predict the topology of helical membrane proteins are highly desirable. In this paper we introduce TMHindex, a method for detecting TMH segments using only the amino acid sequence information. Each amino acid in a protein sequence is represented by a Compositional Index, which is deduced from a combination of the difference in amino acid occurrences in TMH and non-TMH segments in training protein sequences and the amino acid composition information. Furthermore, a genetic algorithm was employed to find the optimal threshold value for the separation of TMH segments from non-TMH segments. The method successfully predicted 376 out of the 378 TMH segments in a dataset consisting of 70 test protein sequences. The sensitivity and specificity for classifying each amino acid in every protein sequence in the dataset was 0.901 and 0.865, respectively. To assess the generality of TMHindex, we also tested the approach on another standard 73-protein 3D helix dataset. TMHindex correctly predicted 91.8% of proteins based on TM segments. The level of the accuracy achieved using TMHindex in comparison to other recent approaches for predicting the topology of TM proteins is a strong argument in favor of our proposed method. The datasets, software together with supplementary materials are available at: http://faculty.uaeu.ac.ae/nzaki/TMHindex.htm.

AB - Transmembrane helix (TMH) topology prediction is becoming a focal problem in bioinformatics because the structure of TM proteins is difficult to determine using experimental methods. Therefore, methods that can computationally predict the topology of helical membrane proteins are highly desirable. In this paper we introduce TMHindex, a method for detecting TMH segments using only the amino acid sequence information. Each amino acid in a protein sequence is represented by a Compositional Index, which is deduced from a combination of the difference in amino acid occurrences in TMH and non-TMH segments in training protein sequences and the amino acid composition information. Furthermore, a genetic algorithm was employed to find the optimal threshold value for the separation of TMH segments from non-TMH segments. The method successfully predicted 376 out of the 378 TMH segments in a dataset consisting of 70 test protein sequences. The sensitivity and specificity for classifying each amino acid in every protein sequence in the dataset was 0.901 and 0.865, respectively. To assess the generality of TMHindex, we also tested the approach on another standard 73-protein 3D helix dataset. TMHindex correctly predicted 91.8% of proteins based on TM segments. The level of the accuracy achieved using TMHindex in comparison to other recent approaches for predicting the topology of TM proteins is a strong argument in favor of our proposed method. The datasets, software together with supplementary materials are available at: http://faculty.uaeu.ac.ae/nzaki/TMHindex.htm.

UR - http://www.scopus.com/inward/record.url?scp=79960779998&partnerID=8YFLogxK

U2 - 10.1371/journal.pone.0021821

DO - 10.1371/journal.pone.0021821

M3 - Journal article

VL - 6

JO - P L o S One

JF - P L o S One

SN - 1932-6203

IS - 7

M1 - e21821

ER -