Fulltext:
70580.pdf
Embargo:
until further notice
Size:
442.9Kb
Format:
PDF
Description:
Publisher’s version
Publication year
2008Source
Bioinformatics, 24, 7, (2008), pp. 908-15ISSN
Publication type
Article / Letter to editor
Display more detailsDisplay less details
Organization
Bioinformatics
CMBI
Former Organization
Bioinformatics (umcn)
Journal title
Bioinformatics
Volume
vol. 24
Issue
iss. 7
Page start
p. 908
Page end
p. 15
Subject
Bioinformatics; NCMLS 3: Growth and differentiation; NCMLS 7: Chemical and physical biologyAbstract
MOTIVATION: Recent advances in sequencing techniques have yielded enormous amounts of protein sequence data from various species. This large dataset allows sequence comparison between paralogous and orthologous proteins to identify motifs or functional positions that account for the differences of functional subgroups ('specificity' positions). Algorithms such as SDPpred and the two-entropies analysis (TEA) have been developed to detect such specificity positions from a multiple sequence alignment (MSA) grouped into classes according to certain biological functions. Other algorithms such as TreeDet compute a classification and then predict specificity positions associated with it. However, there are still many unresolved questions: Was the optimal subdivision of a protein family achieved? Do the definitions at different levels of the phylogenetic tree affect the prediction of specificity positions? Can the whole phylogenetic tree be used instead of only one level in it to predict specificity positions? RESULTS: Here we present a novel method, TEA-O (Two-entropies analysis-Objective), to trace the evolutionary pressure from the root to the branches of the phylogenetic tree. At each level of the tree, a TEA plot is produced to capture the signal of the evolutionary pressure. A consensus TEA-O plot is composed from the whole series of plots to provide a condensed representation. Positions related to functions that evolved early (conserved) or later (specificity) are close to the lower-left or upper-left corner of the TEA-O plot, respectively. This novel approach allows an unbiased, user-independent, analysis of residue relevance in a protein family. We compared our TEA-O method with various algorithms using both synthetic and real protein sequences. The results show that our method is robust, sensitive to subtle differences in evolutionary pressure during evolution and comprehensive because all positions in the MSA are presented in the consensus plot. AVAILABILITY: All computer programs and datasets used in this work are available at http://nava.liacs.nl/kye/TEA-O/ for academic use.
This item appears in the following Collection(s)
- Academic publications [238441]
- Electronic publications [122537]
- Faculty of Medical Sciences [90373]
- Faculty of Science [34986]
Upload full text
Use your RU credentials (u/z-number and password) to log in with SURFconext to upload a file for processing by the repository team.