Identifying sequence-structure pairs undetected by sequence alignments

Citation
S. Miyazawa et Rl. Jernigan, Identifying sequence-structure pairs undetected by sequence alignments, PROTEIN ENG, 13(7), 2000, pp. 459-475
Citations number
51
Language
INGLESE
art.tipo
Article
Categorie Soggetti
Biochemistry & Biophysics
Journal title
PROTEIN ENGINEERING
ISSN journal
0269-2139 → ACNP
Volume
13
Issue
7
Year of publication
2000
Pages
459 - 475
Database
ISI
SICI code
0269-2139(200007)13:7<459:ISPUBS>2.0.ZU;2-W
Abstract
We examine how effectively simple potential functions previously developed can identify compatibilities between sequences and structures of proteins f or database searches. The potential function consists of pairwise contact e nergies, repulsive packing potentials of residues for overly dense arrangem ent and short-range potentials for secondary structures, all of which were estimated from statistical preferences observed in known protein structures . Each potential energy term was modified to represent compatibilities betw een sequences and structures for globular proteins. Pairwise contact intera ctions in a sequence-structure alignment are evaluated in a mean field appr oximation on the basis of probabilities of site pairs to be aligned. Gap pe nalties are assumed to be proportional to the number of contacts at each re sidue position, and as a result gaps will be more frequently placed on prot ein surfaces than in cores. In addition to minimum energy alignments, we us e probability alignments made by successively aligning site pairs in order by pairwise alignment probabilities. The results show that the present ener gy function and alignment method can detect well both folds compatible with a given sequence and, inversely, sequences compatible with a given fold, a nd yield mostly similar alignments for these two types of sequence and stru cture pairs, Probability alignments consisting of most reliable site pairs only can yield extremely small root mean square deviations, and including l ess reliable pairs increases the deviations. Also, it is observed that seco ndary structure potentials are usefully complementary to yield improved ali gnments with this method. Remarkably, by this method some individual sequen ce-structure pairs are detected having only 5-20% sequence identity.