Using the CATH domain database to assign structures and functions to the genome sequences

Citation
F. Pearl et al., Using the CATH domain database to assign structures and functions to the genome sequences, BIOCH SOC T, 28, 2000, pp. 269-275
Citations number
31
Language
INGLESE
art.tipo
Article
Categorie Soggetti
Biochemistry & Biophysics
Journal title
BIOCHEMICAL SOCIETY TRANSACTIONS
ISSN journal
0300-5127 → ACNP
Volume
28
Year of publication
2000
Part
2
Pages
269 - 275
Database
ISI
SICI code
0300-5127(200002)28:<269:UTCDDT>2.0.ZU;2-X
Abstract
The CATH database of protein structures contains similar to 18000 domains o rganized according to their (C)lass, (A)rchitecture, (T)opology and (H)omol ogous superfamily [1]. Relationships between evolutionary related structure s (homologues) within the database have been used to test the sensitivity o f various sequence search methods in order to identify relatives in Genbank and other sequence databases [2]. Subsequent application of the most sensi tive and efficient algorithms, gapped blast and the profile based method, P osition Specific Iterated Basic Local Alignment Tool (PSI-BLAST) [3], could be used to assign structural data to between 22 and 36 % of microbial geno mes in order to improve functional annotation and enhance understanding of biological mechanism. However, on a cautionary note, an analysis of functio nal conservation within fold groups and homologous superfamilies in the CAT H database, revealed that whilst function was conserved in nearly 55% of en zyme families, function had diverged considerably, in some highly populated families. In these families, functional properties should be inherited far more cautiously and the probable effects of substitutions in key functiona l residues carefully assessed.