J. Bowles et al., Phylogeny of the SOX family of developmental transcription factors based on sequence and structural indicators, DEVELOP BIO, 227(2), 2000, pp. 239-255
Members of the SOX family of transcription factors are found throughout the
animal kingdom, are characterized by the presence of a DNA-binding HMG dom
ain, and are involved in a diverse range of developmental processes. Previo
us attempts to group SOX genes and deduce their structural, functional, and
evolutionary relationships have relied largely on complete or partial HMG
box sequence of a limited number of genes. In this study, we have used comp
lete HMG domain sequence, full-length protein structure, and gene organizat
ion data to study the pattern of evolution within the family. For the first
time, a substantial number of invertebrate SOX sequences have been include
d in the analysis. We find support for subdivision of the family into group
s A-H, as has been suggested in some previous studies, and for the assignme
nt of two new groups, I and J. For vertebrate genes, it appears that relate
dness as suggested by HMG domain sequence is congruent with relatedness as
indicated by overall structure of the full-length protein and intron-exon s
tructure of the genes. Most of the SOX groups identified in vertebrates wer
e represented by a single SOX sequence in each invertebrate species studied
. We have named anonymous sequences and, where appropriate, have suggested
systematic names for some previously identified sequences. In addition, we
identify an HMG domain signature motif which may be considered representati
ve of the SOX family. Based on our data, we propose a robust phylogeny of S
OX genes that reflects their evolutionary history in metazoans. (C) 2000 Ac
ademic Press.