A variety of nuclear localization signals (NLSs) are experimentally known a
lthough only one motif was available for database searches through PROSITE.
We initially collected a set of 91 experimentally verified NLSs from the l
iterature. Through iterated 'in silico mutagenesis' we then extended the se
t to 214 potential NLSs. This final set matched in 43% of all known nuclear
proteins and in no known non-nuclear protein. We estimated that >17% of al
l eukaryotic proteins may be imported into the nucleus. Finally, we found a
n overlap between the NLS and DNA-binding region for 90% of the proteins fo
r which both the NLS and DNA-binding regions were known. Thus, evolution se
emed to have used part of the existing DNA-binding mechanism when compartme
ntalizing DNA-binding proteins into the nucleus. However, only 56 of our 21
4 NLS motifs overlapped with DNA-binding regions. These 56 NLSs enabled a d
e novo prediction of partial DNA-binding regions for similar to 800 protein
s in human, fly, worm and yeast.