Fred Hutchinson Cancer Research Center, Seattle
Inteins are proteins that catalyze their excision out of host proteins, ligating the host flanks with a polypeptide bond. This protein splicing activity is autocatalytic and does not depend on any host specific factors, occurring in heterologous organisms, in different in-vitro systems and with various natural and synthetic flanks. Sequence analysis and experimental work had shown that some inteins also contain a homing endonuclease activity that can mediate horizontal transfer. Inteins have a diverse and sporadic distribution across species and proteins. They occur in all three domains of life but so far have been found in just 21 species and strains; currently inteins are known in only 24 types of proteins, but these include metabolic enzymes, DNA and RNA polymerases, a protease, and a vacuolar ATPase.
Using advanced sequence analysis methods I have identified three distinct modular domains in inteins. The distal domains are predicted to perform the protein splicing reaction. An optional central endonuclease domain is shown to correspond to different types of homing endonucleases in different inteins. Intein N-domain motifs are also found in the autocatalytic C-terminal domains present in hedgehog and other protein families. These results predict functional roles for specific residues and suggest that inteins are of extremely ancient origin, now persisting at only a few conserved niches in proteins.