Database for prediction of entire proteomes


Large-scale genome sequencing has provided us with the building blocks of living organisms. However, to obtain new insights into physiological and biochemical processes, it is essential to analyse and catalogue the structural and functional features of each individual protein in the genome. Such predictions for entire proteomes suggest conclusions in context of comparative genomics and provide crucial information in the context of structural genomics.

PEP is a database of Predictions for Entire Proteomes. The database contains summaries of analyses of protein sequences from a range of organisms representing all three major kingdoms of life: eukaryotes, prokaryotes and archaea.The database contains structural and functional features analysis including:

• coiled-coil regions predicted by COILS
• 3-state secondary structure predicted by PROFsec
• percentage relative solvent accessibility predicted by PROFacc
• transmembrane helices assigned by PHDhtm
• low sequence complexity regions according to SEG
• long stretches of non-regular secondary structure (NORS)
• presence and location of signal peptide cleavage sites identified by SignalP
• PROSITE motifs
• nuclear localization signals
• cellular functional classes assigned by EUCLID

PEP database can be accessed by SRS, PSI-BLAST and BlastP interface.It can also downloaded as flat files.

http://cubic.bioc.columbia.edu/PEP/query.html