Computational protein kinase substrate identification

Post-translational modification by phosphorylation is the most abundant type of cellular regulation, affecting essentially every cellular process including metabolism, growth, differentiation, motility, membrane transport, learning and memory. Defects in protein kinase function result in a variety of diseases and kinases are major targets for drug design.

The identification of protein kinase substrates requires understanding the peptide specificity of protein kinases. Understanding phosphorylation specificity will therefore contribute to understanding the roles of protein kinases in health and disease, and help identifying new therapeutic targets and strategies of protein kinase inhibition and anti-kinase drug development.

In eukaryotes, protein kinases phosphorylate mainly Ser or Thr residues (protein Ser/Thr kinases) or Tyr residues (protein Tyr kinases). Although phosphorylation of His residues, as well as other amino acids, occurs also.

The three-dimensional structures are known for a number of protein kinases, some with bound substrates and nucleotides.The characteristic fold consists of a smaller N-terminal “lobe”, comprising a five-stranded β-sheet and one or two α-helices, and a larger C-terminal lobe that usually contains six major α-helices and two small β-sheets (As shown in Fig below).




The peptide substrate is held in the groove between the two lobes. The phosphate group is extracted from an ATP molecule located close to the substrate towards the small lobe. A conserved Asp residue is essential for catalysis.


Databases:

1.The Phospho.ELM database contains a collection of experimentally verified Serine, Threonine and Tyrosine sites in eukaryotic proteins. The entries, manually annotated and based on scientific literature, provide information about the phosphorylated proteins and the exact position of known phosphorylated instances.

http://phospho.elm.eu.org/

2.General databases on post-translational modifications

www.hprd.org/

3.The RESID Database of Protein Modifications is a comprehensive collection of annotations and structures for protein modifications including amino-terminal, carboxyl-terminal and peptide chain cross-link post-translational modifications.

http://www.ebi.ac.uk/RESID/

Prediction tools:

1.ELM is a resource for predicting functional sites in eukaryotic proteins.

www.elm.eu.org/

2.Identification of phosphorylation sites
The NetPhos 2.0 server produces neural network predictions for serine, threonine and tyrosine phosphorylation sites in eukaryotic proteins.

http://www.cbs.dtu.dk/services/NetPhos/

3.Predict PKA phosphorylation sites
NetPhosK is neural network predictions of kinase specific eukaryotic protein phosphoylation sites.It covers the following kinases: PKA, PKC, PKG, CKII, Cdc2, CaM-II, ATM, DNA PK, Cdk5, p38 MAPK, GSK3, CKI, PKB, RSK, INSR, EGFR and Src.

http://www.cbs.dtu.dk/services/NetPhosK/

4.Scansite
Scansite searches for motifs within proteins that are likely to be phosphorylated by specific protein kinases or bind to domains such as SH2 domains, 14-3-3 domains or PDZ domains.

http://scansite.mit.edu/

5.Predphospho
PredPhospho predictsphosphorylation sites of protein sequences.

http://pred.ngri.re.kr/PredPhospho.htm

6.Automotif
The AMS tool allows for identification of PTM (post-translational modification) sites in proteins.

http://automotif.bioinfo.pl/


7.GPS -group-based phosphorylation predicting and scoring method
It covers a larger number of protein kinase families and have greater sensitivity and specificity than Scansite and PredPhospho

http://bioinformatics.lcd-ustc.org/gps_web/predict.php

8.PREDIKIN
A computer program that can be used to predict substrates for serine/threonine protein kinases.

http://florey.biosci.uq.edu.au/kinsub/home.htm

Predikin can predict peptide specificities directly from the amino acid sequences and can therefore be used for most kinases, including hypothetical and uncharacterized ones.