iProt-Sub
Protease Specificity Prediction Server

Introduction

Proteases are enzymes that hydrolyze protein peptide bonds at specific recognition sites in their substrates and play a central role in "life and death" processes, such as neural, endocrine and cardiovascular signalling, digestion, degrading misfolded or unwanted proteins, immunity, cell division, and apoptosis.

The key to our understanding the physiological role of a protease is to identify its natural substrates. Many proteases have the potential to cleave multiple proteins in different physiological compartments, with cleavage influenced by factors such as substrate sequence, substrate conformation and accessibility. Knowledge in regards to the substrate specificity of a protease can dramatically improve our ability to predict target protein substrates, however, this information can at present only be derived from experimental approaches. In the absence of such data, the targets of protease function cannot be deduced a priori from the structure or sequence of the protease. Solving the "substrate identification" problem is fundamental for both understanding protease biology and the development of therapeutics that target specific protease regulated pathways.

To address this problem, we developed iProt-Sub, an integrated feature-based server for the prediction of novel substrates and their cleavage sites of 38 different protease families from primary sequences. The iProt-Sub server utilizes a support vector regression and multi-feature extraction approach to perform predictions using primary sequence and structural characteristics inferred or predicted from amino acid sequences. Features used for prediction include 11 feature encoding schemes, such as AAC, ACC, AAindex, Binary, Blosum, CHR, CKSAAP, DISO, KNN, PSI and PSSM. Based on these features, iProt-Sub offers important advantages over traditional substrate specificity prediction servers in its ability to identify novel substrates, and achieves greater coverage and accuracy than previous predictors. The iProt-Sub server is freely available at http://iProt-Sub.erc.monash.edu.au. To our knowledge, iProt-Sub is the most comprehensive server that is capable of predicting substrate cleavage sites of multiple proteases within a single substrate sequence using machine learning techiniques. It is anticipated to be a valuable tool for in silico identification of protease cleavage sites.