sigcleave predicts the site of cleavage between a signal sequence and the mature exported protein. The predictive accuracy is estimated to be around 75-80% for both prokaryotic and eukaryotic proteins.
sigcleave uses the method of von Heijne as modified by von Heijne in his later book where treatment of positions -1 and -3 in the matrix is slightly altered (see references).
The answer is partly because these sites can be relevant in some biological cases (additional pre-processing for example), but mostly because ...
There is one thing in bioinformatics you can not be certain of ... the start of a protein sequence. The end is easy to predict. The start depends on promoters, transcriptional controls, splicing, etc.
Most importantly, sigcleave is not perfect - you should check the results and decide whether you like the prediction.
Also, remember you can put -send 50 on the command line to make sure it only checks the first 50 residues.
|
By default sigcleave writes a 'motif' report file.
Here is the default file for eukaryotic signals:
# Amino acid counts for 161 Eukaryotic Signal Peptides, # from von Heijne (1986), Nucl. Acids. Res. 14:4683-4690 # # The cleavage site is between +1 and -1 # Sample: 161 aligned sequences # # R -13 -12 -11 -10 -9 -8 -7 -6 -5 -4 -3 -2 -1 +1 +2 Expect # - --- --- --- --- --- --- --- --- --- --- --- --- --- --- --- ------ A 16 13 14 15 20 18 18 17 25 15 47 6 80 18 6 14.5 C 3 6 9 7 9 14 6 8 5 6 19 3 9 8 3 4.5 D 0 0 0 0 0 0 0 0 5 3 0 5 0 10 11 8.9 E 0 0 0 1 0 0 0 0 3 7 0 7 0 13 14 10.0 F 13 9 11 11 6 7 18 13 4 5 0 13 0 6 4 5.6 G 4 4 3 6 3 13 3 2 19 34 5 7 39 10 7 12.1 H 0 0 0 0 0 1 1 0 5 0 0 6 0 4 2 3.4 I 15 15 8 6 11 5 4 8 5 1 10 5 0 8 7 7.4 K 0 0 0 1 0 0 1 0 0 4 0 2 0 11 9 11.3 L 71 68 72 79 78 45 64 49 10 23 8 20 1 8 4 12.1 M 0 3 7 4 1 6 2 2 0 0 0 1 0 1 2 2.7 N 0 1 0 1 1 0 0 0 3 3 0 10 0 4 7 7.1 P 2 0 2 0 0 4 1 8 20 14 0 1 3 0 22 7.4 Q 0 0 0 1 0 6 1 0 10 8 0 18 3 19 10 6.3 R 2 0 0 0 0 1 0 0 7 4 0 15 0 12 9 7.6 S 9 3 8 6 13 10 15 16 26 11 23 17 20 15 10 11.4 T 2 10 5 4 5 13 7 7 12 6 17 8 6 3 10 9.7 V 20 25 15 18 13 15 11 27 0 12 32 3 0 8 17 11.1 W 4 3 3 1 1 2 6 3 1 3 0 9 0 2 0 1.8 Y 0 1 4 0 0 1 3 1 1 2 0 5 0 1 7 5.6
If you use matrix tables with a different number of residues before or after the cleavage site, you must also set the advanced parameters nval and pval.
Original program "SIGCLEAVE" (EGCG 1989) by