The prot2pept command takes one or more protein sequences as input, performs an in silico tryptic digest on them and returns the digested peptides as output. By default, a trypsin digest is simulated, but other proteases can be specified by using the --pattern parameter. This command runs entirely locally and doesn't connect to any server.

Input

The prot2pept command expects protein sequences as input via standard input. A single protein sequences per line is expected.

Example
$ cat input.txt
AALTERAALE
MDGTEKYIIVK
$ cat input | prot2pept
AALTER
AALE
MDGTEK
YIIVK

Output

The prot2pept command outputs the split peptides to standard output. All peptides are separated by newlines.

Fasta support

The prot2pept command supports input in fasta format. This format consists of a fasta header (a line starting with a >), followed by one or more lines containing the protein sequence. When this format is detected, the command behaves slightly different. The main difference is that newlines between fasta headers are ignored: all lines between fasta headers are treated as a single protein. Next to this, the fasta headers are also written to output.

Example
$ cat input.txt
AALTE
AALRTER
$ cat input.txt | prot2pept
AALTE
AALR
TER
$ cat input.txt
> fasta header
AALTE
AALRTER
> other header
PEPTIDE
$ cat input.txt | prot2pept
> fasta header
AALTEAALR
TER
> other header
PEPTIDE

Command-line options

--pattern / -p Specify cleavage pattern

By default, proteins are split by simulating a trypsin digest. This corresponds by splitting the input string by using the regular expression ([KR])([^P]). The --pattern option allows you to specify an alternative (ruby-style) regular expression to split the sequences.

Example
$ echo "LGAARPLGAGLAKVIGAGIGIGK" | prot2pept
LGAARPLGAGLAK
VIGAGIGIGK
$ echo "LGAARPLGAGLAKVIGAGIGIGK" | prot2pept --pattern '([KR])([^V])'
LGAAR
PLGAGLAKVIGAGIGIGK

--help / -h Display help

This flag displays the help.