Assuming we have a fasta file of proteins with ids generated from Trinity and Transdecoder called transdecoder.pep
.
Truncate names as follows.
cat transdecoder.pep | sed -r 's/[^:]*::/>/' > transdecoder_truncated.pep
Note that on a mac you should use -E
instead of -r
Then run signalp as normal
signalp -f short transdecoder_truncated.pep > signalp_truncated.out
Finally restore the names
cat signalp_truncated.out | awk '/#/{print $0};match($0,/TRINITY_[0-9A-Z]+_c[0-9]+_g[0-9]+/){ printf("%s::%s\n",substr($0,RSTART,RLENGTH),$0)} ' > signalp.out