This is a small experiment on the alignment of ~50bp INDELs. The query sequences are shown in 0.01.fq
below, where seq_ori
is a 204bp sequence extracted from the human reference genome, seq_del54
contains a 54bp deletion in the middle, seq_del84
contains a 84bp deletion in a 120bp read, and seq_ins40
contains a 40bp insertion in a 140bp read. These four short sequences were mapped to the human reference genome with Bowtie2, BWA-MEM, LAST, Novoalign, SNAP and Stampy with default settings. Non-default scoring functions were also tested for Bowtie2 (--rdg 5,1 --rfg 5,1), BWA-MEM (-A2 -E1) and LAST (-r2 -q4). The output by various mappers/settings can be found in this gist. The following table gives my summary:
Mapper | Setting | -84bp | -54bp | +40bp |
---|---|---|---|---|
BBMAP | default | Yes | Yes | Yes |
Bowtie2 | default | No | No | No |
Bowtie2 | --rdg 5,1 --rfg 5,1 | as insertion | as insertion | Yes |
BWA-MEM | default | as split | Yes | Yes |
BWA-MEM | -A2 -E1 | Yes | Yes | Yes |
LAST | default | as split | as split |