Stampy is a package for the mapping of short reads from illumina sequencing machines onto a reference genome. It’s recommended for most workflows, including those for genomic resequencing, RNA-Seq and Chip-seq. Stampy excels in the mapping of reads containing that contain sequence variation relative to the reference, in particular for those containing insertions or deletions. It can map reads from a highly divergent species to a reference genome for instance. Stampy achieves high sensitivity and speed by using a fast hashing algorithm and a detailed statistical model. Stampy has the following features:
- Maps single, paired-end and mate pair Illumina reads to a reference genome
- Fast: about 20 Gbase per hour in hybrid mode (using BWA)
- Low memory footprint: 2.7 Gb shared memory for a 3Gbase genome
- High sensitivity for indels and divergent reads, up to 10-15%
- Low mapping bias for reads with SNPs
- Well calibrated mapping quality scores
- Input: Fastq and Fasta; gzipped or plain
- Output: SAM, Maq’s map file
- Optionally calculates per-base alignment posteriors
- Optionally processes part of the input
- Handles reads of up to 4500 bases
Citiation: Lunter and Goodson. Stampy: a statistical algorithm for sensitive and fast mapping of Illumina sequence reads. Genome Res. 2011. 21:936-939.