Bowtie 2

Fast and sensitive read alignment

Site Map

Latest Release

Bowtie2 2.3.2 05/05/17 

Please cite: Langmead B, Salzberg S. Fast gapped-read alignment with Bowtie 2. Nature Methods. 2012, 9:357-359.

Links

Related Tools

Bowtie: Ultrafast short read alignment
Crossbow: Genotyping, cloud computing
Myrna: Cloud, differential gene expression
Tophat: RNA-Seq splice junction mapper
Cufflinks: Isoform assembly, quantitation
Lighter: Fast error correction

Indexes

Consider using Illumina's iGenomes collection. Each iGenomes archive contains pre-built Bowtie 2 and Bowtie indexes.

H. sapiens, UCSC hg18 3.5 GB
 or: part 1 (1.5 GB), part 2 (651 MB), part 3 (1.5 GB)
H. sapiens, UCSC hg19 3.5 GB
 or: part 1 (1.5 GB), part 2 (650 MB), part 3 (1.5 GB)
H. sapiens, NCBI GRCh38 3.5 GB
M. musculus, UCSC mm10 3.2 GB
 or: part 1 (1.3 GB), part 2 (600 MB), part 3 (1.3 GB)
M. musculus, UCSC mm9 3.2 GB
 or: part 1 (1.3 GB), part 2 (593 MB), part 3 (1.3 GB)
R. norvegicus, UCSC rn4 3.1 GB
 or: part 1 (1.3 GB), part 2 (580 MB), part 3 (1.3 GB)

Some unzip programs cannot handle archives >2 GB. If you have problems downloading or unzipping a >2 GB index, try downloading in parts.

Related publications

Contributors

Related links

News archive

Version 2.3.2 - May 05, 2017

  • Now reports MREVERSE SAM flag for unaligned end when only one end of a pair aligns
  • Fixed issue where first character of some read names was omitted from SAM output when using tabbed input formats
  • Added --sam-no-qname-trunc option, which causes entire read name, including spaces, to be written to SAM output. This violates SAM specification, but can be useful in applications that immediately postprocess the SAM.
  • Fixed compilation error caused by pointer comparison issue in aligner_result.cpp
  • Removed termcap and readline dependencies introduced in v2.3.1
  • Fixed compilation issues caused by gzbuffer function when compiling with zlib v1.2.3.5 and earlier. Users compiling against these libraries will use the zlib default buffer size of 8Kb when decompressing read files.
  • Fixed issue that would cause Bowtie 2 hang when aligning FASTA inputs with more than one thread

Version 2.3.1 - Mar 03, 2017

Please note that as of this release Bowtie 2 now has dependencies on zlib and readline libraries. Make sure that all dependencies are met before attempting to build from source.

  • Added native support for gzipped read files. The wrapper script is no longer responsible for decompression. This simplifies the wrapper and improves speed and thread scalability for gzipped inputs.
  • Fixed a bug that caused 'bowtie2-build' to crash when the first FASTA sequence contains all Ns.
  • Add support for interleaved FASTQ format -—interleaved.
  • Empty FASTQ inputs would yield an error in Bowtie 2 2.3.0, whereas previous versions would simply align 0 reads and report the SAM header as usual. This version returns to the pre-2.3.0 behavior, resolving a compatibility issue between TopHat2 and Bowtie 2 2.3.0.
  • Fixed a bug whereby combining '-—un-conc* with '-k' or '-a' would cause 'bowtie2' to print duplicate reads in one or both of the '--un-conc*' output files, causing the ends to be misaligned.
  • The default '--score-min' for '--local' mode is now 'G,20,8'. That was the stated default in the documentation for a while, but the actual default was 'G,0,10' for many versions. Now the default matches the documentation and, we find, yields more accurate alignments than 'G,0,10'

Version 2.3.0 - Dec 13, 2016

This is a major release with some larger and many smaller changes. These notes emphasize the large changes. See commit history for details.

  • Code related to read parsing was completely rewritten to improve scalability to many threads. In short, the critical section is simpler and parses input reads in batches rather than one at a time. The improvement applies to all read formats.
  • TBB is now the default threading library. We consistently found TBB to give superior thread scaling. It is widely available and widely installed. That said, we are also preserving a "legacy" version of Bowtie that, like previous releases, does not use TBB. To compile Bowtie source in legacy mode use NO_TBB=1. To use legacy binaries, download the appropriate binary archive with "legacy" in the name.
  • Bowtie now uses a queue-based lock rather than a spin or heavyweight lock. We find this gives superior thread scaling; we saw an order-of-magnitude throughput improvements at 120 threads in one experiment, for example.
  • Unnecessary thread synchronization removed
  • Fixed issue with parsing FASTA records with greater-than symbol in the name
  • Now detects and reports inconsistencies between --score-min and --ma
  • Changed default for --bmaxdivn to yield better memory footprint and running time when building an index with many threads

Bowtie2 developers note

As of Nov 2015 we had to fix the bowtie2 github repo and relabel the entire history. Developers and contributors should re-clone the bowtie2 github repo from this current state.

Version 2.2.9 - Apr 22, 2016

  • Fixed the multiple threads issue for the bowtie2-build.
  • Fixed a TBB related build issue impacting TBB v4.4.

Version 2.2.8 - Mar 10, 2016

  • Various website updates.
  • Fixed the bowtie2-build issue that made TBB compilation fail.
  • Fixed the static build for Win32 platform.

Version 2.2.7 - Feb 10, 2016

  • Added a parallel index build option: bowtie2-build --threads <# threads>.
  • Fixed an issue whereby IUPAC codes (other than A/C/G/T/N) in reads were converted to As. Now all non-A/C/G/T characters in reads become Ns.
  • Fixed some compilation issues, including for the Intel C++ Compiler.
  • Removed debugging code that could impede performance for many alignment threads.
  • Fixed a few typos in documentation.

Version 2.2.6 - Jul 22, 2015

  • Switched to a stable sort to avoid some potential reproducibility confusions.
  • Added 'install' target for *nix platforms.
  • Added the Intel TBB option which provides in most situations a better performance output. TBB is not present by default in the current build but can be added by compiling the source code with WITH_TBB=1 option.
  • Fixed a bug that caused seed lenght to be dependent of the -L and -N parameters order.
  • Fixed a bug that caused --local followed by -N to reset seed lenght to 22 which is actually the default value for global.
  • Enable compilation on FreeBDS and clang, although gmake port is still required.
  • Fixed an issue that made bowtie2 compilation process to fail on Snow Leopard.

Version 2.2.5 - Mar 9, 2015

  • Fixed some situations where incorrectly we could detect a Mavericks platform.
  • Fixed some manual issues including some HTML bad formating.
  • Make sure the wrapper correctly identifies the platform under OSX.
  • Fixed --rg/--rg-id options where included spaces were incorrectly treated.
  • Various documentation fixes added by contributors.
  • Fixed the incorrect behavior where parameter file names may contain spaces.
  • Fixed bugs related with the presence of spaces in the path where bowtie binaries are stored.
  • Improved exception handling for missformated quality values.
  • Improved redundancy checks by correctly account for soft clipping.

Lighter released

  • Lighter is an extremely fast and memory-efficient program for correcting sequencing errors in DNA sequencing data. For details on how error correction can help improve the speed and accuracy of downstream analysis tools, see the paper in Genome Biology. Source and software available at GitHub.

Version 2.2.4 - Oct 22, 2014

  • Fixed a Mavericks OSX specific bug caused by some linkage ambiguities.
  • Added lz4 compression option for the wrapper.
  • Fixed the vanishing --no-unal help line.
  • Added the static linkage for MinGW builds.
  • Added extra seed-hit output.
  • Fixed missing 0-length read in fastq --passthrough output.
  • Fixed an issue that would cause different output in -a mode depending on random seed.

Version 2.2.3 - May 30, 2014

  • Fixed a bug that made loading an index into memory crash sometimes.
  • Fixed a silent failure to warn the user in case the -x option is missing.
  • Updated --al, --un, al-conc and un-conc options to avoid confusion in cases where the user does not provide a base file name.
  • Fixed a spurious assert that made bowtie2-inspect debug fail.

Version 2.2.2 - April 10, 2014

  • Improved performance for cases where the reference contains ambiguous or masked nucleobases represented by Ns.

Version 2.2.1 - February 28, 2014

  • Improved way in which index files are loaded for alignment. Should fix efficiency problems on some filesystems.
  • Fixed a bug that made older systems unable to correctly deal with bowtie relative symbolic links.
  • Fixed a bug that, for very big indexes, could determine to determine file offsets correctly.
  • Fixed a bug where using --no-unal option incorrectly suppressed --un/--un-conc output.
  • Dropped a perl dependency that could cause problems on old systems.
  • Added --no-1mm-upfront option and clarified documentation for parameters governing the multiseed heuristic.

Bowtie 2 on GitHub - February 4, 2014

Version 2.2.0 - February 17, 2014

  • Improved index querying efficiency using "population count" instructions available since SSE4.2
  • Added support for large and small indexes, removing 4-billion-nucleotide barrier. Bowtie 2 can now be used with reference genomes of any size.
  • Fixed bug that could cause bowtie2-build to crash when reference length is close to 4 billion.
  • Added a CL: string to the @PG SAM header to preserve information about the aligner binary and paramteres.
  • Fixed bug that could cause bowtie2-build to crash when reference length is close to 4 billion.
  • No longer releasing 32-bit binaries. Simplified manual and Makefile accordingly.
  • Credits to the Intel® enabling team for performance optimizations included in this release. Thank you!
  • Phased out CygWin support. MinGW can still be used for Windows building.
  • Added the .bat generation for Windows.
  • Fixed some issues with some uncommon chars in fasta files.
  • Fixed wrappers so bowtie can now be used with symlinks.

Version 2.1.0 - February 21, 2013

  • Improved multithreading support so that Bowtie 2 now uses native Windows threads when compiled on Windows and uses a faster mutex. Threading performance should improve on all platforms.
  • Improved support for building 64-bit binaries for Windows x64 platforms.
  • Bowtie 2 uses a lightweight mutex by default.
  • Test option --nospin is no longer available. However bowtie2 can always be recompiled with EXTRA_FLAGS="-DNO_SPINLOCK" in order to drop the default spinlock usage.

Version 2.0.6 - January 27, 2013

  • Fixed issue whereby spurious output would be written in --no-unal mode.
  • Fixed issue whereby multiple input files combined with --reorder would cause truncated output and a memory spike.
  • Fixed spinlock datatype for Win64 API (LLP64 data model) which made it crash when compiled under Windows 7 x64.
  • Fixed bowtie2 wrapper to handle filename/paths operations in a more platform independent manner.
  • Added pthread as a default library option under cygwin, and pthreadGC for MinGW.
  • Fixed some minor issues that made MinGW compilation fail.

Version 2.0.5 - January 4, 2013

  • Fixed an issue that would cause excessive memory allocation when aligning to very repetitive genomes.
  • Fixed an issue that would cause a pseudo-randomness-related assert to be thrown in debug mode under rare circumstances.
  • When bowtie2-build fails, it will now delete index files created so far so that invalid index files don't linger.
  • Tokenizer no longer has limit of 10,000 tokens, which was a problem for users trying to index a very large number of FASTA files.
  • Updated manual's discussion of the -I and -X options to mention that setting them farther apart makes Bowtie 2 slower.
  • Renamed COPYING to LICENSE and created a README to be GitHub-friendly.

Version 2.0.4 - December 17, 2012

  • Fixed issue whereby --un, --al, --un-conc, and --al-conc options would incorrectly suppress SAM output.
  • Fixed minor command-line parsing issue in wrapper script.
  • Fixed issue on Windows where wrapper script would fail to find bowtie2-align.exe binary.
  • Updated some of the index-building scripts and documentation.
  • Updated author's contact info in usage message

Version 2.0.3 - December 14, 2012

  • Fixed thread safely issues that could cause crashes with a large number of threads. Thanks to John O’Neill for identifying these issues.
  • Fixed some problems with pseudo-random number generation that could cause unequal distribution of alignments across equally good candidate loci.
  • The --un, --al, --un-conc, and --al-conc options (and their compressed analogs) are all much faster now, making it less likely that they become the bottleneck when Bowtie 2 is run with large -p.
  • Fixed issue with innaccurate mapping qualities, XS:i, and YS:i flags when --no-mixed and --no-discordant are specified at the same time.
  • Fixed some compiler warnings and errors when using clang++ to compile.
  • Fixed race condition in bowtie2 script when named pipes are used.
  • Added more discussion of whitespace in read names to manual.

Sourceforge spam

  • Spam on the sourceforge tracker (i.e. where feature requests and bug reports go) was getting out of control, so I disabled posting by anonymous users. This means you'll have to use some set of credentials when posting on the tracker. Sourceforge allows you to use various, e.g., your Google credentials. Sorry for the invonvenience, but I think this will make the experience better overall.

Version 2.0.2 - October 31, 2012

  • Fixes a couple small issues pointed out to me immediately after 2.0.1 release
  • Mac binaries now built on 10.6 in order to be forward-compatible with more Mac OS versions
  • Small changes to source to make it compile with gcc versions up to 4.7 without warnings

Version 2.0.1 - October 31, 2012

  • First non-beta release.
  • Fixed an issue that would cause Bowtie 2 to use excessive amounts of memory for closely-matching and highly repetitive reads under some circumstances.
  • Fixed a bug in --mm mode that would fail to report when an index file could not be memory-mapped.
  • Added --non-deterministic option, which better matches how some users expect the pseudo-random generator inside Bowtie 2 to work. Normally, if you give the same read (same name, sequence and qualities) and --seed, you get the same answer. --non-deterministic breaks that guarantee. This can be more appropriate for datasets where the input contains many identical reads (same name, same sequence, same qualities).
  • Fixed a bug in bowtie2-build would yield corrupt index files when memory settings were adjusted in the middle of indexing.
  • Clarified in manual that --un-* options print reads exactly as they appeared in the input, and that they are not necessarily written in the same order as they appeared in the input.
  • Fixed issue whereby wrapper would incorrectly interpret arguments with --al as a prefix (e.g. --all) as --al.

Version 2.0.0-beta7 - July 13, 2012

  • Fixed an issue in how Bowtie 2 aligns longer reads in --local. mode. Some alignments were incorrectly curtailed on the left-hand side.
  • Fixed issue whereby --un or --un-conc would fail to output unaligned reads when --no-unal was also specified.
  • Fixed issue whereby --un or --un-conc were significantly slowing down Bowtie 2 when -p was set greater than 1.
  • Fixed issue that would could cause hangs in -a mode or when -k was set high.
  • Fixed issue whereby the SAM FLAGS field could be set incorrectly for secondary paired-end alignments with -a or -k > 1.
  • When input reads are unpaired, Bowtie 2 no longer removes the trailing /1 or /2 from the read name.
  • -M option is now deprecated. It will be removed in subsequent versions. What used to be called -M mode is still the default mode, and -k and -a are still there alternatives to the default mode, but adjusting the -M setting is deprecated. Use the -D and -R options to adjust the effort expended to find valid alignments.
  • Gaps are now left-aligned in a manner similar to BWA and other tools.
  • Fixed issue whereby wrapper script would not pass on exitlevel correctly, sometimes spuriously hiding non-0 exitlevel.
  • Added documentation for YT:Z to manual.
  • Fixed documentation describing how Bowtie 2 searches for an index given an index basename.
  • Fixed inconsistent documentation for the default value of the -i parameter

Added rat pre-built index - May 17 2012

Version 2.0.0-beta6 - May 7, 2012

  • Bowtie 2 now handles longer reads in a more memory-economical fashion, which should prevent many out-of-memory issues for longer reads.
  • Error message now produced when -L is set greater than 32.
  • Added a warning message to warn when bowtie2-align binary is being run directly, rather than via the wrapper. Some functionality is provided by the wrapper, so Bowtie 2 should always be run via the bowtie2 executable rather than bowtie2-align.
  • Fixed an occasional crashing bug that was usually caused by setting the seed length relatively short.
  • Fixed an issue whereby the FLAG, RNEXT and PNEXT fields were incorrect for some paired-end alignments. Specifically, this affected paired-end alignments where both mates aligned and one or both mates aligned non-uniquely.
  • Fixed issue whereby compressed input would sometimes be mishandled.
  • Renamed --sam-* options to omit the sam- part for brevity. The old option names will also work.
  • Added --no-unal option to suppress SAM records corresponding to unaligned reads, i.e., records where FLAG field has 0x4 set.
  • Added --rg-id option and enhanced the documentation for both --rg-id and --rg. Users were confused by the need to specify --rg "ID:(something)" in order for the @RG line to be printed; hopefully this is clearer now.
  • Index updates: indexes linked to in the right-hand sidebar have been updated to include the unplaced contigs appearing in the UCSC "random" FASTA files. This makes the indexes more complete. Also, an index for the latest mouse assembly, mm10 (AKA "GRCm38") has been added.

Version 2.0.0-beta5 - December 15, 2011

  • Added --un, --al, --un-conc, and --al-conc options that write unpaired and/or paired-end reads to files depending on whether they align at least once or fail to align.
  • Added --reorder option. When enabled, the order of the SAM records output by Bowtie 2 will match the order of the input reads even when -p is set greater than 1. This is disabled by default; enabling it makes Bowtie 2 somewhat slower and use somewhat more memory when -p is set greater than 1.
  • Changed the default --score-min in --local mode to G,20,8. This ought to improve sensitivity and accuracy in many cases.
  • Fixed some minor documentation issues.
  • Note: I am aware of an issue whereby longer reads (>10,000 bp) drive the memory footprint way up and often cause an out-of-memory exception. This will be fixed in a future version.

Version 2.0.0-beta4 - December 6, 2011

  • Accuracy improvements.
  • Speed improvements in some situations.
  • Fixed a handful of crashing bugs.
  • Fixed some documentation bugs.
  • Fixed bug whereby --version worked incorrectly.
  • Fixed formatting bug with MD:Z optional field that would sometimes fail to follow a mismatch with a number.
  • Added -D option for controlling the maximum number of seed extensions that can fail in a row before we move on. This option or something like it will eventually replace the argument to -M.
  • Added -R option to control maximum number of times re-seeding is attempted for a read with repetitive seeds.
  • Changed default to --no-dovetail. Specifying --dovetail turns it back on.
  • Added second argument for --mp option so that user can set maximum and minimum mismatch penalties at once. Also tweaked the formula for calculating the quality-aware mismatch penalty.

Version 2.0.0-beta3 - November 1, 2011

  • Accuracy improvements.
  • Speed improvements in some situations.
  • Fixed a handful of crashing bugs.
  • Fixed a bug whereby number of repetitively aligned reads could be misreported in the summary output.
  • As always, thanks to everyone for their reports and feedback! Please keep it coming.

Version 2.0.0-beta2 - October 16, 2011

  • Added manual, both included in the download package and on the website. The website will always have the manual for the latest version.
  • Added Linux 32-bit and 64-bit binary packages. Mac OS X packages to come. Still working on a Windows package.
  • Fixed a bug that led to crashes when seed-alignment result memory was exhausted.
  • Changed the --end-to-end mode --score-min default to be less permissive. The previous threshold seemed to be having an adverse effect on accuracy, though the fix implemented in this version comes at the expense of some sensitivity.
  • Changed the --end-to-end mode -M default to be lower by 2 notches. This offsets any adverse effect that the previous change would have had on speed, without a large adverse impact on accuracy. As always, setting -M higher will yield still greater accuracy at the expense of speed.

Version 2.0.0-beta1 - September 22, 2011

  • First public release
  • Caveats: as of now, the manual is incomplete, there's no tutorial, and no example genome or example reads. All these will be fixed in upcoming releases.
  • Only a source package is currently available. Platform-specific binaries will be included in future releases.