Bowtie

An ultrafast memory-efficient short read aligner

   

Site Map

Latest Release

Related Tools

    Crossbow: Genotyping, cloud computing
    Tophat: RNA-Seq splice junction mapper
    Cufflinks: Isoform assembly, quantitation

Pre-built indexes

H. sapiens, UCSC hg18 2.7 GB
 or: part 1 - 1.7 GB, part 2 - 1.0 GB
H. sapiens, UCSC hg19 2.7 GB
 or: part 1 - 1.7 GB, part 2 - 1.0 GB
H. sapiens, NCBI 36.3 2.7 GB
 or: part 1 - 1.7 GB, part 2 - 1.0 GB
H. sapiens, NCBI 37.1 2.7 GB
 or: part 1 - 1.7 GB, part 2 - 1.0 GB
M. musculus, UCSC mm9 2.4 GB
 or: part 1 - 1.5 GB, part 2 - 905 MB
M. musculus, NCBI 37.1 2.4 GB
 or: part 1 - 1.5 GB, part 2 - 905 MB
B. taurus, UMD 3.0 2.1 GB
 or: part 1 - 1.3 GB, part 2 - 801 MB
D. melanogaster, Flybase, r5.22 153 MB
A. thaliana, TAIR, TAIR9 119 MB
C. elegans, Wormbase, WS200 77 MB
S. cerevisiae, CYGD 15 MB
E. coli, NCBI, st. 536 5.0 MB

All indexes are for assemblies, not contigs. Unplaced or unlocalized sequences and alternate haplotype assemblies are excluded.

Some unzip programs cannot handle archives >2 GB. If you have problems downloading or unzipping a >2 GB index, try downloading in two parts.

Check .zip file integrity with MD5s.

Pre-built indexes are compatible with Bowtie versions 0.9.8 and later. For older indexes, please contact us.

Publications

Other Documentation

Authors

Links

News archive

0.12.0 release - Coming soon

  • ABI SOLiD colorspace support
    • Colorspace indexes are distinct from standard letterspace indexes and must be built with a separate invocation of bowtie-build (with -C option)
    • Running bowtie with -C causes Bowtie to align in colorspace. Both index and reads must be in colorspace.
    • Colorspace memory requirement is the same as paired-end alignment in letterspace (normal) mode. Paired-end alignment does not increase the memory requirement further in colorspace.
    • csfasta, csfastq, and "raw" read formats are all supported with -C. In read sequences, 0 means "blue" and is intechangeable with A, likewise 1 (C) means "green", 2 (G) means "orange" and 3 (T) means "red".
    • New manual section discussing colorspace features
  • Fixed a few SAM output issues
    • @PG line now properly uses colons instead of equals signs
    • Removed /1, /2 suffixes for paired-end reads in SAM mode
    • Added --sam-RG option that permits the user to insert set values for flags that appear on the @RG line
  • Fixed lingering pthreads bugs that could cause Bowtie to hang or crash toward the end of execution with -p > 1.
  • bowtie -f now supports fasta files with reads split across multiple lines

Crossbow paper out - 11/20/09

New pre-built index list - 11/14/09

  • The roster of pre-built indexes has been changed to address a number of user requests and issues and to streamline maintenance. Now:
    • All indexes are for well known assemblies. Contig indexes have been removed.
    • Indexes for UCSC builds hg18, hg19 and mm9 have been added.
    • Rarely used indexes (chimp, rat, dog, chicken) have been removed.
  • If there is an index not on the list that you use on a regular basis and would like to be made availble pre-built, contact us. We may add more if enough users request them.
  • Note: neither UCSC nor NCBI indexes contain unplaced contigs (UCSC: "random" files, NCBI: Chromosome "Un") or alternative haplotype assemblies.

Bowtie on Galaxy - 10/15/09

  • The Galaxy project has integrated Bowtie as one of the tools available for aligning short reads (under "NGS: Mapping" on in the "Tools" box). Many thanks to Anton Nekrutenko for his work on that.

0.11.3 release - 10/12/09

  • Fixed crashing bug in -S/--sam mode when the number of reference sequences in the index is very large.
  • Added --sam-nohead option to suppress output of SAM headers in -S/--sam mode.
  • Added --sam-nosq option to suppress output of @SQ SAM headers in -S/--sam mode. These can become a nuisance when the reference index contains a very large number of sequences.
  • Fixed a bug in bowtie-build's auto-configure mode that would cause it to underestimate the amount of memory required by a set of parameters. This in turn would cause the index to be corrupted.

Crossbow released - 10/9/09

0.11.2 release - 10/7/09

  • Fixed issue whereby --max option was disabled.

0.11.1 mini-release - 10/5/09

  • SAM output: changed XS:i optional field to be named XA:i to avoid a conflict with TopHat's XS:i field.

0.11.0 release - 10/5/09

  • Initial SAM output support with -S/--sam option. bowtie sets all fields according to the SAM spec (Version 0.1.2-draft, 20090820). See the new SAM Output section of the manual for details.
  • The Getting Started Guide and TUTORIAL file have been updated to use SAMtools instead of Maq for the SNP calling step.
  • Added --shmem option: --shmem is similar to --mm in that it allows concurrent bowtie processes querying the same index to share a single memory image of the index. Unlike --mm, shared memory alocated by --shmem is permanent.
  • The alignment summary printed to stderr at the end of an alignment run is now more friendly and includes data about the number and proportion of reads that aligned, failed to align, or were suppressed via the -m option.
  • When too-short reads are encountered, bowtie now always prints warnings, not errors. --quiet now suppresses those warnings.
  • By default, when bowtie prints a reference sequence name it now stops at the first whitespace. In 0.10.1, the default was to print the entire name, which could cause confusion when parsing bowtie output. To revert to printing the full name, use the new --fullref option.
  • bowtie now prints the command-line before exiting with an error.
  • Fixed mistake in the manual's "Default output" section: the offset in field 4 is 0-based, not 1-based. To obtain a 1-based offset instead, use the -B 1 option.
  • Various minor bug fixes.
  • Deprecated: -z/--phased, -b/--binout, bowtie-maptool, bowtie-maqconvert. These features will be removed in a future version of Bowtie. Note that -b/--binout, bowtie-maptool, and bowtie-maqconvert are largely superseded by the SAM output format (-S/--sam), BAM, and SAMtools. Contact the authors if this is a problem.
  • Removed: --unfq/--unfa/--maxfq/--maxfa/--alfq/--alfa. Please use --un/--max/--al instead. Contact the authors if this is a problem.

Cufflinks released - 9/28/09

  • Cufflinks, a tool that assembles transcripts and estimates their abundances in RNA-Seq samples was released today. The author is Cole Trapnell, who is also the author of TopHat.

10,000 downloads - 9/21/09

  • Since the first version was released a little over a year ago, Bowtie has been downloaded more than 10,000 times. Our sincere thanks to users whose bug reports, suggestions and help have made this possible. Thank you!

New human index - 8/26/09

  • Added index files for NCBI 37.1 assembly.

New cow index - 8/26/09

  • Added index files for the 3.0 version of the University of Maryland Bos Taurus (cow) assembly. See sidebar.

0.10.1 release - 7/19/09

  • Now when -3/-5 are used in combination with -I/-X, the -I/-X constraints are interpreted as applying to the original insert, not the trimmed insert.
  • Fixed issue whereby -I option was ignored; -I option works now.
  • Fixed a bug whereby some large indexes were incorrectly reported as corrupt by bowtie-build.
  • Fixed issue whereby negative quality values were wrongly rejected when both --integer-quals and --solexa-quals were specified.
  • The -l/--seedlen parameter can now be adjusted down to 5 (previously had to be >= 20).
  • Fixed several minor memory leaks and out-of-bounds issues. The Linux version of the bowtie aligner now receives a clean bill of health from valgrind's memcheck.
  • Other minor bugfixes.

0.10.0.2 release - 6/28/09

  • Second bugfix for Windows version. src and bin-win32 packages updated. Linux and Mac users are not affected. Thanks for your bug reports and patience.

0.10.0.1 release - 6/23/09

  • Fix for crashing bug in Windows version. src and bin-win32 packages updated. Linux and Mac users are not affected.

Note to TopHat users

  • TopHat is not yet compatible with Bowtie 0.10.0. Stay tuned to the TopHat site for updated information on TopHat/Bowtie 0.10.0 compatibility.

0.10.0 release - 6/15/09

  • Major change: All alignment modes are now unstratified by default. The --nostrata option has been removed, since it is now the default. A --strata option has been added to override the default and force stratified reporting. Reporting is stratified if and only if --strata is specified. --strata now cannot be specified without also specifying --best. Please note that, because of this change, specifying the same arguments to this version of Bowtie may yield different reported results.
  • Replaced the --unfa/--unfq options with a single --un option, which writes unaligned reads to an output file (or pair of output files) but keeps reads in their original form. This is in contrast to the old --unfa/--unfq options, which only supported FASTA or FASTQ formats, and which would print a post-trimming and post-quality-value translation version of the read. Likewise, the --alfa/--alfq and --maxfa/--maxfq options have been replaced with --al and --max options. The old options are still present, but are deprecated and will be removed in a future version.
  • Added --nofw and --norc options, allowing alignment to just one reference strand or the other.
  • Added --mm option that causes bowtie to use memory-mapped files instead of traditional file I/O to access the reference index. This allows multiple bowtie processes running on the same computer to share a single in-memory image of a given index. This is a useful feature for parallelizing bowtie in situations where memory is limited and where -p is inappropriate or insufficient. This feature is not available in the Windows version of Bowtie.
  • Added a section to the manual ("Reporting Modes") clarifying and giving examples of how to use Bowtie's reporting options.
  • The --al and -z/--phased options previously interacted in such a way that the --al file could contain multiple entries for the same aligned read. --al and -z/--phased are now incompatible.
  • The --oldpmap option, deprecated in version 0.9.8, has been removed.

0.9.9.3 release - 5/12/09

  • Fixed an issue where bowtie --best would sometimes use excessive amounts of memory to store path descriptors. There is now a per-thread 32-MB ceiling (configurable with new option --chunkmbs <int>) on the memory taken by path descriptors. If the ceiling is exceeded Bowtie will skip the offending read, print a warning message identifying the read, and continue.
  • More options are available for defining the quality-value format, including new --phred64-quals/--solexa1.3-quals options appropriate for the 64-based-Phred output of Illumina's GA Pipeline 1.3. Added option --phred33-quals to to handle the more typical 33-based-Phred scale (the default). The --solexa-quals option still handles the 64-based-Solexa scale output by GA Pipeline versions prior to 1.3.
  • bowtie-build now checks output files for obvious corruption due, for example, to disk exhaustion.
  • Specifying - (meaning stdin) as an input to bowtie is now supported and documented.
  • Fixed a bug whereby bowtie-maqconvert could fail to notice that it had exhausted memory and output a corrupt Maq map file.
  • Fixed a bug whereby bowtie would crash when trying to use an index built on a machine with different endianness.
  • Fixed several issues that prevented Bowtie from compiling on Solaris. I confirm that Bowtie builds and runs on Solaris 10.
  • Added _LARGEFILE_SOURCE _FILE_OFFSET_BITS=64 _GNU_SOURCE to the default build options in an attempt to resolve some of the large-file issues users are having.
  • Clarified column 7 in the manual. We received many queries from users curious about this number.
  • Moderate speed improvements in --best mode.

0.9.9.2 release - 4/6/09

  • Paired-end alignment is now available in all alignment modes, including all -n modes.
  • --best now provides better guarantees. Reported alignments are now guaranteed to be "best" both in terms of stratum (i.e. number of mismatches, or mismatches in the seed in the case of -n mode), and in terms of the quality values at the mismatched position(s). Stratum always trumps quality when determining best alignments. Also, --best mode resolves the strand bias issue that is present in the default mode (see manual for a discussion of the issue).
  • Speed improvements for --best mode in most alignment modes.
  • Major speed improvement for the -v 3 alignment mode (except when -z is also used).
  • The "Reported X alignments..." message is now printed to stderr rather than stdout. Only alignments are written to stdout.
  • In bowtie-maqconvert, read names longer than Maq's limit (36) are now truncated to a suffix of the original name, rather than a prefix. This mimics Maq's behavior and prevents /1 and /2 suffixes for paired-end reads from being destroyed.
  • Added --alfq/--alfa options to dump aligned reads to FASTQ and/or FASTA files.
  • Removed many extraneous source files.

TopHat paper out - 3/17/09


0.9.9.1 release - 3/11/09

  • Added paired-end alignment for -v 2 and -v 3 alignment modes (-n modes coming soon).
  • Minor bug fixes and speed improvements for all paired-end modes.
  • Added -s/--skip <int> option to skip over the first <int> reads or pairs in the input.
  • --unfq/--unfa/--maxfq/--maxfa modes no longer create empty output files.
  • All Bowtie tools now compile under GCC 4.3.3.
  • Fixed bug whereby bowtie -b would sometimes write garbage into the reference offset field.
  • Paired-end info is now persisted in the -b format, allowing bowtie-maptool output to add /1 and /2 suffixes as appropriate.

Final paper out - 3/10/09


0.9.9 release - 2/19/09

  • Added some preliminary support for paired-end alignment in -v 0 and -v 1 modes. -1/-2 options to specify the paired-end files, -I/-X to specify min and max insert sizes, and --fr/--rf/--ff specify relative orientation of upstream and downstream mates. bowtie-build now builds two additional files: NAME.3.ebwt and NAME.4.ebwt. Together, these files store a bitpacked version of the reference and they are required for paired-end alignment. If your index does not include these files and you would like to perform a paired-end alignment, you will have to rebuild the index with bowtie-build version 0.9.9 or later. Paired-end alignment is not compatible with -z mode, and it incurs about a 30% greater memory overhead than single-end mode.
  • Pre-built indexes available from Bowtie website have been updated to include .3/.4.ebwt index files. These new pre-built indexes are no longer compatible with bowtie versions prior to 0.9.8.
  • New -B/--offbase option allows user to specify how bowtie numbers reference positions in its output. E.g. -B 1 causes bowtie to number leftmost char as 1. -B 0 is the default, but -B 1 will likely become the default in the 1.0 release.
  • Fixed a bug that caused trimming options -3 and -5 not to work properly in -r (raw input) mode.
  • bowtie-build now prints a friendly error message and exits if an input file doesn't exist.
  • Fixed a bug that caused the Win32 version of bowtie to hang just before it would normally have exited.
  • Fixed bug that could prevent successful read-in of very large (>1GB) .2.ebwt index files.
  • Removed --maxns option since it's mostly redundant with what -v and -n already do.
  • Removed --ntoa option.
  • bowtie usage message is now divided into sections for clarity.

Added AGBT poster - 2/5/09

  • ...to Other Documentation section in the right-hand sidebar

Paper coming soon - 1/27/09

  • Bowtie paper Ultrafast and memory-efficient alignment of short DNA sequences to the human genome is accepted at Genome Biology. A link to the paper will be posted when available.
  • Added Related Tools, Publications, Other Documentation sections to right-hand sidebar.

Paper out - 3/4/09

  • The provisional PDF of our paper Ultrafast and memory-efficient alignment of short DNA sequences to the human genome is available at Genome Biology.

Current goals - 3/4/09

  • Current goals in rough order of priority are:
    • Expanding preliminary paired-end support (new in 0.9.9) into full paired-end support.
    • Better alignment guarantees.
    • Gapped alignment.
  • Also of interest: colorspace support and mapping qualities.

0.9.8.1 release - 1/7/09

  • Fixed all known problems with the --unfa/--unfq options:
    • They now work properly with multiple threads.
    • Fixed issue where sequence and quals were sometimes reversed.
    • Fixed other issues causing spurious omission of unaligned reads.
  • Added --maxfa/--maxfq options so that reads that don't align due to the -m limit can be dumped separately from reads that don't align at all.
  • Alignment output is now guaranteed to be "deterministic" even when multiple threads are used. I.e., given the same input reads (in any order) and the same --seed, bowtie will produce the same alignments every time it is run, though not necessarily in the same order. This does not hold across different versions of Bowtie.
  • Multiple other bug fixes.

0.9.7 release - 11/8/08

  • Added new reporting option -m <int> which suppresses all alignments for a particular read if more than <int> reportable alignments exist for it.
  • Threads now buffer all alignments for a particular read/phase then output all alignments in one critical section. This guarantees that all alignments for a given read/phase appear in one consecutive block of the output, even when multiple threads are operating in parallel.
  • Separated the quality-conversion and parsing aspects of the old --solexa-quals argument into separate arguments: --solexa-quals (quality conversion) and --integer-quals (parsing).
  • bowtie-convert now handles the new (post-0.7.0) Maq alignment format. The new format allows Maq tools to handle reads up to 127 bases, whereas the old format was limited to 63 bases. Added a -o option to opt for the old Maq format.
  • New --refout argument sends alignments to a set of files named refXXXXX.map, where XXXXX is the 0-padded index of the reference sequence aligned to. Useful for dealing with large datasets aligned to, e.g., the assembled human genome.
  • Improved tutorial to use a simple simulated read set (included) to do SNP calls with Maq.
  • Added --nota option to bowtie-build
  • Fixed make_h_sapiens_asm.sh script to include mitochondrial DNA.

0.9.8 release - 11/25/08

  • --unfa/--unfq <filename> options cause bowtie to dump unaligned reads to FASTA and/or FASTQ files.
  • bowtie-build now selects its memory-efficiency parameters (--bmax, --dcv, --packed) automatically by default; this makes it far easier to build an index under memory constraints by eliminating tedious trail-and-error. New -a option disables this, yielding old behavior.
  • bowtie-build-packed is no longer a separate binary. Supplying the new -p/--packed argument to bowtie-build is the new equivalent.
  • New tool bowtie-maptool converts between Bowtie's output formats.
  • New tool bowtie-inspect recreates reference strings from Bowtie index.
  • Renamed bowtie-convert to bowtie-maqconvert for clarity.
  • New universal Mac binary combines i386 & x86_64 binaries. PowerPC still not supported.
  • Added --nomaqround option to bowtie.
  • Fixed memory leaks in bowtie.
  • Switched to a new scheme for mapping positions in "joined" reference string to positions in original strings. This changes the index format. bowtie-build's --oldpmap parameter reverts to the old format. Versions of bowtie prior to 0.9.8 cannot search indexes produced by bowtie-build 0.9.8 unless bowtie-build is run with --oldpmap. bowtie 0.9.8 can search either index format. Pre-built indexes are still in the old format, but will switch to new format when Bowtie 1.0 is released.

0.9.7.1 release - 11/11/08

  • Fixed an issue that caused a spurious loss of sensitivity between Bowtie versions 0.9.6 and 0.9.7 in certain modes. Many thanks to Ali Mortazavi for bringing this to our attention.

TopHat released - 11/8/08

  • Cole Trapnell has completed the initial release of TopHat, a fast splice junction mapper for RNA-Seq reads. TopHat aligns RNA-Seq reads to mammalian-sized genomes using Bowtie, and then analyzes the mapping results to identify splice junctions between exons.

Illumina link - 10/22/08


Coming soon...

  • A new addition to the Bowtie family: Tophat. Tophat is a complete Bowtie-based toolchain for RNA-seq applications. Check back soon for the official release.

0.9.6 release - 10/10/08

  • bowtie now supports a host of options that allow the user to specify which and how many valid alignments to report per read. The default is still to report 1 "good" alignment, which is by far the fastest mode. See -k/-a/--best/--nostrata options described in the manual for details.
  • bowtie now supports reads up to 1024 bases long. Note that for reads much longer than, say, 35 bases, the user must be careful to set alignment policy parameters (especially -e) appropriately.
  • --fast flag eliminated, double-index mode is now the default. Added the -z/--phased flag to revert to phased, half-index mode.
  • --concise output mode now officially supported. Now outputs one alignment per line.
  • Changed bowtie-build default back to --bmaxdivn 4.
  • -h/--help now prints much more verbose help for bowtie and bowtie-build (verbatim from MANUAL file).
  • BWT-searching code streamlined; much old code eliminated

0.9.5 release - 9/27/08

  • Last column of output now additionally reports the reference and query bases (in that order) for mismatches. E.g., old: 30,32 new: 30:C>A,32:C>T.
  • Eliminated spurious trailing space in first column of output.
  • Minor performance and sensitivity improvements.
  • New option -p spawns a user-specified number of pthreads for parallel processing of reads. For example, use -p 4 to run bowtie on 4 processor cores simultaneously.
  • Due to the new -p option, bowtie needs pthreads to compile and run. To compile bowtie without pthreads support (which disables the -p option), use make BOWTIE_PTHREADS=0.
  • Also due to '-p' option, the Windows version of Bowtie now comes with the pthreadGC2.dll file from the pthreads for Win32 project. This library is released under the LGPL license.
  • New option --fast causes Bowtie to load both the "forward" and "mirror" halves of the index at once, which eliminates the need for multiple phases and speeds up matching at the cost of using about twice as much memory. --fast also causes bowtie to scale better when used in combination with -p.
  • Fixed crashing bug with -o/--offrate in bowtie.
  • Improved error reporting.

0.9.4 release - 9/16/08

  • New method for handling gaps and ambiguity codes in the reference. New bowtie-build handles long stretches of gaps gracefully. New bowtie rejects alignments that overlap a gap or ambiguous character in the reference.
  • Due to above change, index file format has been changed. All pre-built indexes available on this site have been updated to the new format. To obtain indexes with the old format, contact us.
  • In bowtie unnamed reads are now given ordinal names (rather than "default") in the alignment output. Works for all input modes.
  • New bowtie input mode: Raw, activated with -r. Expects one read sequence per line; no quality values or names.
  • Fixed bowtie bug whereby trimming did not work in -c mode.
  • Changed bowtie-build default to not use blockwise mode.
  • Fixed bowtie-build to avoid certain infinite-loop and very-long-runtime scenarios.
  • Packaging improvements: archives now explode into subdirectories and scripts are executable.

0.9.3 release - 9/6/08

  • Major reference-name bug fixes to bowtie-convert

0.9.2 release - 9/4/08

  • Now allows 3-mismatches: -n and -v options accept 3
  • Output format prints reference name instead of id in third column
  • Pre-built indexes updated to encode reference names
  • Ns in reads now match nothing (previously, they matched As/Ts)
  • Dropped -l/--linerate and -i/--linesperside arguments to bowtie-build
  • Fixed bug in Maq-like mode that allowed some poor alignments
  • Minor speed improvements

0.9.1 release - 8/25/08

  • Integrated relevant SeqAn-1.1 sources into Bowtie source release
  • Now builds on Windows under MinGW (needs pthreads and zlib)
  • Binary releases for Linux (i386, x86_64), Windows (i386) and MacOS X (i386)

0.9.0 release - 8/18/08

The first public release of the Bowtie source is now available for download. The release includes the three core Bowtie tools: the indexer bowtie-build, the read aligner bowtie and the converter from Bowtie's to Maq's mapping output format, bowtie-convert. This is a stable release, and compatible pre-built indexes for many model organisms are also available. See "Source releases" and "Pre-built indexes" on the sidebar. Please report any issues using the Sourceforge bug tracker.

Features of this release include:

  • FASTA, FASTQ inputs supported; tested with Solexa FASTQ
  • Supports Maq alignment policy (-n and -e behave as in Maq)
  • Supports X-mismatch policy (-v option behaves as in SOAP)
  • -n and -v options accept 0, 1, or 2