Site Map
Latest Release
- Please cite: Langmead B, Trapnell C, Pop M, Salzberg SL. Ultrafast and memory-efficient alignment of short DNA sequences to the human genome. Genome Biol 10:R25.
- For release updates, subscribe to the mailing list.
Bowtie 1.3.1 | 09/13/21 |
|
|
|
Links
Related Tools
Pre-built indexes
All indexes are for assemblies, not contigs. Unplaced or unlocalized sequences and alternate haplotype assemblies are excluded.
Some unzip programs cannot handle archives >2 GB. If you have problems downloading or unzipping a >2 GB index, try downloading in two parts.
Check .zip file integrity with MD5s.
Publications
- Langmead B, Trapnell C, Pop M, Salzberg SL. Ultrafast and memory-efficient alignment of short DNA sequences to the human genome. Genome Biology 10:R25.
- Langmead B, Schatz M, Lin J, Pop M, Salzberg SL. Searching for SNPs with cloud computing. Genome Biology 10:R134.
- Trapnell C, Pachter L, Salzberg SL, TopHat: discovering splice junctions with RNA-Seq. Bioinformatics 2009 25(9):1105-1111.
Contributors
- Ben Langmead
- Cole Trapnell
- Daehwan Kim
- Rone Charles
- Chris Wilks
- Valentin Antonescu
Related links
News archive
1.3.1 - 09/13/2021
- Fixed an overflow issue in
bowtie-build
that would sometimes yield corrupt "large" (64-bit) indexes; the resulting index would sometimes causebowtie
to hang. Note:bowtie2-build
does not have this issue. - Fixed an issue in
bowtie
causing XM:i SAM optional field to sometimes be off by 1 when using the-m/-M
flags. - Fixed an issue that would sometimes cause deadlocks in
bowtie
when running multithreaded. - Fixed an issue causing build errors when compiling against a pre-C++11 standard.
1.3.0 - 07/22/2020
- Fixed an issue causing
bowtie
to report incorrect results when using a Bowtie 2 index. - New, more efficient implementation of --reorder for keeping SAM output lines in same order as input reads
- Added -x parameter for specifying index.
bowtie
still supports specifying an index via positional parameter, but this behavior will be deprecated. - Migrated
python
scripts topython3
. - Fully removed colorspace functionality.
- Added support for compiling on ARM architectures.
- Fixed an issue preventing
bowtie
from outputting newlines in --max and --un output files. - Fixed an issue causing alignment results to vary based on read names.
- Fixed an issue preventing --no-unal from suppressing unmapped reads.
- Removed dependence on some third-party libraries, simplifying the code and improving portability.
- Fix an issue preventing
bowtie
from running with many threads on big-endian machines.
1.2.3 - 07/05/2019
- Added support for reading and inspecting Bowtie 2 indexes. Bowtie 2 indexes can now be used with either Bowtie or Bowtie 2.
- Added support for building an index from a gzipped-compressed FASTA.
- Fixed issue preventing bowtie from reporting repeated alignments when -M is specified.
- Fixed issue with -F mode omitting final base of each read.
- Fixed clipping of first letter of first read in batches after first.
- Fixed an issue preventing bowtie wrapper script from finding indexes.
1000-Genomes major-allele SNP references -- 4/26/2019
- For each base where the typical reference has the non-majority allele (according to the 1000 Genomes Project, we substituted in the majority allele instead
- Links for indexes added to sidebar, as are links for the edited FASTA files
- We made versions both for GRCh38 primary assembly and hg19 assembly
- See how we created them
- Only SNPs (single-base substitutions) are considered for now; indels are future work
- Because only SNPs are considered, coordinates (e.g. gene annotations) are the same as for typical GRCh38 and hg19 assemblies. Most downstream tools are unaffected as long as major-allele-edited FASTAs are used wherever genome sequences are required.
1.2.2 - 12/11/2017
Update (12/12/2017): We have had to re-release this version of bowtie to address an issue when compiling with pthreads (make NO_TBB=1).
- Fixed major issue causing corrupt SAM output when using many threads (-p/--threads) on certain systems
- Fixed major issue with incorrect alignment offsets being reported in --large-index mode
- Fixed major issue with reads files being skipped when multiple inputs were specified together with -p/--threads
- The official LICENSE of Bowtie was changed to Artistic License 2.0. This fixes an issue with the previous LICENSE, which mistakenly combined elements of different open-source licenses.
- Fixed issue where bowtie would still run for a long time even when -u was set to a small number.
- Fixed spurious "Reads file contained a pattern with more than 1024 quality values" error for some colorspace inputs.
- Fixed issue with --strata sometimes failing to suppress alignments at lower strata.
- Fixed issue with ends of paired-end reads sometimes appearing in non-adjacent lines of the SAM output with -p/--threads >1
- Fixed issue whereby the read name of end #2 was not always truncated at the first whitespace character
- Code simplifications
1.2.1.1 - 06/13/2017
- Fixed an issue causing Bowtie to segfault when processing reads from stdin
1.2.1 - 06/12/2017
Please note that Bowtie will be switching to the Artistic 2.0 license in the next release.
Pre-build binaries now include statically linked TBB and zlib libraries no longer requiring that
these libraries be pre-installed.
- Fixed an issue which caused Bowtie to hang during parallell index building when running an optimized binary
- Deprecated --refout option. It will be fully removed in the next release
- Added parallel index building with the bowtie2-build --threads option (credit to Aidan Reilly)
- Added native support for gzipped read files. The wrapper script is no longer responsible for this, which simplifies the wrapper and improves speed and thread scaling.
- Added support for interleaved paired-end FASTQ inputs (--interleaved)
- Fixed issue where first character of some read names was omitted from SAM output when using tabbed input formats
- Fixed issue that caused Bowtie to hang when aligning FASTA inputs with more than one thread
- Bowtie wrapper now works even when invoked via a symlink in a different directory from the executables
- Fixed issue preventing reading --12 input on stdin
- Added --no-unal option for suppressing unmapped reads in SAM output
1.2.0 - 12/12/2016
This is a major release with some larger and many smaller changes. These notes emphasize the large changes. See commit history for details.
- Code related to read parsing was completely rewritten to improve scalability to many threads. In short, the critical section is simpler and parses input reads in batches rather than one at a time. The improvement applies to all read formats.
- --reads-per-batch command line parameter added to specify the number of reads to read from the input file at once
- TBB is now the default threading library. We consistently found TBB to give superior thread scaling. It is widely available and widely installed. That said, we are also preserving a "legacy" version of Bowtie that, like previous releases, does not use TBB. To compile Bowtie source in legacy mode use NO_TBB=1. To use legacy binaries, download the appropriate binary archive with "legacy" in the name.
- Bowtie now uses a queue-based lock rather than a spin or heavyweight lock. We find this gives superior thread scaling; we saw an order-of-magnitude throughput improvements at 120 threads in one experiment, for example.
- Unnecessary thread synchronization removed
- Fixed colorspace parsing when primer base is present
- Fixed bugs related to --skip command line option
1.1.2 - 6/23/2015
- Fixed the building process for Mac OS X Yosemite.
- Added install target (make install) for Linux to better aid package building process and the overall installation process.
- Added support for Intel TBB threading, providing better thread scaling in most situations. The default build still uses TinyThread but TBB is used with make WITH_TBB=1.
- Fixed minor issue related with managing the number of threads spawned.
- Fixed minor issue which may have caused a memory leak after an exception was thrown.
- Fixed bug that caused bowtie to crash if a read was trimmed more than the read's length on 5' end.
- Added minor corrections/addition to the manual.
- Fixed bug that caused the wrapper to incorrectly identify the bowtie binary.
Lighter released
- Lighter is an extremely fast and memory-efficient program for correcting sequencing errors in DNA sequencing data. For details on how error correction can help improve the speed and accuracy of downstream analysis tools, see the paper in Genome Biology. Source and software available at GitHub.
1.1.1 - 10/1/2014
- Fixed a compiling linkage problem related with Mac OS X Mavericks.
- Improved performance for cases where the reference contains many stretches of Ns.
- Some minor automatic tests updates.
1.1.0 - 7/19/2014
- Added support for large and small indexes, removing 4-billion-nucleotide barrier. Bowtie can now be used with reference genomes of any size.
- No longer releasing 32-bit binaries. Simplified manual and Makefile accordingly.
- Phased out CygWin support.
- Improved efficiency of index files loading.
- Fixed a bug that made bowtie-inspect fail in some situations.
- (This release was briefly given version number 2.0.0, but we changed it to 1.1.0 to avoid confusion with Bowtie 2.)
1.0.1 release - 3/14/2014
- Improved index querying efficiency using "population count" instructions available since SSE4.2.
- Credits to the Intel(r) enabling team for performance optimizations included in this release. Thank you!
Bowtie on GitHub - 4/11/13
- Bowtie source now lives in a public GitHub repository.
1.0.0 release - 4/9/13
- Finally, a 64-bit Windows binary!
- Due to general performance improvements spinlocking is now used by default. The EXTRA_FLAGS=-DNO_SPINLOCK may be used to reverse this during compilation.
- Fixed some race conditions.
- CygWin builds now use pthreads library.
- Changed MinGW to optionally use pthread library on Win32 platforms.
- Changed the Windows build to use native Windows threads by default.
- Renamed COPYING to LICENSE to be more GitHub-friendly.
- Tokenizer no longer has limit of 10,000 tokens, which was a problem for users trying to index a very large number of FASTA files.
- Removed references to no-longer-implemented --cutoff option for bowtie-build
0.12.9 release - 12/16/12
- Fixed a bug whereby read names would not be truncated at first whitespace character in unmapped or maxed-out SAM records.
- Fixed errors and warnings when compiling with clang++.
- Fixed most errors and warnings when compiling with recent versions of g++, though you may need to add EXTRA_FLAGS=-Wno-enum-compare to avoid all warnings.
0.12.8 release - 5/6/12
- Fixed a bug that would sometimes cause an immediate segmentation fault in --sam mode.
- Fixed make_galGal3.sh script to not omit chromosome 25.
- Removed -B option from usage message for bowtie-build; that option is not implemented.
- Fixed issue that could cause Bowtie not to compile when BOWTIE_PTHREADS is left undefined and pthreads.h is not present.
- Elaborated documentation for -B/--offbase option to indicate that it only affects offsets in the default output mode, not in SAM mode.
Bowtie 2 beta released - 10/16/2011
- Bowtie 2 2.0.0-beta2 is available now.
- Differences between Bowtie 2 and Bowtie 1 include:
-
- For reads longer than about 50 bp Bowtie 2 is generally faster, more sensitive, and uses less memory than Bowtie 1. For relatively short reads (e.g. less than 50 bp) Bowtie 1 is sometimes faster and/or more sensitive.
- Bowtie 2 supports gapped alignment with affine gap penalties. Number of gaps and gap lengths are not restricted, except by way of the configurable scoring scheme. Bowtie 1 finds just ungapped alignments.
- Bowtie 2 supports local alignment, which doesn't require reads to align end-to-end. Local alignments might be "trimmed" ("soft clipped") at one or both extremes in a way that optimizes alignment score. Bowtie 2 also supports end-to-end alignment which, like Bowtie 1, requires that the read align entirely.
- There is no upper limit on read length in Bowtie 2. Bowtie 1 has an upper limit of around 1000 bp.
- Bowtie 2 allows alignments to overlap ambiguous characters (e.g. Ns) in the reference. Bowtie 1 does not.
- Bowtie 2 does away with Bowtie 1's notion of alignment "stratum", and its distinction between "Maq-like" and "end-to-end" modes. In Bowtie 2 all alignments lie along a continuous spectrum of alignment scores where the scoring scheme, similar to Needleman-Wunsch and Smith-Waterman.
- Bowtie 2's paired-end alignment is more flexible. E.g. for pairs that do not align in a paired fashion, Bowtie 2 attempts to find unpaired alignments for each mate.
- Bowtie 2 does not align colorspace reads.
Fixed chicken index - 7/19/2011
- The pre-built chicken genome available from this website was missing chr25. chr25_random was included, but chr25 was erroneously excluded. This is fixed as of today - the index files linked to from the sidebar now contains all chicken chromosomes.
0.12.7 release - 9/7/10
- Fixes the all-gap reference sequence issue that was present in Bowtie 0.12.6. Index files produced by bowtie-build 0.12.5 and earlier (back to 0.10.*) are compatible with bowtie 0.12.7. Index files produced by bowtie-build 0.12.7 are backward compatible with bowtie 0.12.5 and earlier as long as the first reference sequence is not all-gaps (or, when colorspace indexes, as long as the first reference sequence has two consecutive ACGT characters in it somewhere).
- Indexes where the first sequence consists of all gaps or other non-ACGT characters are not handled properly by Bowtie versions 0.12.5 and older, but are handled properly by Bowtie 0.12.7.
- REMOVED: bowtie-build's --old-reverse option; the old reverse- index scheme is again the default. The new scheme is disabled pending further refinement.
0.12.6 release - 8/29/10
- Modified bowtie-inspect's default mode to use the bit-encoded reference portion of the index to reconstruct the reference sequence, rather than the ebwt portion. This makes bowtie-inspect much faster and uses less memory, and the output for a colorspace index will now be in nucleotide space. To get the behavior of the old default, use the new -e/--ebwt-ref option.
- Fixed bug whereby SOLiD QV strings without a trailing space would fail to parse.
- Moved to a new default way of building the reverse index. Revert to the old behavior with bowtie-build's new --old-reverse option. The new reverse index format is forward and backward compatible with `bowtie`, unless otherwise noted in a future version.
- Fixed issue that would sometimes cause bowtie-build to crash when building a large index with a low --offrate.
- Fixed build issue that would cause bowtie-build built on one version of Linux to die with a "floating point error" on other versions. bowtie now simply skips reads with 0 characters. Previously it would print an error and exit.
- Fixed a bug whereby alignment cost could sometimes be miscalculated. Stratum was unaffected.
- bowtie now simply skips reads with 0 characters. Previously it would print an error and exit.
Myrna paper out - 8/11/10
- The provisional version of the Myrna paper appeared today in Genome Biology. See the Myrna site for details about the software.
Myrna released - 7/22/10
- Myrna is cloud-scale tool for calculating differential gene expression in large RNA-seq datasets. It is very similar to Crossbow in design and philosophy; e.g. it uses Bowtie, it's "tri-mode" (runs in the cloud or on a Hadoop cluster or a single computer), it has a web-form-based GUI for composing and submitting cloud jobs, and its cloud mode uses Elastic MapReduce.
Major update to Crossbow - 7/22/10
- Crossbow has received a major update to
version 1.0.4. New features:
- New "tri-mode" design; Crossbow jobs can be run in the cloud (on Amazon AWS), on a Hadoop cluster, or on a single computer, exploiting multiple processor cores
- Added a web-form-based GUI for composing and submitting cloud jobs.
- Cloud version of Crossbow now runs on Elastic MapReduce (EMR) rather than directly on Elastic Compute Cloud (EC2).
Cufflinks paper out - 5/3/10
- The Cufflinks paper, by Cole Trapnell et al is out in Nature Biotechnology. See the supplement for a detailed discussion of the methods behind Cufflinks.
0.12.5 release - 4/10/10
- Fixed spurious "Error while writing string output; not all characters written" errors in -S/--sam mode.
0.12.4 release - 4/5/10
- Periods in read sequences are now treated as Ns instead of ignored. This should help with some problems where Bowtie erroneously reports "Reads file contained a pattern with more than 1024 quality values..." for data from recent versions of the Illumina GA pipeline.
- Fixed a bug whereby some error and warning messages would be printed on top of each other in -p/--threads mode.
- Chunk-exhaustion warnings messages are now suppressed when --quiet is specified.
- Fixed small issue in quality decoding whereby no-confidence colors would incorrectly influence decoded quality of adjacent bases.
0.12.3 release - 2/17/10
- Fixed a significant bug in -C/--color mode whereby quality values for SNP nucleotide positions are erroneously penalized.
- Fixed a bug in -S/--sam mode whereby if whitespace occurred in the original read name, it would be printed in the QNAME field in violation of the SAM spec. Bowtie now truncates read names at the first whitespace character before printing.
- When input is FASTQ and -C/--color is enabled, Bowtie is now tolerant of output from FASTQ converters that include the primer base and where the quality string is one character shorter than the sequence string (due to the primer). This fixes issues users have reported using output from BFAST's solid2fastq.
- Added support for SOLiD-style _QV files, via the -Q/--quals, --Q1 and --Q2 options. These options are used in combination with -C/--color and -f to align parallel colorspace read/quality files without having to convert to FASTQ.
0.12.2 release - 2/2/10
- When -C/--color is enabled, the default paired-end orientation is now --ff, not --fr. The new default fits typical SOLiD output.
- Fixed a bug whereby very large reads could cause bowtie to crash in --best mode
- After 0.12.1, some issues remained whereby bowtie would fail to trim the primer in colorspace (e.g. in -c mode). All input modes should now have fully-functioning colorspace primer trimming.
- Fixed a bug whereby decoded colorspace qualities could overflow and erroneously become low qualities.
- Fixed a bug that could produce incorrect paired-end alignment results in -n mode when using paired-end orientation modes other than --fr Even with the bug, reported results are reasonable; but the seed edit constraint ( -n ) may have been applied to the wrong end of one of the mates.
- Changed --chunkmbs default up to 64 from 32.
- Better error checking and reporting for some bowtie options.
- Some basic testing scripts are now bundled with Bowtie (in scripts/test), which should make it easier to regression-test.
Fixed mm9 colorspace index
- Previously posted version of the colorspace mm9 index was not actually colorspace. This has been fixed.
0.12.1 release - 1/8/10
- IMPORTANT: Fixed bug whereby bowtie would fail to remove both the primer base and the first color when parsing .csfasta files with primer bases in -C -f mode. A workaround for users of version 0.12.0 is to use -5 1 in that situation.
- Fixed bug whereby, when -M limit was exceeded for an unpaired read, the number output in the 7th column (or the XM:i optional field) for the randomly-selected alignment was too low by 1.
- Added discussion to manual about a pitfall regarding SOLiD paired-end input, i.e., not all entries necessarily have corresponding mates in the other file.
0.12.0 release - 12/23/09
- Added missing README.markdown file
- Minor documentation additions
0.12.0-beta1 release - 12/15/09
- Added SOLiD colorspace support
- Colorspace indexes are distinct from standard letterspace indexes and must be built with a separate invocation of bowtie-build (with -C option)
- Running bowtie with -C causes Bowtie to align in colorspace. Both index and reads must be in colorspace.
- Colorspace memory requirement is the same as paired-end alignment in letterspace (normal) mode. Paired-end alignment does not increase the memory requirement further in colorspace.
- csfasta, csfastq, and "raw" read formats are all supported with -C. In read sequences, 0 means "blue" and is intechangeable with A, likewise 1 (C) means "green", 2 (G) means "orange" and 3 (T) means "red".
- Colorspace versions of indexes added (see sidebar)
- New manual section discussing colorspace features
- Fixed a few SAM output issues
- @PG line now properly uses colons instead of equals signs
- Removed /1, /2 suffixes for paired-end reads in SAM mode
- Added --sam-RG option that permits the user to insert set values for flags that appear on the @RG line
- Fixed lingering pthreads bugs that could cause Bowtie to hang or crash toward the end of execution with -p > 1.
- Fixed performance-related bug that would cause paired-end alignment to be artificially slow in many situations.
- Fixed issue with random number generation that would result in non-random selection of alignments in some situations.
- bowtie -f now supports fasta files with reads split across multiple lines
- New --suppress option suppresses unwanted columns of output
- The MANUAL file was converted to markdown format, facilitating conversion to various other formats using tools like pandoc
- Deprecated: bowtie: --concise, bowtie-build: --big, --little
- Removed: -z/--phased, -b/--binout, bowtie-maptool, bowtie-maqconvert
Crossbow paper out - 11/20/09
- The provisional version of the Crossbow paper appeared today in Genome Biology.
New pre-built index list - 11/14/09
- The roster of pre-built indexes has been changed to address a number of user requests and issues and to streamline maintenance. Now:
- If there is an index not on the list that you use on a regular basis and would like to be made availble pre-built, contact us. We may add more if enough users request them.
- Note: neither UCSC nor NCBI indexes contain unplaced contigs (UCSC: "random" files, NCBI: Chromosome "Un") or alternative haplotype assemblies.
Bowtie on Galaxy - 10/15/09
- The Galaxy project has integrated Bowtie as one of the tools available for aligning short reads (under "NGS: Mapping" on in the "Tools" box). Many thanks to Anton Nekrutenko for his work on that.
0.11.3 release - 10/12/09
- Fixed crashing bug in -S/--sam mode when the number of reference sequences in the index is very large.
- Added --sam-nohead option to suppress output of SAM headers in -S/--sam mode.
- Added --sam-nosq option to suppress output of @SQ SAM headers in -S/--sam mode. These can become a nuisance when the reference index contains a very large number of sequences.
- Fixed a bug in bowtie-build's auto-configure mode that would cause it to underestimate the amount of memory required by a set of parameters. This in turn would cause the index to be corrupted.
Crossbow released - 10/9/09
- Crossbow is a cloud-computing pipeline for whole genome resequencing. It combines Bowtie and SoapSNP, an accurate genotyper, on top of Hadoop. Crossbow includes scripts that run automatically on Amazon's EC2 utility computing service for a fee, allowing researchers without extravagant computing resources to conduct large experiments efficiently. Authors are Ben Langmead and Michael C. Schatz.
0.11.2 release - 10/7/09
- Fixed issue whereby --max option was disabled.
0.11.1 mini-release - 10/5/09
- SAM output: changed XS:i optional field to be named XA:i to avoid a conflict with TopHat's XS:i field.
0.11.0 release - 10/5/09
- Initial SAM output support with -S/--sam option. bowtie sets all fields according to the SAM spec (Version 0.1.2-draft, 20090820). See the new SAM Output section of the manual for details.
- The Getting Started Guide and TUTORIAL file have been updated to use SAMtools instead of Maq for the SNP calling step.
- Added --shmem option: --shmem is similar to --mm in that it allows concurrent bowtie processes querying the same index to share a single memory image of the index. Unlike --mm, shared memory alocated by --shmem is permanent.
- The alignment summary printed to stderr at the end of an alignment run is now more friendly and includes data about the number and proportion of reads that aligned, failed to align, or were suppressed via the -m option.
- When too-short reads are encountered, bowtie now always prints warnings, not errors. --quiet now suppresses those warnings.
- By default, when bowtie prints a reference sequence name it now stops at the first whitespace. In 0.10.1, the default was to print the entire name, which could cause confusion when parsing bowtie output. To revert to printing the full name, use the new --fullref option.
- bowtie now prints the command-line before exiting with an error.
- Fixed mistake in the manual's "Default output" section: the offset in field 4 is 0-based, not 1-based. To obtain a 1-based offset instead, use the -B 1 option.
- Various minor bug fixes.
- Deprecated: -z/--phased, -b/--binout, bowtie-maptool, bowtie-maqconvert. These features will be removed in a future version of Bowtie. Note that -b/--binout, bowtie-maptool, and bowtie-maqconvert are largely superseded by the SAM output format (-S/--sam), BAM, and SAMtools. Contact the authors if this is a problem.
- Removed: --unfq/--unfa/--maxfq/--maxfa/--alfq/--alfa. Please use --un/--max/--al instead. Contact the authors if this is a problem.
Cufflinks released - 9/28/09
- Cufflinks, a tool that assembles transcripts and estimates their abundances in RNA-Seq samples was released today. The author is Cole Trapnell, who is also the author of TopHat.
10,000 downloads - 9/21/09
- Since the first version was released a little over a year ago, Bowtie has been downloaded more than 10,000 times. Our sincere thanks to users whose bug reports, suggestions and help have made this possible. Thank you!
New human index - 8/26/09
- Added index files for NCBI 37.1 assembly.
New cow index - 8/26/09
- Added index files for the 3.0 version of the University of Maryland Bos Taurus (cow) assembly. See sidebar.
0.10.1 release - 7/19/09
- Now when -3/-5 are used in combination with -I/-X, the -I/-X constraints are interpreted as applying to the original insert, not the trimmed insert.
- Fixed issue whereby -I option was ignored; -I option works now.
- Fixed a bug whereby some large indexes were incorrectly reported as corrupt by bowtie-build.
- Fixed issue whereby negative quality values were wrongly rejected when both --integer-quals and --solexa-quals were specified.
- The -l/--seedlen parameter can now be adjusted down to 5 (previously had to be >= 20).
- Fixed several minor memory leaks and out-of-bounds issues. The Linux version of the bowtie aligner now receives a clean bill of health from valgrind's memcheck.
- Other minor bugfixes.
0.10.0.2 release - 6/28/09
- Second bugfix for Windows version. src and bin-win32 packages updated. Linux and Mac users are not affected. Thanks for your bug reports and patience.
0.10.0.1 release - 6/23/09
- Fix for crashing bug in Windows version. src and bin-win32 packages updated. Linux and Mac users are not affected.
Note to TopHat users
- TopHat is not yet compatible with Bowtie 0.10.0. Stay tuned to the TopHat site for updated information on TopHat/Bowtie 0.10.0 compatibility.
0.10.0 release - 6/15/09
- Major change: All alignment modes are now unstratified by default. The --nostrata option has been removed, since it is now the default. A --strata option has been added to override the default and force stratified reporting. Reporting is stratified if and only if --strata is specified. --strata now cannot be specified without also specifying --best. Please note that, because of this change, specifying the same arguments to this version of Bowtie may yield different reported results.
- Replaced the --unfa/--unfq options with a single --un option, which writes unaligned reads to an output file (or pair of output files) but keeps reads in their original form. This is in contrast to the old --unfa/--unfq options, which only supported FASTA or FASTQ formats, and which would print a post-trimming and post-quality-value translation version of the read. Likewise, the --alfa/--alfq and --maxfa/--maxfq options have been replaced with --al and --max options. The old options are still present, but are deprecated and will be removed in a future version.
- Added --nofw and --norc options, allowing alignment to just one reference strand or the other.
- Added --mm option that causes bowtie to use memory-mapped files instead of traditional file I/O to access the reference index. This allows multiple bowtie processes running on the same computer to share a single in-memory image of a given index. This is a useful feature for parallelizing bowtie in situations where memory is limited and where -p is inappropriate or insufficient. This feature is not available in the Windows version of Bowtie.
- Added a section to the manual ("Reporting Modes") clarifying and giving examples of how to use Bowtie's reporting options.
- The --al and -z/--phased options previously interacted in such a way that the --al file could contain multiple entries for the same aligned read. --al and -z/--phased are now incompatible.
- The --oldpmap option, deprecated in version 0.9.8, has been removed.
0.9.9.3 release - 5/12/09
- Fixed an issue where bowtie --best would sometimes use excessive amounts of memory to store path descriptors. There is now a per-thread 32-MB ceiling (configurable with new option --chunkmbs <int>) on the memory taken by path descriptors. If the ceiling is exceeded Bowtie will skip the offending read, print a warning message identifying the read, and continue.
- More options are available for defining the quality-value format, including new --phred64-quals/--solexa1.3-quals options appropriate for the 64-based-Phred output of Illumina's GA Pipeline 1.3. Added option --phred33-quals to to handle the more typical 33-based-Phred scale (the default). The --solexa-quals option still handles the 64-based-Solexa scale output by GA Pipeline versions prior to 1.3.
- bowtie-build now checks output files for obvious corruption due, for example, to disk exhaustion.
- Specifying - (meaning stdin) as an input to bowtie is now supported and documented.
- Fixed a bug whereby bowtie-maqconvert could fail to notice that it had exhausted memory and output a corrupt Maq map file.
- Fixed a bug whereby bowtie would crash when trying to use an index built on a machine with different endianness.
- Fixed several issues that prevented Bowtie from compiling on Solaris. I confirm that Bowtie builds and runs on Solaris 10.
- Added _LARGEFILE_SOURCE _FILE_OFFSET_BITS=64 _GNU_SOURCE to the default build options in an attempt to resolve some of the large-file issues users are having.
- Clarified column 7 in the manual. We received many queries from users curious about this number.
- Moderate speed improvements in --best mode.
0.9.9.2 release - 4/6/09
- Paired-end alignment is now available in all alignment modes, including all -n modes.
- --best now provides better guarantees. Reported alignments are now guaranteed to be "best" both in terms of stratum (i.e. number of mismatches, or mismatches in the seed in the case of -n mode), and in terms of the quality values at the mismatched position(s). Stratum always trumps quality when determining best alignments. Also, --best mode resolves the strand bias issue that is present in the default mode (see manual for a discussion of the issue).
- Speed improvements for --best mode in most alignment modes.
- Major speed improvement for the -v 3 alignment mode (except when -z is also used).
- The "Reported X alignments..." message is now printed to stderr rather than stdout. Only alignments are written to stdout.
- In bowtie-maqconvert, read names longer than Maq's limit (36) are now truncated to a suffix of the original name, rather than a prefix. This mimics Maq's behavior and prevents /1 and /2 suffixes for paired-end reads from being destroyed.
- Added --alfq/--alfa options to dump aligned reads to FASTQ and/or FASTA files.
- Removed many extraneous source files.
TopHat paper out - 3/17/09
- The TopHat paper, by Cole Trapnell, Lior Pachter, and Steven L. Salzberg is out in Bioinformatics advance access.
0.9.9.1 release - 3/11/09
- Added paired-end alignment for -v 2 and -v 3 alignment modes (-n modes coming soon).
- Minor bug fixes and speed improvements for all paired-end modes.
- Added -s/--skip <int> option to skip over the first <int> reads or pairs in the input.
- --unfq/--unfa/--maxfq/--maxfa modes no longer create empty output files.
- All Bowtie tools now compile under GCC 4.3.3.
- Fixed bug whereby bowtie -b would sometimes write garbage into the reference offset field.
- Paired-end info is now persisted in the -b format, allowing bowtie-maptool output to add /1 and /2 suffixes as appropriate.
Final paper out - 3/10/09
- The Bowtie paper in Genome Biology is now final.
0.9.9 release - 2/19/09
- Added some preliminary support for paired-end alignment in -v 0 and -v 1 modes. -1/-2 options to specify the paired-end files, -I/-X to specify min and max insert sizes, and --fr/--rf/--ff specify relative orientation of upstream and downstream mates. bowtie-build now builds two additional files: NAME.3.ebwt and NAME.4.ebwt. Together, these files store a bitpacked version of the reference and they are required for paired-end alignment. If your index does not include these files and you would like to perform a paired-end alignment, you will have to rebuild the index with bowtie-build version 0.9.9 or later. Paired-end alignment is not compatible with -z mode, and it incurs about a 30% greater memory overhead than single-end mode.
- Pre-built indexes available from Bowtie website have been updated to include .3/.4.ebwt index files. These new pre-built indexes are no longer compatible with bowtie versions prior to 0.9.8.
- New -B/--offbase option allows user to specify how bowtie numbers reference positions in its output. E.g. -B 1 causes bowtie to number leftmost char as 1. -B 0 is the default, but -B 1 will likely become the default in the 1.0 release.
- Fixed a bug that caused trimming options -3 and -5 not to work properly in -r (raw input) mode.
- bowtie-build now prints a friendly error message and exits if an input file doesn't exist.
- Fixed a bug that caused the Win32 version of bowtie to hang just before it would normally have exited.
- Fixed bug that could prevent successful read-in of very large (>1GB) .2.ebwt index files.
- Removed --maxns option since it's mostly redundant with what -v and -n already do.
- Removed --ntoa option.
- bowtie usage message is now divided into sections for clarity.
Added AGBT poster - 2/5/09
- ...to Other Documentation section in the right-hand sidebar
Paper coming soon - 1/27/09
- Bowtie paper Ultrafast and memory-efficient alignment of short DNA sequences to the human genome is accepted at Genome Biology. A link to the paper will be posted when available.
- Added Related Tools, Publications, Other Documentation sections to right-hand sidebar.
Paper out - 3/4/09
- The provisional PDF of our paper Ultrafast and memory-efficient alignment of short DNA sequences to the human genome is available at Genome Biology.
Current goals - 3/4/09
- Current goals in rough order of priority are:
- Expanding preliminary paired-end support (new in 0.9.9) into full paired-end support.
- Better alignment guarantees.
- Gapped alignment.
- Also of interest: colorspace support and mapping qualities.
0.9.8.1 release - 1/7/09
- Fixed all known problems with the --unfa/--unfq options:
- They now work properly with multiple threads.
- Fixed issue where sequence and quals were sometimes reversed.
- Fixed other issues causing spurious omission of unaligned reads.
- Added --maxfa/--maxfq options so that reads that don't align due to the -m limit can be dumped separately from reads that don't align at all.
- Alignment output is now guaranteed to be "deterministic" even when multiple threads are used. I.e., given the same input reads (in any order) and the same --seed, bowtie will produce the same alignments every time it is run, though not necessarily in the same order. This does not hold across different versions of Bowtie.
- Multiple other bug fixes.
0.9.7 release - 11/8/08
- Added new reporting option -m <int> which suppresses all alignments for a particular read if more than <int> reportable alignments exist for it.
- Threads now buffer all alignments for a particular read/phase then output all alignments in one critical section. This guarantees that all alignments for a given read/phase appear in one consecutive block of the output, even when multiple threads are operating in parallel.
- Separated the quality-conversion and parsing aspects of the old --solexa-quals argument into separate arguments: --solexa-quals (quality conversion) and --integer-quals (parsing).
- bowtie-convert now handles the new (post-0.7.0) Maq alignment format. The new format allows Maq tools to handle reads up to 127 bases, whereas the old format was limited to 63 bases. Added a -o option to opt for the old Maq format.
- New --refout argument sends alignments to a set of files named refXXXXX.map, where XXXXX is the 0-padded index of the reference sequence aligned to. Useful for dealing with large datasets aligned to, e.g., the assembled human genome.
- Improved tutorial to use a simple simulated read set (included) to do SNP calls with Maq.
- Added --nota option to bowtie-build
- Fixed make_h_sapiens_asm.sh script to include mitochondrial DNA.
0.9.8 release - 11/25/08
- --unfa/--unfq <filename> options cause bowtie to dump unaligned reads to FASTA and/or FASTQ files.
- bowtie-build now selects its memory-efficiency parameters (--bmax, --dcv, --packed) automatically by default; this makes it far easier to build an index under memory constraints by eliminating tedious trail-and-error. New -a option disables this, yielding old behavior.
- bowtie-build-packed is no longer a separate binary. Supplying the new -p/--packed argument to bowtie-build is the new equivalent.
- New tool bowtie-maptool converts between Bowtie's output formats.
- New tool bowtie-inspect recreates reference strings from Bowtie index.
- Renamed bowtie-convert to bowtie-maqconvert for clarity.
- New universal Mac binary combines i386 & x86_64 binaries. PowerPC still not supported.
- Added --nomaqround option to bowtie.
- Fixed memory leaks in bowtie.
- Switched to a new scheme for mapping positions in "joined" reference string to positions in original strings. This changes the index format. bowtie-build's --oldpmap parameter reverts to the old format. Versions of bowtie prior to 0.9.8 cannot search indexes produced by bowtie-build 0.9.8 unless bowtie-build is run with --oldpmap. bowtie 0.9.8 can search either index format. Pre-built indexes are still in the old format, but will switch to new format when Bowtie 1.0 is released.
0.9.7.1 release - 11/11/08
- Fixed an issue that caused a spurious loss of sensitivity between Bowtie versions 0.9.6 and 0.9.7 in certain modes. Many thanks to Ali Mortazavi for bringing this to our attention.
TopHat released - 11/8/08
- Cole Trapnell has completed the initial release of TopHat, a fast splice junction mapper for RNA-Seq reads. TopHat aligns RNA-Seq reads to mammalian-sized genomes using Bowtie, and then analyzes the mapping results to identify splice junctions between exons.
Illumina link - 10/22/08
- Bowtie is now listed on Illumina's third-party tools page
Coming soon...
- A new addition to the Bowtie family: Tophat. Tophat is a complete Bowtie-based toolchain for RNA-seq applications. Check back soon for the official release.
0.9.6 release - 10/10/08
- bowtie now supports a host of options that allow the user to specify which and how many valid alignments to report per read. The default is still to report 1 "good" alignment, which is by far the fastest mode. See -k/-a/--best/--nostrata options described in the manual for details.
- bowtie now supports reads up to 1024 bases long. Note that for reads much longer than, say, 35 bases, the user must be careful to set alignment policy parameters (especially -e) appropriately.
- --fast flag eliminated, double-index mode is now the default. Added the -z/--phased flag to revert to phased, half-index mode.
- --concise output mode now officially supported. Now outputs one alignment per line.
- Changed bowtie-build default back to --bmaxdivn 4.
- -h/--help now prints much more verbose help for bowtie and bowtie-build (verbatim from MANUAL file).
- BWT-searching code streamlined; much old code eliminated
0.9.5 release - 9/27/08
- Last column of output now additionally reports the reference and query bases (in that order) for mismatches. E.g., old: 30,32 new: 30:C>A,32:C>T.
- Eliminated spurious trailing space in first column of output.
- Minor performance and sensitivity improvements.
- New option -p spawns a user-specified number of pthreads for parallel processing of reads. For example, use -p 4 to run bowtie on 4 processor cores simultaneously.
- Due to the new -p option, bowtie needs pthreads to compile and run. To compile bowtie without pthreads support (which disables the -p option), use make BOWTIE_PTHREADS=0.
- Also due to '-p' option, the Windows version of Bowtie now comes with the pthreadGC2.dll file from the pthreads for Win32 project. This library is released under the LGPL license.
- New option --fast causes Bowtie to load both the "forward" and "mirror" halves of the index at once, which eliminates the need for multiple phases and speeds up matching at the cost of using about twice as much memory. --fast also causes bowtie to scale better when used in combination with -p.
- Fixed crashing bug with -o/--offrate in bowtie.
- Improved error reporting.
0.9.4 release - 9/16/08
- New method for handling gaps and ambiguity codes in the reference. New bowtie-build handles long stretches of gaps gracefully. New bowtie rejects alignments that overlap a gap or ambiguous character in the reference.
- Due to above change, index file format has been changed. All pre-built indexes available on this site have been updated to the new format. To obtain indexes with the old format, contact us.
- In bowtie unnamed reads are now given ordinal names (rather than "default") in the alignment output. Works for all input modes.
- New bowtie input mode: Raw, activated with -r. Expects one read sequence per line; no quality values or names.
- Fixed bowtie bug whereby trimming did not work in -c mode.
- Changed bowtie-build default to not use blockwise mode.
- Fixed bowtie-build to avoid certain infinite-loop and very-long-runtime scenarios.
- Packaging improvements: archives now explode into subdirectories and scripts are executable.
0.9.3 release - 9/6/08
- Major reference-name bug fixes to bowtie-convert
0.9.2 release - 9/4/08
- Now allows 3-mismatches: -n and -v options accept 3
- Output format prints reference name instead of id in third column
- Pre-built indexes updated to encode reference names
- Ns in reads now match nothing (previously, they matched As/Ts)
- Dropped -l/--linerate and -i/--linesperside arguments to bowtie-build
- Fixed bug in Maq-like mode that allowed some poor alignments
- Minor speed improvements
0.9.1 release - 8/25/08
- Integrated relevant SeqAn-1.1 sources into Bowtie source release
- Now builds on Windows under MinGW (needs pthreads and zlib)
- Binary releases for Linux (i386, x86_64), Windows (i386) and MacOS X (i386)
0.9.0 release - 8/18/08
The first public release of the Bowtie source is now available for download. The release includes the three core Bowtie tools: the indexer bowtie-build, the read aligner bowtie and the converter from Bowtie's to Maq's mapping output format, bowtie-convert. This is a stable release, and compatible pre-built indexes for many model organisms are also available. See "Source releases" and "Pre-built indexes" on the sidebar. Please report any issues using the Sourceforge bug tracker.
Features of this release include:
- FASTA, FASTQ inputs supported; tested with Solexa FASTQ
- Supports Maq alignment policy (-n and -e behave as in Maq)
- Supports X-mismatch policy (-v option behaves as in SOAP)
- -n and -v options accept 0, 1, or 2