Site Map
Latest Release
- Please cite: Langmead B, Trapnell C, Pop M, Salzberg SL. Ultrafast and memory-efficient alignment of short DNA sequences to the human genome. Genome Biol 10:R25.
| Bowtie 0.11.3 | 10/12/09 |
|
| |
Related Tools
Pre-built indexes
| H. sapiens, UCSC hg18 | 2.7 GB | |
|
||
| H. sapiens, UCSC hg19 | 2.7 GB | |
|
||
| H. sapiens, NCBI 36.3 | 2.7 GB | |
|
||
| H. sapiens, NCBI 37.1 | 2.7 GB | |
|
||
| M. musculus, UCSC mm9 | 2.4 GB | |
|
||
| M. musculus, NCBI 37.1 | 2.4 GB | |
|
||
| B. taurus, UMD 3.0 | 2.1 GB | |
|
||
| D. melanogaster, Flybase, r5.22 | 153 MB | |
| A. thaliana, TAIR, TAIR9 | 119 MB | |
| C. elegans, Wormbase, WS200 | 77 MB | |
| S. cerevisiae, CYGD | 15 MB | |
| E. coli, NCBI, st. 536 | 5.0 MB | |
All indexes are for assemblies, not contigs. Unplaced or unlocalized sequences and alternate haplotype assemblies are excluded.
Some unzip programs cannot handle archives >2 GB. If you have problems downloading or unzipping a >2 GB index, try downloading in two parts.
Check .zip file integrity with MD5s.
Pre-built indexes are compatible with Bowtie versions 0.9.8 and later. For older indexes, please contact us.
Publications
- Langmead B, Trapnell C, Pop M, Salzberg SL. Ultrafast and memory-efficient alignment of short DNA sequences to the human genome. Genome Biology 10:R25.
- Langmead B, Schatz M, Lin J, Pop M, Salzberg SL. Searching for SNPs with cloud computing. Genome Biology 10:R134.
- Trapnell C, Pachter L, Salzberg SL, TopHat: discovering splice junctions with RNA-Seq. Bioinformatics 2009 25(9):1105-1111.
Other Documentation
Authors
Links
Manual
What is Bowtie?Bowtie is an ultrafast, memory-efficient short read aligner geared toward quickly aligning large sets of short DNA sequences (reads) to large genomes. It aligns 35-base-pair reads to the human genome at a rate of 25 million reads per hour on a typical workstation. Bowtie indexes the genome with a Burrows-Wheeler index to keep its memory footprint small: for the human genome, the index is typically about 2.2 GB (for unpaired alignment) or 2.9 GB (for paired-end alignment). Multiple processor cores can be used simultaneously to achieve greater alignment speed. Bowtie can also output alignments in the standard SAM format, allowing Bowtie to interoperate with other tools supporting SAM, including the SAMtools consensus, SNP, and indel callers. Bowtie runs on the command line under Windows, Mac OS X, Linux, and Solaris. What isn't Bowtie?Bowtie is not a general-purpose alignment tool like MUMmer, BLAST or Vmatch. Bowtie works best with short reads (though it supports reads up to 1024 bases in length) and is designed to be extremely fast for read sets where a) many of the reads have at least one good, valid alignment, b) many of the reads are relatively high-quality, and c) the number of alignments reported per read is small (close to 1). These criteria are generally satisfied in the context of mammalian resequencing projects, but poor running times may be observed in other contexts. Bowtie does not yet work in ABI color space and does not yet report gapped alignments. These features are future work. Obtaining Bowtie BinariesObtain Bowtie binaries for your platform from the Download section of the Sourceforge project site. Binaries are currently available for Intel architectures (i386 and x86_64) running Linux, Windows, and Mac OS X. Obtaining and Building Bowtie SourcesBuilding Bowtie requires a GNU-like environment that includes GCC, GNU Make and other basics. It should be possible to build Bowtie on a vanilla Linux or Mac installation. Bowtie can also be built on Windows using Cygwin or MinGW (recommended). If building with MinGW, first install MinGW and MSYS, the zlib library and the pthreads library. You may also need the GnuWin32 core and other utilities to drive the build process. Bowtie depends on code from Maq and from the SeqAn. However, all supporting code is included in the Bowtie source archive, so there is no need to download additional sources. Obtain Bowtie sources from the Download section of the Sourceforge project site. Extract the sources, change to the directory where they were extracted, and build the Bowtie tools by running GNU make (usually with the command make, but sometimes with gmake) with no arguments. If building with MinGW, run GNU make from the MSYS environment. Due to the -p option, Bowtie needs the pthreads library to compile and run. To compile Bowtie without pthreads support (which disables the -p option), use make BOWTIE_PTHREADS=0.Using the bowtie AlignerThe bowtie aligner takes an index and a set of reads as input and outputs a list of alignments. Alignments are selected according to a combination of the -v/-n/-e/-l options (plus the -I/-X/--fr/--rf/--ff options for paired-end alignment), which define which alignments are legal, and the -k/-a/-m/--best/--strata options which define which and how many legal alignments should be reported. By default, Bowtie enforces an alignment policy equivalent to Maq's quality-aware policy (-n 2 -l 28 -e 70), but it can also be made to enforce an end-to-end k-difference policy equivalent to SOAP's (-v 2). Bowtie is designed to be very fast for read sets where a) many of the reads have at least one good, valid alignment, b) many of the reads are relatively high-quality, c) the number of alignments reported per read is small (close to 1). These criteria are generally satisfied in the context of modern short-read analyses such as RNA-seq, ChIP-seq, other types of -seq, and especially mammalian genotyping (e.g. the 1000 Genomes Project). You may observe longer running times in other research contexts. If you find Bowtie's performance to be disappointingly slow, please try the hints described in the "High Performance Tips" section below. If Bowtie continues to be too slow, please contact us and tell us the nature of your research application and the parameters you are using to run Bowtie. We are eager to hear your feedback. A result of Bowtie's indexing strategy is that alignments involving one or more ambiguous reference characters (N, -, R, Y, etc.) are considered invalid by Bowtie, regardless of the alignment policy. This is true only for ambiguous characters in the reference; alignments involving ambiguous characters in the read are legal, subject to the alignment policy. Also, alignments that "fall off" the reference sequence are not considered legal by Bowtie, though some such alignments will become legal once gapped alignment is implemented. The process by which bowtie chooses an alignment to report is randomized in order to avoid "mapping bias" - the phenomenon whereby an aligner systematically fails to report a particular class of good alignments, causing spurious "holes" in the comparative assembly. Whenever bowtie reports a subset of the valid alignments that exist, it makes an effort to sample them randomly. This randomness flows from a simple seeded pseudo-random number generator and is "deterministic" in the sense that Bowtie will always produce the same results for the same read when run with the same initial "seed" value (see documentation for --seed option). In the default mode, bowtie can exhibit strand bias. Strand bias occurs when input reference and reads are such that (a) some reads align equally well to sites on the forward and reverse strands of the reference, and (b) the number of such sites on one strand is different from the number on the other strand. When this happens for a given read, bowtie effectively chooses one strand or the other with 50% probability, then reports a randomly-selected alignment for that read from among the sites on the selected strand. This tends to overassign alignments to the sites on the strand with fewer sites and underassign to sites on the strand with more sites. The effect is mitigated, though it may not be eliminated, when reads are longer or when paired- end reads are used. Running Bowtie in --best mode eliminates strand bias by forcing Bowtie to select one strand or the other with a probability that is proportional to the number of best sites on the strand. Gapped alignments are not currently supported, but we do plan to implement this in the future. Alignment in ABI "color space" is also not currently supported. Maq-like PolicyWhen the -n option is specified (and it is by default), Bowtie determines which alignments are valid according to the following policy, which is equivalent to Maq's default policy:
The N, L and E parameters are configured using Bowtie's -n, -l and -e options. The -n (Maq-like) option is mutually exclusive with the -v (end-to-end k-difference) option. If there are many possible alignments satisfying these criteria, Bowtie will prefer to report alignments with fewer mismatches and where the sum from criterion 2 is smaller. However, Bowtie does not guarantee that the reported alignment(s) are "best" in terms of the number of mismatches (i.e. the alignment "stratum") or in terms of the quality values at the mismatched positions unless the --best option is specified. Bowtie is about 1 to 2.5 times slower when --best is specified. Note that Maq internally rounds base qualities to the nearest 10 and rounds qualities greater than 30 to 30. To maintain compatibility with Maq, Bowtie does the same. Rounding can be suppressed with the --nomaqround option. Bowtie is not fully sensitive in -n 2 and -n 3 modes by default. In these modes Bowtie imposes a "backtracking limit" to limit effort spent trying to find valid alignments for low-quality reads that are unlikely to have any. This may cause bowtie to miss some legal 2- and 3-mismatch alignments. The limit is set to a reasonable default (125 without --best, 800 with --best), but the user may decrease or increase the limit using the --maxbts and/or -y options. -y mode is slow but guarantees full sensitivity. End-to-end k-difference PolicyThe policy has one criterion: Alignments may have no more than V mismatches. Quality values are ignored. The number of mismatches permitted is configurable with the -v option. The -v (end-to-end) option is mutually exclusive with the -n (Maq-like) option. If there are many possible alignments satisfying this criterion, Bowtie will prefer to report alignments with fewer mismatches. However, for reads where the "best" alignment has one or more mismatches, Bowtie does not guarantee that the reported alignment(s) will be best unless the --best option is specified. Bowtie is typically about 1 to 2.5 times slower when --best is specified. Reporting Modes
With the -k, -a, -m, --best and --strata options, Bowtie gives the
user a great deal of flexibility in selecting which alignments get
reported. Here we give a few examples that demonstrate a few ways
they can be combined to achieve a desired result. All examples are
using the e_coli index that comes packaged with Bowtie. Alignment
summary output is elided.
Example 1: -a$ ./bowtie -a -v 2 e_coli --concise -c ATGCATCATGCGCCAT
Specifying -a instructs bowtie to report all valid alignments,
subject to the alignment policy: -v 2. In this case, bowtie finds
5 inexact hits in the E. coli genome; 1 hit (the 2nd one listed)
has 1 mismatch and 4 hits have 2 mismatches. Note that they are
not necessarily listed in best-to-worst order.
Example 2: -k 3$ ./bowtie -k 3 -v 2 e_coli --concise -c ATGCATCATGCGCCAT
Specifying -k 3 instructs bowtie to report up to 3 valid
alignments. In this case, a total of 5 valid alignments exist (see
Example 1); bowtie reports 3 out of those 5. -k can be set to any
integer greater than 0.
Example 3: -k 6$ ./bowtie -k 6 -v 2 e_coli --concise -c ATGCATCATGCGCCAT
Specifying -k 6 instructs bowtie to report up to 6 valid
alignments. In this case, a total of 5 valid alignments exist, so
bowtie reports all 5.
Example 4: default (-k 1)$ ./bowtie -v 2 e_coli --concise -c ATGCATCATGCGCCAT
Leaving the reporting options at their defaults causes Bowtie to
report the first valid alignment it encounters. Because --best was
not specified, we are not guaranteed that bowtie will report the
best alignment, and in this case it does not (the 1-mismatch
alignment from the previous example would have been better). The
default reporting mode is equivalent to -k 1.
Example 5: -a --best$ ./bowtie -a --best -v 2 e_coli --concise -c ATGCATCATGCGCCAT
Specifying -a --best results in the same alignments being printed
as if just -a had been specified, but they are guaranteed to be
reported in best-to-worst order.
Example 6: -a --best --strata$ ./bowtie -a --best --strata -v 2 e_coli --concise -c ATGCATCATGCGCCAT
Specifying --strata in addition to -a and --best causes Bowtie to
report only those alignments in the best alignment "stratum". The
alignments in the best stratum are those having the least number of
mismatches (or mismatches just in the "seed" portion of the
alignment in the case of -n mode). Note that if --strata is
specified, --best must also be specified.
Example 7: -a -m 3$ ./bowtie -a -m 3 -v 2 e_coli --concise -c ATGCATCATGCGCCAT
Specifying -m 3 instructs bowtie to refrain from reporting any
alignments for reads having more than 3 reportable alignments. The
-m option is useful when the user would like to guarantee that
reported alignments are "unique", for some definition of unique.
Example 1 showed that the read has 5 reportable alignments when -a
and -v 2 are specified, so the -m 3 limit causes bowtie to output
no alignments.
Example 8: -a -m 5$ ./bowtie -a -m 5 -v 2 e_coli --concise -c ATGCATCATGCGCCAT
Specifying -m 5 instructs bowtie to refrain from reporting any
alignments for reads having more than 5 reportable alignments.
Since the read has exactly 5 reportable alignments, the -m 5 limit
allows bowtie to print them as usual.
Example 9: -a -m 3 --best --strata$ ./bowtie -a -m 3 --best --strata -v 2 e_coli --concise -c ATGCATCATGCGCCAT Specifying -m 3 instructs bowtie to refrain from reporting any alignments for reads having more than 3 reportable alignments. As we saw in Example 6, the read has only 1 reportable alignment when -a, --best and --strata are specified, so the -m 3 limit allows bowtie to print that alignment as usual. Intuitively, the -m option, when combined with the --best and --strata options, guarntees a principled, though somewhat weaker form of "uniqueness." A stronger form of uniqueness is enforced when -m is specified but --best --strata are not. Paired-end AlignmentBowtie can align paired-end reads when paired read files are specified using the -1 and -2 options (for pairs of raw, FASTA, or FASTQ read files), or using the --12 option (for Tab-delimited read files). A valid paired-end alignment satisfies the following criteria:
Policies governing which paired-end alignments are reported for a given read are specified using the -k, -a and -m options as usual. The --strata and --best options do not apply in paired-end mode. A paired-end alignment is reported as a pair of mate alignments, both on a separate line, where the alignment for each mate is formatted the same as an unpaired (singleton) alignment. The alignment for the mate that occurs closest to the beginning of the reference sequence (the "upstream" mate) is always printed before the alignment for the downstream mate. Reads files containing paired-end reads will sometimes name the reads according to whether they are the #1 or #2 mates by appending a "/1" or "/2" suffix to the read name. If no such suffix is present in Bowtie's input, the suffix will be added when Bowtie prints read names in alignments. Finding a valid paired-end alignment where both mates align to repetitive regions of the reference can be very time-consuming. By default, Bowtie avoids much of this cost by imposing a limit on the number of "tries" it makes to match an alignment for one mate with a nearby alignment for the other. The default limit is 100. This causes Bowtie to miss some valid paired-end alignments where both mates lie in repetitive regions, but the user may use the --pairtries or -y options to increase Bowtie's sensitivity as desired. Paired-end alignments where one mate's alignment is entirely contained within the other's are considered invalid. Because Bowtie uses an in-memory representation of the original reference string when finding paired-end alignments, its memory footprint is larger when aligning paired-end reads. For example, the human index has a memory footprint of about 2.2 GB in single-end mode and 2.9 GB in paired-end mode. High Performance TipsTip 1: Use 64-bit bowtie if possible. The 64-bit version of Bowtie is substantially faster (usually more than 50% faster) than the 32-bit version, due to Bowtie's use of 64-bit arithmetic when searching both in the index and in the reference. If possible, download the 64-bit binaries for Bowtie and run them on a 64-bit machine. If you are building Bowtie from sources, you may need to pass the -m64 option to g++ to compile the 64-bit version; you can do this by supplying argument BITS=64 to the make command; e.g.: make BITS=64 bowtie. To determine whether your version of bowtie is 64-bit or 32-bit, run bowtie --version. Tip 2: If your computer has multiple processors/cores, try -p. The -p <int> option causes Bowtie to launch <int> parallel search threads. Each thread runs on a different processor/core and all threads find alignments in parallel, increasing alignment throughput by approximately a multiple of <int>. Tip 3: If reporting many alignments per read, try tweaking bowtie-build --offrate. If you are using the -k, -a or -m options and Bowtie is reporting many alignments per read (an average of more than about 10 per read) and you have some physical memory to spare, then consider building an index with a denser SA sample. To build an index with a denser SA sample, specify a smaller --offrate value when running bowtie-build. A denser SA sample leads to a larger index, but is also particularly effective at speeding up alignment when then number of alignments reported per read is large. For example, if the number of alignments per read is very large, decreasing the index's --offrate by 1 could as much as double alignment performance, and decreasing by 2 could quadruple alignment performance, etc. On the other hand, decreasing --offrate increases the size of the Bowtie index, both on disk and in memory when aligning reads. At the default --offrate of 5, the SA sample for the human genome occupies about 375 MB of memory when aligning reads. Decreasing the --offrate by 1 doubles the memory taken by the SA sample, and decreasing by 2 quadruples the memory taken, etc. Tip 4: If bowtie "thrashes", try tweaking bowtie --offrate. If bowtie is very slow and consistently triggers more than a few page faults per second (as observed via top or vmstat on Mac/Linux, or via a tool like Process Explorer on Windows), then try giving bowtie the --offrate <int> option with a larger <int> value than the value used when building the index. For example, bowtie-build's default --offrate is 5 and all pre-built indexes available from the Bowtie website are built with --offrate 5; so if bowtie thrashes when querying such an index, try using bowtie --offrate 6. If bowtie still thrashes, try bowtie --offrate 7, etc. A higher --offrate causes bowtie to use a sparser sample of the suffix-array than is stored in the index; this saves memory but makes alignment reporting slower (which is especially slow when using -a or large -k). Command LineThe following is a detailed description of the options used to control the bowtie aligner: Usage: bowtie [options]* <ebwt> {-1 <m1> -2 <m2> | --12 <r> | <s>} [<hit>]
Default outputThe bowtie aligner outputs each alignment on a separate line. Each line is a collection of 8 fields separated by tabs; from left to right, the fields are:
SAM outputFollowing is a brief description of the SAM format as output by Bowtie when the -S/--sam option is specified. For more details, see the SAM format specification. When -S/--sam is specified, bowtie will always print a SAM header with @HD, @SQ and @PG lines. Each subsequnt line corresponds to a read or an alignment. Each line is a collection of at least 12 fields separated by tabs; from left to right, the fields are:
|
Using the bowtie-build Indexer
Use bowtie-build to build a Bowtie index from a set of DNA sequences. bowtie-build outputs a set of 6 files with suffixes .1.ebwt, .2.ebwt, .3.ebwt, .4.ebwt, .rev.1.ebwt, and .rev.2.ebwt, where the prefix is the <ebwt_outfile_base> parameter supplied by the user on the command line. These files together constitute the index: they are all that is needed to align reads to the reference sequences. The original sequence files are no longer used by Bowtie once the index is built.
Use of Karkkainen's blockwise algorithm (see reference #4 below) allows bowtie-build to trade off between running time and memory usage. bowtie-build has three options governing how it makes this trade: -p/--packed, --bmax/--bmaxdivn, and --dcv. By default, bowtie-build will automatically search for the settings that yield the best running time without exhausting memory. This behavior can be disabled using the -a/--noauto option.
The indexer provides options pertaining to the "shape" of the index, e.g. --offrate governs the fraction of Burrows-Wheeler rows that are "marked" (i.e., the "density" of the suffix-array sample; see reference #2). All of these options are potentially profitable trade-offs depending on the application. They have been set to defaults that are reasonable for most cases according to our experiments. See High Performance Tips for additional details.
Because bowtie-build uses 32-bit pointers internally, it can handle up to a maximum of 232-1 (somewhat more than 4 billion) characters in an index. If your reference exceeds 232-1 characters, bowtie-build will print an error message and abort. To resolve this, divide your reference sequences into smaller batches and/or chunks and build a separate index for each.
If your computer has more than 3-4 GB of memory and you would like to exploit that fact to make index building faster, you must use a 64-bit version of the bowtie-build binary. The 32-bit version of the binary is restricted to using less than 4 GB of memory. If a 64-bit pre-built binary does not yet exist for your platform on the sourceforge download site, you will need to build one from source.
The Bowtie index is based on the FM Index of Ferragina and Manzini, which in turn is based on the Burrows-Wheeler transform. The algorithm used to build the index is based on the blockwise algorithm of Karkkainen. For more information on these techniques, see these references:
- Burrows M, Wheeler DJ: A block sorting lossless data compression algorithm. Digital Equipment Corporation, Palo Alto, CA 1994, Technical Report 124.
- Ferragina, P. and Manzini, G. 2000. Opportunistic data structures with applications. In Proceedings of the 41st Annual Symposium on Foundations of Computer Science (November 12 - 14, 2000). FOCS
- Ferragina, P. and Manzini, G. 2001. An experimental study of an opportunistic index. In Proceedings of the Twelfth Annual ACM-SIAM Symposium on Discrete Algorithms (Washington, D.C., United States, January 07 - 09, 2001). 269-278.
- Karkkainen, J. 2007. Fast BWT in small space by blockwise suffix sorting. Theor. Comput. Sci. 387, 3 (Nov. 2007), 249-257
Command Line
Usage: bowtie-build [options]* <reference_in> <index_basename>
| Arguments: | |
| <reference_in> | A comma-separated list of FASTA files containing the reference sequences to be aligned to, or, if -c is specified, the sequences themselves. E.g., this might be chr1.fa,chr2.fa,chrX.fa,chrY.fa, or, if -c is specified, this might be GGTCATCCT,ACGGGTCGT,CCGTTCTATGCGGCTTA. |
| <ebwt_outfile_base> | The basename of the index files to write. By default, bowtie-build writes files named NAME.1.ebwt, NAME.2.ebwt, NAME.3.ebwt, NAME.4.ebwt, NAME.rev.1.ebwt, and NAME.rev.2.ebwt, where NAME is the basename. |
| Options: | |
| -f | The reference input files (specified as <reference_in>) are FASTA files (usually having extension .fa, .mfa, .fna or similar). |
| -c | The reference sequences are given on the command line. I.e. <reference_in> is a comma-separated list of sequences rather than a list of FASTA files. |
| -a/--noauto | Disable the default behavior whereby bowtie-build automatically selects values for --bmax/--dcv/--packed parameters according to the memory available. User may specify values for those parameters. If memory is exhausted during indexing, an error message will be printed; it is up to the user to try new parameters. |
| -p/--packed | Use a packed (2-bits-per-nucleotide) representation for DNA strings. This saves memory but makes indexing 2-3 times slower. Default: off. This is configured automatically by default; use -a/--noauto to configure manually. |
| --bmax <int> | The maximum number of suffixes allowed in a block. Allowing more suffixes per block makes indexing faster, but increases memory overhead. Overrides any previous specification of --bmax, --bmaxmultsqrt or --bmaxdivn. Default: --bmaxdivn 4. This is configured automatically by default; use -a/--noauto to configure manually. |
| --bmaxdivn <int> | The maximum number of suffixes allowed in a block, expressed as a fraction of the length of the reference. Overrides any previous specification of --bmax, --bmaxmultsqrt or --bmaxdivn. Default: --bmaxdivn 4. This is configured automatically by default; use -a/--noauto to configure manually. |
| --dcv <int> | Use <int> as the period for the difference- cover sample. A larger period yields less memory overhead, but may make suffix sorting slower, especially if repeats are present. Must be a power of 2 no greater than 4096. Default: 1024. This is configured automatically by default; use -a/--noauto to configure manually. |
| --nodc <int> | Disable use of the difference-cover sample. Suffix sorting becomes quadratic-time in the worst case (where the worst case is an extremely repetitive reference). Default: off. |
| -r/--noref | Do not build the NAME.3.ebwt and NAME.4.ebwt portions of the index, which contain a bitpacked version of the reference sequences and are (currently) only used for paired-end alignment. |
| -3/--justref | Build only the NAME.3.ebwt and NAME.4.ebwt portions of the index, which contain a bitpacked version of the reference sequences and are (currently) only used for paired-end alignment. |
| -o/--offrate <int> | To map alignments back to positions on the reference sequences, it's necessary to annotate ("mark") some or all of the Burrows-Wheeler rows with their corresponding location on the genome. The offrate governs how many rows get marked: the indexer will mark every 2<int> rows. Marking more rows makes reference-position lookups faster, but requires more memory to hold the annotations at runtime. The default is 5 (every 32nd row is marked; for human genome, annotations occupy about 340 megabytes). |
| -t/--ftabchars <int> | The ftab is the lookup table used to calculate an initial Burrows-Wheeler range with respect to the first <int> characters of the query. A larger <int> yields a larger lookup table but faster query times. The ftab has size 4<int>+1 bytes. The default is 10 (ftab is 4MB). |
| --ntoa | Convert Ns in the reference sequence to As before building the index. By default, Ns are simply excluded from the index and bowtie will not find alignments that overlap them. |
| --big --little | Endianness to use when serializing integers to the index file. Default: little-endian (recommended for Intel- and AMD-based architectures). |
| --seed <int> | Use <int> as the seed for pseudo-random number generator. |
| --cutoff <int> | Index only the first <int> bases of the reference sequences (cumulative across sequences) and ignore the rest. |
| --oldpmap | bowtie-build switched schemes for mapping "joined" reference locations to original reference locations in version 0.9.8. The new scheme has the advantage that it does not use padding. This option activates the old padding-based scheme used in versions prior to 0.9.8. Versions of bowtie prior to 0.9.8 can query only indexes that use the old scheme. Version of bowtie starting with 0.9.8 can query indexes using either scheme. This option will be deprecated in version 1.0. |
| -q/--quiet | bowtie-build is verbose by default. With this option bowtie-build will print only error messages. |
| -h/--help | Print detailed description of tool and its options (from MANUAL). |
| --version | Print version information and quit. |
Using the bowtie-inspect Index Inspector
bowtie-inspect extracts information from a Bowtie index about the original reference sequences used to build it. By default, the tool will output a FASTA file containing the sequences of the original references (with all non-A/C/G/T characters converted to Ns). It can also be used to extract just the reference sequence names using the -n option.
Command Line
Usage: bowtie-inspect [options]* <ebwt_base>
| Arguments: | |
| <ebwt_base> | The basename of the index to be inspected. The basename is the name of any of the four index files up to but not including the first period. bowtie first looks in the current directory for the index files, then looks in the indexes subdirectory under the directory where the currently-running bowtie executable is located, then looks in the directory specified in the BOWTIE_INDEXES environment variable. |
| Options: | |
| -a/--across <int> | When printing FASTA output, output a newline character every <int> bases (default: 60). |
| -n/--names | Print reference sequence names only; ignore sequence. |
| -v/--verbose | Print verbose output (for debugging). |
| --version | Print version information and quit. |
| -h/--help | Print detailed description of tool and its options (from MANUAL). |


