BEETL: Burrows-Wheeler Extended Tool Library
BEETL is a suite of applications for building and manipulating the Burrows-Wheeler Transform (BWT) of collections of DNA sequences. The algorithms employed in BEETL are intended to scale to collections of sequences containing one billion entries or more.
The initial release implemented two flavours of an algorithm for building the BWT of a sequence collection - BCR and BCRext.
Subsequent releases are adding functionality for efficient inversion and querying of BWTs.
Questions and queries can be sent to the BEETL mailing list
Release History
Version 1.0.0 (29th August 2014)
- BEETL is now released under the BSD 2-Clause license, enjoy!
- New "bwt_v3" Run Length Encoding format for BWT files, set as default (and described in doc/FileFormats.md)
Version 0.10.0 (17th June 2014)
- New beetl-fastq tool to convert FASTQ files into a set of smaller and searchable files (including quality scores and read ids)
- Added RLE53 class (run length encoding: 3 bits for base, 5 bits for run length), but RLE44 is still our default BWT format
- Beetl-convert handles BWT44 -> BWT53 conversions
- Integrated LCP code with main BCR, making it faster
- Beetl-extend can now rebuild the whole reads' bases with --propagate-sequence (instead of stopping at the $ signs)
- Replaced compile-time PROPAGATE_SEQUENCE with runtime --propagate-sequence in beetl-compare
- Beetl-compare output now correctly goes into the directory specified by --output
- Beetl-search/beetl-extend --use-shm stores pre-parsed index files in shared memory for faster tool startup when doing many small searches
- Thanks to Shaun Jackman and Rayan Chikhi for compilation patches
Version 0.9.0 (9th December 2013)
- New K-mer search and propagation tools (see example with beetl-search/beetl-extend in doc/BEETL.md)
- MetaBeetl filtering changed a bit, in preparation for further improvements
- MetaBeetl database construction scripts and instructions updated
- Better distribution packaging, with 'make distcheck'
Version 0.8.0 (29th October 2013)
- New BWT-based tumour-normal filter (see example in doc/BEETL.md)
- Beetl-compare modes: 'split' renamed to 'splice'
- Beetl-compare modes: new 'tumour-normal'
- 16-way parallel beetl-compare
- MetaBEETL is now able to run without sequence propagation (which is faster)
- Upgraded requirement: autoconf 2.65 (improved regression testing)
Version 0.7.0 (18th September 2013)
- New beetl-correct tool
- Updated regression tests
- Detection of required libraries and OpenMP in configure script
Version 0.6.0 (27th August 2013)
- Important: Harmonised file encodings - you may need to rebuild your BWT files or just convert bwt-B00 from BWT_ASCII to BWT_RLE with beetl-convert (bwt-B00 was always ASCII-encoded. It now follows the same encoding as the other piles)
- Now requires GCC with C++11 support (gcc 4.3 or above)
- New Markdown documentation style
- Faster beetl-compare and metaBEETL
- Lossless quality compression scripts added, even though they are not using BWT
- New features:
- BWT creation with reverse-complemented reads (--add-rev-comp)
- BWT creation with multiple reads per sequence (--sequence-length)
- Support for BCL and BCL.gz in beetl-convert
Version 0.5.0 (10th June 2013)
Faster metaBEETL- Pruning of BKPT to reduce the number of MTAXA tests and output lines
- New beetl-compare --subset option for distributed computing
Version 0.4.0 (10th April 2013)
Robustness improvementsVersion 0.3.0 (7th April 2013)
Metagenomics metaBEETL code added to main treeVersion 0.2.0 (19th March 2013)
- Longest Common Prefix computation integrated under --generate-lcp in beetl-bwt
- New beetl-unbwt command line interface
Version 0.1.0 (28th February 2013)
- New command line interface
- Support for FASTA, FASTQ, raw SEQ and cycle-by-cycle files as input
- Distinct intermediate and final formats
- Support for Huffman encoding as intermediate or final format
- Support for intermediate "incremental run-length-encoded" format
- Automatic detection of format
- Performance prediction for automatic use of fastest algorithm and options
Version 0.0.2 (25th June 2012)
Code to build SAP array, as described in this paper.Version 0.0.1 (18th November 2011)
This contains initial implementations of the BCR and BCRext algorithms as described in our CPM paper.Contributors to BEETL, past and present
- Christina Ander, University of Bielefeld
- Markus J. Bauer, Illumina UK (now at Boehringer Ingelheim)
- Anthony J. Cox, Illumina UK (project lead)
- Tobias Jakobi, University of Bielefeld
- Lilian Janin, Illumina UK
- Giovanna Rosone, University of Palermo
- Ole Schulz-Trieglaff, Illumina UK
BEETL-ography: papers about BEETL