Bzip2
From Wikipedia, the free encyclopedia
File extension: | .bz2, .tar.bz2, .tbz2, .tb2 |
---|---|
MIME type: | application/x-bzip |
Type code: | Bzp2 |
Developed by: | Julian Seward |
Type of format: | data compression |
Developer: | Julian Seward |
---|---|
Latest release: | 1.0.3 / February 15, 2005 |
OS: | Cross-platform |
Use: | data compression |
License: | Bzip2 |
Website: | www.bzip.org |
- The correct title of this article is bzip2. The initial letter is shown capitalized due to technical restrictions.
bzip2 is a free software/open source data compression algorithm and program developed by Julian Seward. Seward made the first public release of bzip2, version 0.15, in July 1996. The compressor's stability and popularity grew over the next several years, and Seward released version 1.0 in late 2000.
Contents |
[edit] Compression efficiency
bzip2 compresses most files more effectively than more traditional gzip or ZIP but is slower. In this manner it is fairly similar to other recent-generation compression algorithms. Unlike other formats such as RAR or ZIP (and similar to gzip), bzip2 is only a data compressor, not an archiver. The program itself has no facilities for multiple files, encryption or archive-splitting, in the UNIX tradition instead relying on separate external utilities such as tar and GnuPG for these tasks.
In some cases, bzip2 is surpassed by 7z and RAR formats in terms of absolute compression efficiency. According to the author, bzip2 gets within ten to fifteen percent of the "best" class of compression algorithms currently known (PPM), although it is roughly twice as fast at compression and six times faster at decompression.
bzip2 uses the Burrows-Wheeler transform to convert frequently recurring character sequences into strings of identical letters, and then applies a move-to-front transform and finally Huffman coding. In bzip2 the blocks are all the same size in plaintext, which can be selected by a command-line argument, and are marked in compresstext by an arbitrary bit sequence derived from the decimal representation of π.
Originally, bzip2's ancestor bzip used arithmetic coding after the blocksort; this was discontinued because of the patent restriction.
[edit] Use
In Unix, bzip2 can be used combined with or independently of tar: bzip2 file to compress and bzip2 -d file.bz2 to uncompress (the alias bunzip2 for decompression may also be used).
bzip2's command line flags are mostly the same as in gzip. So, to extract from a bzip2-compressed tar-file:
bzcat archivefile.tar.bz2 | tar -xvf -
To create a bzip2-compressed tar-file:
tar -cvf - filenames | bzip2 > archivefile.tar.bz2
GNU tar supports a -j flag, which allows creation of tar.bz2 files without a pipeline:
tar -cvjf archivefile.tar.bz2 file-list
Decompressing in GNU tar:
tar -xvjf archivefile.tar.bz2
[edit] See also
[edit] External links
- The bzip2 and libbzip2 home page
- bzip2 for Windows
- Graphical bzip2 for Windows(WBZip2)
- MacBzip2 (for Classic Mac OS; under Mac OS X, the standard bzip2 is available at the command line)
- bzip2smp (a parallel implementation of bzip2 for use on multiprocessor/multicore machines)
- 4 Parallel bzip2 Implementations at The Data Compression News Blog