mirror of
git://sourceware.org/git/bzip2.git
synced 2024-11-23 11:43:28 +08:00
bzip2-1.0.2
This commit is contained in:
parent
795b859eee
commit
099d844292
88
CHANGES
88
CHANGES
@ -134,7 +134,7 @@ Several minor bugfixes and enhancements:
|
|||||||
|
|
||||||
* Advance the version number to 1.0, so as to counteract the
|
* Advance the version number to 1.0, so as to counteract the
|
||||||
(false-in-this-case) impression some people have that programs
|
(false-in-this-case) impression some people have that programs
|
||||||
with version numbers less than 1.0 are in someway, experimental,
|
with version numbers less than 1.0 are in some way, experimental,
|
||||||
pre-release versions.
|
pre-release versions.
|
||||||
|
|
||||||
* Create an initial Makefile-libbz2_so to build a shared library.
|
* Create an initial Makefile-libbz2_so to build a shared library.
|
||||||
@ -165,3 +165,89 @@ There are no functionality changes or bug fixes relative to version
|
|||||||
1.0.0. This is just a documentation update + a fix for minor Win32
|
1.0.0. This is just a documentation update + a fix for minor Win32
|
||||||
build problems. For almost everyone, upgrading from 1.0.0 to 1.0.1 is
|
build problems. For almost everyone, upgrading from 1.0.0 to 1.0.1 is
|
||||||
utterly pointless. Don't bother.
|
utterly pointless. Don't bother.
|
||||||
|
|
||||||
|
|
||||||
|
1.0.2
|
||||||
|
~~~~~
|
||||||
|
A bug fix release, addressing various minor issues which have appeared
|
||||||
|
in the 18 or so months since 1.0.1 was released. Most of the fixes
|
||||||
|
are to do with file-handling or documentation bugs. To the best of my
|
||||||
|
knowledge, there have been no data-loss-causing bugs reported in the
|
||||||
|
compression/decompression engine of 1.0.0 or 1.0.1.
|
||||||
|
|
||||||
|
Note that this release does not improve the rather crude build system
|
||||||
|
for Unix platforms. The general plan here is to autoconfiscate/
|
||||||
|
libtoolise 1.0.2 soon after release, and release the result as 1.1.0
|
||||||
|
or perhaps 1.2.0. That, however, is still just a plan at this point.
|
||||||
|
|
||||||
|
Here are the changes in 1.0.2. Bug-reporters and/or patch-senders in
|
||||||
|
parentheses.
|
||||||
|
|
||||||
|
* Fix an infinite segfault loop in 1.0.1 when a directory is
|
||||||
|
encountered in -f (force) mode.
|
||||||
|
(Trond Eivind Glomsrod, Nicholas Nethercote, Volker Schmidt)
|
||||||
|
|
||||||
|
* Avoid double fclose() of output file on certain I/O error paths.
|
||||||
|
(Solar Designer)
|
||||||
|
|
||||||
|
* Don't fail with internal error 1007 when fed a long stream (> 48MB)
|
||||||
|
of byte 251. Also print useful message suggesting that 1007s may be
|
||||||
|
caused by bad memory.
|
||||||
|
(noticed by Juan Pedro Vallejo, fixed by me)
|
||||||
|
|
||||||
|
* Fix uninitialised variable silly bug in demo prog dlltest.c.
|
||||||
|
(Jorj Bauer)
|
||||||
|
|
||||||
|
* Remove 512-MB limitation on recovered file size for bzip2recover
|
||||||
|
on selected platforms which support 64-bit ints. At the moment
|
||||||
|
all GCC supported platforms, and Win32.
|
||||||
|
(me, Alson van der Meulen)
|
||||||
|
|
||||||
|
* Hard-code header byte values, to give correct operation on platforms
|
||||||
|
using EBCDIC as their native character set (IBM's OS/390).
|
||||||
|
(Leland Lucius)
|
||||||
|
|
||||||
|
* Copy file access times correctly.
|
||||||
|
(Marty Leisner)
|
||||||
|
|
||||||
|
* Add distclean and check targets to Makefile.
|
||||||
|
(Michael Carmack)
|
||||||
|
|
||||||
|
* Parameterise use of ar and ranlib in Makefile. Also add $(LDFLAGS).
|
||||||
|
(Rich Ireland, Bo Thorsen)
|
||||||
|
|
||||||
|
* Pass -p (create parent dirs as needed) to mkdir during make install.
|
||||||
|
(Jeremy Fusco)
|
||||||
|
|
||||||
|
* Dereference symlinks when copying file permissions in -f mode.
|
||||||
|
(Volker Schmidt)
|
||||||
|
|
||||||
|
* Majorly simplify implementation of uInt64_qrm10.
|
||||||
|
(Bo Lindbergh)
|
||||||
|
|
||||||
|
* Check the input file still exists before deleting the output one,
|
||||||
|
when aborting in cleanUpAndFail().
|
||||||
|
(Joerg Prante, Robert Linden, Matthias Krings)
|
||||||
|
|
||||||
|
Also a bunch of patches courtesy of Philippe Troin, the Debian maintainer
|
||||||
|
of bzip2:
|
||||||
|
|
||||||
|
* Wrapper scripts (with manpages): bzdiff, bzgrep, bzmore.
|
||||||
|
|
||||||
|
* Spelling changes and minor enhancements in bzip2.1.
|
||||||
|
|
||||||
|
* Avoid race condition between creating the output file and setting its
|
||||||
|
interim permissions safely, by using fopen_output_safely().
|
||||||
|
No changes to bzip2recover since there is no issue with file
|
||||||
|
permissions there.
|
||||||
|
|
||||||
|
* do not print senseless report with -v when compressing an empty
|
||||||
|
file.
|
||||||
|
|
||||||
|
* bzcat -f works on non-bzip2 files.
|
||||||
|
|
||||||
|
* do not try to escape shell meta-characters on unix (the shell takes
|
||||||
|
care of these).
|
||||||
|
|
||||||
|
* added --fast and --best aliases for -1 -9 for gzip compatibility.
|
||||||
|
|
||||||
|
4
LICENSE
4
LICENSE
@ -1,6 +1,6 @@
|
|||||||
|
|
||||||
This program, "bzip2" and associated library "libbzip2", are
|
This program, "bzip2" and associated library "libbzip2", are
|
||||||
copyright (C) 1996-2000 Julian R Seward. All rights reserved.
|
copyright (C) 1996-2002 Julian R Seward. All rights reserved.
|
||||||
|
|
||||||
Redistribution and use in source and binary forms, with or without
|
Redistribution and use in source and binary forms, with or without
|
||||||
modification, are permitted provided that the following conditions
|
modification, are permitted provided that the following conditions
|
||||||
@ -35,5 +35,5 @@ SOFTWARE, EVEN IF ADVISED OF THE POSSIBILITY OF SUCH DAMAGE.
|
|||||||
|
|
||||||
Julian Seward, Cambridge, UK.
|
Julian Seward, Cambridge, UK.
|
||||||
jseward@acm.org
|
jseward@acm.org
|
||||||
bzip2/libbzip2 version 1.0 of 21 March 2000
|
bzip2/libbzip2 version 1.0.2 of 30 December 2001
|
||||||
|
|
||||||
|
81
Makefile
81
Makefile
@ -1,9 +1,20 @@
|
|||||||
|
|
||||||
SHELL=/bin/sh
|
SHELL=/bin/sh
|
||||||
|
|
||||||
|
# To assist in cross-compiling
|
||||||
CC=gcc
|
CC=gcc
|
||||||
|
AR=ar
|
||||||
|
RANLIB=ranlib
|
||||||
|
LDFLAGS=
|
||||||
|
|
||||||
|
# Suitably paranoid flags to avoid bugs in gcc-2.7
|
||||||
BIGFILES=-D_FILE_OFFSET_BITS=64
|
BIGFILES=-D_FILE_OFFSET_BITS=64
|
||||||
CFLAGS=-Wall -Winline -O2 -fomit-frame-pointer -fno-strength-reduce $(BIGFILES)
|
CFLAGS=-Wall -Winline -O2 -fomit-frame-pointer -fno-strength-reduce $(BIGFILES)
|
||||||
|
|
||||||
|
# Where you want it installed when you do 'make install'
|
||||||
|
PREFIX=/usr
|
||||||
|
|
||||||
|
|
||||||
OBJS= blocksort.o \
|
OBJS= blocksort.o \
|
||||||
huffman.o \
|
huffman.o \
|
||||||
crctable.o \
|
crctable.o \
|
||||||
@ -15,20 +26,21 @@ OBJS= blocksort.o \
|
|||||||
all: libbz2.a bzip2 bzip2recover test
|
all: libbz2.a bzip2 bzip2recover test
|
||||||
|
|
||||||
bzip2: libbz2.a bzip2.o
|
bzip2: libbz2.a bzip2.o
|
||||||
$(CC) $(CFLAGS) -o bzip2 bzip2.o -L. -lbz2
|
$(CC) $(CFLAGS) $(LDFLAGS) -o bzip2 bzip2.o -L. -lbz2
|
||||||
|
|
||||||
bzip2recover: bzip2recover.o
|
bzip2recover: bzip2recover.o
|
||||||
$(CC) $(CFLAGS) -o bzip2recover bzip2recover.o
|
$(CC) $(CFLAGS) $(LDFLAGS) -o bzip2recover bzip2recover.o
|
||||||
|
|
||||||
libbz2.a: $(OBJS)
|
libbz2.a: $(OBJS)
|
||||||
rm -f libbz2.a
|
rm -f libbz2.a
|
||||||
ar cq libbz2.a $(OBJS)
|
$(AR) cq libbz2.a $(OBJS)
|
||||||
@if ( test -f /usr/bin/ranlib -o -f /bin/ranlib -o \
|
@if ( test -f $(RANLIB) -o -f /usr/bin/ranlib -o \
|
||||||
-f /usr/ccs/bin/ranlib ) ; then \
|
-f /bin/ranlib -o -f /usr/ccs/bin/ranlib ) ; then \
|
||||||
echo ranlib libbz2.a ; \
|
echo $(RANLIB) libbz2.a ; \
|
||||||
ranlib libbz2.a ; \
|
$(RANLIB) libbz2.a ; \
|
||||||
fi
|
fi
|
||||||
|
|
||||||
|
check: test
|
||||||
test: bzip2
|
test: bzip2
|
||||||
@cat words1
|
@cat words1
|
||||||
./bzip2 -1 < sample1.ref > sample1.rb2
|
./bzip2 -1 < sample1.ref > sample1.rb2
|
||||||
@ -45,14 +57,12 @@ test: bzip2
|
|||||||
cmp sample3.tst sample3.ref
|
cmp sample3.tst sample3.ref
|
||||||
@cat words3
|
@cat words3
|
||||||
|
|
||||||
PREFIX=/usr
|
|
||||||
|
|
||||||
install: bzip2 bzip2recover
|
install: bzip2 bzip2recover
|
||||||
if ( test ! -d $(PREFIX)/bin ) ; then mkdir $(PREFIX)/bin ; fi
|
if ( test ! -d $(PREFIX)/bin ) ; then mkdir -p $(PREFIX)/bin ; fi
|
||||||
if ( test ! -d $(PREFIX)/lib ) ; then mkdir $(PREFIX)/lib ; fi
|
if ( test ! -d $(PREFIX)/lib ) ; then mkdir -p $(PREFIX)/lib ; fi
|
||||||
if ( test ! -d $(PREFIX)/man ) ; then mkdir $(PREFIX)/man ; fi
|
if ( test ! -d $(PREFIX)/man ) ; then mkdir -p $(PREFIX)/man ; fi
|
||||||
if ( test ! -d $(PREFIX)/man/man1 ) ; then mkdir $(PREFIX)/man/man1 ; fi
|
if ( test ! -d $(PREFIX)/man/man1 ) ; then mkdir -p $(PREFIX)/man/man1 ; fi
|
||||||
if ( test ! -d $(PREFIX)/include ) ; then mkdir $(PREFIX)/include ; fi
|
if ( test ! -d $(PREFIX)/include ) ; then mkdir -p $(PREFIX)/include ; fi
|
||||||
cp -f bzip2 $(PREFIX)/bin/bzip2
|
cp -f bzip2 $(PREFIX)/bin/bzip2
|
||||||
cp -f bzip2 $(PREFIX)/bin/bunzip2
|
cp -f bzip2 $(PREFIX)/bin/bunzip2
|
||||||
cp -f bzip2 $(PREFIX)/bin/bzcat
|
cp -f bzip2 $(PREFIX)/bin/bzcat
|
||||||
@ -67,7 +77,26 @@ install: bzip2 bzip2recover
|
|||||||
chmod a+r $(PREFIX)/include/bzlib.h
|
chmod a+r $(PREFIX)/include/bzlib.h
|
||||||
cp -f libbz2.a $(PREFIX)/lib
|
cp -f libbz2.a $(PREFIX)/lib
|
||||||
chmod a+r $(PREFIX)/lib/libbz2.a
|
chmod a+r $(PREFIX)/lib/libbz2.a
|
||||||
|
cp -f bzgrep $(PREFIX)/bin/bzgrep
|
||||||
|
ln $(PREFIX)/bin/bzgrep $(PREFIX)/bin/bzegrep
|
||||||
|
ln $(PREFIX)/bin/bzgrep $(PREFIX)/bin/bzfgrep
|
||||||
|
chmod a+x $(PREFIX)/bin/bzgrep
|
||||||
|
cp -f bzmore $(PREFIX)/bin/bzmore
|
||||||
|
ln $(PREFIX)/bin/bzmore $(PREFIX)/bin/bzless
|
||||||
|
chmod a+x $(PREFIX)/bin/bzmore
|
||||||
|
cp -f bzdiff $(PREFIX)/bin/bzdiff
|
||||||
|
ln $(PREFIX)/bin/bzdiff $(PREFIX)/bin/bzcmp
|
||||||
|
chmod a+x $(PREFIX)/bin/bzdiff
|
||||||
|
cp -f bzgrep.1 bzmore.1 bzdiff.1 $(PREFIX)/man/man1
|
||||||
|
chmod a+r $(PREFIX)/man/man1/bzgrep.1
|
||||||
|
chmod a+r $(PREFIX)/man/man1/bzmore.1
|
||||||
|
chmod a+r $(PREFIX)/man/man1/bzdiff.1
|
||||||
|
echo ".so man1/bzgrep.1" > $(PREFIX)/man/man1/bzegrep.1
|
||||||
|
echo ".so man1/bzgrep.1" > $(PREFIX)/man/man1/bzfgrep.1
|
||||||
|
echo ".so man1/bzmore.1" > $(PREFIX)/man/man1/bzless.1
|
||||||
|
echo ".so man1/bzdiff.1" > $(PREFIX)/man/man1/bzcmp.1
|
||||||
|
|
||||||
|
distclean: clean
|
||||||
clean:
|
clean:
|
||||||
rm -f *.o libbz2.a bzip2 bzip2recover \
|
rm -f *.o libbz2.a bzip2 bzip2recover \
|
||||||
sample1.rb2 sample2.rb2 sample3.rb2 \
|
sample1.rb2 sample2.rb2 sample3.rb2 \
|
||||||
@ -93,7 +122,7 @@ bzip2.o: bzip2.c
|
|||||||
bzip2recover.o: bzip2recover.c
|
bzip2recover.o: bzip2recover.c
|
||||||
$(CC) $(CFLAGS) -c bzip2recover.c
|
$(CC) $(CFLAGS) -c bzip2recover.c
|
||||||
|
|
||||||
DISTNAME=bzip2-1.0.1
|
DISTNAME=bzip2-1.0.2
|
||||||
tarfile:
|
tarfile:
|
||||||
rm -f $(DISTNAME)
|
rm -f $(DISTNAME)
|
||||||
ln -sf . $(DISTNAME)
|
ln -sf . $(DISTNAME)
|
||||||
@ -112,6 +141,7 @@ tarfile:
|
|||||||
$(DISTNAME)/Makefile \
|
$(DISTNAME)/Makefile \
|
||||||
$(DISTNAME)/manual.texi \
|
$(DISTNAME)/manual.texi \
|
||||||
$(DISTNAME)/manual.ps \
|
$(DISTNAME)/manual.ps \
|
||||||
|
$(DISTNAME)/manual.pdf \
|
||||||
$(DISTNAME)/LICENSE \
|
$(DISTNAME)/LICENSE \
|
||||||
$(DISTNAME)/bzip2.1 \
|
$(DISTNAME)/bzip2.1 \
|
||||||
$(DISTNAME)/bzip2.1.preformatted \
|
$(DISTNAME)/bzip2.1.preformatted \
|
||||||
@ -138,4 +168,25 @@ tarfile:
|
|||||||
$(DISTNAME)/Y2K_INFO \
|
$(DISTNAME)/Y2K_INFO \
|
||||||
$(DISTNAME)/unzcrash.c \
|
$(DISTNAME)/unzcrash.c \
|
||||||
$(DISTNAME)/spewG.c \
|
$(DISTNAME)/spewG.c \
|
||||||
|
$(DISTNAME)/mk251.c \
|
||||||
|
$(DISTNAME)/bzdiff \
|
||||||
|
$(DISTNAME)/bzdiff.1 \
|
||||||
|
$(DISTNAME)/bzmore \
|
||||||
|
$(DISTNAME)/bzmore.1 \
|
||||||
|
$(DISTNAME)/bzgrep \
|
||||||
|
$(DISTNAME)/bzgrep.1 \
|
||||||
$(DISTNAME)/Makefile-libbz2_so
|
$(DISTNAME)/Makefile-libbz2_so
|
||||||
|
gzip -v $(DISTNAME).tar
|
||||||
|
|
||||||
|
# For rebuilding the manual from sources on my RedHat 7.2 box
|
||||||
|
manual: manual.ps manual.pdf manual.html
|
||||||
|
|
||||||
|
manual.ps: manual.texi
|
||||||
|
tex manual.texi
|
||||||
|
dvips -o manual.ps manual.dvi
|
||||||
|
|
||||||
|
manual.pdf: manual.ps
|
||||||
|
ps2pdf manual.ps
|
||||||
|
|
||||||
|
manual.html: manual.texi
|
||||||
|
texi2html -split_chapter manual.texi
|
||||||
|
@ -1,8 +1,9 @@
|
|||||||
|
|
||||||
# This Makefile builds a shared version of the library,
|
# This Makefile builds a shared version of the library,
|
||||||
# libbz2.so.1.0.1, with soname libbz2.so.1.0,
|
# libbz2.so.1.0.2, with soname libbz2.so.1.0,
|
||||||
# at least on x86-Linux (RedHat 5.2),
|
# at least on x86-Linux (RedHat 7.2),
|
||||||
# with gcc-2.7.2.3. Please see the README file for some
|
# with gcc-2.96 20000731 (Red Hat Linux 7.1 2.96-98).
|
||||||
|
# Please see the README file for some
|
||||||
# important info about building the library like this.
|
# important info about building the library like this.
|
||||||
|
|
||||||
SHELL=/bin/sh
|
SHELL=/bin/sh
|
||||||
@ -19,13 +20,13 @@ OBJS= blocksort.o \
|
|||||||
bzlib.o
|
bzlib.o
|
||||||
|
|
||||||
all: $(OBJS)
|
all: $(OBJS)
|
||||||
$(CC) -shared -Wl,-soname -Wl,libbz2.so.1.0 -o libbz2.so.1.0.1 $(OBJS)
|
$(CC) -shared -Wl,-soname -Wl,libbz2.so.1.0 -o libbz2.so.1.0.2 $(OBJS)
|
||||||
$(CC) $(CFLAGS) -o bzip2-shared bzip2.c libbz2.so.1.0.1
|
$(CC) $(CFLAGS) -o bzip2-shared bzip2.c libbz2.so.1.0.2
|
||||||
rm -f libbz2.so.1.0
|
rm -f libbz2.so.1.0
|
||||||
ln -s libbz2.so.1.0.1 libbz2.so.1.0
|
ln -s libbz2.so.1.0.2 libbz2.so.1.0
|
||||||
|
|
||||||
clean:
|
clean:
|
||||||
rm -f $(OBJS) bzip2.o libbz2.so.1.0.1 libbz2.so.1.0 bzip2-shared
|
rm -f $(OBJS) bzip2.o libbz2.so.1.0.2 libbz2.so.1.0 bzip2-shared
|
||||||
|
|
||||||
blocksort.o: blocksort.c
|
blocksort.o: blocksort.c
|
||||||
$(CC) $(CFLAGS) -c blocksort.c
|
$(CC) $(CFLAGS) -c blocksort.c
|
||||||
|
89
README
89
README
@ -1,15 +1,15 @@
|
|||||||
|
|
||||||
This is the README for bzip2, a block-sorting file compressor, version
|
This is the README for bzip2, a block-sorting file compressor, version
|
||||||
1.0. This version is fully compatible with the previous public
|
1.0.2. This version is fully compatible with the previous public
|
||||||
releases, bzip2-0.1pl2, bzip2-0.9.0 and bzip2-0.9.5.
|
releases, versions 0.1pl2, 0.9.0, 0.9.5, 1.0.0 and 1.0.1.
|
||||||
|
|
||||||
bzip2-1.0 is distributed under a BSD-style license. For details,
|
bzip2-1.0.2 is distributed under a BSD-style license. For details,
|
||||||
see the file LICENSE.
|
see the file LICENSE.
|
||||||
|
|
||||||
Complete documentation is available in Postscript form (manual.ps) or
|
Complete documentation is available in Postscript form (manual.ps),
|
||||||
html (manual_toc.html). A plain-text version of the manual page is
|
PDF (manual.pdf, amazingly enough) or html (manual_toc.html). A
|
||||||
available as bzip2.txt. A statement about Y2K issues is now included
|
plain-text version of the manual page is available as bzip2.txt.
|
||||||
in the file Y2K_INFO.
|
A statement about Y2K issues is now included in the file Y2K_INFO.
|
||||||
|
|
||||||
|
|
||||||
HOW TO BUILD -- UNIX
|
HOW TO BUILD -- UNIX
|
||||||
@ -33,34 +33,41 @@ not actually execute them.
|
|||||||
HOW TO BUILD -- UNIX, shared library libbz2.so.
|
HOW TO BUILD -- UNIX, shared library libbz2.so.
|
||||||
|
|
||||||
Do 'make -f Makefile-libbz2_so'. This Makefile seems to work for
|
Do 'make -f Makefile-libbz2_so'. This Makefile seems to work for
|
||||||
Linux-ELF (RedHat 5.2 on an x86 box), with gcc. I make no claims
|
Linux-ELF (RedHat 7.2 on an x86 box), with gcc. I make no claims
|
||||||
that it works for any other platform, though I suspect it probably
|
that it works for any other platform, though I suspect it probably
|
||||||
will work for most platforms employing both ELF and gcc.
|
will work for most platforms employing both ELF and gcc.
|
||||||
|
|
||||||
bzip2-shared, a client of the shared library, is also build, but
|
bzip2-shared, a client of the shared library, is also built, but not
|
||||||
not self-tested. So I suggest you also build using the normal
|
self-tested. So I suggest you also build using the normal Makefile,
|
||||||
Makefile, since that conducts a self-test.
|
since that conducts a self-test. A second reason to prefer the
|
||||||
|
version statically linked to the library is that, on x86 platforms,
|
||||||
|
building shared objects makes a valuable register (%ebx) unavailable
|
||||||
|
to gcc, resulting in a slowdown of 10%-20%, at least for bzip2.
|
||||||
|
|
||||||
Important note for people upgrading .so's from 0.9.0/0.9.5 to
|
Important note for people upgrading .so's from 0.9.0/0.9.5 to version
|
||||||
version 1.0. All the functions in the library have been renamed,
|
1.0.X. All the functions in the library have been renamed, from (eg)
|
||||||
from (eg) bzCompress to BZ2_bzCompress, to avoid namespace pollution.
|
bzCompress to BZ2_bzCompress, to avoid namespace pollution.
|
||||||
Unfortunately this means that the libbz2.so created by
|
Unfortunately this means that the libbz2.so created by
|
||||||
Makefile-libbz2_so will not work with any program which used an
|
Makefile-libbz2_so will not work with any program which used an older
|
||||||
older version of the library. Sorry. I do encourage library
|
version of the library. Sorry. I do encourage library clients to
|
||||||
clients to make the effort to upgrade to use version 1.0, since
|
make the effort to upgrade to use version 1.0, since it is both faster
|
||||||
it is both faster and more robust than previous versions.
|
and more robust than previous versions.
|
||||||
|
|
||||||
|
|
||||||
HOW TO BUILD -- Windows 95, NT, DOS, Mac, etc.
|
HOW TO BUILD -- Windows 95, NT, DOS, Mac, etc.
|
||||||
|
|
||||||
It's difficult for me to support compilation on all these platforms.
|
It's difficult for me to support compilation on all these platforms.
|
||||||
My approach is to collect binaries for these platforms, and put them
|
My approach is to collect binaries for these platforms, and put them
|
||||||
on the master web page (http://sourceware.cygnus.com/bzip2). Look
|
on the master web page (http://sources.redhat.com/bzip2). Look there.
|
||||||
there. However (FWIW), bzip2-1.0 is very standard ANSI C and should
|
However (FWIW), bzip2-1.0.X is very standard ANSI C and should compile
|
||||||
compile unmodified with MS Visual C. For Win32, there is one
|
unmodified with MS Visual C. If you have difficulties building, you
|
||||||
important caveat: in bzip2.c, you must set BZ_UNIX to 0 and
|
might want to read README.COMPILATION.PROBLEMS.
|
||||||
BZ_LCCWIN32 to 1 before building. If you have difficulties building,
|
|
||||||
you might want to read README.COMPILATION.PROBLEMS.
|
At least using MS Visual C++ 6, you can build from the unmodified
|
||||||
|
sources by issuing, in a command shell:
|
||||||
|
nmake -f makefile.msc
|
||||||
|
(you may need to first run the MSVC-provided script VCVARS32.BAT
|
||||||
|
so as to set up paths to the MSVC tools correctly).
|
||||||
|
|
||||||
|
|
||||||
VALIDATION
|
VALIDATION
|
||||||
@ -138,29 +145,37 @@ WHAT'S NEW IN 0.9.5 ?
|
|||||||
* Many small improvements in file and flag handling.
|
* Many small improvements in file and flag handling.
|
||||||
* A Y2K statement.
|
* A Y2K statement.
|
||||||
|
|
||||||
WHAT'S NEW IN 1.0
|
WHAT'S NEW IN 1.0.0 ?
|
||||||
|
|
||||||
See the CHANGES file.
|
See the CHANGES file.
|
||||||
|
|
||||||
|
WHAT'S NEW IN 1.0.2 ?
|
||||||
|
|
||||||
|
See the CHANGES file.
|
||||||
|
|
||||||
|
|
||||||
I hope you find bzip2 useful. Feel free to contact me at
|
I hope you find bzip2 useful. Feel free to contact me at
|
||||||
jseward@acm.org
|
jseward@acm.org
|
||||||
if you have any suggestions or queries. Many people mailed me with
|
if you have any suggestions or queries. Many people mailed me with
|
||||||
comments, suggestions and patches after the releases of bzip-0.15,
|
comments, suggestions and patches after the releases of bzip-0.15,
|
||||||
bzip-0.21, bzip2-0.1pl2 and bzip2-0.9.0, and the changes in bzip2 are
|
bzip-0.21, and bzip2 versions 0.1pl2, 0.9.0, 0.9.5, 1.0.0 and 1.0.1,
|
||||||
largely a result of this feedback. I thank you for your comments.
|
and the changes in bzip2 are largely a result of this feedback.
|
||||||
|
I thank you for your comments.
|
||||||
|
|
||||||
At least for the time being, bzip2's "home" is (or can be reached via)
|
At least for the time being, bzip2's "home" is (or can be reached via)
|
||||||
http://www.muraroa.demon.co.uk.
|
http://sources.redhat.com/bzip2.
|
||||||
|
|
||||||
Julian Seward
|
Julian Seward
|
||||||
jseward@acm.org
|
jseward@acm.org
|
||||||
|
|
||||||
Cambridge, UK
|
Cambridge, UK (and what a great town this is!)
|
||||||
18 July 1996 (version 0.15)
|
|
||||||
25 August 1996 (version 0.21)
|
18 July 1996 (version 0.15)
|
||||||
7 August 1997 (bzip2, version 0.1)
|
25 August 1996 (version 0.21)
|
||||||
29 August 1997 (bzip2, version 0.1pl2)
|
7 August 1997 (bzip2, version 0.1)
|
||||||
23 August 1998 (bzip2, version 0.9.0)
|
29 August 1997 (bzip2, version 0.1pl2)
|
||||||
8 June 1999 (bzip2, version 0.9.5)
|
23 August 1998 (bzip2, version 0.9.0)
|
||||||
4 Sept 1999 (bzip2, version 0.9.5d)
|
8 June 1999 (bzip2, version 0.9.5)
|
||||||
5 May 2000 (bzip2, version 1.0pre8)
|
4 Sept 1999 (bzip2, version 0.9.5d)
|
||||||
|
5 May 2000 (bzip2, version 1.0pre8)
|
||||||
|
30 December 2001 (bzip2, version 1.0.2pre1)
|
@ -117,11 +117,11 @@ Known problems as of 1.0pre8:
|
|||||||
All that said: you might be able to get somewhere
|
All that said: you might be able to get somewhere
|
||||||
by finding the line in Makefile-libbz2_so which says
|
by finding the line in Makefile-libbz2_so which says
|
||||||
|
|
||||||
$(CC) -shared -Wl,-soname -Wl,libbz2.so.1.0 -o libbz2.so.1.0.1 $(OBJS)
|
$(CC) -shared -Wl,-soname -Wl,libbz2.so.1.0 -o libbz2.so.1.0.2 $(OBJS)
|
||||||
|
|
||||||
and replacing with
|
and replacing with
|
||||||
|
|
||||||
($CC) -G -shared -o libbz2.so.1.0.1 -h libbz2.so.1.0 $(OBJS)
|
$(CC) -G -shared -o libbz2.so.1.0.2 -h libbz2.so.1.0 $(OBJS)
|
||||||
|
|
||||||
If gcc objects to the combination -fpic -fPIC, get rid of
|
If gcc objects to the combination -fpic -fPIC, get rid of
|
||||||
the second one, leaving just "-fpic".
|
the second one, leaving just "-fpic".
|
||||||
|
11
blocksort.c
11
blocksort.c
@ -8,7 +8,7 @@
|
|||||||
This file is a part of bzip2 and/or libbzip2, a program and
|
This file is a part of bzip2 and/or libbzip2, a program and
|
||||||
library for lossless, block-sorting data compression.
|
library for lossless, block-sorting data compression.
|
||||||
|
|
||||||
Copyright (C) 1996-2000 Julian R Seward. All rights reserved.
|
Copyright (C) 1996-2002 Julian R Seward. All rights reserved.
|
||||||
|
|
||||||
Redistribution and use in source and binary forms, with or without
|
Redistribution and use in source and binary forms, with or without
|
||||||
modification, are permitted provided that the following conditions
|
modification, are permitted provided that the following conditions
|
||||||
@ -981,7 +981,14 @@ void mainSort ( UInt32* ptr,
|
|||||||
}
|
}
|
||||||
}
|
}
|
||||||
|
|
||||||
AssertH ( copyStart[ss]-1 == copyEnd[ss], 1007 );
|
AssertH ( (copyStart[ss]-1 == copyEnd[ss])
|
||||||
|
||
|
||||||
|
/* Extremely rare case missing in bzip2-1.0.0 and 1.0.1.
|
||||||
|
Necessity for this case is demonstrated by compressing
|
||||||
|
a sequence of approximately 48.5 million of character
|
||||||
|
251; 1.0.0/1.0.1 will then die here. */
|
||||||
|
(copyStart[ss] == 0 && copyEnd[ss] == nblock-1),
|
||||||
|
1007 )
|
||||||
|
|
||||||
for (j = 0; j <= 255; j++) ftab[(j << 8) + ss] |= SETMASK;
|
for (j = 0; j <= 255; j++) ftab[(j << 8) + ss] |= SETMASK;
|
||||||
|
|
||||||
|
76
bzdiff
Normal file
76
bzdiff
Normal file
@ -0,0 +1,76 @@
|
|||||||
|
#!/bin/sh
|
||||||
|
# sh is buggy on RS/6000 AIX 3.2. Replace above line with #!/bin/ksh
|
||||||
|
|
||||||
|
# Bzcmp/diff wrapped for bzip2,
|
||||||
|
# adapted from zdiff by Philippe Troin <phil@fifi.org> for Debian GNU/Linux.
|
||||||
|
|
||||||
|
# Bzcmp and bzdiff are used to invoke the cmp or the diff pro-
|
||||||
|
# gram on compressed files. All options specified are passed
|
||||||
|
# directly to cmp or diff. If only 1 file is specified, then
|
||||||
|
# the files compared are file1 and an uncompressed file1.gz.
|
||||||
|
# If two files are specified, then they are uncompressed (if
|
||||||
|
# necessary) and fed to cmp or diff. The exit status from cmp
|
||||||
|
# or diff is preserved.
|
||||||
|
|
||||||
|
PATH="/usr/bin:$PATH"; export PATH
|
||||||
|
prog=`echo $0 | sed 's|.*/||'`
|
||||||
|
case "$prog" in
|
||||||
|
*cmp) comp=${CMP-cmp} ;;
|
||||||
|
*) comp=${DIFF-diff} ;;
|
||||||
|
esac
|
||||||
|
|
||||||
|
OPTIONS=
|
||||||
|
FILES=
|
||||||
|
for ARG
|
||||||
|
do
|
||||||
|
case "$ARG" in
|
||||||
|
-*) OPTIONS="$OPTIONS $ARG";;
|
||||||
|
*) if test -f "$ARG"; then
|
||||||
|
FILES="$FILES $ARG"
|
||||||
|
else
|
||||||
|
echo "${prog}: $ARG not found or not a regular file"
|
||||||
|
exit 1
|
||||||
|
fi ;;
|
||||||
|
esac
|
||||||
|
done
|
||||||
|
if test -z "$FILES"; then
|
||||||
|
echo "Usage: $prog [${comp}_options] file [file]"
|
||||||
|
exit 1
|
||||||
|
fi
|
||||||
|
tmp=`tempfile -d /tmp -p bz` || {
|
||||||
|
echo 'cannot create a temporary file' >&2
|
||||||
|
exit 1
|
||||||
|
}
|
||||||
|
set $FILES
|
||||||
|
if test $# -eq 1; then
|
||||||
|
FILE=`echo "$1" | sed 's/.bz2$//'`
|
||||||
|
bzip2 -cd "$FILE.bz2" | $comp $OPTIONS - "$FILE"
|
||||||
|
STAT="$?"
|
||||||
|
|
||||||
|
elif test $# -eq 2; then
|
||||||
|
case "$1" in
|
||||||
|
*.bz2)
|
||||||
|
case "$2" in
|
||||||
|
*.bz2)
|
||||||
|
F=`echo "$2" | sed 's|.*/||;s|.bz2$||'`
|
||||||
|
bzip2 -cdfq "$2" > $tmp
|
||||||
|
bzip2 -cdfq "$1" | $comp $OPTIONS - $tmp
|
||||||
|
STAT="$?"
|
||||||
|
/bin/rm -f $tmp;;
|
||||||
|
|
||||||
|
*) bzip2 -cdfq "$1" | $comp $OPTIONS - "$2"
|
||||||
|
STAT="$?";;
|
||||||
|
esac;;
|
||||||
|
*) case "$2" in
|
||||||
|
*.bz2)
|
||||||
|
bzip2 -cdfq "$2" | $comp $OPTIONS "$1" -
|
||||||
|
STAT="$?";;
|
||||||
|
*) $comp $OPTIONS "$1" "$2"
|
||||||
|
STAT="$?";;
|
||||||
|
esac;;
|
||||||
|
esac
|
||||||
|
exit "$STAT"
|
||||||
|
else
|
||||||
|
echo "Usage: $prog [${comp}_options] file [file]"
|
||||||
|
exit 1
|
||||||
|
fi
|
47
bzdiff.1
Normal file
47
bzdiff.1
Normal file
@ -0,0 +1,47 @@
|
|||||||
|
\"Shamelessly copied from zmore.1 by Philippe Troin <phil@fifi.org>
|
||||||
|
\"for Debian GNU/Linux
|
||||||
|
.TH BZDIFF 1
|
||||||
|
.SH NAME
|
||||||
|
bzcmp, bzdiff \- compare bzip2 compressed files
|
||||||
|
.SH SYNOPSIS
|
||||||
|
.B bzcmp
|
||||||
|
[ cmp_options ] file1
|
||||||
|
[ file2 ]
|
||||||
|
.br
|
||||||
|
.B bzdiff
|
||||||
|
[ diff_options ] file1
|
||||||
|
[ file2 ]
|
||||||
|
.SH DESCRIPTION
|
||||||
|
.I Bzcmp
|
||||||
|
and
|
||||||
|
.I bzdiff
|
||||||
|
are used to invoke the
|
||||||
|
.I cmp
|
||||||
|
or the
|
||||||
|
.I diff
|
||||||
|
program on bzip2 compressed files. All options specified are passed
|
||||||
|
directly to
|
||||||
|
.I cmp
|
||||||
|
or
|
||||||
|
.IR diff "."
|
||||||
|
If only 1 file is specified, then the files compared are
|
||||||
|
.I file1
|
||||||
|
and an uncompressed
|
||||||
|
.IR file1 ".bz2."
|
||||||
|
If two files are specified, then they are uncompressed if necessary and fed to
|
||||||
|
.I cmp
|
||||||
|
or
|
||||||
|
.IR diff "."
|
||||||
|
The exit status from
|
||||||
|
.I cmp
|
||||||
|
or
|
||||||
|
.I diff
|
||||||
|
is preserved.
|
||||||
|
.SH "SEE ALSO"
|
||||||
|
cmp(1), diff(1), bzmore(1), bzless(1), bzgrep(1), bzip2(1)
|
||||||
|
.SH BUGS
|
||||||
|
Messages from the
|
||||||
|
.I cmp
|
||||||
|
or
|
||||||
|
.I diff
|
||||||
|
programs refer to temporary filenames instead of those specified.
|
71
bzgrep
Normal file
71
bzgrep
Normal file
@ -0,0 +1,71 @@
|
|||||||
|
#!/bin/sh
|
||||||
|
|
||||||
|
# Bzgrep wrapped for bzip2,
|
||||||
|
# adapted from zgrep by Philippe Troin <phil@fifi.org> for Debian GNU/Linux.
|
||||||
|
## zgrep notice:
|
||||||
|
## zgrep -- a wrapper around a grep program that decompresses files as needed
|
||||||
|
## Adapted from a version sent by Charles Levert <charles@comm.polymtl.ca>
|
||||||
|
|
||||||
|
PATH="/usr/bin:$PATH"; export PATH
|
||||||
|
|
||||||
|
prog=`echo $0 | sed 's|.*/||'`
|
||||||
|
case "$prog" in
|
||||||
|
*egrep) grep=${EGREP-egrep} ;;
|
||||||
|
*fgrep) grep=${FGREP-fgrep} ;;
|
||||||
|
*) grep=${GREP-grep} ;;
|
||||||
|
esac
|
||||||
|
pat=""
|
||||||
|
while test $# -ne 0; do
|
||||||
|
case "$1" in
|
||||||
|
-e | -f) opt="$opt $1"; shift; pat="$1"
|
||||||
|
if test "$grep" = grep; then # grep is buggy with -e on SVR4
|
||||||
|
grep=egrep
|
||||||
|
fi;;
|
||||||
|
-A | -B) opt="$opt $1 $2"; shift;;
|
||||||
|
-*) opt="$opt $1";;
|
||||||
|
*) if test -z "$pat"; then
|
||||||
|
pat="$1"
|
||||||
|
else
|
||||||
|
break;
|
||||||
|
fi;;
|
||||||
|
esac
|
||||||
|
shift
|
||||||
|
done
|
||||||
|
|
||||||
|
if test -z "$pat"; then
|
||||||
|
echo "grep through bzip2 files"
|
||||||
|
echo "usage: $prog [grep_options] pattern [files]"
|
||||||
|
exit 1
|
||||||
|
fi
|
||||||
|
|
||||||
|
list=0
|
||||||
|
silent=0
|
||||||
|
op=`echo "$opt" | sed -e 's/ //g' -e 's/-//g'`
|
||||||
|
case "$op" in
|
||||||
|
*l*) list=1
|
||||||
|
esac
|
||||||
|
case "$op" in
|
||||||
|
*h*) silent=1
|
||||||
|
esac
|
||||||
|
|
||||||
|
if test $# -eq 0; then
|
||||||
|
bzip2 -cdfq | $grep $opt "$pat"
|
||||||
|
exit $?
|
||||||
|
fi
|
||||||
|
|
||||||
|
res=0
|
||||||
|
for i do
|
||||||
|
if test -f "$i"; then :; else if test -f "$i.bz2"; then i="$i.bz2"; fi; fi
|
||||||
|
if test $list -eq 1; then
|
||||||
|
bzip2 -cdfq "$i" | $grep $opt "$pat" 2>&1 > /dev/null && echo $i
|
||||||
|
r=$?
|
||||||
|
elif test $# -eq 1 -o $silent -eq 1; then
|
||||||
|
bzip2 -cdfq "$i" | $grep $opt "$pat"
|
||||||
|
r=$?
|
||||||
|
else
|
||||||
|
bzip2 -cdfq "$i" | $grep $opt "$pat" | sed "s|^|${i}:|"
|
||||||
|
r=$?
|
||||||
|
fi
|
||||||
|
test "$r" -ne 0 && res="$r"
|
||||||
|
done
|
||||||
|
exit $res
|
56
bzgrep.1
Normal file
56
bzgrep.1
Normal file
@ -0,0 +1,56 @@
|
|||||||
|
\"Shamelessly copied from zmore.1 by Philippe Troin <phil@fifi.org>
|
||||||
|
\"for Debian GNU/Linux
|
||||||
|
.TH BZGREP 1
|
||||||
|
.SH NAME
|
||||||
|
bzgrep, bzfgrep, bzegrep \- search possibly bzip2 compressed files for a regular expression
|
||||||
|
.SH SYNOPSIS
|
||||||
|
.B bzgrep
|
||||||
|
[ grep_options ]
|
||||||
|
.BI [\ -e\ ] " pattern"
|
||||||
|
.IR filename ".\|.\|."
|
||||||
|
.br
|
||||||
|
.B bzegrep
|
||||||
|
[ egrep_options ]
|
||||||
|
.BI [\ -e\ ] " pattern"
|
||||||
|
.IR filename ".\|.\|."
|
||||||
|
.br
|
||||||
|
.B bzfgrep
|
||||||
|
[ fgrep_options ]
|
||||||
|
.BI [\ -e\ ] " pattern"
|
||||||
|
.IR filename ".\|.\|."
|
||||||
|
.SH DESCRIPTION
|
||||||
|
.IR Bzgrep
|
||||||
|
is used to invoke the
|
||||||
|
.I grep
|
||||||
|
on bzip2-compressed files. All options specified are passed directly to
|
||||||
|
.I grep.
|
||||||
|
If no file is specified, then the standard input is decompressed
|
||||||
|
if necessary and fed to grep.
|
||||||
|
Otherwise the given files are uncompressed if necessary and fed to
|
||||||
|
.I grep.
|
||||||
|
.PP
|
||||||
|
If
|
||||||
|
.I bzgrep
|
||||||
|
is invoked as
|
||||||
|
.I bzegrep
|
||||||
|
or
|
||||||
|
.I bzfgrep
|
||||||
|
then
|
||||||
|
.I egrep
|
||||||
|
or
|
||||||
|
.I fgrep
|
||||||
|
is used instead of
|
||||||
|
.I grep.
|
||||||
|
If the GREP environment variable is set,
|
||||||
|
.I bzgrep
|
||||||
|
uses it as the
|
||||||
|
.I grep
|
||||||
|
program to be invoked. For example:
|
||||||
|
|
||||||
|
for sh: GREP=fgrep bzgrep string files
|
||||||
|
for csh: (setenv GREP fgrep; bzgrep string files)
|
||||||
|
.SH AUTHOR
|
||||||
|
Charles Levert (charles@comm.polymtl.ca). Adapted to bzip2 by Philippe
|
||||||
|
Troin <phil@fifi.org> for Debian GNU/Linux.
|
||||||
|
.SH "SEE ALSO"
|
||||||
|
grep(1), egrep(1), fgrep(1), bzdiff(1), bzmore(1), bzless(1), bzip2(1)
|
56
bzip2.1
56
bzip2.1
@ -1,7 +1,7 @@
|
|||||||
.PU
|
.PU
|
||||||
.TH bzip2 1
|
.TH bzip2 1
|
||||||
.SH NAME
|
.SH NAME
|
||||||
bzip2, bunzip2 \- a block-sorting file compressor, v1.0
|
bzip2, bunzip2 \- a block-sorting file compressor, v1.0.2
|
||||||
.br
|
.br
|
||||||
bzcat \- decompresses files to stdout
|
bzcat \- decompresses files to stdout
|
||||||
.br
|
.br
|
||||||
@ -197,7 +197,7 @@ to decompress.
|
|||||||
.TP
|
.TP
|
||||||
.B \-z --compress
|
.B \-z --compress
|
||||||
The complement to \-d: forces compression, regardless of the
|
The complement to \-d: forces compression, regardless of the
|
||||||
invokation name.
|
invocation name.
|
||||||
.TP
|
.TP
|
||||||
.B \-t --test
|
.B \-t --test
|
||||||
Check integrity of the specified file(s), but don't decompress them.
|
Check integrity of the specified file(s), but don't decompress them.
|
||||||
@ -211,6 +211,10 @@ existing output files. Also forces
|
|||||||
.I bzip2
|
.I bzip2
|
||||||
to break hard links
|
to break hard links
|
||||||
to files, which it otherwise wouldn't do.
|
to files, which it otherwise wouldn't do.
|
||||||
|
|
||||||
|
bzip2 normally declines to decompress files which don't have the
|
||||||
|
correct magic header bytes. If forced (-f), however, it will pass
|
||||||
|
such files through unmodified. This is how GNU gzip behaves.
|
||||||
.TP
|
.TP
|
||||||
.B \-k --keep
|
.B \-k --keep
|
||||||
Keep (don't delete) input files during compression
|
Keep (don't delete) input files during compression
|
||||||
@ -239,9 +243,13 @@ information which is primarily of interest for diagnostic purposes.
|
|||||||
.B \-L --license -V --version
|
.B \-L --license -V --version
|
||||||
Display the software version, license terms and conditions.
|
Display the software version, license terms and conditions.
|
||||||
.TP
|
.TP
|
||||||
.B \-1 to \-9
|
.B \-1 (or \-\-fast) to \-9 (or \-\-best)
|
||||||
Set the block size to 100 k, 200 k .. 900 k when compressing. Has no
|
Set the block size to 100 k, 200 k .. 900 k when compressing. Has no
|
||||||
effect when decompressing. See MEMORY MANAGEMENT below.
|
effect when decompressing. See MEMORY MANAGEMENT below.
|
||||||
|
The \-\-fast and \-\-best aliases are primarily for GNU gzip
|
||||||
|
compatibility. In particular, \-\-fast doesn't make things
|
||||||
|
significantly faster.
|
||||||
|
And \-\-best merely selects the default behaviour.
|
||||||
.TP
|
.TP
|
||||||
.B \--
|
.B \--
|
||||||
Treats all subsequent arguments as file names, even if they start
|
Treats all subsequent arguments as file names, even if they start
|
||||||
@ -352,11 +360,11 @@ undamaged.
|
|||||||
|
|
||||||
.I bzip2recover
|
.I bzip2recover
|
||||||
takes a single argument, the name of the damaged file,
|
takes a single argument, the name of the damaged file,
|
||||||
and writes a number of files "rec0001file.bz2",
|
and writes a number of files "rec00001file.bz2",
|
||||||
"rec0002file.bz2", etc, containing the extracted blocks.
|
"rec00002file.bz2", etc, containing the extracted blocks.
|
||||||
The output filenames are designed so that the use of
|
The output filenames are designed so that the use of
|
||||||
wildcards in subsequent processing -- for example,
|
wildcards in subsequent processing -- for example,
|
||||||
"bzip2 -dc rec*file.bz2 > recovered_data" -- lists the files in
|
"bzip2 -dc rec*file.bz2 > recovered_data" -- processes the files in
|
||||||
the correct order.
|
the correct order.
|
||||||
|
|
||||||
.I bzip2recover
|
.I bzip2recover
|
||||||
@ -397,27 +405,31 @@ I/O error messages are not as helpful as they could be.
|
|||||||
tries hard to detect I/O errors and exit cleanly, but the details of
|
tries hard to detect I/O errors and exit cleanly, but the details of
|
||||||
what the problem is sometimes seem rather misleading.
|
what the problem is sometimes seem rather misleading.
|
||||||
|
|
||||||
This manual page pertains to version 1.0 of
|
This manual page pertains to version 1.0.2 of
|
||||||
.I bzip2.
|
.I bzip2.
|
||||||
Compressed
|
Compressed data created by this version is entirely forwards and
|
||||||
data created by this version is entirely forwards and backwards
|
backwards compatible with the previous public releases, versions
|
||||||
compatible with the previous public releases, versions 0.1pl2, 0.9.0
|
0.1pl2, 0.9.0, 0.9.5, 1.0.0 and 1.0.1, but with the following
|
||||||
and 0.9.5,
|
exception: 0.9.0 and above can correctly decompress multiple
|
||||||
but with the following exception: 0.9.0 and above can correctly
|
concatenated compressed files. 0.1pl2 cannot do this; it will stop
|
||||||
decompress multiple concatenated compressed files. 0.1pl2 cannot do
|
after decompressing just the first file in the stream.
|
||||||
this; it will stop after decompressing just the first file in the
|
|
||||||
stream.
|
|
||||||
|
|
||||||
.I bzip2recover
|
.I bzip2recover
|
||||||
uses 32-bit integers to represent bit positions in
|
versions prior to this one, 1.0.2, used 32-bit integers to represent
|
||||||
compressed files, so it cannot handle compressed files more than 512
|
bit positions in compressed files, so it could not handle compressed
|
||||||
megabytes long. This could easily be fixed.
|
files more than 512 megabytes long. Version 1.0.2 and above uses
|
||||||
|
64-bit ints on some platforms which support them (GNU supported
|
||||||
|
targets, and Windows). To establish whether or not bzip2recover was
|
||||||
|
built with such a limitation, run it without arguments. In any event
|
||||||
|
you can build yourself an unlimited version if you can recompile it
|
||||||
|
with MaybeUInt64 set to be an unsigned 64-bit integer.
|
||||||
|
|
||||||
|
|
||||||
|
|
||||||
.SH AUTHOR
|
.SH AUTHOR
|
||||||
Julian Seward, jseward@acm.org.
|
Julian Seward, jseward@acm.org.
|
||||||
|
|
||||||
http://sourceware.cygnus.com/bzip2
|
http://sources.redhat.com/bzip2
|
||||||
http://www.muraroa.demon.co.uk
|
|
||||||
|
|
||||||
The ideas embodied in
|
The ideas embodied in
|
||||||
.I bzip2
|
.I bzip2
|
||||||
@ -434,6 +446,8 @@ indebted for their help, support and advice. See the manual in the
|
|||||||
source distribution for pointers to sources of documentation. Christian
|
source distribution for pointers to sources of documentation. Christian
|
||||||
von Roques encouraged me to look for faster sorting algorithms, so as to
|
von Roques encouraged me to look for faster sorting algorithms, so as to
|
||||||
speed up compression. Bela Lubkin encouraged me to improve the
|
speed up compression. Bela Lubkin encouraged me to improve the
|
||||||
worst-case compression performance. Many people sent patches, helped
|
worst-case compression performance.
|
||||||
|
The bz* scripts are derived from those of GNU gzip.
|
||||||
|
Many people sent patches, helped
|
||||||
with portability problems, lent machines, gave advice and were generally
|
with portability problems, lent machines, gave advice and were generally
|
||||||
helpful.
|
helpful.
|
||||||
|
@ -1,11 +1,9 @@
|
|||||||
|
|
||||||
|
|
||||||
|
|
||||||
bzip2(1) bzip2(1)
|
bzip2(1) bzip2(1)
|
||||||
|
|
||||||
|
|
||||||
|
|
||||||
NNAAMMEE
|
NNAAMMEE
|
||||||
bzip2, bunzip2 - a block-sorting file compressor, v1.0
|
bzip2, bunzip2 - a block-sorting file compressor, v1.0.2
|
||||||
bzcat - decompresses files to stdout
|
bzcat - decompresses files to stdout
|
||||||
bzip2recover - recovers data from damaged bzip2 files
|
bzip2recover - recovers data from damaged bzip2 files
|
||||||
|
|
||||||
@ -22,20 +20,20 @@ DDEESSCCRRIIPPTTIIOONN
|
|||||||
sorting text compression algorithm, and Huffman coding.
|
sorting text compression algorithm, and Huffman coding.
|
||||||
Compression is generally considerably better than that
|
Compression is generally considerably better than that
|
||||||
achieved by more conventional LZ77/LZ78-based compressors,
|
achieved by more conventional LZ77/LZ78-based compressors,
|
||||||
and approaches the performance of the PPM family of sta-
|
and approaches the performance of the PPM family of sta
|
||||||
tistical compressors.
|
tistical compressors.
|
||||||
|
|
||||||
The command-line options are deliberately very similar to
|
The command-line options are deliberately very similar to
|
||||||
those of _G_N_U _g_z_i_p_, but they are not identical.
|
those of _G_N_U _g_z_i_p_, but they are not identical.
|
||||||
|
|
||||||
_b_z_i_p_2 expects a list of file names to accompany the com-
|
_b_z_i_p_2 expects a list of file names to accompany the com
|
||||||
mand-line flags. Each file is replaced by a compressed
|
mand-line flags. Each file is replaced by a compressed
|
||||||
version of itself, with the name "original_name.bz2".
|
version of itself, with the name "original_name.bz2".
|
||||||
Each compressed file has the same modification date, per-
|
Each compressed file has the same modification date, per
|
||||||
missions, and, when possible, ownership as the correspond-
|
missions, and, when possible, ownership as the correspond
|
||||||
ing original, so that these properties can be correctly
|
ing original, so that these properties can be correctly
|
||||||
restored at decompression time. File name handling is
|
restored at decompression time. File name handling is
|
||||||
naive in the sense that there is no mechanism for preserv-
|
naive in the sense that there is no mechanism for preserv
|
||||||
ing original file names, permissions, ownerships or dates
|
ing original file names, permissions, ownerships or dates
|
||||||
in filesystems which lack these concepts, or have serious
|
in filesystems which lack these concepts, or have serious
|
||||||
file name length restrictions, such as MS-DOS.
|
file name length restrictions, such as MS-DOS.
|
||||||
@ -58,18 +56,6 @@ DDEESSCCRRIIPPTTIIOONN
|
|||||||
filename.bz2 becomes filename
|
filename.bz2 becomes filename
|
||||||
filename.bz becomes filename
|
filename.bz becomes filename
|
||||||
filename.tbz2 becomes filename.tar
|
filename.tbz2 becomes filename.tar
|
||||||
|
|
||||||
|
|
||||||
|
|
||||||
1
|
|
||||||
|
|
||||||
|
|
||||||
|
|
||||||
|
|
||||||
|
|
||||||
bzip2(1) bzip2(1)
|
|
||||||
|
|
||||||
|
|
||||||
filename.tbz becomes filename.tar
|
filename.tbz becomes filename.tar
|
||||||
anyothername becomes anyothername.out
|
anyothername becomes anyothername.out
|
||||||
|
|
||||||
@ -78,23 +64,23 @@ bzip2(1) bzip2(1)
|
|||||||
guess the name of the original file, and uses the original
|
guess the name of the original file, and uses the original
|
||||||
name with _._o_u_t appended.
|
name with _._o_u_t appended.
|
||||||
|
|
||||||
As with compression, supplying no filenames causes decom-
|
As with compression, supplying no filenames causes decom
|
||||||
pression from standard input to standard output.
|
pression from standard input to standard output.
|
||||||
|
|
||||||
_b_u_n_z_i_p_2 will correctly decompress a file which is the con-
|
_b_u_n_z_i_p_2 will correctly decompress a file which is the con
|
||||||
catenation of two or more compressed files. The result is
|
catenation of two or more compressed files. The result is
|
||||||
the concatenation of the corresponding uncompressed files.
|
the concatenation of the corresponding uncompressed files.
|
||||||
Integrity testing (-t) of concatenated compressed files is
|
Integrity testing (-t) of concatenated compressed files is
|
||||||
also supported.
|
also supported.
|
||||||
|
|
||||||
You can also compress or decompress files to the standard
|
You can also compress or decompress files to the standard
|
||||||
output by giving the -c flag. Multiple files may be com-
|
output by giving the -c flag. Multiple files may be com
|
||||||
pressed and decompressed like this. The resulting outputs
|
pressed and decompressed like this. The resulting outputs
|
||||||
are fed sequentially to stdout. Compression of multiple
|
are fed sequentially to stdout. Compression of multiple
|
||||||
files in this manner generates a stream containing multi-
|
files in this manner generates a stream containing multi
|
||||||
ple compressed file representations. Such a stream can be
|
ple compressed file representations. Such a stream can be
|
||||||
decompressed correctly only by _b_z_i_p_2 version 0.9.0 or
|
decompressed correctly only by _b_z_i_p_2 version 0.9.0 or
|
||||||
later. Earlier versions of _b_z_i_p_2 will stop after decom-
|
later. Earlier versions of _b_z_i_p_2 will stop after decom
|
||||||
pressing the first file in the stream.
|
pressing the first file in the stream.
|
||||||
|
|
||||||
_b_z_c_a_t (or _b_z_i_p_2 _-_d_c_) decompresses all specified files to
|
_b_z_c_a_t (or _b_z_i_p_2 _-_d_c_) decompresses all specified files to
|
||||||
@ -115,7 +101,7 @@ bzip2(1) bzip2(1)
|
|||||||
|
|
||||||
As a self-check for your protection, _b_z_i_p_2 uses 32-bit
|
As a self-check for your protection, _b_z_i_p_2 uses 32-bit
|
||||||
CRCs to make sure that the decompressed version of a file
|
CRCs to make sure that the decompressed version of a file
|
||||||
is identical to the original. This guards against corrup-
|
is identical to the original. This guards against corrup
|
||||||
tion of the compressed data, and against undetected bugs
|
tion of the compressed data, and against undetected bugs
|
||||||
in _b_z_i_p_2 (hopefully very unlikely). The chances of data
|
in _b_z_i_p_2 (hopefully very unlikely). The chances of data
|
||||||
corruption going undetected is microscopic, about one
|
corruption going undetected is microscopic, about one
|
||||||
@ -125,17 +111,6 @@ bzip2(1) bzip2(1)
|
|||||||
you recover the original uncompressed data. You can use
|
you recover the original uncompressed data. You can use
|
||||||
_b_z_i_p_2_r_e_c_o_v_e_r to try to recover data from damaged files.
|
_b_z_i_p_2_r_e_c_o_v_e_r to try to recover data from damaged files.
|
||||||
|
|
||||||
|
|
||||||
|
|
||||||
2
|
|
||||||
|
|
||||||
|
|
||||||
|
|
||||||
|
|
||||||
|
|
||||||
bzip2(1) bzip2(1)
|
|
||||||
|
|
||||||
|
|
||||||
Return values: 0 for a normal exit, 1 for environmental
|
Return values: 0 for a normal exit, 1 for environmental
|
||||||
problems (file not found, invalid flags, I/O errors, &c),
|
problems (file not found, invalid flags, I/O errors, &c),
|
||||||
2 to indicate a corrupt compressed file, 3 for an internal
|
2 to indicate a corrupt compressed file, 3 for an internal
|
||||||
@ -154,8 +129,8 @@ OOPPTTIIOONNSS
|
|||||||
and forces _b_z_i_p_2 to decompress.
|
and forces _b_z_i_p_2 to decompress.
|
||||||
|
|
||||||
--zz ----ccoommpprreessss
|
--zz ----ccoommpprreessss
|
||||||
The complement to -d: forces compression, regard-
|
The complement to -d: forces compression,
|
||||||
less of the invokation name.
|
regardless of the invocation name.
|
||||||
|
|
||||||
--tt ----tteesstt
|
--tt ----tteesstt
|
||||||
Check integrity of the specified file(s), but don't
|
Check integrity of the specified file(s), but don't
|
||||||
@ -168,6 +143,11 @@ OOPPTTIIOONNSS
|
|||||||
forces _b_z_i_p_2 to break hard links to files, which it
|
forces _b_z_i_p_2 to break hard links to files, which it
|
||||||
otherwise wouldn't do.
|
otherwise wouldn't do.
|
||||||
|
|
||||||
|
bzip2 normally declines to decompress files which
|
||||||
|
don't have the correct magic header bytes. If
|
||||||
|
forced (-f), however, it will pass such files
|
||||||
|
through unmodified. This is how GNU gzip behaves.
|
||||||
|
|
||||||
--kk ----kkeeeepp
|
--kk ----kkeeeepp
|
||||||
Keep (don't delete) input files during compression
|
Keep (don't delete) input files during compression
|
||||||
or decompression.
|
or decompression.
|
||||||
@ -190,23 +170,11 @@ OOPPTTIIOONNSS
|
|||||||
--qq ----qquuiieett
|
--qq ----qquuiieett
|
||||||
Suppress non-essential warning messages. Messages
|
Suppress non-essential warning messages. Messages
|
||||||
pertaining to I/O errors and other critical events
|
pertaining to I/O errors and other critical events
|
||||||
|
|
||||||
|
|
||||||
|
|
||||||
3
|
|
||||||
|
|
||||||
|
|
||||||
|
|
||||||
|
|
||||||
|
|
||||||
bzip2(1) bzip2(1)
|
|
||||||
|
|
||||||
|
|
||||||
will not be suppressed.
|
will not be suppressed.
|
||||||
|
|
||||||
--vv ----vveerrbboossee
|
--vv ----vveerrbboossee
|
||||||
Verbose mode -- show the compression ratio for each
|
Verbose mode -- show the compression ratio for each
|
||||||
file processed. Further -v's increase the ver-
|
file processed. Further -v's increase the ver
|
||||||
bosity level, spewing out lots of information which
|
bosity level, spewing out lots of information which
|
||||||
is primarily of interest for diagnostic purposes.
|
is primarily of interest for diagnostic purposes.
|
||||||
|
|
||||||
@ -214,20 +182,24 @@ bzip2(1) bzip2(1)
|
|||||||
Display the software version, license terms and
|
Display the software version, license terms and
|
||||||
conditions.
|
conditions.
|
||||||
|
|
||||||
--11 ttoo --99
|
--11 ((oorr ----ffaasstt)) ttoo --99 ((oorr ----bbeesstt))
|
||||||
Set the block size to 100 k, 200 k .. 900 k when
|
Set the block size to 100 k, 200 k .. 900 k when
|
||||||
compressing. Has no effect when decompressing.
|
compressing. Has no effect when decompressing.
|
||||||
See MEMORY MANAGEMENT below.
|
See MEMORY MANAGEMENT below. The --fast and --best
|
||||||
|
aliases are primarily for GNU gzip compatibility.
|
||||||
|
In particular, --fast doesn't make things signifi
|
||||||
|
cantly faster. And --best merely selects the
|
||||||
|
default behaviour.
|
||||||
|
|
||||||
---- Treats all subsequent arguments as file names, even
|
---- Treats all subsequent arguments as file names, even
|
||||||
if they start with a dash. This is so you can han-
|
if they start with a dash. This is so you can han
|
||||||
dle files with names beginning with a dash, for
|
dle files with names beginning with a dash, for
|
||||||
example: bzip2 -- -myfilename.
|
example: bzip2 -- -myfilename.
|
||||||
|
|
||||||
----rreeppeettiittiivvee--ffaasstt ----rreeppeettiittiivvee--bbeesstt
|
----rreeppeettiittiivvee--ffaasstt ----rreeppeettiittiivvee--bbeesstt
|
||||||
These flags are redundant in versions 0.9.5 and
|
These flags are redundant in versions 0.9.5 and
|
||||||
above. They provided some coarse control over the
|
above. They provided some coarse control over the
|
||||||
behaviour of the sorting algorithm in earlier ver-
|
behaviour of the sorting algorithm in earlier ver
|
||||||
sions, which was sometimes useful. 0.9.5 and above
|
sions, which was sometimes useful. 0.9.5 and above
|
||||||
have an improved algorithm which renders these
|
have an improved algorithm which renders these
|
||||||
flags irrelevant.
|
flags irrelevant.
|
||||||
@ -238,7 +210,7 @@ MMEEMMOORRYY MMAANNAAGGEEMMEENNTT
|
|||||||
affects both the compression ratio achieved, and the
|
affects both the compression ratio achieved, and the
|
||||||
amount of memory needed for compression and decompression.
|
amount of memory needed for compression and decompression.
|
||||||
The flags -1 through -9 specify the block size to be
|
The flags -1 through -9 specify the block size to be
|
||||||
100,000 bytes through 900,000 bytes (the default) respec-
|
100,000 bytes through 900,000 bytes (the default) respec
|
||||||
tively. At decompression time, the block size used for
|
tively. At decompression time, the block size used for
|
||||||
compression is read from the header of the compressed
|
compression is read from the header of the compressed
|
||||||
file, and _b_u_n_z_i_p_2 then allocates itself just enough memory
|
file, and _b_u_n_z_i_p_2 then allocates itself just enough memory
|
||||||
@ -256,18 +228,6 @@ MMEEMMOORRYY MMAANNAAGGEEMMEENNTT
|
|||||||
|
|
||||||
Larger block sizes give rapidly diminishing marginal
|
Larger block sizes give rapidly diminishing marginal
|
||||||
returns. Most of the compression comes from the first two
|
returns. Most of the compression comes from the first two
|
||||||
|
|
||||||
|
|
||||||
|
|
||||||
4
|
|
||||||
|
|
||||||
|
|
||||||
|
|
||||||
|
|
||||||
|
|
||||||
bzip2(1) bzip2(1)
|
|
||||||
|
|
||||||
|
|
||||||
or three hundred k of block size, a fact worth bearing in
|
or three hundred k of block size, a fact worth bearing in
|
||||||
mind when using _b_z_i_p_2 on small machines. It is also
|
mind when using _b_z_i_p_2 on small machines. It is also
|
||||||
important to appreciate that the decompression memory
|
important to appreciate that the decompression memory
|
||||||
@ -278,13 +238,13 @@ bzip2(1) bzip2(1)
|
|||||||
_b_u_n_z_i_p_2 will require about 3700 kbytes to decompress. To
|
_b_u_n_z_i_p_2 will require about 3700 kbytes to decompress. To
|
||||||
support decompression of any file on a 4 megabyte machine,
|
support decompression of any file on a 4 megabyte machine,
|
||||||
_b_u_n_z_i_p_2 has an option to decompress using approximately
|
_b_u_n_z_i_p_2 has an option to decompress using approximately
|
||||||
half this amount of memory, about 2300 kbytes. Decompres-
|
half this amount of memory, about 2300 kbytes. Decompres
|
||||||
sion speed is also halved, so you should use this option
|
sion speed is also halved, so you should use this option
|
||||||
only where necessary. The relevant flag is -s.
|
only where necessary. The relevant flag is -s.
|
||||||
|
|
||||||
In general, try and use the largest block size memory con-
|
In general, try and use the largest block size memory con
|
||||||
straints allow, since that maximises the compression
|
straints allow, since that maximises the compression
|
||||||
achieved. Compression and decompression speed are virtu-
|
achieved. Compression and decompression speed are virtu
|
||||||
ally unaffected by block size.
|
ally unaffected by block size.
|
||||||
|
|
||||||
Another significant point applies to files which fit in a
|
Another significant point applies to files which fit in a
|
||||||
@ -300,11 +260,11 @@ bzip2(1) bzip2(1)
|
|||||||
|
|
||||||
Here is a table which summarises the maximum memory usage
|
Here is a table which summarises the maximum memory usage
|
||||||
for different block sizes. Also recorded is the total
|
for different block sizes. Also recorded is the total
|
||||||
compressed size for 14 files of the Calgary Text Compres-
|
compressed size for 14 files of the Calgary Text Compres
|
||||||
sion Corpus totalling 3,141,622 bytes. This column gives
|
sion Corpus totalling 3,141,622 bytes. This column gives
|
||||||
some feel for how compression varies with block size.
|
some feel for how compression varies with block size.
|
||||||
These figures tend to understate the advantage of larger
|
These figures tend to understate the advantage of larger
|
||||||
block sizes for larger files, since the Corpus is domi-
|
block sizes for larger files, since the Corpus is domi
|
||||||
nated by smaller files.
|
nated by smaller files.
|
||||||
|
|
||||||
Compress Decompress Decompress Corpus
|
Compress Decompress Decompress Corpus
|
||||||
@ -321,22 +281,9 @@ bzip2(1) bzip2(1)
|
|||||||
-9 7600k 3700k 2350k 828642
|
-9 7600k 3700k 2350k 828642
|
||||||
|
|
||||||
|
|
||||||
|
|
||||||
|
|
||||||
|
|
||||||
|
|
||||||
5
|
|
||||||
|
|
||||||
|
|
||||||
|
|
||||||
|
|
||||||
|
|
||||||
bzip2(1) bzip2(1)
|
|
||||||
|
|
||||||
|
|
||||||
RREECCOOVVEERRIINNGG DDAATTAA FFRROOMM DDAAMMAAGGEEDD FFIILLEESS
|
RREECCOOVVEERRIINNGG DDAATTAA FFRROOMM DDAAMMAAGGEEDD FFIILLEESS
|
||||||
_b_z_i_p_2 compresses files in blocks, usually 900kbytes long.
|
_b_z_i_p_2 compresses files in blocks, usually 900kbytes long.
|
||||||
Each block is handled independently. If a media or trans-
|
Each block is handled independently. If a media or trans
|
||||||
mission error causes a multi-block .bz2 file to become
|
mission error causes a multi-block .bz2 file to become
|
||||||
damaged, it may be possible to recover data from the
|
damaged, it may be possible to recover data from the
|
||||||
undamaged blocks in the file.
|
undamaged blocks in the file.
|
||||||
@ -353,19 +300,19 @@ RREECCOOVVEERRIINNGG DDAATTAA FFRROOMM DDAAMMAAGGEEDD F
|
|||||||
the integrity of the resulting files, and decompress those
|
the integrity of the resulting files, and decompress those
|
||||||
which are undamaged.
|
which are undamaged.
|
||||||
|
|
||||||
_b_z_i_p_2_r_e_c_o_v_e_r takes a single argument, the name of the dam-
|
_b_z_i_p_2_r_e_c_o_v_e_r takes a single argument, the name of the dam
|
||||||
aged file, and writes a number of files "rec0001file.bz2",
|
aged file, and writes a number of files
|
||||||
"rec0002file.bz2", etc, containing the extracted blocks.
|
"rec00001file.bz2", "rec00002file.bz2", etc, containing
|
||||||
The output filenames are designed so that the use of
|
the extracted blocks. The output filenames are
|
||||||
wildcards in subsequent processing -- for example, "bzip2
|
designed so that the use of wildcards in subsequent pro
|
||||||
-dc rec*file.bz2 > recovered_data" -- lists the files in
|
cessing -- for example, "bzip2 -dc rec*file.bz2 > recov
|
||||||
the correct order.
|
ered_data" -- processes the files in the correct order.
|
||||||
|
|
||||||
_b_z_i_p_2_r_e_c_o_v_e_r should be of most use dealing with large .bz2
|
_b_z_i_p_2_r_e_c_o_v_e_r should be of most use dealing with large .bz2
|
||||||
files, as these will contain many blocks. It is clearly
|
files, as these will contain many blocks. It is clearly
|
||||||
futile to use it on damaged single-block files, since a
|
futile to use it on damaged single-block files, since a
|
||||||
damaged block cannot be recovered. If you wish to min-
|
damaged block cannot be recovered. If you wish to min
|
||||||
imise any potential data loss through media or transmis-
|
imise any potential data loss through media or transmis
|
||||||
sion errors, you might consider compressing with a smaller
|
sion errors, you might consider compressing with a smaller
|
||||||
block size.
|
block size.
|
||||||
|
|
||||||
@ -379,31 +326,19 @@ PPEERRFFOORRMMAANNCCEE NNOOTTEESS
|
|||||||
better than previous versions in this respect. The ratio
|
better than previous versions in this respect. The ratio
|
||||||
between worst-case and average-case compression time is in
|
between worst-case and average-case compression time is in
|
||||||
the region of 10:1. For previous versions, this figure
|
the region of 10:1. For previous versions, this figure
|
||||||
was more like 100:1. You can use the -vvvv option to mon-
|
was more like 100:1. You can use the -vvvv option to mon
|
||||||
itor progress in great detail, if you want.
|
itor progress in great detail, if you want.
|
||||||
|
|
||||||
Decompression speed is unaffected by these phenomena.
|
Decompression speed is unaffected by these phenomena.
|
||||||
|
|
||||||
_b_z_i_p_2 usually allocates several megabytes of memory to
|
_b_z_i_p_2 usually allocates several megabytes of memory to
|
||||||
operate in, and then charges all over it in a fairly ran-
|
operate in, and then charges all over it in a fairly ran
|
||||||
dom fashion. This means that performance, both for com-
|
dom fashion. This means that performance, both for com
|
||||||
pressing and decompressing, is largely determined by the
|
pressing and decompressing, is largely determined by the
|
||||||
|
|
||||||
|
|
||||||
|
|
||||||
6
|
|
||||||
|
|
||||||
|
|
||||||
|
|
||||||
|
|
||||||
|
|
||||||
bzip2(1) bzip2(1)
|
|
||||||
|
|
||||||
|
|
||||||
speed at which your machine can service cache misses.
|
speed at which your machine can service cache misses.
|
||||||
Because of this, small changes to the code to reduce the
|
Because of this, small changes to the code to reduce the
|
||||||
miss rate have been observed to give disproportionately
|
miss rate have been observed to give disproportionately
|
||||||
large performance improvements. I imagine _b_z_i_p_2 will per-
|
large performance improvements. I imagine _b_z_i_p_2 will per
|
||||||
form best on machines with very large caches.
|
form best on machines with very large caches.
|
||||||
|
|
||||||
|
|
||||||
@ -413,50 +348,51 @@ CCAAVVEEAATTSS
|
|||||||
but the details of what the problem is sometimes seem
|
but the details of what the problem is sometimes seem
|
||||||
rather misleading.
|
rather misleading.
|
||||||
|
|
||||||
This manual page pertains to version 1.0 of _b_z_i_p_2_. Com-
|
This manual page pertains to version 1.0.2 of _b_z_i_p_2_. Com
|
||||||
pressed data created by this version is entirely forwards
|
pressed data created by this version is entirely forwards
|
||||||
and backwards compatible with the previous public
|
and backwards compatible with the previous public
|
||||||
releases, versions 0.1pl2, 0.9.0 and 0.9.5, but with the
|
releases, versions 0.1pl2, 0.9.0, 0.9.5, 1.0.0 and 1.0.1,
|
||||||
following exception: 0.9.0 and above can correctly decom-
|
but with the following exception: 0.9.0 and above can cor
|
||||||
press multiple concatenated compressed files. 0.1pl2 can-
|
rectly decompress multiple concatenated compressed files.
|
||||||
not do this; it will stop after decompressing just the
|
0.1pl2 cannot do this; it will stop after decompressing
|
||||||
first file in the stream.
|
just the first file in the stream.
|
||||||
|
|
||||||
|
_b_z_i_p_2_r_e_c_o_v_e_r versions prior to this one, 1.0.2, used
|
||||||
|
32-bit integers to represent bit positions in compressed
|
||||||
|
files, so it could not handle compressed files more than
|
||||||
|
512 megabytes long. Version 1.0.2 and above uses 64-bit
|
||||||
|
ints on some platforms which support them (GNU supported
|
||||||
|
targets, and Windows). To establish whether or not
|
||||||
|
bzip2recover was built with such a limitation, run it
|
||||||
|
without arguments. In any event you can build yourself an
|
||||||
|
unlimited version if you can recompile it with MaybeUInt64
|
||||||
|
set to be an unsigned 64-bit integer.
|
||||||
|
|
||||||
|
|
||||||
_b_z_i_p_2_r_e_c_o_v_e_r uses 32-bit integers to represent bit posi-
|
|
||||||
tions in compressed files, so it cannot handle compressed
|
|
||||||
files more than 512 megabytes long. This could easily be
|
|
||||||
fixed.
|
|
||||||
|
|
||||||
|
|
||||||
AAUUTTHHOORR
|
AAUUTTHHOORR
|
||||||
Julian Seward, jseward@acm.org.
|
Julian Seward, jseward@acm.org.
|
||||||
|
|
||||||
http://sourceware.cygnus.com/bzip2
|
http://sources.redhat.com/bzip2
|
||||||
http://www.muraroa.demon.co.uk
|
|
||||||
|
|
||||||
The ideas embodied in _b_z_i_p_2 are due to (at least) the fol-
|
The ideas embodied in _b_z_i_p_2 are due to (at least) the fol
|
||||||
lowing people: Michael Burrows and David Wheeler (for the
|
lowing people: Michael Burrows and David Wheeler (for the
|
||||||
block sorting transformation), David Wheeler (again, for
|
block sorting transformation), David Wheeler (again, for
|
||||||
the Huffman coder), Peter Fenwick (for the structured cod-
|
the Huffman coder), Peter Fenwick (for the structured cod
|
||||||
ing model in the original _b_z_i_p_, and many refinements), and
|
ing model in the original _b_z_i_p_, and many refinements), and
|
||||||
Alistair Moffat, Radford Neal and Ian Witten (for the
|
Alistair Moffat, Radford Neal and Ian Witten (for the
|
||||||
arithmetic coder in the original _b_z_i_p_)_. I am much
|
arithmetic coder in the original _b_z_i_p_)_. I am much
|
||||||
indebted for their help, support and advice. See the man-
|
indebted for their help, support and advice. See the man
|
||||||
ual in the source distribution for pointers to sources of
|
ual in the source distribution for pointers to sources of
|
||||||
documentation. Christian von Roques encouraged me to look
|
documentation. Christian von Roques encouraged me to look
|
||||||
for faster sorting algorithms, so as to speed up compres-
|
for faster sorting algorithms, so as to speed up compres
|
||||||
sion. Bela Lubkin encouraged me to improve the worst-case
|
sion. Bela Lubkin encouraged me to improve the worst-case
|
||||||
compression performance. Many people sent patches, helped
|
compression performance. The bz* scripts are derived from
|
||||||
with portability problems, lent machines, gave advice and
|
those of GNU gzip. Many people sent patches, helped with
|
||||||
were generally helpful.
|
portability problems, lent machines, gave advice and were
|
||||||
|
generally helpful.
|
||||||
|
|
||||||
|
|
||||||
|
|
||||||
|
bzip2(1)
|
||||||
|
|
||||||
|
|
||||||
|
|
||||||
|
|
||||||
7
|
|
||||||
|
|
||||||
|
|
||||||
|
535
bzip2.c
535
bzip2.c
@ -7,7 +7,7 @@
|
|||||||
This file is a part of bzip2 and/or libbzip2, a program and
|
This file is a part of bzip2 and/or libbzip2, a program and
|
||||||
library for lossless, block-sorting data compression.
|
library for lossless, block-sorting data compression.
|
||||||
|
|
||||||
Copyright (C) 1996-2000 Julian R Seward. All rights reserved.
|
Copyright (C) 1996-2002 Julian R Seward. All rights reserved.
|
||||||
|
|
||||||
Redistribution and use in source and binary forms, with or without
|
Redistribution and use in source and binary forms, with or without
|
||||||
modification, are permitted provided that the following conditions
|
modification, are permitted provided that the following conditions
|
||||||
@ -113,13 +113,16 @@
|
|||||||
/*--
|
/*--
|
||||||
Generic 32-bit Unix.
|
Generic 32-bit Unix.
|
||||||
Also works on 64-bit Unix boxes.
|
Also works on 64-bit Unix boxes.
|
||||||
|
This is the default.
|
||||||
--*/
|
--*/
|
||||||
#define BZ_UNIX 1
|
#define BZ_UNIX 1
|
||||||
|
|
||||||
/*--
|
/*--
|
||||||
Win32, as seen by Jacob Navia's excellent
|
Win32, as seen by Jacob Navia's excellent
|
||||||
port of (Chris Fraser & David Hanson)'s excellent
|
port of (Chris Fraser & David Hanson)'s excellent
|
||||||
lcc compiler.
|
lcc compiler. Or with MS Visual C.
|
||||||
|
This is selected automatically if compiled by a compiler which
|
||||||
|
defines _WIN32, not including the Cygwin GCC.
|
||||||
--*/
|
--*/
|
||||||
#define BZ_LCCWIN32 0
|
#define BZ_LCCWIN32 0
|
||||||
|
|
||||||
@ -156,6 +159,7 @@
|
|||||||
--*/
|
--*/
|
||||||
|
|
||||||
#if BZ_UNIX
|
#if BZ_UNIX
|
||||||
|
# include <fcntl.h>
|
||||||
# include <sys/types.h>
|
# include <sys/types.h>
|
||||||
# include <utime.h>
|
# include <utime.h>
|
||||||
# include <unistd.h>
|
# include <unistd.h>
|
||||||
@ -164,8 +168,9 @@
|
|||||||
|
|
||||||
# define PATH_SEP '/'
|
# define PATH_SEP '/'
|
||||||
# define MY_LSTAT lstat
|
# define MY_LSTAT lstat
|
||||||
# define MY_S_IFREG S_ISREG
|
|
||||||
# define MY_STAT stat
|
# define MY_STAT stat
|
||||||
|
# define MY_S_ISREG S_ISREG
|
||||||
|
# define MY_S_ISDIR S_ISDIR
|
||||||
|
|
||||||
# define APPEND_FILESPEC(root, name) \
|
# define APPEND_FILESPEC(root, name) \
|
||||||
root=snocString((root), (name))
|
root=snocString((root), (name))
|
||||||
@ -180,19 +185,23 @@
|
|||||||
# else
|
# else
|
||||||
# define NORETURN /**/
|
# define NORETURN /**/
|
||||||
# endif
|
# endif
|
||||||
|
|
||||||
# ifdef __DJGPP__
|
# ifdef __DJGPP__
|
||||||
# include <io.h>
|
# include <io.h>
|
||||||
# include <fcntl.h>
|
# include <fcntl.h>
|
||||||
# undef MY_LSTAT
|
# undef MY_LSTAT
|
||||||
|
# undef MY_STAT
|
||||||
# define MY_LSTAT stat
|
# define MY_LSTAT stat
|
||||||
|
# define MY_STAT stat
|
||||||
# undef SET_BINARY_MODE
|
# undef SET_BINARY_MODE
|
||||||
# define SET_BINARY_MODE(fd) \
|
# define SET_BINARY_MODE(fd) \
|
||||||
do { \
|
do { \
|
||||||
int retVal = setmode ( fileno ( fd ), \
|
int retVal = setmode ( fileno ( fd ), \
|
||||||
O_BINARY ); \
|
O_BINARY ); \
|
||||||
ERROR_IF_MINUS_ONE ( retVal ); \
|
ERROR_IF_MINUS_ONE ( retVal ); \
|
||||||
} while ( 0 )
|
} while ( 0 )
|
||||||
# endif
|
# endif
|
||||||
|
|
||||||
# ifdef __CYGWIN__
|
# ifdef __CYGWIN__
|
||||||
# include <io.h>
|
# include <io.h>
|
||||||
# include <fcntl.h>
|
# include <fcntl.h>
|
||||||
@ -200,11 +209,11 @@
|
|||||||
# define SET_BINARY_MODE(fd) \
|
# define SET_BINARY_MODE(fd) \
|
||||||
do { \
|
do { \
|
||||||
int retVal = setmode ( fileno ( fd ), \
|
int retVal = setmode ( fileno ( fd ), \
|
||||||
O_BINARY ); \
|
O_BINARY ); \
|
||||||
ERROR_IF_MINUS_ONE ( retVal ); \
|
ERROR_IF_MINUS_ONE ( retVal ); \
|
||||||
} while ( 0 )
|
} while ( 0 )
|
||||||
# endif
|
# endif
|
||||||
#endif
|
#endif /* BZ_UNIX */
|
||||||
|
|
||||||
|
|
||||||
|
|
||||||
@ -217,46 +226,23 @@
|
|||||||
# define PATH_SEP '\\'
|
# define PATH_SEP '\\'
|
||||||
# define MY_LSTAT _stat
|
# define MY_LSTAT _stat
|
||||||
# define MY_STAT _stat
|
# define MY_STAT _stat
|
||||||
# define MY_S_IFREG(x) ((x) & _S_IFREG)
|
# define MY_S_ISREG(x) ((x) & _S_IFREG)
|
||||||
|
# define MY_S_ISDIR(x) ((x) & _S_IFDIR)
|
||||||
|
|
||||||
# define APPEND_FLAG(root, name) \
|
# define APPEND_FLAG(root, name) \
|
||||||
root=snocString((root), (name))
|
root=snocString((root), (name))
|
||||||
|
|
||||||
# if 0
|
|
||||||
/*-- lcc-win32 seems to expand wildcards itself --*/
|
|
||||||
# define APPEND_FILESPEC(root, spec) \
|
|
||||||
do { \
|
|
||||||
if ((spec)[0] == '-') { \
|
|
||||||
root = snocString((root), (spec)); \
|
|
||||||
} else { \
|
|
||||||
struct _finddata_t c_file; \
|
|
||||||
long hFile; \
|
|
||||||
hFile = _findfirst((spec), &c_file); \
|
|
||||||
if ( hFile == -1L ) { \
|
|
||||||
root = snocString ((root), (spec)); \
|
|
||||||
} else { \
|
|
||||||
int anInt = 0; \
|
|
||||||
while ( anInt == 0 ) { \
|
|
||||||
root = snocString((root), \
|
|
||||||
&c_file.name[0]); \
|
|
||||||
anInt = _findnext(hFile, &c_file); \
|
|
||||||
} \
|
|
||||||
} \
|
|
||||||
} \
|
|
||||||
} while ( 0 )
|
|
||||||
# else
|
|
||||||
# define APPEND_FILESPEC(root, name) \
|
# define APPEND_FILESPEC(root, name) \
|
||||||
root = snocString ((root), (name))
|
root = snocString ((root), (name))
|
||||||
# endif
|
|
||||||
|
|
||||||
# define SET_BINARY_MODE(fd) \
|
# define SET_BINARY_MODE(fd) \
|
||||||
do { \
|
do { \
|
||||||
int retVal = setmode ( fileno ( fd ), \
|
int retVal = setmode ( fileno ( fd ), \
|
||||||
O_BINARY ); \
|
O_BINARY ); \
|
||||||
ERROR_IF_MINUS_ONE ( retVal ); \
|
ERROR_IF_MINUS_ONE ( retVal ); \
|
||||||
} while ( 0 )
|
} while ( 0 )
|
||||||
|
|
||||||
#endif
|
#endif /* BZ_LCCWIN32 */
|
||||||
|
|
||||||
|
|
||||||
/*---------------------------------------------*/
|
/*---------------------------------------------*/
|
||||||
@ -338,6 +324,7 @@ typedef
|
|||||||
struct { UChar b[8]; }
|
struct { UChar b[8]; }
|
||||||
UInt64;
|
UInt64;
|
||||||
|
|
||||||
|
|
||||||
static
|
static
|
||||||
void uInt64_from_UInt32s ( UInt64* n, UInt32 lo32, UInt32 hi32 )
|
void uInt64_from_UInt32s ( UInt64* n, UInt32 lo32, UInt32 hi32 )
|
||||||
{
|
{
|
||||||
@ -351,6 +338,7 @@ void uInt64_from_UInt32s ( UInt64* n, UInt32 lo32, UInt32 hi32 )
|
|||||||
n->b[0] = (UChar) (lo32 & 0xFF);
|
n->b[0] = (UChar) (lo32 & 0xFF);
|
||||||
}
|
}
|
||||||
|
|
||||||
|
|
||||||
static
|
static
|
||||||
double uInt64_to_double ( UInt64* n )
|
double uInt64_to_double ( UInt64* n )
|
||||||
{
|
{
|
||||||
@ -364,77 +352,6 @@ double uInt64_to_double ( UInt64* n )
|
|||||||
return sum;
|
return sum;
|
||||||
}
|
}
|
||||||
|
|
||||||
static
|
|
||||||
void uInt64_add ( UInt64* src, UInt64* dst )
|
|
||||||
{
|
|
||||||
Int32 i;
|
|
||||||
Int32 carry = 0;
|
|
||||||
for (i = 0; i < 8; i++) {
|
|
||||||
carry += ( ((Int32)src->b[i]) + ((Int32)dst->b[i]) );
|
|
||||||
dst->b[i] = (UChar)(carry & 0xFF);
|
|
||||||
carry >>= 8;
|
|
||||||
}
|
|
||||||
}
|
|
||||||
|
|
||||||
static
|
|
||||||
void uInt64_sub ( UInt64* src, UInt64* dst )
|
|
||||||
{
|
|
||||||
Int32 t, i;
|
|
||||||
Int32 borrow = 0;
|
|
||||||
for (i = 0; i < 8; i++) {
|
|
||||||
t = ((Int32)dst->b[i]) - ((Int32)src->b[i]) - borrow;
|
|
||||||
if (t < 0) {
|
|
||||||
dst->b[i] = (UChar)(t + 256);
|
|
||||||
borrow = 1;
|
|
||||||
} else {
|
|
||||||
dst->b[i] = (UChar)t;
|
|
||||||
borrow = 0;
|
|
||||||
}
|
|
||||||
}
|
|
||||||
}
|
|
||||||
|
|
||||||
static
|
|
||||||
void uInt64_mul ( UInt64* a, UInt64* b, UInt64* r_hi, UInt64* r_lo )
|
|
||||||
{
|
|
||||||
UChar sum[16];
|
|
||||||
Int32 ia, ib, carry;
|
|
||||||
for (ia = 0; ia < 16; ia++) sum[ia] = 0;
|
|
||||||
for (ia = 0; ia < 8; ia++) {
|
|
||||||
carry = 0;
|
|
||||||
for (ib = 0; ib < 8; ib++) {
|
|
||||||
carry += ( ((Int32)sum[ia+ib])
|
|
||||||
+ ((Int32)a->b[ia]) * ((Int32)b->b[ib]) );
|
|
||||||
sum[ia+ib] = (UChar)(carry & 0xFF);
|
|
||||||
carry >>= 8;
|
|
||||||
}
|
|
||||||
sum[ia+8] = (UChar)(carry & 0xFF);
|
|
||||||
if ((carry >>= 8) != 0) panic ( "uInt64_mul" );
|
|
||||||
}
|
|
||||||
|
|
||||||
for (ia = 0; ia < 8; ia++) r_hi->b[ia] = sum[ia+8];
|
|
||||||
for (ia = 0; ia < 8; ia++) r_lo->b[ia] = sum[ia];
|
|
||||||
}
|
|
||||||
|
|
||||||
|
|
||||||
static
|
|
||||||
void uInt64_shr1 ( UInt64* n )
|
|
||||||
{
|
|
||||||
Int32 i;
|
|
||||||
for (i = 0; i < 8; i++) {
|
|
||||||
n->b[i] >>= 1;
|
|
||||||
if (i < 7 && (n->b[i+1] & 1)) n->b[i] |= 0x80;
|
|
||||||
}
|
|
||||||
}
|
|
||||||
|
|
||||||
static
|
|
||||||
void uInt64_shl1 ( UInt64* n )
|
|
||||||
{
|
|
||||||
Int32 i;
|
|
||||||
for (i = 7; i >= 0; i--) {
|
|
||||||
n->b[i] <<= 1;
|
|
||||||
if (i > 0 && (n->b[i-1] & 0x80)) n->b[i]++;
|
|
||||||
}
|
|
||||||
}
|
|
||||||
|
|
||||||
static
|
static
|
||||||
Bool uInt64_isZero ( UInt64* n )
|
Bool uInt64_isZero ( UInt64* n )
|
||||||
@ -445,49 +362,23 @@ Bool uInt64_isZero ( UInt64* n )
|
|||||||
return 1;
|
return 1;
|
||||||
}
|
}
|
||||||
|
|
||||||
static
|
|
||||||
|
/* Divide *n by 10, and return the remainder. */
|
||||||
|
static
|
||||||
Int32 uInt64_qrm10 ( UInt64* n )
|
Int32 uInt64_qrm10 ( UInt64* n )
|
||||||
{
|
{
|
||||||
/* Divide *n by 10, and return the remainder. Long division
|
UInt32 rem, tmp;
|
||||||
is difficult, so we cheat and instead multiply by
|
|
||||||
0xCCCC CCCC CCCC CCCD, which is 0.8 (viz, 0.1 << 3).
|
|
||||||
*/
|
|
||||||
Int32 i;
|
Int32 i;
|
||||||
UInt64 tmp1, tmp2, n_orig, zero_point_eight;
|
rem = 0;
|
||||||
|
for (i = 7; i >= 0; i--) {
|
||||||
zero_point_eight.b[1] = zero_point_eight.b[2] =
|
tmp = rem * 256 + n->b[i];
|
||||||
zero_point_eight.b[3] = zero_point_eight.b[4] =
|
n->b[i] = tmp / 10;
|
||||||
zero_point_eight.b[5] = zero_point_eight.b[6] =
|
rem = tmp % 10;
|
||||||
zero_point_eight.b[7] = 0xCC;
|
}
|
||||||
zero_point_eight.b[0] = 0xCD;
|
return rem;
|
||||||
|
|
||||||
n_orig = *n;
|
|
||||||
|
|
||||||
/* divide n by 10,
|
|
||||||
by multiplying by 0.8 and then shifting right 3 times */
|
|
||||||
uInt64_mul ( n, &zero_point_eight, &tmp1, &tmp2 );
|
|
||||||
uInt64_shr1(&tmp1); uInt64_shr1(&tmp1); uInt64_shr1(&tmp1);
|
|
||||||
*n = tmp1;
|
|
||||||
|
|
||||||
/* tmp1 = 8*n, tmp2 = 2*n */
|
|
||||||
uInt64_shl1(&tmp1); uInt64_shl1(&tmp1); uInt64_shl1(&tmp1);
|
|
||||||
tmp2 = *n; uInt64_shl1(&tmp2);
|
|
||||||
|
|
||||||
/* tmp1 = 10*n */
|
|
||||||
uInt64_add ( &tmp2, &tmp1 );
|
|
||||||
|
|
||||||
/* n_orig = n_orig - 10*n */
|
|
||||||
uInt64_sub ( &tmp1, &n_orig );
|
|
||||||
|
|
||||||
/* n_orig should now hold quotient, in range 0 .. 9 */
|
|
||||||
for (i = 7; i >= 1; i--)
|
|
||||||
if (n_orig.b[i] != 0) panic ( "uInt64_qrm10(1)" );
|
|
||||||
if (n_orig.b[0] > 9)
|
|
||||||
panic ( "uInt64_qrm10(2)" );
|
|
||||||
|
|
||||||
return (int)n_orig.b[0];
|
|
||||||
}
|
}
|
||||||
|
|
||||||
|
|
||||||
/* ... and the Whole Entire Point of all this UInt64 stuff is
|
/* ... and the Whole Entire Point of all this UInt64 stuff is
|
||||||
so that we can supply the following function.
|
so that we can supply the following function.
|
||||||
*/
|
*/
|
||||||
@ -504,7 +395,8 @@ void uInt64_toAscii ( char* outbuf, UInt64* n )
|
|||||||
nBuf++;
|
nBuf++;
|
||||||
} while (!uInt64_isZero(&n_copy));
|
} while (!uInt64_isZero(&n_copy));
|
||||||
outbuf[nBuf] = 0;
|
outbuf[nBuf] = 0;
|
||||||
for (i = 0; i < nBuf; i++) outbuf[i] = buf[nBuf-i-1];
|
for (i = 0; i < nBuf; i++)
|
||||||
|
outbuf[i] = buf[nBuf-i-1];
|
||||||
}
|
}
|
||||||
|
|
||||||
|
|
||||||
@ -566,35 +458,38 @@ void compressStream ( FILE *stream, FILE *zStream )
|
|||||||
if (ret == EOF) goto errhandler_io;
|
if (ret == EOF) goto errhandler_io;
|
||||||
if (zStream != stdout) {
|
if (zStream != stdout) {
|
||||||
ret = fclose ( zStream );
|
ret = fclose ( zStream );
|
||||||
|
outputHandleJustInCase = NULL;
|
||||||
if (ret == EOF) goto errhandler_io;
|
if (ret == EOF) goto errhandler_io;
|
||||||
}
|
}
|
||||||
|
outputHandleJustInCase = NULL;
|
||||||
if (ferror(stream)) goto errhandler_io;
|
if (ferror(stream)) goto errhandler_io;
|
||||||
ret = fclose ( stream );
|
ret = fclose ( stream );
|
||||||
if (ret == EOF) goto errhandler_io;
|
if (ret == EOF) goto errhandler_io;
|
||||||
|
|
||||||
if (nbytes_in_lo32 == 0 && nbytes_in_hi32 == 0)
|
|
||||||
nbytes_in_lo32 = 1;
|
|
||||||
|
|
||||||
if (verbosity >= 1) {
|
if (verbosity >= 1) {
|
||||||
Char buf_nin[32], buf_nout[32];
|
if (nbytes_in_lo32 == 0 && nbytes_in_hi32 == 0) {
|
||||||
UInt64 nbytes_in, nbytes_out;
|
fprintf ( stderr, " no data compressed.\n");
|
||||||
double nbytes_in_d, nbytes_out_d;
|
} else {
|
||||||
uInt64_from_UInt32s ( &nbytes_in,
|
Char buf_nin[32], buf_nout[32];
|
||||||
nbytes_in_lo32, nbytes_in_hi32 );
|
UInt64 nbytes_in, nbytes_out;
|
||||||
uInt64_from_UInt32s ( &nbytes_out,
|
double nbytes_in_d, nbytes_out_d;
|
||||||
nbytes_out_lo32, nbytes_out_hi32 );
|
uInt64_from_UInt32s ( &nbytes_in,
|
||||||
nbytes_in_d = uInt64_to_double ( &nbytes_in );
|
nbytes_in_lo32, nbytes_in_hi32 );
|
||||||
nbytes_out_d = uInt64_to_double ( &nbytes_out );
|
uInt64_from_UInt32s ( &nbytes_out,
|
||||||
uInt64_toAscii ( buf_nin, &nbytes_in );
|
nbytes_out_lo32, nbytes_out_hi32 );
|
||||||
uInt64_toAscii ( buf_nout, &nbytes_out );
|
nbytes_in_d = uInt64_to_double ( &nbytes_in );
|
||||||
fprintf ( stderr, "%6.3f:1, %6.3f bits/byte, "
|
nbytes_out_d = uInt64_to_double ( &nbytes_out );
|
||||||
"%5.2f%% saved, %s in, %s out.\n",
|
uInt64_toAscii ( buf_nin, &nbytes_in );
|
||||||
nbytes_in_d / nbytes_out_d,
|
uInt64_toAscii ( buf_nout, &nbytes_out );
|
||||||
(8.0 * nbytes_out_d) / nbytes_in_d,
|
fprintf ( stderr, "%6.3f:1, %6.3f bits/byte, "
|
||||||
100.0 * (1.0 - nbytes_out_d / nbytes_in_d),
|
"%5.2f%% saved, %s in, %s out.\n",
|
||||||
buf_nin,
|
nbytes_in_d / nbytes_out_d,
|
||||||
buf_nout
|
(8.0 * nbytes_out_d) / nbytes_in_d,
|
||||||
);
|
100.0 * (1.0 - nbytes_out_d / nbytes_in_d),
|
||||||
|
buf_nin,
|
||||||
|
buf_nout
|
||||||
|
);
|
||||||
|
}
|
||||||
}
|
}
|
||||||
|
|
||||||
return;
|
return;
|
||||||
@ -652,7 +547,7 @@ Bool uncompressStream ( FILE *zStream, FILE *stream )
|
|||||||
|
|
||||||
while (bzerr == BZ_OK) {
|
while (bzerr == BZ_OK) {
|
||||||
nread = BZ2_bzRead ( &bzerr, bzf, obuf, 5000 );
|
nread = BZ2_bzRead ( &bzerr, bzf, obuf, 5000 );
|
||||||
if (bzerr == BZ_DATA_ERROR_MAGIC) goto errhandler;
|
if (bzerr == BZ_DATA_ERROR_MAGIC) goto trycat;
|
||||||
if ((bzerr == BZ_OK || bzerr == BZ_STREAM_END) && nread > 0)
|
if ((bzerr == BZ_OK || bzerr == BZ_STREAM_END) && nread > 0)
|
||||||
fwrite ( obuf, sizeof(UChar), nread, stream );
|
fwrite ( obuf, sizeof(UChar), nread, stream );
|
||||||
if (ferror(stream)) goto errhandler_io;
|
if (ferror(stream)) goto errhandler_io;
|
||||||
@ -668,9 +563,9 @@ Bool uncompressStream ( FILE *zStream, FILE *stream )
|
|||||||
if (bzerr != BZ_OK) panic ( "decompress:bzReadGetUnused" );
|
if (bzerr != BZ_OK) panic ( "decompress:bzReadGetUnused" );
|
||||||
|
|
||||||
if (nUnused == 0 && myfeof(zStream)) break;
|
if (nUnused == 0 && myfeof(zStream)) break;
|
||||||
|
|
||||||
}
|
}
|
||||||
|
|
||||||
|
closeok:
|
||||||
if (ferror(zStream)) goto errhandler_io;
|
if (ferror(zStream)) goto errhandler_io;
|
||||||
ret = fclose ( zStream );
|
ret = fclose ( zStream );
|
||||||
if (ret == EOF) goto errhandler_io;
|
if (ret == EOF) goto errhandler_io;
|
||||||
@ -680,11 +575,26 @@ Bool uncompressStream ( FILE *zStream, FILE *stream )
|
|||||||
if (ret != 0) goto errhandler_io;
|
if (ret != 0) goto errhandler_io;
|
||||||
if (stream != stdout) {
|
if (stream != stdout) {
|
||||||
ret = fclose ( stream );
|
ret = fclose ( stream );
|
||||||
|
outputHandleJustInCase = NULL;
|
||||||
if (ret == EOF) goto errhandler_io;
|
if (ret == EOF) goto errhandler_io;
|
||||||
}
|
}
|
||||||
|
outputHandleJustInCase = NULL;
|
||||||
if (verbosity >= 2) fprintf ( stderr, "\n " );
|
if (verbosity >= 2) fprintf ( stderr, "\n " );
|
||||||
return True;
|
return True;
|
||||||
|
|
||||||
|
trycat:
|
||||||
|
if (forceOverwrite) {
|
||||||
|
rewind(zStream);
|
||||||
|
while (True) {
|
||||||
|
if (myfeof(zStream)) break;
|
||||||
|
nread = fread ( obuf, sizeof(UChar), 5000, zStream );
|
||||||
|
if (ferror(zStream)) goto errhandler_io;
|
||||||
|
if (nread > 0) fwrite ( obuf, sizeof(UChar), nread, stream );
|
||||||
|
if (ferror(stream)) goto errhandler_io;
|
||||||
|
}
|
||||||
|
goto closeok;
|
||||||
|
}
|
||||||
|
|
||||||
errhandler:
|
errhandler:
|
||||||
BZ2_bzReadClose ( &bzerr_dummy, bzf );
|
BZ2_bzReadClose ( &bzerr_dummy, bzf );
|
||||||
switch (bzerr) {
|
switch (bzerr) {
|
||||||
@ -832,7 +742,7 @@ void cadvise ( void )
|
|||||||
stderr,
|
stderr,
|
||||||
"\nIt is possible that the compressed file(s) have become corrupted.\n"
|
"\nIt is possible that the compressed file(s) have become corrupted.\n"
|
||||||
"You can use the -tvv option to test integrity of such files.\n\n"
|
"You can use the -tvv option to test integrity of such files.\n\n"
|
||||||
"You can use the `bzip2recover' program to *attempt* to recover\n"
|
"You can use the `bzip2recover' program to attempt to recover\n"
|
||||||
"data from undamaged sections of corrupted files.\n\n"
|
"data from undamaged sections of corrupted files.\n\n"
|
||||||
);
|
);
|
||||||
}
|
}
|
||||||
@ -855,28 +765,55 @@ void showFileNames ( void )
|
|||||||
static
|
static
|
||||||
void cleanUpAndFail ( Int32 ec )
|
void cleanUpAndFail ( Int32 ec )
|
||||||
{
|
{
|
||||||
IntNative retVal;
|
IntNative retVal;
|
||||||
|
struct MY_STAT statBuf;
|
||||||
|
|
||||||
if ( srcMode == SM_F2F
|
if ( srcMode == SM_F2F
|
||||||
&& opMode != OM_TEST
|
&& opMode != OM_TEST
|
||||||
&& deleteOutputOnInterrupt ) {
|
&& deleteOutputOnInterrupt ) {
|
||||||
if (noisy)
|
|
||||||
fprintf ( stderr, "%s: Deleting output file %s, if it exists.\n",
|
/* Check whether input file still exists. Delete output file
|
||||||
progName, outName );
|
only if input exists to avoid loss of data. Joerg Prante, 5
|
||||||
if (outputHandleJustInCase != NULL)
|
January 2002. (JRS 06-Jan-2002: other changes in 1.0.2 mean
|
||||||
fclose ( outputHandleJustInCase );
|
this is less likely to happen. But to be ultra-paranoid, we
|
||||||
retVal = remove ( outName );
|
do the check anyway.) */
|
||||||
if (retVal != 0)
|
retVal = MY_STAT ( inName, &statBuf );
|
||||||
|
if (retVal == 0) {
|
||||||
|
if (noisy)
|
||||||
|
fprintf ( stderr,
|
||||||
|
"%s: Deleting output file %s, if it exists.\n",
|
||||||
|
progName, outName );
|
||||||
|
if (outputHandleJustInCase != NULL)
|
||||||
|
fclose ( outputHandleJustInCase );
|
||||||
|
retVal = remove ( outName );
|
||||||
|
if (retVal != 0)
|
||||||
|
fprintf ( stderr,
|
||||||
|
"%s: WARNING: deletion of output file "
|
||||||
|
"(apparently) failed.\n",
|
||||||
|
progName );
|
||||||
|
} else {
|
||||||
fprintf ( stderr,
|
fprintf ( stderr,
|
||||||
"%s: WARNING: deletion of output file (apparently) failed.\n",
|
"%s: WARNING: deletion of output file suppressed\n",
|
||||||
|
progName );
|
||||||
|
fprintf ( stderr,
|
||||||
|
"%s: since input file no longer exists. Output file\n",
|
||||||
progName );
|
progName );
|
||||||
|
fprintf ( stderr,
|
||||||
|
"%s: `%s' may be incomplete.\n",
|
||||||
|
progName, outName );
|
||||||
|
fprintf ( stderr,
|
||||||
|
"%s: I suggest doing an integrity test (bzip2 -tv)"
|
||||||
|
" of it.\n",
|
||||||
|
progName );
|
||||||
|
}
|
||||||
}
|
}
|
||||||
|
|
||||||
if (noisy && numFileNames > 0 && numFilesProcessed < numFileNames) {
|
if (noisy && numFileNames > 0 && numFilesProcessed < numFileNames) {
|
||||||
fprintf ( stderr,
|
fprintf ( stderr,
|
||||||
"%s: WARNING: some files have not been processed:\n"
|
"%s: WARNING: some files have not been processed:\n"
|
||||||
"\t%d specified on command line, %d not processed yet.\n\n",
|
"%s: %d specified on command line, %d not processed yet.\n\n",
|
||||||
progName, numFileNames,
|
progName, progName,
|
||||||
numFileNames - numFilesProcessed );
|
numFileNames, numFileNames - numFilesProcessed );
|
||||||
}
|
}
|
||||||
setExit(ec);
|
setExit(ec);
|
||||||
exit(exitValue);
|
exit(exitValue);
|
||||||
@ -915,14 +852,16 @@ void crcError ( void )
|
|||||||
static
|
static
|
||||||
void compressedStreamEOF ( void )
|
void compressedStreamEOF ( void )
|
||||||
{
|
{
|
||||||
fprintf ( stderr,
|
if (noisy) {
|
||||||
"\n%s: Compressed file ends unexpectedly;\n\t"
|
fprintf ( stderr,
|
||||||
"perhaps it is corrupted? *Possible* reason follows.\n",
|
"\n%s: Compressed file ends unexpectedly;\n\t"
|
||||||
progName );
|
"perhaps it is corrupted? *Possible* reason follows.\n",
|
||||||
perror ( progName );
|
progName );
|
||||||
showFileNames();
|
perror ( progName );
|
||||||
cadvise();
|
showFileNames();
|
||||||
cleanUpAndFail( 2 );
|
cadvise();
|
||||||
|
}
|
||||||
|
cleanUpAndFail( 2 );
|
||||||
}
|
}
|
||||||
|
|
||||||
|
|
||||||
@ -1038,6 +977,11 @@ void configError ( void )
|
|||||||
/*--- The main driver machinery ---*/
|
/*--- The main driver machinery ---*/
|
||||||
/*---------------------------------------------------*/
|
/*---------------------------------------------------*/
|
||||||
|
|
||||||
|
/* All rather crufty. The main problem is that input files
|
||||||
|
are stat()d multiple times before use. This should be
|
||||||
|
cleaned up.
|
||||||
|
*/
|
||||||
|
|
||||||
/*---------------------------------------------*/
|
/*---------------------------------------------*/
|
||||||
static
|
static
|
||||||
void pad ( Char *s )
|
void pad ( Char *s )
|
||||||
@ -1081,6 +1025,32 @@ Bool fileExists ( Char* name )
|
|||||||
}
|
}
|
||||||
|
|
||||||
|
|
||||||
|
/*---------------------------------------------*/
|
||||||
|
/* Open an output file safely with O_EXCL and good permissions.
|
||||||
|
This avoids a race condition in versions < 1.0.2, in which
|
||||||
|
the file was first opened and then had its interim permissions
|
||||||
|
set safely. We instead use open() to create the file with
|
||||||
|
the interim permissions required. (--- --- rw-).
|
||||||
|
|
||||||
|
For non-Unix platforms, if we are not worrying about
|
||||||
|
security issues, simple this simply behaves like fopen.
|
||||||
|
*/
|
||||||
|
FILE* fopen_output_safely ( Char* name, const char* mode )
|
||||||
|
{
|
||||||
|
# if BZ_UNIX
|
||||||
|
FILE* fp;
|
||||||
|
IntNative fh;
|
||||||
|
fh = open(name, O_WRONLY|O_CREAT|O_EXCL, S_IWUSR|S_IRUSR);
|
||||||
|
if (fh == -1) return NULL;
|
||||||
|
fp = fdopen(fh, mode);
|
||||||
|
if (fp == NULL) close(fh);
|
||||||
|
return fp;
|
||||||
|
# else
|
||||||
|
return fopen(name, mode);
|
||||||
|
# endif
|
||||||
|
}
|
||||||
|
|
||||||
|
|
||||||
/*---------------------------------------------*/
|
/*---------------------------------------------*/
|
||||||
/*--
|
/*--
|
||||||
if in doubt, return True
|
if in doubt, return True
|
||||||
@ -1093,7 +1063,7 @@ Bool notAStandardFile ( Char* name )
|
|||||||
|
|
||||||
i = MY_LSTAT ( name, &statBuf );
|
i = MY_LSTAT ( name, &statBuf );
|
||||||
if (i != 0) return True;
|
if (i != 0) return True;
|
||||||
if (MY_S_IFREG(statBuf.st_mode)) return False;
|
if (MY_S_ISREG(statBuf.st_mode)) return False;
|
||||||
return True;
|
return True;
|
||||||
}
|
}
|
||||||
|
|
||||||
@ -1115,42 +1085,66 @@ Int32 countHardLinks ( Char* name )
|
|||||||
|
|
||||||
|
|
||||||
/*---------------------------------------------*/
|
/*---------------------------------------------*/
|
||||||
static
|
/* Copy modification date, access date, permissions and owner from the
|
||||||
void copyDatePermissionsAndOwner ( Char *srcName, Char *dstName )
|
source to destination file. We have to copy this meta-info off
|
||||||
{
|
into fileMetaInfo before starting to compress / decompress it,
|
||||||
|
because doing it afterwards means we get the wrong access time.
|
||||||
|
|
||||||
|
To complicate matters, in compress() and decompress() below, the
|
||||||
|
sequence of tests preceding the call to saveInputFileMetaInfo()
|
||||||
|
involves calling fileExists(), which in turn establishes its result
|
||||||
|
by attempting to fopen() the file, and if successful, immediately
|
||||||
|
fclose()ing it again. So we have to assume that the fopen() call
|
||||||
|
does not cause the access time field to be updated.
|
||||||
|
|
||||||
|
Reading of the man page for stat() (man 2 stat) on RedHat 7.2 seems
|
||||||
|
to imply that merely doing open() will not affect the access time.
|
||||||
|
Therefore we merely need to hope that the C library only does
|
||||||
|
open() as a result of fopen(), and not any kind of read()-ahead
|
||||||
|
cleverness.
|
||||||
|
|
||||||
|
It sounds pretty fragile to me. Whether this carries across
|
||||||
|
robustly to arbitrary Unix-like platforms (or even works robustly
|
||||||
|
on this one, RedHat 7.2) is unknown to me. Nevertheless ...
|
||||||
|
*/
|
||||||
#if BZ_UNIX
|
#if BZ_UNIX
|
||||||
|
static
|
||||||
|
struct MY_STAT fileMetaInfo;
|
||||||
|
#endif
|
||||||
|
|
||||||
|
static
|
||||||
|
void saveInputFileMetaInfo ( Char *srcName )
|
||||||
|
{
|
||||||
|
# if BZ_UNIX
|
||||||
|
IntNative retVal;
|
||||||
|
/* Note use of stat here, not lstat. */
|
||||||
|
retVal = MY_STAT( srcName, &fileMetaInfo );
|
||||||
|
ERROR_IF_NOT_ZERO ( retVal );
|
||||||
|
# endif
|
||||||
|
}
|
||||||
|
|
||||||
|
|
||||||
|
static
|
||||||
|
void applySavedMetaInfoToOutputFile ( Char *dstName )
|
||||||
|
{
|
||||||
|
# if BZ_UNIX
|
||||||
IntNative retVal;
|
IntNative retVal;
|
||||||
struct MY_STAT statBuf;
|
|
||||||
struct utimbuf uTimBuf;
|
struct utimbuf uTimBuf;
|
||||||
|
|
||||||
retVal = MY_LSTAT ( srcName, &statBuf );
|
uTimBuf.actime = fileMetaInfo.st_atime;
|
||||||
ERROR_IF_NOT_ZERO ( retVal );
|
uTimBuf.modtime = fileMetaInfo.st_mtime;
|
||||||
uTimBuf.actime = statBuf.st_atime;
|
|
||||||
uTimBuf.modtime = statBuf.st_mtime;
|
|
||||||
|
|
||||||
retVal = chmod ( dstName, statBuf.st_mode );
|
retVal = chmod ( dstName, fileMetaInfo.st_mode );
|
||||||
ERROR_IF_NOT_ZERO ( retVal );
|
ERROR_IF_NOT_ZERO ( retVal );
|
||||||
|
|
||||||
retVal = utime ( dstName, &uTimBuf );
|
retVal = utime ( dstName, &uTimBuf );
|
||||||
ERROR_IF_NOT_ZERO ( retVal );
|
ERROR_IF_NOT_ZERO ( retVal );
|
||||||
|
|
||||||
retVal = chown ( dstName, statBuf.st_uid, statBuf.st_gid );
|
retVal = chown ( dstName, fileMetaInfo.st_uid, fileMetaInfo.st_gid );
|
||||||
/* chown() will in many cases return with EPERM, which can
|
/* chown() will in many cases return with EPERM, which can
|
||||||
be safely ignored.
|
be safely ignored.
|
||||||
*/
|
*/
|
||||||
#endif
|
# endif
|
||||||
}
|
|
||||||
|
|
||||||
|
|
||||||
/*---------------------------------------------*/
|
|
||||||
static
|
|
||||||
void setInterimPermissions ( Char *dstName )
|
|
||||||
{
|
|
||||||
#if BZ_UNIX
|
|
||||||
IntNative retVal;
|
|
||||||
retVal = chmod ( dstName, S_IRUSR | S_IWUSR );
|
|
||||||
ERROR_IF_NOT_ZERO ( retVal );
|
|
||||||
#endif
|
|
||||||
}
|
}
|
||||||
|
|
||||||
|
|
||||||
@ -1158,10 +1152,19 @@ void setInterimPermissions ( Char *dstName )
|
|||||||
static
|
static
|
||||||
Bool containsDubiousChars ( Char* name )
|
Bool containsDubiousChars ( Char* name )
|
||||||
{
|
{
|
||||||
Bool cdc = False;
|
# if BZ_UNIX
|
||||||
|
/* On unix, files can contain any characters and the file expansion
|
||||||
|
* is performed by the shell.
|
||||||
|
*/
|
||||||
|
return False;
|
||||||
|
# else /* ! BZ_UNIX */
|
||||||
|
/* On non-unix (Win* platforms), wildcard characters are not allowed in
|
||||||
|
* filenames.
|
||||||
|
*/
|
||||||
for (; *name != '\0'; name++)
|
for (; *name != '\0'; name++)
|
||||||
if (*name == '?' || *name == '*') cdc = True;
|
if (*name == '?' || *name == '*') return True;
|
||||||
return cdc;
|
return False;
|
||||||
|
# endif /* BZ_UNIX */
|
||||||
}
|
}
|
||||||
|
|
||||||
|
|
||||||
@ -1201,6 +1204,7 @@ void compress ( Char *name )
|
|||||||
FILE *inStr;
|
FILE *inStr;
|
||||||
FILE *outStr;
|
FILE *outStr;
|
||||||
Int32 n, i;
|
Int32 n, i;
|
||||||
|
struct MY_STAT statBuf;
|
||||||
|
|
||||||
deleteOutputOnInterrupt = False;
|
deleteOutputOnInterrupt = False;
|
||||||
|
|
||||||
@ -1246,6 +1250,16 @@ void compress ( Char *name )
|
|||||||
return;
|
return;
|
||||||
}
|
}
|
||||||
}
|
}
|
||||||
|
if ( srcMode == SM_F2F || srcMode == SM_F2O ) {
|
||||||
|
MY_STAT(inName, &statBuf);
|
||||||
|
if ( MY_S_ISDIR(statBuf.st_mode) ) {
|
||||||
|
fprintf( stderr,
|
||||||
|
"%s: Input file %s is a directory.\n",
|
||||||
|
progName,inName);
|
||||||
|
setExit(1);
|
||||||
|
return;
|
||||||
|
}
|
||||||
|
}
|
||||||
if ( srcMode == SM_F2F && !forceOverwrite && notAStandardFile ( inName )) {
|
if ( srcMode == SM_F2F && !forceOverwrite && notAStandardFile ( inName )) {
|
||||||
if (noisy)
|
if (noisy)
|
||||||
fprintf ( stderr, "%s: Input file %s is not a normal file.\n",
|
fprintf ( stderr, "%s: Input file %s is not a normal file.\n",
|
||||||
@ -1253,11 +1267,15 @@ void compress ( Char *name )
|
|||||||
setExit(1);
|
setExit(1);
|
||||||
return;
|
return;
|
||||||
}
|
}
|
||||||
if ( srcMode == SM_F2F && !forceOverwrite && fileExists ( outName ) ) {
|
if ( srcMode == SM_F2F && fileExists ( outName ) ) {
|
||||||
fprintf ( stderr, "%s: Output file %s already exists.\n",
|
if (forceOverwrite) {
|
||||||
progName, outName );
|
remove(outName);
|
||||||
setExit(1);
|
} else {
|
||||||
return;
|
fprintf ( stderr, "%s: Output file %s already exists.\n",
|
||||||
|
progName, outName );
|
||||||
|
setExit(1);
|
||||||
|
return;
|
||||||
|
}
|
||||||
}
|
}
|
||||||
if ( srcMode == SM_F2F && !forceOverwrite &&
|
if ( srcMode == SM_F2F && !forceOverwrite &&
|
||||||
(n=countHardLinks ( inName )) > 0) {
|
(n=countHardLinks ( inName )) > 0) {
|
||||||
@ -1267,6 +1285,12 @@ void compress ( Char *name )
|
|||||||
return;
|
return;
|
||||||
}
|
}
|
||||||
|
|
||||||
|
if ( srcMode == SM_F2F ) {
|
||||||
|
/* Save the file's meta-info before we open it. Doing it later
|
||||||
|
means we mess up the access times. */
|
||||||
|
saveInputFileMetaInfo ( inName );
|
||||||
|
}
|
||||||
|
|
||||||
switch ( srcMode ) {
|
switch ( srcMode ) {
|
||||||
|
|
||||||
case SM_I2O:
|
case SM_I2O:
|
||||||
@ -1306,7 +1330,7 @@ void compress ( Char *name )
|
|||||||
|
|
||||||
case SM_F2F:
|
case SM_F2F:
|
||||||
inStr = fopen ( inName, "rb" );
|
inStr = fopen ( inName, "rb" );
|
||||||
outStr = fopen ( outName, "wb" );
|
outStr = fopen_output_safely ( outName, "wb" );
|
||||||
if ( outStr == NULL) {
|
if ( outStr == NULL) {
|
||||||
fprintf ( stderr, "%s: Can't create output file %s: %s.\n",
|
fprintf ( stderr, "%s: Can't create output file %s: %s.\n",
|
||||||
progName, outName, strerror(errno) );
|
progName, outName, strerror(errno) );
|
||||||
@ -1321,7 +1345,6 @@ void compress ( Char *name )
|
|||||||
setExit(1);
|
setExit(1);
|
||||||
return;
|
return;
|
||||||
};
|
};
|
||||||
setInterimPermissions ( outName );
|
|
||||||
break;
|
break;
|
||||||
|
|
||||||
default:
|
default:
|
||||||
@ -1343,7 +1366,7 @@ void compress ( Char *name )
|
|||||||
|
|
||||||
/*--- If there was an I/O error, we won't get here. ---*/
|
/*--- If there was an I/O error, we won't get here. ---*/
|
||||||
if ( srcMode == SM_F2F ) {
|
if ( srcMode == SM_F2F ) {
|
||||||
copyDatePermissionsAndOwner ( inName, outName );
|
applySavedMetaInfoToOutputFile ( outName );
|
||||||
deleteOutputOnInterrupt = False;
|
deleteOutputOnInterrupt = False;
|
||||||
if ( !keepInputFiles ) {
|
if ( !keepInputFiles ) {
|
||||||
IntNative retVal = remove ( inName );
|
IntNative retVal = remove ( inName );
|
||||||
@ -1364,6 +1387,7 @@ void uncompress ( Char *name )
|
|||||||
Int32 n, i;
|
Int32 n, i;
|
||||||
Bool magicNumberOK;
|
Bool magicNumberOK;
|
||||||
Bool cantGuess;
|
Bool cantGuess;
|
||||||
|
struct MY_STAT statBuf;
|
||||||
|
|
||||||
deleteOutputOnInterrupt = False;
|
deleteOutputOnInterrupt = False;
|
||||||
|
|
||||||
@ -1405,6 +1429,16 @@ void uncompress ( Char *name )
|
|||||||
setExit(1);
|
setExit(1);
|
||||||
return;
|
return;
|
||||||
}
|
}
|
||||||
|
if ( srcMode == SM_F2F || srcMode == SM_F2O ) {
|
||||||
|
MY_STAT(inName, &statBuf);
|
||||||
|
if ( MY_S_ISDIR(statBuf.st_mode) ) {
|
||||||
|
fprintf( stderr,
|
||||||
|
"%s: Input file %s is a directory.\n",
|
||||||
|
progName,inName);
|
||||||
|
setExit(1);
|
||||||
|
return;
|
||||||
|
}
|
||||||
|
}
|
||||||
if ( srcMode == SM_F2F && !forceOverwrite && notAStandardFile ( inName )) {
|
if ( srcMode == SM_F2F && !forceOverwrite && notAStandardFile ( inName )) {
|
||||||
if (noisy)
|
if (noisy)
|
||||||
fprintf ( stderr, "%s: Input file %s is not a normal file.\n",
|
fprintf ( stderr, "%s: Input file %s is not a normal file.\n",
|
||||||
@ -1419,11 +1453,15 @@ void uncompress ( Char *name )
|
|||||||
progName, inName, outName );
|
progName, inName, outName );
|
||||||
/* just a warning, no return */
|
/* just a warning, no return */
|
||||||
}
|
}
|
||||||
if ( srcMode == SM_F2F && !forceOverwrite && fileExists ( outName ) ) {
|
if ( srcMode == SM_F2F && fileExists ( outName ) ) {
|
||||||
fprintf ( stderr, "%s: Output file %s already exists.\n",
|
if (forceOverwrite) {
|
||||||
progName, outName );
|
remove(outName);
|
||||||
setExit(1);
|
} else {
|
||||||
return;
|
fprintf ( stderr, "%s: Output file %s already exists.\n",
|
||||||
|
progName, outName );
|
||||||
|
setExit(1);
|
||||||
|
return;
|
||||||
|
}
|
||||||
}
|
}
|
||||||
if ( srcMode == SM_F2F && !forceOverwrite &&
|
if ( srcMode == SM_F2F && !forceOverwrite &&
|
||||||
(n=countHardLinks ( inName ) ) > 0) {
|
(n=countHardLinks ( inName ) ) > 0) {
|
||||||
@ -1433,6 +1471,12 @@ void uncompress ( Char *name )
|
|||||||
return;
|
return;
|
||||||
}
|
}
|
||||||
|
|
||||||
|
if ( srcMode == SM_F2F ) {
|
||||||
|
/* Save the file's meta-info before we open it. Doing it later
|
||||||
|
means we mess up the access times. */
|
||||||
|
saveInputFileMetaInfo ( inName );
|
||||||
|
}
|
||||||
|
|
||||||
switch ( srcMode ) {
|
switch ( srcMode ) {
|
||||||
|
|
||||||
case SM_I2O:
|
case SM_I2O:
|
||||||
@ -1463,7 +1507,7 @@ void uncompress ( Char *name )
|
|||||||
|
|
||||||
case SM_F2F:
|
case SM_F2F:
|
||||||
inStr = fopen ( inName, "rb" );
|
inStr = fopen ( inName, "rb" );
|
||||||
outStr = fopen ( outName, "wb" );
|
outStr = fopen_output_safely ( outName, "wb" );
|
||||||
if ( outStr == NULL) {
|
if ( outStr == NULL) {
|
||||||
fprintf ( stderr, "%s: Can't create output file %s: %s.\n",
|
fprintf ( stderr, "%s: Can't create output file %s: %s.\n",
|
||||||
progName, outName, strerror(errno) );
|
progName, outName, strerror(errno) );
|
||||||
@ -1478,7 +1522,6 @@ void uncompress ( Char *name )
|
|||||||
setExit(1);
|
setExit(1);
|
||||||
return;
|
return;
|
||||||
};
|
};
|
||||||
setInterimPermissions ( outName );
|
|
||||||
break;
|
break;
|
||||||
|
|
||||||
default:
|
default:
|
||||||
@ -1501,7 +1544,7 @@ void uncompress ( Char *name )
|
|||||||
/*--- If there was an I/O error, we won't get here. ---*/
|
/*--- If there was an I/O error, we won't get here. ---*/
|
||||||
if ( magicNumberOK ) {
|
if ( magicNumberOK ) {
|
||||||
if ( srcMode == SM_F2F ) {
|
if ( srcMode == SM_F2F ) {
|
||||||
copyDatePermissionsAndOwner ( inName, outName );
|
applySavedMetaInfoToOutputFile ( outName );
|
||||||
deleteOutputOnInterrupt = False;
|
deleteOutputOnInterrupt = False;
|
||||||
if ( !keepInputFiles ) {
|
if ( !keepInputFiles ) {
|
||||||
IntNative retVal = remove ( inName );
|
IntNative retVal = remove ( inName );
|
||||||
@ -1539,6 +1582,7 @@ void testf ( Char *name )
|
|||||||
{
|
{
|
||||||
FILE *inStr;
|
FILE *inStr;
|
||||||
Bool allOK;
|
Bool allOK;
|
||||||
|
struct MY_STAT statBuf;
|
||||||
|
|
||||||
deleteOutputOnInterrupt = False;
|
deleteOutputOnInterrupt = False;
|
||||||
|
|
||||||
@ -1565,6 +1609,16 @@ void testf ( Char *name )
|
|||||||
setExit(1);
|
setExit(1);
|
||||||
return;
|
return;
|
||||||
}
|
}
|
||||||
|
if ( srcMode != SM_I2O ) {
|
||||||
|
MY_STAT(inName, &statBuf);
|
||||||
|
if ( MY_S_ISDIR(statBuf.st_mode) ) {
|
||||||
|
fprintf( stderr,
|
||||||
|
"%s: Input file %s is a directory.\n",
|
||||||
|
progName,inName);
|
||||||
|
setExit(1);
|
||||||
|
return;
|
||||||
|
}
|
||||||
|
}
|
||||||
|
|
||||||
switch ( srcMode ) {
|
switch ( srcMode ) {
|
||||||
|
|
||||||
@ -1603,6 +1657,7 @@ void testf ( Char *name )
|
|||||||
}
|
}
|
||||||
|
|
||||||
/*--- Now the input handle is sane. Do the Biz. ---*/
|
/*--- Now the input handle is sane. Do the Biz. ---*/
|
||||||
|
outputHandleJustInCase = NULL;
|
||||||
allOK = testStream ( inStr );
|
allOK = testStream ( inStr );
|
||||||
|
|
||||||
if (allOK && verbosity >= 1) fprintf ( stderr, "ok\n" );
|
if (allOK && verbosity >= 1) fprintf ( stderr, "ok\n" );
|
||||||
@ -1619,7 +1674,7 @@ void license ( void )
|
|||||||
"bzip2, a block-sorting file compressor. "
|
"bzip2, a block-sorting file compressor. "
|
||||||
"Version %s.\n"
|
"Version %s.\n"
|
||||||
" \n"
|
" \n"
|
||||||
" Copyright (C) 1996-2000 by Julian Seward.\n"
|
" Copyright (C) 1996-2002 by Julian Seward.\n"
|
||||||
" \n"
|
" \n"
|
||||||
" This program is free software; you can redistribute it and/or modify\n"
|
" This program is free software; you can redistribute it and/or modify\n"
|
||||||
" it under the terms set out in the LICENSE file, which is included\n"
|
" it under the terms set out in the LICENSE file, which is included\n"
|
||||||
@ -1658,6 +1713,8 @@ void usage ( Char *fullProgName )
|
|||||||
" -V --version display software version & license\n"
|
" -V --version display software version & license\n"
|
||||||
" -s --small use less memory (at most 2500k)\n"
|
" -s --small use less memory (at most 2500k)\n"
|
||||||
" -1 .. -9 set block size to 100k .. 900k\n"
|
" -1 .. -9 set block size to 100k .. 900k\n"
|
||||||
|
" --fast alias for -1\n"
|
||||||
|
" --best alias for -9\n"
|
||||||
"\n"
|
"\n"
|
||||||
" If invoked as `bzip2', default action is to compress.\n"
|
" If invoked as `bzip2', default action is to compress.\n"
|
||||||
" as `bunzip2', default action is to decompress.\n"
|
" as `bunzip2', default action is to decompress.\n"
|
||||||
@ -1666,9 +1723,9 @@ void usage ( Char *fullProgName )
|
|||||||
" If no file names are given, bzip2 compresses or decompresses\n"
|
" If no file names are given, bzip2 compresses or decompresses\n"
|
||||||
" from standard input to standard output. You can combine\n"
|
" from standard input to standard output. You can combine\n"
|
||||||
" short flags, so `-v -4' means the same as -v4 or -4v, &c.\n"
|
" short flags, so `-v -4' means the same as -v4 or -4v, &c.\n"
|
||||||
#if BZ_UNIX
|
# if BZ_UNIX
|
||||||
"\n"
|
"\n"
|
||||||
#endif
|
# endif
|
||||||
,
|
,
|
||||||
|
|
||||||
BZ2_bzlibVersion(),
|
BZ2_bzlibVersion(),
|
||||||
@ -1818,11 +1875,11 @@ IntNative main ( IntNative argc, Char *argv[] )
|
|||||||
|
|
||||||
/*-- Set up signal handlers for mem access errors --*/
|
/*-- Set up signal handlers for mem access errors --*/
|
||||||
signal (SIGSEGV, mySIGSEGVorSIGBUScatcher);
|
signal (SIGSEGV, mySIGSEGVorSIGBUScatcher);
|
||||||
#if BZ_UNIX
|
# if BZ_UNIX
|
||||||
#ifndef __DJGPP__
|
# ifndef __DJGPP__
|
||||||
signal (SIGBUS, mySIGSEGVorSIGBUScatcher);
|
signal (SIGBUS, mySIGSEGVorSIGBUScatcher);
|
||||||
#endif
|
# endif
|
||||||
#endif
|
# endif
|
||||||
|
|
||||||
copyFileName ( inName, "(none)" );
|
copyFileName ( inName, "(none)" );
|
||||||
copyFileName ( outName, "(none)" );
|
copyFileName ( outName, "(none)" );
|
||||||
@ -1933,6 +1990,8 @@ IntNative main ( IntNative argc, Char *argv[] )
|
|||||||
if (ISFLAG("--exponential")) workFactor = 1; else
|
if (ISFLAG("--exponential")) workFactor = 1; else
|
||||||
if (ISFLAG("--repetitive-best")) redundant(aa->name); else
|
if (ISFLAG("--repetitive-best")) redundant(aa->name); else
|
||||||
if (ISFLAG("--repetitive-fast")) redundant(aa->name); else
|
if (ISFLAG("--repetitive-fast")) redundant(aa->name); else
|
||||||
|
if (ISFLAG("--fast")) blockSize100k = 1; else
|
||||||
|
if (ISFLAG("--best")) blockSize100k = 9; else
|
||||||
if (ISFLAG("--verbose")) verbosity++; else
|
if (ISFLAG("--verbose")) verbosity++; else
|
||||||
if (ISFLAG("--help")) { usage ( progName ); exit ( 0 ); }
|
if (ISFLAG("--help")) { usage ( progName ); exit ( 0 ); }
|
||||||
else
|
else
|
||||||
|
132
bzip2.txt
132
bzip2.txt
@ -1,7 +1,6 @@
|
|||||||
|
|
||||||
|
|
||||||
NAME
|
NAME
|
||||||
bzip2, bunzip2 - a block-sorting file compressor, v1.0
|
bzip2, bunzip2 - a block-sorting file compressor, v1.0.2
|
||||||
bzcat - decompresses files to stdout
|
bzcat - decompresses files to stdout
|
||||||
bzip2recover - recovers data from damaged bzip2 files
|
bzip2recover - recovers data from damaged bzip2 files
|
||||||
|
|
||||||
@ -18,20 +17,20 @@ DESCRIPTION
|
|||||||
sorting text compression algorithm, and Huffman coding.
|
sorting text compression algorithm, and Huffman coding.
|
||||||
Compression is generally considerably better than that
|
Compression is generally considerably better than that
|
||||||
achieved by more conventional LZ77/LZ78-based compressors,
|
achieved by more conventional LZ77/LZ78-based compressors,
|
||||||
and approaches the performance of the PPM family of sta-
|
and approaches the performance of the PPM family of sta
|
||||||
tistical compressors.
|
tistical compressors.
|
||||||
|
|
||||||
The command-line options are deliberately very similar to
|
The command-line options are deliberately very similar to
|
||||||
those of GNU gzip, but they are not identical.
|
those of GNU gzip, but they are not identical.
|
||||||
|
|
||||||
bzip2 expects a list of file names to accompany the com-
|
bzip2 expects a list of file names to accompany the com
|
||||||
mand-line flags. Each file is replaced by a compressed
|
mand-line flags. Each file is replaced by a compressed
|
||||||
version of itself, with the name "original_name.bz2".
|
version of itself, with the name "original_name.bz2".
|
||||||
Each compressed file has the same modification date, per-
|
Each compressed file has the same modification date, per
|
||||||
missions, and, when possible, ownership as the correspond-
|
missions, and, when possible, ownership as the correspond
|
||||||
ing original, so that these properties can be correctly
|
ing original, so that these properties can be correctly
|
||||||
restored at decompression time. File name handling is
|
restored at decompression time. File name handling is
|
||||||
naive in the sense that there is no mechanism for preserv-
|
naive in the sense that there is no mechanism for preserv
|
||||||
ing original file names, permissions, ownerships or dates
|
ing original file names, permissions, ownerships or dates
|
||||||
in filesystems which lack these concepts, or have serious
|
in filesystems which lack these concepts, or have serious
|
||||||
file name length restrictions, such as MS-DOS.
|
file name length restrictions, such as MS-DOS.
|
||||||
@ -62,23 +61,23 @@ DESCRIPTION
|
|||||||
guess the name of the original file, and uses the original
|
guess the name of the original file, and uses the original
|
||||||
name with .out appended.
|
name with .out appended.
|
||||||
|
|
||||||
As with compression, supplying no filenames causes decom-
|
As with compression, supplying no filenames causes decom
|
||||||
pression from standard input to standard output.
|
pression from standard input to standard output.
|
||||||
|
|
||||||
bunzip2 will correctly decompress a file which is the con-
|
bunzip2 will correctly decompress a file which is the con
|
||||||
catenation of two or more compressed files. The result is
|
catenation of two or more compressed files. The result is
|
||||||
the concatenation of the corresponding uncompressed files.
|
the concatenation of the corresponding uncompressed files.
|
||||||
Integrity testing (-t) of concatenated compressed files is
|
Integrity testing (-t) of concatenated compressed files is
|
||||||
also supported.
|
also supported.
|
||||||
|
|
||||||
You can also compress or decompress files to the standard
|
You can also compress or decompress files to the standard
|
||||||
output by giving the -c flag. Multiple files may be com-
|
output by giving the -c flag. Multiple files may be com
|
||||||
pressed and decompressed like this. The resulting outputs
|
pressed and decompressed like this. The resulting outputs
|
||||||
are fed sequentially to stdout. Compression of multiple
|
are fed sequentially to stdout. Compression of multiple
|
||||||
files in this manner generates a stream containing multi-
|
files in this manner generates a stream containing multi
|
||||||
ple compressed file representations. Such a stream can be
|
ple compressed file representations. Such a stream can be
|
||||||
decompressed correctly only by bzip2 version 0.9.0 or
|
decompressed correctly only by bzip2 version 0.9.0 or
|
||||||
later. Earlier versions of bzip2 will stop after decom-
|
later. Earlier versions of bzip2 will stop after decom
|
||||||
pressing the first file in the stream.
|
pressing the first file in the stream.
|
||||||
|
|
||||||
bzcat (or bzip2 -dc) decompresses all specified files to
|
bzcat (or bzip2 -dc) decompresses all specified files to
|
||||||
@ -99,7 +98,7 @@ DESCRIPTION
|
|||||||
|
|
||||||
As a self-check for your protection, bzip2 uses 32-bit
|
As a self-check for your protection, bzip2 uses 32-bit
|
||||||
CRCs to make sure that the decompressed version of a file
|
CRCs to make sure that the decompressed version of a file
|
||||||
is identical to the original. This guards against corrup-
|
is identical to the original. This guards against corrup
|
||||||
tion of the compressed data, and against undetected bugs
|
tion of the compressed data, and against undetected bugs
|
||||||
in bzip2 (hopefully very unlikely). The chances of data
|
in bzip2 (hopefully very unlikely). The chances of data
|
||||||
corruption going undetected is microscopic, about one
|
corruption going undetected is microscopic, about one
|
||||||
@ -127,8 +126,8 @@ OPTIONS
|
|||||||
and forces bzip2 to decompress.
|
and forces bzip2 to decompress.
|
||||||
|
|
||||||
-z --compress
|
-z --compress
|
||||||
The complement to -d: forces compression, regard-
|
The complement to -d: forces compression,
|
||||||
less of the invokation name.
|
regardless of the invocation name.
|
||||||
|
|
||||||
-t --test
|
-t --test
|
||||||
Check integrity of the specified file(s), but don't
|
Check integrity of the specified file(s), but don't
|
||||||
@ -141,6 +140,11 @@ OPTIONS
|
|||||||
forces bzip2 to break hard links to files, which it
|
forces bzip2 to break hard links to files, which it
|
||||||
otherwise wouldn't do.
|
otherwise wouldn't do.
|
||||||
|
|
||||||
|
bzip2 normally declines to decompress files which
|
||||||
|
don't have the correct magic header bytes. If
|
||||||
|
forced (-f), however, it will pass such files
|
||||||
|
through unmodified. This is how GNU gzip behaves.
|
||||||
|
|
||||||
-k --keep
|
-k --keep
|
||||||
Keep (don't delete) input files during compression
|
Keep (don't delete) input files during compression
|
||||||
or decompression.
|
or decompression.
|
||||||
@ -167,7 +171,7 @@ OPTIONS
|
|||||||
|
|
||||||
-v --verbose
|
-v --verbose
|
||||||
Verbose mode -- show the compression ratio for each
|
Verbose mode -- show the compression ratio for each
|
||||||
file processed. Further -v's increase the ver-
|
file processed. Further -v's increase the ver
|
||||||
bosity level, spewing out lots of information which
|
bosity level, spewing out lots of information which
|
||||||
is primarily of interest for diagnostic purposes.
|
is primarily of interest for diagnostic purposes.
|
||||||
|
|
||||||
@ -175,20 +179,24 @@ OPTIONS
|
|||||||
Display the software version, license terms and
|
Display the software version, license terms and
|
||||||
conditions.
|
conditions.
|
||||||
|
|
||||||
-1 to -9
|
-1 (or --fast) to -9 (or --best)
|
||||||
Set the block size to 100 k, 200 k .. 900 k when
|
Set the block size to 100 k, 200 k .. 900 k when
|
||||||
compressing. Has no effect when decompressing.
|
compressing. Has no effect when decompressing.
|
||||||
See MEMORY MANAGEMENT below.
|
See MEMORY MANAGEMENT below. The --fast and --best
|
||||||
|
aliases are primarily for GNU gzip compatibility.
|
||||||
|
In particular, --fast doesn't make things signifi
|
||||||
|
cantly faster. And --best merely selects the
|
||||||
|
default behaviour.
|
||||||
|
|
||||||
-- Treats all subsequent arguments as file names, even
|
-- Treats all subsequent arguments as file names, even
|
||||||
if they start with a dash. This is so you can han-
|
if they start with a dash. This is so you can han
|
||||||
dle files with names beginning with a dash, for
|
dle files with names beginning with a dash, for
|
||||||
example: bzip2 -- -myfilename.
|
example: bzip2 -- -myfilename.
|
||||||
|
|
||||||
--repetitive-fast --repetitive-best
|
--repetitive-fast --repetitive-best
|
||||||
These flags are redundant in versions 0.9.5 and
|
These flags are redundant in versions 0.9.5 and
|
||||||
above. They provided some coarse control over the
|
above. They provided some coarse control over the
|
||||||
behaviour of the sorting algorithm in earlier ver-
|
behaviour of the sorting algorithm in earlier ver
|
||||||
sions, which was sometimes useful. 0.9.5 and above
|
sions, which was sometimes useful. 0.9.5 and above
|
||||||
have an improved algorithm which renders these
|
have an improved algorithm which renders these
|
||||||
flags irrelevant.
|
flags irrelevant.
|
||||||
@ -199,7 +207,7 @@ MEMORY MANAGEMENT
|
|||||||
affects both the compression ratio achieved, and the
|
affects both the compression ratio achieved, and the
|
||||||
amount of memory needed for compression and decompression.
|
amount of memory needed for compression and decompression.
|
||||||
The flags -1 through -9 specify the block size to be
|
The flags -1 through -9 specify the block size to be
|
||||||
100,000 bytes through 900,000 bytes (the default) respec-
|
100,000 bytes through 900,000 bytes (the default) respec
|
||||||
tively. At decompression time, the block size used for
|
tively. At decompression time, the block size used for
|
||||||
compression is read from the header of the compressed
|
compression is read from the header of the compressed
|
||||||
file, and bunzip2 then allocates itself just enough memory
|
file, and bunzip2 then allocates itself just enough memory
|
||||||
@ -227,13 +235,13 @@ MEMORY MANAGEMENT
|
|||||||
bunzip2 will require about 3700 kbytes to decompress. To
|
bunzip2 will require about 3700 kbytes to decompress. To
|
||||||
support decompression of any file on a 4 megabyte machine,
|
support decompression of any file on a 4 megabyte machine,
|
||||||
bunzip2 has an option to decompress using approximately
|
bunzip2 has an option to decompress using approximately
|
||||||
half this amount of memory, about 2300 kbytes. Decompres-
|
half this amount of memory, about 2300 kbytes. Decompres
|
||||||
sion speed is also halved, so you should use this option
|
sion speed is also halved, so you should use this option
|
||||||
only where necessary. The relevant flag is -s.
|
only where necessary. The relevant flag is -s.
|
||||||
|
|
||||||
In general, try and use the largest block size memory con-
|
In general, try and use the largest block size memory con
|
||||||
straints allow, since that maximises the compression
|
straints allow, since that maximises the compression
|
||||||
achieved. Compression and decompression speed are virtu-
|
achieved. Compression and decompression speed are virtu
|
||||||
ally unaffected by block size.
|
ally unaffected by block size.
|
||||||
|
|
||||||
Another significant point applies to files which fit in a
|
Another significant point applies to files which fit in a
|
||||||
@ -249,11 +257,11 @@ MEMORY MANAGEMENT
|
|||||||
|
|
||||||
Here is a table which summarises the maximum memory usage
|
Here is a table which summarises the maximum memory usage
|
||||||
for different block sizes. Also recorded is the total
|
for different block sizes. Also recorded is the total
|
||||||
compressed size for 14 files of the Calgary Text Compres-
|
compressed size for 14 files of the Calgary Text Compres
|
||||||
sion Corpus totalling 3,141,622 bytes. This column gives
|
sion Corpus totalling 3,141,622 bytes. This column gives
|
||||||
some feel for how compression varies with block size.
|
some feel for how compression varies with block size.
|
||||||
These figures tend to understate the advantage of larger
|
These figures tend to understate the advantage of larger
|
||||||
block sizes for larger files, since the Corpus is domi-
|
block sizes for larger files, since the Corpus is domi
|
||||||
nated by smaller files.
|
nated by smaller files.
|
||||||
|
|
||||||
Compress Decompress Decompress Corpus
|
Compress Decompress Decompress Corpus
|
||||||
@ -272,7 +280,7 @@ MEMORY MANAGEMENT
|
|||||||
|
|
||||||
RECOVERING DATA FROM DAMAGED FILES
|
RECOVERING DATA FROM DAMAGED FILES
|
||||||
bzip2 compresses files in blocks, usually 900kbytes long.
|
bzip2 compresses files in blocks, usually 900kbytes long.
|
||||||
Each block is handled independently. If a media or trans-
|
Each block is handled independently. If a media or trans
|
||||||
mission error causes a multi-block .bz2 file to become
|
mission error causes a multi-block .bz2 file to become
|
||||||
damaged, it may be possible to recover data from the
|
damaged, it may be possible to recover data from the
|
||||||
undamaged blocks in the file.
|
undamaged blocks in the file.
|
||||||
@ -289,19 +297,19 @@ RECOVERING DATA FROM DAMAGED FILES
|
|||||||
the integrity of the resulting files, and decompress those
|
the integrity of the resulting files, and decompress those
|
||||||
which are undamaged.
|
which are undamaged.
|
||||||
|
|
||||||
bzip2recover takes a single argument, the name of the dam-
|
bzip2recover takes a single argument, the name of the dam
|
||||||
aged file, and writes a number of files "rec0001file.bz2",
|
aged file, and writes a number of files
|
||||||
"rec0002file.bz2", etc, containing the extracted blocks.
|
"rec00001file.bz2", "rec00002file.bz2", etc, containing
|
||||||
The output filenames are designed so that the use of
|
the extracted blocks. The output filenames are
|
||||||
wildcards in subsequent processing -- for example, "bzip2
|
designed so that the use of wildcards in subsequent pro
|
||||||
-dc rec*file.bz2 > recovered_data" -- lists the files in
|
cessing -- for example, "bzip2 -dc rec*file.bz2 > recov
|
||||||
the correct order.
|
ered_data" -- processes the files in the correct order.
|
||||||
|
|
||||||
bzip2recover should be of most use dealing with large .bz2
|
bzip2recover should be of most use dealing with large .bz2
|
||||||
files, as these will contain many blocks. It is clearly
|
files, as these will contain many blocks. It is clearly
|
||||||
futile to use it on damaged single-block files, since a
|
futile to use it on damaged single-block files, since a
|
||||||
damaged block cannot be recovered. If you wish to min-
|
damaged block cannot be recovered. If you wish to min
|
||||||
imise any potential data loss through media or transmis-
|
imise any potential data loss through media or transmis
|
||||||
sion errors, you might consider compressing with a smaller
|
sion errors, you might consider compressing with a smaller
|
||||||
block size.
|
block size.
|
||||||
|
|
||||||
@ -315,19 +323,19 @@ PERFORMANCE NOTES
|
|||||||
better than previous versions in this respect. The ratio
|
better than previous versions in this respect. The ratio
|
||||||
between worst-case and average-case compression time is in
|
between worst-case and average-case compression time is in
|
||||||
the region of 10:1. For previous versions, this figure
|
the region of 10:1. For previous versions, this figure
|
||||||
was more like 100:1. You can use the -vvvv option to mon-
|
was more like 100:1. You can use the -vvvv option to mon
|
||||||
itor progress in great detail, if you want.
|
itor progress in great detail, if you want.
|
||||||
|
|
||||||
Decompression speed is unaffected by these phenomena.
|
Decompression speed is unaffected by these phenomena.
|
||||||
|
|
||||||
bzip2 usually allocates several megabytes of memory to
|
bzip2 usually allocates several megabytes of memory to
|
||||||
operate in, and then charges all over it in a fairly ran-
|
operate in, and then charges all over it in a fairly ran
|
||||||
dom fashion. This means that performance, both for com-
|
dom fashion. This means that performance, both for com
|
||||||
pressing and decompressing, is largely determined by the
|
pressing and decompressing, is largely determined by the
|
||||||
speed at which your machine can service cache misses.
|
speed at which your machine can service cache misses.
|
||||||
Because of this, small changes to the code to reduce the
|
Because of this, small changes to the code to reduce the
|
||||||
miss rate have been observed to give disproportionately
|
miss rate have been observed to give disproportionately
|
||||||
large performance improvements. I imagine bzip2 will per-
|
large performance improvements. I imagine bzip2 will per
|
||||||
form best on machines with very large caches.
|
form best on machines with very large caches.
|
||||||
|
|
||||||
|
|
||||||
@ -337,40 +345,46 @@ CAVEATS
|
|||||||
but the details of what the problem is sometimes seem
|
but the details of what the problem is sometimes seem
|
||||||
rather misleading.
|
rather misleading.
|
||||||
|
|
||||||
This manual page pertains to version 1.0 of bzip2. Com-
|
This manual page pertains to version 1.0.2 of bzip2. Com
|
||||||
pressed data created by this version is entirely forwards
|
pressed data created by this version is entirely forwards
|
||||||
and backwards compatible with the previous public
|
and backwards compatible with the previous public
|
||||||
releases, versions 0.1pl2, 0.9.0 and 0.9.5, but with the
|
releases, versions 0.1pl2, 0.9.0, 0.9.5, 1.0.0 and 1.0.1,
|
||||||
following exception: 0.9.0 and above can correctly decom-
|
but with the following exception: 0.9.0 and above can cor
|
||||||
press multiple concatenated compressed files. 0.1pl2 can-
|
rectly decompress multiple concatenated compressed files.
|
||||||
not do this; it will stop after decompressing just the
|
0.1pl2 cannot do this; it will stop after decompressing
|
||||||
first file in the stream.
|
just the first file in the stream.
|
||||||
|
|
||||||
bzip2recover uses 32-bit integers to represent bit posi-
|
bzip2recover versions prior to this one, 1.0.2, used
|
||||||
tions in compressed files, so it cannot handle compressed
|
32-bit integers to represent bit positions in compressed
|
||||||
files more than 512 megabytes long. This could easily be
|
files, so it could not handle compressed files more than
|
||||||
fixed.
|
512 megabytes long. Version 1.0.2 and above uses 64-bit
|
||||||
|
ints on some platforms which support them (GNU supported
|
||||||
|
targets, and Windows). To establish whether or not
|
||||||
|
bzip2recover was built with such a limitation, run it
|
||||||
|
without arguments. In any event you can build yourself an
|
||||||
|
unlimited version if you can recompile it with MaybeUInt64
|
||||||
|
set to be an unsigned 64-bit integer.
|
||||||
|
|
||||||
|
|
||||||
AUTHOR
|
AUTHOR
|
||||||
Julian Seward, jseward@acm.org.
|
Julian Seward, jseward@acm.org.
|
||||||
|
|
||||||
http://sourceware.cygnus.com/bzip2
|
http://sources.redhat.com/bzip2
|
||||||
http://www.muraroa.demon.co.uk
|
|
||||||
|
|
||||||
The ideas embodied in bzip2 are due to (at least) the fol-
|
The ideas embodied in bzip2 are due to (at least) the fol
|
||||||
lowing people: Michael Burrows and David Wheeler (for the
|
lowing people: Michael Burrows and David Wheeler (for the
|
||||||
block sorting transformation), David Wheeler (again, for
|
block sorting transformation), David Wheeler (again, for
|
||||||
the Huffman coder), Peter Fenwick (for the structured cod-
|
the Huffman coder), Peter Fenwick (for the structured cod
|
||||||
ing model in the original bzip, and many refinements), and
|
ing model in the original bzip, and many refinements), and
|
||||||
Alistair Moffat, Radford Neal and Ian Witten (for the
|
Alistair Moffat, Radford Neal and Ian Witten (for the
|
||||||
arithmetic coder in the original bzip). I am much
|
arithmetic coder in the original bzip). I am much
|
||||||
indebted for their help, support and advice. See the man-
|
indebted for their help, support and advice. See the man
|
||||||
ual in the source distribution for pointers to sources of
|
ual in the source distribution for pointers to sources of
|
||||||
documentation. Christian von Roques encouraged me to look
|
documentation. Christian von Roques encouraged me to look
|
||||||
for faster sorting algorithms, so as to speed up compres-
|
for faster sorting algorithms, so as to speed up compres
|
||||||
sion. Bela Lubkin encouraged me to improve the worst-case
|
sion. Bela Lubkin encouraged me to improve the worst-case
|
||||||
compression performance. Many people sent patches, helped
|
compression performance. The bz* scripts are derived from
|
||||||
with portability problems, lent machines, gave advice and
|
those of GNU gzip. Many people sent patches, helped with
|
||||||
were generally helpful.
|
portability problems, lent machines, gave advice and were
|
||||||
|
generally helpful.
|
||||||
|
|
||||||
|
161
bzip2recover.c
161
bzip2recover.c
@ -9,7 +9,7 @@
|
|||||||
salvage from damaged files created by the accompanying
|
salvage from damaged files created by the accompanying
|
||||||
bzip2-1.0 program.
|
bzip2-1.0 program.
|
||||||
|
|
||||||
Copyright (C) 1996-2000 Julian R Seward. All rights reserved.
|
Copyright (C) 1996-2002 Julian R Seward. All rights reserved.
|
||||||
|
|
||||||
Redistribution and use in source and binary forms, with or without
|
Redistribution and use in source and binary forms, with or without
|
||||||
modification, are permitted provided that the following conditions
|
modification, are permitted provided that the following conditions
|
||||||
@ -57,6 +57,29 @@
|
|||||||
#include <stdlib.h>
|
#include <stdlib.h>
|
||||||
#include <string.h>
|
#include <string.h>
|
||||||
|
|
||||||
|
|
||||||
|
/* This program records bit locations in the file to be recovered.
|
||||||
|
That means that if 64-bit ints are not supported, we will not
|
||||||
|
be able to recover .bz2 files over 512MB (2^32 bits) long.
|
||||||
|
On GNU supported platforms, we take advantage of the 64-bit
|
||||||
|
int support to circumvent this problem. Ditto MSVC.
|
||||||
|
|
||||||
|
This change occurred in version 1.0.2; all prior versions have
|
||||||
|
the 512MB limitation.
|
||||||
|
*/
|
||||||
|
#ifdef __GNUC__
|
||||||
|
typedef unsigned long long int MaybeUInt64;
|
||||||
|
# define MaybeUInt64_FMT "%Lu"
|
||||||
|
#else
|
||||||
|
#ifdef _MSC_VER
|
||||||
|
typedef unsigned __int64 MaybeUInt64;
|
||||||
|
# define MaybeUInt64_FMT "%I64u"
|
||||||
|
#else
|
||||||
|
typedef unsigned int MaybeUInt64;
|
||||||
|
# define MaybeUInt64_FMT "%u"
|
||||||
|
#endif
|
||||||
|
#endif
|
||||||
|
|
||||||
typedef unsigned int UInt32;
|
typedef unsigned int UInt32;
|
||||||
typedef int Int32;
|
typedef int Int32;
|
||||||
typedef unsigned char UChar;
|
typedef unsigned char UChar;
|
||||||
@ -66,13 +89,25 @@ typedef unsigned char Bool;
|
|||||||
#define False ((Bool)0)
|
#define False ((Bool)0)
|
||||||
|
|
||||||
|
|
||||||
Char inFileName[2000];
|
#define BZ_MAX_FILENAME 2000
|
||||||
Char outFileName[2000];
|
|
||||||
Char progName[2000];
|
|
||||||
|
|
||||||
UInt32 bytesOut = 0;
|
Char inFileName[BZ_MAX_FILENAME];
|
||||||
UInt32 bytesIn = 0;
|
Char outFileName[BZ_MAX_FILENAME];
|
||||||
|
Char progName[BZ_MAX_FILENAME];
|
||||||
|
|
||||||
|
MaybeUInt64 bytesOut = 0;
|
||||||
|
MaybeUInt64 bytesIn = 0;
|
||||||
|
|
||||||
|
|
||||||
|
/*---------------------------------------------------*/
|
||||||
|
/*--- Header bytes ---*/
|
||||||
|
/*---------------------------------------------------*/
|
||||||
|
|
||||||
|
#define BZ_HDR_B 0x42 /* 'B' */
|
||||||
|
#define BZ_HDR_Z 0x5a /* 'Z' */
|
||||||
|
#define BZ_HDR_h 0x68 /* 'h' */
|
||||||
|
#define BZ_HDR_0 0x30 /* '0' */
|
||||||
|
|
||||||
|
|
||||||
/*---------------------------------------------------*/
|
/*---------------------------------------------------*/
|
||||||
/*--- I/O errors ---*/
|
/*--- I/O errors ---*/
|
||||||
@ -116,6 +151,23 @@ void mallocFail ( Int32 n )
|
|||||||
}
|
}
|
||||||
|
|
||||||
|
|
||||||
|
/*---------------------------------------------*/
|
||||||
|
void tooManyBlocks ( Int32 max_handled_blocks )
|
||||||
|
{
|
||||||
|
fprintf ( stderr,
|
||||||
|
"%s: `%s' appears to contain more than %d blocks\n",
|
||||||
|
progName, inFileName, max_handled_blocks );
|
||||||
|
fprintf ( stderr,
|
||||||
|
"%s: and cannot be handled. To fix, increase\n",
|
||||||
|
progName );
|
||||||
|
fprintf ( stderr,
|
||||||
|
"%s: BZ_MAX_HANDLED_BLOCKS in bzip2recover.c, and recompile.\n",
|
||||||
|
progName );
|
||||||
|
exit ( 1 );
|
||||||
|
}
|
||||||
|
|
||||||
|
|
||||||
|
|
||||||
/*---------------------------------------------------*/
|
/*---------------------------------------------------*/
|
||||||
/*--- Bit stream I/O ---*/
|
/*--- Bit stream I/O ---*/
|
||||||
/*---------------------------------------------------*/
|
/*---------------------------------------------------*/
|
||||||
@ -254,27 +306,37 @@ Bool endsInBz2 ( Char* name )
|
|||||||
/*--- ---*/
|
/*--- ---*/
|
||||||
/*---------------------------------------------------*/
|
/*---------------------------------------------------*/
|
||||||
|
|
||||||
|
/* This logic isn't really right when it comes to Cygwin. */
|
||||||
|
#ifdef _WIN32
|
||||||
|
# define BZ_SPLIT_SYM '\\' /* path splitter on Windows platform */
|
||||||
|
#else
|
||||||
|
# define BZ_SPLIT_SYM '/' /* path splitter on Unix platform */
|
||||||
|
#endif
|
||||||
|
|
||||||
#define BLOCK_HEADER_HI 0x00003141UL
|
#define BLOCK_HEADER_HI 0x00003141UL
|
||||||
#define BLOCK_HEADER_LO 0x59265359UL
|
#define BLOCK_HEADER_LO 0x59265359UL
|
||||||
|
|
||||||
#define BLOCK_ENDMARK_HI 0x00001772UL
|
#define BLOCK_ENDMARK_HI 0x00001772UL
|
||||||
#define BLOCK_ENDMARK_LO 0x45385090UL
|
#define BLOCK_ENDMARK_LO 0x45385090UL
|
||||||
|
|
||||||
|
/* Increase if necessary. However, a .bz2 file with > 50000 blocks
|
||||||
|
would have an uncompressed size of at least 40GB, so the chances
|
||||||
|
are low you'll need to up this.
|
||||||
|
*/
|
||||||
|
#define BZ_MAX_HANDLED_BLOCKS 50000
|
||||||
|
|
||||||
UInt32 bStart[20000];
|
MaybeUInt64 bStart [BZ_MAX_HANDLED_BLOCKS];
|
||||||
UInt32 bEnd[20000];
|
MaybeUInt64 bEnd [BZ_MAX_HANDLED_BLOCKS];
|
||||||
UInt32 rbStart[20000];
|
MaybeUInt64 rbStart[BZ_MAX_HANDLED_BLOCKS];
|
||||||
UInt32 rbEnd[20000];
|
MaybeUInt64 rbEnd [BZ_MAX_HANDLED_BLOCKS];
|
||||||
|
|
||||||
Int32 main ( Int32 argc, Char** argv )
|
Int32 main ( Int32 argc, Char** argv )
|
||||||
{
|
{
|
||||||
FILE* inFile;
|
FILE* inFile;
|
||||||
FILE* outFile;
|
FILE* outFile;
|
||||||
BitStream* bsIn, *bsWr;
|
BitStream* bsIn, *bsWr;
|
||||||
Int32 currBlock, b, wrBlock;
|
Int32 b, wrBlock, currBlock, rbCtr;
|
||||||
UInt32 bitsRead;
|
MaybeUInt64 bitsRead;
|
||||||
Int32 rbCtr;
|
|
||||||
|
|
||||||
|
|
||||||
UInt32 buffHi, buffLo, blockCRC;
|
UInt32 buffHi, buffLo, blockCRC;
|
||||||
Char* p;
|
Char* p;
|
||||||
@ -282,11 +344,37 @@ Int32 main ( Int32 argc, Char** argv )
|
|||||||
strcpy ( progName, argv[0] );
|
strcpy ( progName, argv[0] );
|
||||||
inFileName[0] = outFileName[0] = 0;
|
inFileName[0] = outFileName[0] = 0;
|
||||||
|
|
||||||
fprintf ( stderr, "bzip2recover 1.0: extracts blocks from damaged .bz2 files.\n" );
|
fprintf ( stderr,
|
||||||
|
"bzip2recover 1.0.2: extracts blocks from damaged .bz2 files.\n" );
|
||||||
|
|
||||||
if (argc != 2) {
|
if (argc != 2) {
|
||||||
fprintf ( stderr, "%s: usage is `%s damaged_file_name'.\n",
|
fprintf ( stderr, "%s: usage is `%s damaged_file_name'.\n",
|
||||||
progName, progName );
|
progName, progName );
|
||||||
|
switch (sizeof(MaybeUInt64)) {
|
||||||
|
case 8:
|
||||||
|
fprintf(stderr,
|
||||||
|
"\trestrictions on size of recovered file: None\n");
|
||||||
|
break;
|
||||||
|
case 4:
|
||||||
|
fprintf(stderr,
|
||||||
|
"\trestrictions on size of recovered file: 512 MB\n");
|
||||||
|
fprintf(stderr,
|
||||||
|
"\tto circumvent, recompile with MaybeUInt64 as an\n"
|
||||||
|
"\tunsigned 64-bit int.\n");
|
||||||
|
break;
|
||||||
|
default:
|
||||||
|
fprintf(stderr,
|
||||||
|
"\tsizeof(MaybeUInt64) is not 4 or 8 -- "
|
||||||
|
"configuration error.\n");
|
||||||
|
break;
|
||||||
|
}
|
||||||
|
exit(1);
|
||||||
|
}
|
||||||
|
|
||||||
|
if (strlen(argv[1]) >= BZ_MAX_FILENAME-20) {
|
||||||
|
fprintf ( stderr,
|
||||||
|
"%s: supplied filename is suspiciously (>= %d chars) long. Bye!\n",
|
||||||
|
progName, strlen(argv[1]) );
|
||||||
exit(1);
|
exit(1);
|
||||||
}
|
}
|
||||||
|
|
||||||
@ -316,7 +404,8 @@ Int32 main ( Int32 argc, Char** argv )
|
|||||||
(bitsRead - bStart[currBlock]) >= 40) {
|
(bitsRead - bStart[currBlock]) >= 40) {
|
||||||
bEnd[currBlock] = bitsRead-1;
|
bEnd[currBlock] = bitsRead-1;
|
||||||
if (currBlock > 0)
|
if (currBlock > 0)
|
||||||
fprintf ( stderr, " block %d runs from %d to %d (incomplete)\n",
|
fprintf ( stderr, " block %d runs from " MaybeUInt64_FMT
|
||||||
|
" to " MaybeUInt64_FMT " (incomplete)\n",
|
||||||
currBlock, bStart[currBlock], bEnd[currBlock] );
|
currBlock, bStart[currBlock], bEnd[currBlock] );
|
||||||
} else
|
} else
|
||||||
currBlock--;
|
currBlock--;
|
||||||
@ -330,17 +419,22 @@ Int32 main ( Int32 argc, Char** argv )
|
|||||||
( (buffHi & 0x0000ffff) == BLOCK_ENDMARK_HI
|
( (buffHi & 0x0000ffff) == BLOCK_ENDMARK_HI
|
||||||
&& buffLo == BLOCK_ENDMARK_LO)
|
&& buffLo == BLOCK_ENDMARK_LO)
|
||||||
) {
|
) {
|
||||||
if (bitsRead > 49)
|
if (bitsRead > 49) {
|
||||||
bEnd[currBlock] = bitsRead-49; else
|
bEnd[currBlock] = bitsRead-49;
|
||||||
|
} else {
|
||||||
bEnd[currBlock] = 0;
|
bEnd[currBlock] = 0;
|
||||||
|
}
|
||||||
if (currBlock > 0 &&
|
if (currBlock > 0 &&
|
||||||
(bEnd[currBlock] - bStart[currBlock]) >= 130) {
|
(bEnd[currBlock] - bStart[currBlock]) >= 130) {
|
||||||
fprintf ( stderr, " block %d runs from %d to %d\n",
|
fprintf ( stderr, " block %d runs from " MaybeUInt64_FMT
|
||||||
|
" to " MaybeUInt64_FMT "\n",
|
||||||
rbCtr+1, bStart[currBlock], bEnd[currBlock] );
|
rbCtr+1, bStart[currBlock], bEnd[currBlock] );
|
||||||
rbStart[rbCtr] = bStart[currBlock];
|
rbStart[rbCtr] = bStart[currBlock];
|
||||||
rbEnd[rbCtr] = bEnd[currBlock];
|
rbEnd[rbCtr] = bEnd[currBlock];
|
||||||
rbCtr++;
|
rbCtr++;
|
||||||
}
|
}
|
||||||
|
if (currBlock >= BZ_MAX_HANDLED_BLOCKS)
|
||||||
|
tooManyBlocks(BZ_MAX_HANDLED_BLOCKS);
|
||||||
currBlock++;
|
currBlock++;
|
||||||
|
|
||||||
bStart[currBlock] = bitsRead;
|
bStart[currBlock] = bitsRead;
|
||||||
@ -400,10 +494,25 @@ Int32 main ( Int32 argc, Char** argv )
|
|||||||
wrBlock++;
|
wrBlock++;
|
||||||
} else
|
} else
|
||||||
if (bitsRead == rbStart[wrBlock]) {
|
if (bitsRead == rbStart[wrBlock]) {
|
||||||
outFileName[0] = 0;
|
/* Create the output file name, correctly handling leading paths.
|
||||||
sprintf ( outFileName, "rec%4d", wrBlock+1 );
|
(31.10.2001 by Sergey E. Kusikov) */
|
||||||
for (p = outFileName; *p != 0; p++) if (*p == ' ') *p = '0';
|
Char* split;
|
||||||
strcat ( outFileName, inFileName );
|
Int32 ofs, k;
|
||||||
|
for (k = 0; k < BZ_MAX_FILENAME; k++)
|
||||||
|
outFileName[k] = 0;
|
||||||
|
strcpy (outFileName, inFileName);
|
||||||
|
split = strrchr (outFileName, BZ_SPLIT_SYM);
|
||||||
|
if (split == NULL) {
|
||||||
|
split = outFileName;
|
||||||
|
} else {
|
||||||
|
++split;
|
||||||
|
}
|
||||||
|
/* Now split points to the start of the basename. */
|
||||||
|
ofs = split - outFileName;
|
||||||
|
sprintf (split, "rec%5d", wrBlock+1);
|
||||||
|
for (p = split; *p != 0; p++) if (*p == ' ') *p = '0';
|
||||||
|
strcat (outFileName, inFileName + ofs);
|
||||||
|
|
||||||
if ( !endsInBz2(outFileName)) strcat ( outFileName, ".bz2" );
|
if ( !endsInBz2(outFileName)) strcat ( outFileName, ".bz2" );
|
||||||
|
|
||||||
fprintf ( stderr, " writing block %d to `%s' ...\n",
|
fprintf ( stderr, " writing block %d to `%s' ...\n",
|
||||||
@ -416,8 +525,10 @@ Int32 main ( Int32 argc, Char** argv )
|
|||||||
exit(1);
|
exit(1);
|
||||||
}
|
}
|
||||||
bsWr = bsOpenWriteStream ( outFile );
|
bsWr = bsOpenWriteStream ( outFile );
|
||||||
bsPutUChar ( bsWr, 'B' ); bsPutUChar ( bsWr, 'Z' );
|
bsPutUChar ( bsWr, BZ_HDR_B );
|
||||||
bsPutUChar ( bsWr, 'h' ); bsPutUChar ( bsWr, '9' );
|
bsPutUChar ( bsWr, BZ_HDR_Z );
|
||||||
|
bsPutUChar ( bsWr, BZ_HDR_h );
|
||||||
|
bsPutUChar ( bsWr, BZ_HDR_0 + 9 );
|
||||||
bsPutUChar ( bsWr, 0x31 ); bsPutUChar ( bsWr, 0x41 );
|
bsPutUChar ( bsWr, 0x31 ); bsPutUChar ( bsWr, 0x41 );
|
||||||
bsPutUChar ( bsWr, 0x59 ); bsPutUChar ( bsWr, 0x26 );
|
bsPutUChar ( bsWr, 0x59 ); bsPutUChar ( bsWr, 0x26 );
|
||||||
bsPutUChar ( bsWr, 0x53 ); bsPutUChar ( bsWr, 0x59 );
|
bsPutUChar ( bsWr, 0x53 ); bsPutUChar ( bsWr, 0x59 );
|
||||||
|
35
bzlib.c
35
bzlib.c
@ -8,7 +8,7 @@
|
|||||||
This file is a part of bzip2 and/or libbzip2, a program and
|
This file is a part of bzip2 and/or libbzip2, a program and
|
||||||
library for lossless, block-sorting data compression.
|
library for lossless, block-sorting data compression.
|
||||||
|
|
||||||
Copyright (C) 1996-2000 Julian R Seward. All rights reserved.
|
Copyright (C) 1996-2002 Julian R Seward. All rights reserved.
|
||||||
|
|
||||||
Redistribution and use in source and binary forms, with or without
|
Redistribution and use in source and binary forms, with or without
|
||||||
modification, are permitted provided that the following conditions
|
modification, are permitted provided that the following conditions
|
||||||
@ -93,10 +93,39 @@ void BZ2_bz__AssertH__fail ( int errcode )
|
|||||||
"component, you should also report this bug to the author(s)\n"
|
"component, you should also report this bug to the author(s)\n"
|
||||||
"of that program. Please make an effort to report this bug;\n"
|
"of that program. Please make an effort to report this bug;\n"
|
||||||
"timely and accurate bug reports eventually lead to higher\n"
|
"timely and accurate bug reports eventually lead to higher\n"
|
||||||
"quality software. Thanks. Julian Seward, 21 March 2000.\n\n",
|
"quality software. Thanks. Julian Seward, 30 December 2001.\n\n",
|
||||||
errcode,
|
errcode,
|
||||||
BZ2_bzlibVersion()
|
BZ2_bzlibVersion()
|
||||||
);
|
);
|
||||||
|
|
||||||
|
if (errcode == 1007) {
|
||||||
|
fprintf(stderr,
|
||||||
|
"\n*** A special note about internal error number 1007 ***\n"
|
||||||
|
"\n"
|
||||||
|
"Experience suggests that a common cause of i.e. 1007\n"
|
||||||
|
"is unreliable memory or other hardware. The 1007 assertion\n"
|
||||||
|
"just happens to cross-check the results of huge numbers of\n"
|
||||||
|
"memory reads/writes, and so acts (unintendedly) as a stress\n"
|
||||||
|
"test of your memory system.\n"
|
||||||
|
"\n"
|
||||||
|
"I suggest the following: try compressing the file again,\n"
|
||||||
|
"possibly monitoring progress in detail with the -vv flag.\n"
|
||||||
|
"\n"
|
||||||
|
"* If the error cannot be reproduced, and/or happens at different\n"
|
||||||
|
" points in compression, you may have a flaky memory system.\n"
|
||||||
|
" Try a memory-test program. I have used Memtest86\n"
|
||||||
|
" (www.memtest86.com). At the time of writing it is free (GPLd).\n"
|
||||||
|
" Memtest86 tests memory much more thorougly than your BIOSs\n"
|
||||||
|
" power-on test, and may find failures that the BIOS doesn't.\n"
|
||||||
|
"\n"
|
||||||
|
"* If the error can be repeatably reproduced, this is a bug in\n"
|
||||||
|
" bzip2, and I would very much like to hear about it. Please\n"
|
||||||
|
" let me know, and, ideally, save a copy of the file causing the\n"
|
||||||
|
" problem -- without which I will be unable to investigate it.\n"
|
||||||
|
"\n"
|
||||||
|
);
|
||||||
|
}
|
||||||
|
|
||||||
exit(3);
|
exit(3);
|
||||||
}
|
}
|
||||||
#endif
|
#endif
|
||||||
@ -1402,7 +1431,7 @@ BZFILE * bzopen_or_bzdopen
|
|||||||
smallMode = 1; break;
|
smallMode = 1; break;
|
||||||
default:
|
default:
|
||||||
if (isdigit((int)(*mode))) {
|
if (isdigit((int)(*mode))) {
|
||||||
blockSize100k = *mode-'0';
|
blockSize100k = *mode-BZ_HDR_0;
|
||||||
}
|
}
|
||||||
}
|
}
|
||||||
mode++;
|
mode++;
|
||||||
|
6
bzlib.h
6
bzlib.h
@ -8,7 +8,7 @@
|
|||||||
This file is a part of bzip2 and/or libbzip2, a program and
|
This file is a part of bzip2 and/or libbzip2, a program and
|
||||||
library for lossless, block-sorting data compression.
|
library for lossless, block-sorting data compression.
|
||||||
|
|
||||||
Copyright (C) 1996-2000 Julian R Seward. All rights reserved.
|
Copyright (C) 1996-2002 Julian R Seward. All rights reserved.
|
||||||
|
|
||||||
Redistribution and use in source and binary forms, with or without
|
Redistribution and use in source and binary forms, with or without
|
||||||
modification, are permitted provided that the following conditions
|
modification, are permitted provided that the following conditions
|
||||||
@ -110,8 +110,10 @@ typedef
|
|||||||
#define BZ_EXPORT
|
#define BZ_EXPORT
|
||||||
#endif
|
#endif
|
||||||
|
|
||||||
|
/* Need a definitition for FILE */
|
||||||
|
#include <stdio.h>
|
||||||
|
|
||||||
#ifdef _WIN32
|
#ifdef _WIN32
|
||||||
# include <stdio.h>
|
|
||||||
# include <windows.h>
|
# include <windows.h>
|
||||||
# ifdef small
|
# ifdef small
|
||||||
/* windows.h define small to char */
|
/* windows.h define small to char */
|
||||||
|
@ -8,7 +8,7 @@
|
|||||||
This file is a part of bzip2 and/or libbzip2, a program and
|
This file is a part of bzip2 and/or libbzip2, a program and
|
||||||
library for lossless, block-sorting data compression.
|
library for lossless, block-sorting data compression.
|
||||||
|
|
||||||
Copyright (C) 1996-2000 Julian R Seward. All rights reserved.
|
Copyright (C) 1996-2002 Julian R Seward. All rights reserved.
|
||||||
|
|
||||||
Redistribution and use in source and binary forms, with or without
|
Redistribution and use in source and binary forms, with or without
|
||||||
modification, are permitted provided that the following conditions
|
modification, are permitted provided that the following conditions
|
||||||
@ -76,7 +76,7 @@
|
|||||||
|
|
||||||
/*-- General stuff. --*/
|
/*-- General stuff. --*/
|
||||||
|
|
||||||
#define BZ_VERSION "1.0.1, 23-June-2000"
|
#define BZ_VERSION "1.0.2, 30-Dec-2001"
|
||||||
|
|
||||||
typedef char Char;
|
typedef char Char;
|
||||||
typedef unsigned char Bool;
|
typedef unsigned char Bool;
|
||||||
@ -137,6 +137,13 @@ extern void bz_internal_error ( int errcode );
|
|||||||
#define BZFREE(ppp) (strm->bzfree)(strm->opaque,(ppp))
|
#define BZFREE(ppp) (strm->bzfree)(strm->opaque,(ppp))
|
||||||
|
|
||||||
|
|
||||||
|
/*-- Header bytes. --*/
|
||||||
|
|
||||||
|
#define BZ_HDR_B 0x42 /* 'B' */
|
||||||
|
#define BZ_HDR_Z 0x5a /* 'Z' */
|
||||||
|
#define BZ_HDR_h 0x68 /* 'h' */
|
||||||
|
#define BZ_HDR_0 0x30 /* '0' */
|
||||||
|
|
||||||
/*-- Constants for the back end. --*/
|
/*-- Constants for the back end. --*/
|
||||||
|
|
||||||
#define BZ_MAX_ALPHA_SIZE 258
|
#define BZ_MAX_ALPHA_SIZE 258
|
||||||
|
61
bzmore
Normal file
61
bzmore
Normal file
@ -0,0 +1,61 @@
|
|||||||
|
#!/bin/sh
|
||||||
|
|
||||||
|
# Bzmore wrapped for bzip2,
|
||||||
|
# adapted from zmore by Philippe Troin <phil@fifi.org> for Debian GNU/Linux.
|
||||||
|
|
||||||
|
PATH="/usr/bin:$PATH"; export PATH
|
||||||
|
|
||||||
|
prog=`echo $0 | sed 's|.*/||'`
|
||||||
|
case "$prog" in
|
||||||
|
*less) more=less ;;
|
||||||
|
*) more=more ;;
|
||||||
|
esac
|
||||||
|
|
||||||
|
if test "`echo -n a`" = "-n a"; then
|
||||||
|
# looks like a SysV system:
|
||||||
|
n1=''; n2='\c'
|
||||||
|
else
|
||||||
|
n1='-n'; n2=''
|
||||||
|
fi
|
||||||
|
oldtty=`stty -g 2>/dev/null`
|
||||||
|
if stty -cbreak 2>/dev/null; then
|
||||||
|
cb='cbreak'; ncb='-cbreak'
|
||||||
|
else
|
||||||
|
# 'stty min 1' resets eof to ^a on both SunOS and SysV!
|
||||||
|
cb='min 1 -icanon'; ncb='icanon eof ^d'
|
||||||
|
fi
|
||||||
|
if test $? -eq 0 -a -n "$oldtty"; then
|
||||||
|
trap 'stty $oldtty 2>/dev/null; exit' 0 2 3 5 10 13 15
|
||||||
|
else
|
||||||
|
trap 'stty $ncb echo 2>/dev/null; exit' 0 2 3 5 10 13 15
|
||||||
|
fi
|
||||||
|
|
||||||
|
if test $# = 0; then
|
||||||
|
if test -t 0; then
|
||||||
|
echo usage: $prog files...
|
||||||
|
else
|
||||||
|
bzip2 -cdfq | eval $more
|
||||||
|
fi
|
||||||
|
else
|
||||||
|
FIRST=1
|
||||||
|
for FILE
|
||||||
|
do
|
||||||
|
if test $FIRST -eq 0; then
|
||||||
|
echo $n1 "--More--(Next file: $FILE)$n2"
|
||||||
|
stty $cb -echo 2>/dev/null
|
||||||
|
ANS=`dd bs=1 count=1 2>/dev/null`
|
||||||
|
stty $ncb echo 2>/dev/null
|
||||||
|
echo " "
|
||||||
|
if test "$ANS" = 'e' -o "$ANS" = 'q'; then
|
||||||
|
exit
|
||||||
|
fi
|
||||||
|
fi
|
||||||
|
if test "$ANS" != 's'; then
|
||||||
|
echo "------> $FILE <------"
|
||||||
|
bzip2 -cdfq "$FILE" | eval $more
|
||||||
|
fi
|
||||||
|
if test -t; then
|
||||||
|
FIRST=0
|
||||||
|
fi
|
||||||
|
done
|
||||||
|
fi
|
152
bzmore.1
Normal file
152
bzmore.1
Normal file
@ -0,0 +1,152 @@
|
|||||||
|
.\"Shamelessly copied from zmore.1 by Philippe Troin <phil@fifi.org>
|
||||||
|
.\"for Debian GNU/Linux
|
||||||
|
.TH BZMORE 1
|
||||||
|
.SH NAME
|
||||||
|
bzmore, bzless \- file perusal filter for crt viewing of bzip2 compressed text
|
||||||
|
.SH SYNOPSIS
|
||||||
|
.B bzmore
|
||||||
|
[ name ... ]
|
||||||
|
.br
|
||||||
|
.B bzless
|
||||||
|
[ name ... ]
|
||||||
|
.SH NOTE
|
||||||
|
In the following description,
|
||||||
|
.I bzless
|
||||||
|
and
|
||||||
|
.I less
|
||||||
|
can be used interchangeably with
|
||||||
|
.I bzmore
|
||||||
|
and
|
||||||
|
.I more.
|
||||||
|
.SH DESCRIPTION
|
||||||
|
.I Bzmore
|
||||||
|
is a filter which allows examination of compressed or plain text files
|
||||||
|
one screenful at a time on a soft-copy terminal.
|
||||||
|
.I bzmore
|
||||||
|
works on files compressed with
|
||||||
|
.I bzip2
|
||||||
|
and also on uncompressed files.
|
||||||
|
If a file does not exist,
|
||||||
|
.I bzmore
|
||||||
|
looks for a file of the same name with the addition of a .bz2 suffix.
|
||||||
|
.PP
|
||||||
|
.I Bzmore
|
||||||
|
normally pauses after each screenful, printing --More--
|
||||||
|
at the bottom of the screen.
|
||||||
|
If the user then types a carriage return, one more line is displayed.
|
||||||
|
If the user hits a space,
|
||||||
|
another screenful is displayed. Other possibilities are enumerated later.
|
||||||
|
.PP
|
||||||
|
.I Bzmore
|
||||||
|
looks in the file
|
||||||
|
.I /etc/termcap
|
||||||
|
to determine terminal characteristics,
|
||||||
|
and to determine the default window size.
|
||||||
|
On a terminal capable of displaying 24 lines,
|
||||||
|
the default window size is 22 lines.
|
||||||
|
Other sequences which may be typed when
|
||||||
|
.I bzmore
|
||||||
|
pauses, and their effects, are as follows (\fIi\fP is an optional integer
|
||||||
|
argument, defaulting to 1) :
|
||||||
|
.PP
|
||||||
|
.IP \fIi\|\fP<space>
|
||||||
|
display
|
||||||
|
.I i
|
||||||
|
more lines, (or another screenful if no argument is given)
|
||||||
|
.PP
|
||||||
|
.IP ^D
|
||||||
|
display 11 more lines (a ``scroll'').
|
||||||
|
If
|
||||||
|
.I i
|
||||||
|
is given, then the scroll size is set to \fIi\|\fP.
|
||||||
|
.PP
|
||||||
|
.IP d
|
||||||
|
same as ^D (control-D)
|
||||||
|
.PP
|
||||||
|
.IP \fIi\|\fPz
|
||||||
|
same as typing a space except that \fIi\|\fP, if present, becomes the new
|
||||||
|
window size. Note that the window size reverts back to the default at the
|
||||||
|
end of the current file.
|
||||||
|
.PP
|
||||||
|
.IP \fIi\|\fPs
|
||||||
|
skip \fIi\|\fP lines and print a screenful of lines
|
||||||
|
.PP
|
||||||
|
.IP \fIi\|\fPf
|
||||||
|
skip \fIi\fP screenfuls and print a screenful of lines
|
||||||
|
.PP
|
||||||
|
.IP "q or Q"
|
||||||
|
quit reading the current file; go on to the next (if any)
|
||||||
|
.PP
|
||||||
|
.IP "e or q"
|
||||||
|
When the prompt --More--(Next file:
|
||||||
|
.IR file )
|
||||||
|
is printed, this command causes bzmore to exit.
|
||||||
|
.PP
|
||||||
|
.IP s
|
||||||
|
When the prompt --More--(Next file:
|
||||||
|
.IR file )
|
||||||
|
is printed, this command causes bzmore to skip the next file and continue.
|
||||||
|
.PP
|
||||||
|
.IP =
|
||||||
|
Display the current line number.
|
||||||
|
.PP
|
||||||
|
.IP \fIi\|\fP/expr
|
||||||
|
search for the \fIi\|\fP-th occurrence of the regular expression \fIexpr.\fP
|
||||||
|
If the pattern is not found,
|
||||||
|
.I bzmore
|
||||||
|
goes on to the next file (if any).
|
||||||
|
Otherwise, a screenful is displayed, starting two lines before the place
|
||||||
|
where the expression was found.
|
||||||
|
The user's erase and kill characters may be used to edit the regular
|
||||||
|
expression.
|
||||||
|
Erasing back past the first column cancels the search command.
|
||||||
|
.PP
|
||||||
|
.IP \fIi\|\fPn
|
||||||
|
search for the \fIi\|\fP-th occurrence of the last regular expression entered.
|
||||||
|
.PP
|
||||||
|
.IP !command
|
||||||
|
invoke a shell with \fIcommand\|\fP.
|
||||||
|
The character `!' in "command" are replaced with the
|
||||||
|
previous shell command. The sequence "\\!" is replaced by "!".
|
||||||
|
.PP
|
||||||
|
.IP ":q or :Q"
|
||||||
|
quit reading the current file; go on to the next (if any)
|
||||||
|
(same as q or Q).
|
||||||
|
.PP
|
||||||
|
.IP .
|
||||||
|
(dot) repeat the previous command.
|
||||||
|
.PP
|
||||||
|
The commands take effect immediately, i.e., it is not necessary to
|
||||||
|
type a carriage return.
|
||||||
|
Up to the time when the command character itself is given,
|
||||||
|
the user may hit the line kill character to cancel the numerical
|
||||||
|
argument being formed.
|
||||||
|
In addition, the user may hit the erase character to redisplay the
|
||||||
|
--More-- message.
|
||||||
|
.PP
|
||||||
|
At any time when output is being sent to the terminal, the user can
|
||||||
|
hit the quit key (normally control\-\\).
|
||||||
|
.I Bzmore
|
||||||
|
will stop sending output, and will display the usual --More--
|
||||||
|
prompt.
|
||||||
|
The user may then enter one of the above commands in the normal manner.
|
||||||
|
Unfortunately, some output is lost when this is done, due to the
|
||||||
|
fact that any characters waiting in the terminal's output queue
|
||||||
|
are flushed when the quit signal occurs.
|
||||||
|
.PP
|
||||||
|
The terminal is set to
|
||||||
|
.I noecho
|
||||||
|
mode by this program so that the output can be continuous.
|
||||||
|
What you type will thus not show on your terminal, except for the / and !
|
||||||
|
commands.
|
||||||
|
.PP
|
||||||
|
If the standard output is not a teletype, then
|
||||||
|
.I bzmore
|
||||||
|
acts just like
|
||||||
|
.I bzcat,
|
||||||
|
except that a header is printed before each file.
|
||||||
|
.SH FILES
|
||||||
|
.DT
|
||||||
|
/etc/termcap Terminal data base
|
||||||
|
.SH "SEE ALSO"
|
||||||
|
more(1), less(1), bzip2(1), bzdiff(1), bzgrep(1)
|
10
compress.c
10
compress.c
@ -8,7 +8,7 @@
|
|||||||
This file is a part of bzip2 and/or libbzip2, a program and
|
This file is a part of bzip2 and/or libbzip2, a program and
|
||||||
library for lossless, block-sorting data compression.
|
library for lossless, block-sorting data compression.
|
||||||
|
|
||||||
Copyright (C) 1996-2000 Julian R Seward. All rights reserved.
|
Copyright (C) 1996-2002 Julian R Seward. All rights reserved.
|
||||||
|
|
||||||
Redistribution and use in source and binary forms, with or without
|
Redistribution and use in source and binary forms, with or without
|
||||||
modification, are permitted provided that the following conditions
|
modification, are permitted provided that the following conditions
|
||||||
@ -663,10 +663,10 @@ void BZ2_compressBlock ( EState* s, Bool is_last_block )
|
|||||||
/*-- If this is the first block, create the stream header. --*/
|
/*-- If this is the first block, create the stream header. --*/
|
||||||
if (s->blockNo == 1) {
|
if (s->blockNo == 1) {
|
||||||
BZ2_bsInitWrite ( s );
|
BZ2_bsInitWrite ( s );
|
||||||
bsPutUChar ( s, 'B' );
|
bsPutUChar ( s, BZ_HDR_B );
|
||||||
bsPutUChar ( s, 'Z' );
|
bsPutUChar ( s, BZ_HDR_Z );
|
||||||
bsPutUChar ( s, 'h' );
|
bsPutUChar ( s, BZ_HDR_h );
|
||||||
bsPutUChar ( s, (UChar)('0' + s->blockSize100k) );
|
bsPutUChar ( s, (UChar)(BZ_HDR_0 + s->blockSize100k) );
|
||||||
}
|
}
|
||||||
|
|
||||||
if (s->nblock > 0) {
|
if (s->nblock > 0) {
|
||||||
|
@ -8,7 +8,7 @@
|
|||||||
This file is a part of bzip2 and/or libbzip2, a program and
|
This file is a part of bzip2 and/or libbzip2, a program and
|
||||||
library for lossless, block-sorting data compression.
|
library for lossless, block-sorting data compression.
|
||||||
|
|
||||||
Copyright (C) 1996-2000 Julian R Seward. All rights reserved.
|
Copyright (C) 1996-2002 Julian R Seward. All rights reserved.
|
||||||
|
|
||||||
Redistribution and use in source and binary forms, with or without
|
Redistribution and use in source and binary forms, with or without
|
||||||
modification, are permitted provided that the following conditions
|
modification, are permitted provided that the following conditions
|
||||||
|
14
decompress.c
14
decompress.c
@ -8,7 +8,7 @@
|
|||||||
This file is a part of bzip2 and/or libbzip2, a program and
|
This file is a part of bzip2 and/or libbzip2, a program and
|
||||||
library for lossless, block-sorting data compression.
|
library for lossless, block-sorting data compression.
|
||||||
|
|
||||||
Copyright (C) 1996-2000 Julian R Seward. All rights reserved.
|
Copyright (C) 1996-2002 Julian R Seward. All rights reserved.
|
||||||
|
|
||||||
Redistribution and use in source and binary forms, with or without
|
Redistribution and use in source and binary forms, with or without
|
||||||
modification, are permitted provided that the following conditions
|
modification, are permitted provided that the following conditions
|
||||||
@ -235,18 +235,18 @@ Int32 BZ2_decompress ( DState* s )
|
|||||||
switch (s->state) {
|
switch (s->state) {
|
||||||
|
|
||||||
GET_UCHAR(BZ_X_MAGIC_1, uc);
|
GET_UCHAR(BZ_X_MAGIC_1, uc);
|
||||||
if (uc != 'B') RETURN(BZ_DATA_ERROR_MAGIC);
|
if (uc != BZ_HDR_B) RETURN(BZ_DATA_ERROR_MAGIC);
|
||||||
|
|
||||||
GET_UCHAR(BZ_X_MAGIC_2, uc);
|
GET_UCHAR(BZ_X_MAGIC_2, uc);
|
||||||
if (uc != 'Z') RETURN(BZ_DATA_ERROR_MAGIC);
|
if (uc != BZ_HDR_Z) RETURN(BZ_DATA_ERROR_MAGIC);
|
||||||
|
|
||||||
GET_UCHAR(BZ_X_MAGIC_3, uc)
|
GET_UCHAR(BZ_X_MAGIC_3, uc)
|
||||||
if (uc != 'h') RETURN(BZ_DATA_ERROR_MAGIC);
|
if (uc != BZ_HDR_h) RETURN(BZ_DATA_ERROR_MAGIC);
|
||||||
|
|
||||||
GET_BITS(BZ_X_MAGIC_4, s->blockSize100k, 8)
|
GET_BITS(BZ_X_MAGIC_4, s->blockSize100k, 8)
|
||||||
if (s->blockSize100k < '1' ||
|
if (s->blockSize100k < (BZ_HDR_0 + 1) ||
|
||||||
s->blockSize100k > '9') RETURN(BZ_DATA_ERROR_MAGIC);
|
s->blockSize100k > (BZ_HDR_0 + 9)) RETURN(BZ_DATA_ERROR_MAGIC);
|
||||||
s->blockSize100k -= '0';
|
s->blockSize100k -= BZ_HDR_0;
|
||||||
|
|
||||||
if (s->smallDecompress) {
|
if (s->smallDecompress) {
|
||||||
s->ll16 = BZALLOC( s->blockSize100k * 100000 * sizeof(UInt16) );
|
s->ll16 = BZALLOC( s->blockSize100k * 100000 * sizeof(UInt16) );
|
||||||
|
@ -19,7 +19,7 @@
|
|||||||
|
|
||||||
#ifdef _WIN32
|
#ifdef _WIN32
|
||||||
|
|
||||||
#define BZ2_LIBNAME "libbz2-1.0.0.DLL"
|
#define BZ2_LIBNAME "libbz2-1.0.2.DLL"
|
||||||
|
|
||||||
#include <windows.h>
|
#include <windows.h>
|
||||||
static int BZ2DLLLoaded = 0;
|
static int BZ2DLLLoaded = 0;
|
||||||
@ -130,8 +130,8 @@ int main(int argc,char *argv[])
|
|||||||
}else{
|
}else{
|
||||||
fp_w = stdout;
|
fp_w = stdout;
|
||||||
}
|
}
|
||||||
if((BZ2fp_r == NULL && (BZ2fp_r = BZ2_bzdopen(fileno(stdin),"rb"))==NULL)
|
if((fn_r == NULL && (BZ2fp_r = BZ2_bzdopen(fileno(stdin),"rb"))==NULL)
|
||||||
|| (BZ2fp_r != NULL && (BZ2fp_r = BZ2_bzopen(fn_r,"rb"))==NULL)){
|
|| (fn_r != NULL && (BZ2fp_r = BZ2_bzopen(fn_r,"rb"))==NULL)){
|
||||||
printf("can't bz2openstream\n");
|
printf("can't bz2openstream\n");
|
||||||
exit(1);
|
exit(1);
|
||||||
}
|
}
|
||||||
|
@ -8,7 +8,7 @@
|
|||||||
This file is a part of bzip2 and/or libbzip2, a program and
|
This file is a part of bzip2 and/or libbzip2, a program and
|
||||||
library for lossless, block-sorting data compression.
|
library for lossless, block-sorting data compression.
|
||||||
|
|
||||||
Copyright (C) 1996-2000 Julian R Seward. All rights reserved.
|
Copyright (C) 1996-2002 Julian R Seward. All rights reserved.
|
||||||
|
|
||||||
Redistribution and use in source and binary forms, with or without
|
Redistribution and use in source and binary forms, with or without
|
||||||
modification, are permitted provided that the following conditions
|
modification, are permitted provided that the following conditions
|
||||||
|
@ -4,7 +4,7 @@
|
|||||||
# Fixed up by JRS for bzip2-0.9.5d release.
|
# Fixed up by JRS for bzip2-0.9.5d release.
|
||||||
|
|
||||||
CC=cl
|
CC=cl
|
||||||
CFLAGS= -DWIN32 -MD -Ox -D_FILE_OFFSET_BITS=64
|
CFLAGS= -DWIN32 -MD -Ox -D_FILE_OFFSET_BITS=64 -nologo
|
||||||
|
|
||||||
OBJS= blocksort.obj \
|
OBJS= blocksort.obj \
|
||||||
huffman.obj \
|
huffman.obj \
|
||||||
|
116
manual.texi
116
manual.texi
@ -2,10 +2,10 @@
|
|||||||
@setfilename bzip2.info
|
@setfilename bzip2.info
|
||||||
|
|
||||||
@ignore
|
@ignore
|
||||||
This file documents bzip2 version 1.0, and associated library
|
This file documents bzip2 version 1.0.2, and associated library
|
||||||
libbzip2, written by Julian Seward (jseward@acm.org).
|
libbzip2, written by Julian Seward (jseward@acm.org).
|
||||||
|
|
||||||
Copyright (C) 1996-2000 Julian R Seward
|
Copyright (C) 1996-2002 Julian R Seward
|
||||||
|
|
||||||
Permission is granted to make and distribute verbatim copies of
|
Permission is granted to make and distribute verbatim copies of
|
||||||
this manual provided the copyright notice and this permission notice
|
this manual provided the copyright notice and this permission notice
|
||||||
@ -30,8 +30,8 @@ END-INFO-DIR-ENTRY
|
|||||||
@titlepage
|
@titlepage
|
||||||
@title bzip2 and libbzip2
|
@title bzip2 and libbzip2
|
||||||
@subtitle a program and library for data compression
|
@subtitle a program and library for data compression
|
||||||
@subtitle copyright (C) 1996-2000 Julian Seward
|
@subtitle copyright (C) 1996-2002 Julian Seward
|
||||||
@subtitle version 1.0 of 21 March 2000
|
@subtitle version 1.0.2 of 30 December 2001
|
||||||
@author Julian Seward
|
@author Julian Seward
|
||||||
|
|
||||||
@end titlepage
|
@end titlepage
|
||||||
@ -40,11 +40,17 @@ END-INFO-DIR-ENTRY
|
|||||||
@parskip 2mm
|
@parskip 2mm
|
||||||
|
|
||||||
@end iftex
|
@end iftex
|
||||||
@node Top, Overview, (dir), (dir)
|
@node Top,,, (dir)
|
||||||
|
|
||||||
|
The following text is the License for this software. You should
|
||||||
|
find it identical to that contained in the file LICENSE in the
|
||||||
|
source distribution.
|
||||||
|
|
||||||
|
@bf{------------------ START OF THE LICENSE ------------------}
|
||||||
|
|
||||||
This program, @code{bzip2},
|
This program, @code{bzip2},
|
||||||
and associated library @code{libbzip2}, are
|
and associated library @code{libbzip2}, are
|
||||||
Copyright (C) 1996-2000 Julian R Seward. All rights reserved.
|
Copyright (C) 1996-2002 Julian R Seward. All rights reserved.
|
||||||
|
|
||||||
Redistribution and use in source and binary forms, with or without
|
Redistribution and use in source and binary forms, with or without
|
||||||
modification, are permitted provided that the following conditions
|
modification, are permitted provided that the following conditions
|
||||||
@ -82,14 +88,16 @@ Julian Seward, Cambridge, UK.
|
|||||||
|
|
||||||
@code{jseward@@acm.org}
|
@code{jseward@@acm.org}
|
||||||
|
|
||||||
@code{http://sourceware.cygnus.com/bzip2}
|
@code{bzip2}/@code{libbzip2} version 1.0.2 of 30 December 2001.
|
||||||
|
|
||||||
|
@bf{------------------ END OF THE LICENSE ------------------}
|
||||||
|
|
||||||
|
Web sites:
|
||||||
|
|
||||||
|
@code{http://sources.redhat.com/bzip2}
|
||||||
|
|
||||||
@code{http://www.cacheprof.org}
|
@code{http://www.cacheprof.org}
|
||||||
|
|
||||||
@code{http://www.muraroa.demon.co.uk}
|
|
||||||
|
|
||||||
@code{bzip2}/@code{libbzip2} version 1.0 of 21 March 2000.
|
|
||||||
|
|
||||||
PATENTS: To the best of my knowledge, @code{bzip2} does not use any patented
|
PATENTS: To the best of my knowledge, @code{bzip2} does not use any patented
|
||||||
algorithms. However, I do not have the resources available to carry out
|
algorithms. However, I do not have the resources available to carry out
|
||||||
a full patent search. Therefore I cannot give any guarantee of the
|
a full patent search. Therefore I cannot give any guarantee of the
|
||||||
@ -101,7 +109,6 @@ above statement.
|
|||||||
|
|
||||||
|
|
||||||
|
|
||||||
@node Overview, Implementation, Top, Top
|
|
||||||
@chapter Introduction
|
@chapter Introduction
|
||||||
|
|
||||||
@code{bzip2} compresses files using the Burrows-Wheeler
|
@code{bzip2} compresses files using the Burrows-Wheeler
|
||||||
@ -134,7 +141,7 @@ and nothing else.
|
|||||||
@unnumberedsubsubsec NAME
|
@unnumberedsubsubsec NAME
|
||||||
@itemize
|
@itemize
|
||||||
@item @code{bzip2}, @code{bunzip2}
|
@item @code{bzip2}, @code{bunzip2}
|
||||||
- a block-sorting file compressor, v1.0
|
- a block-sorting file compressor, v1.0.2
|
||||||
@item @code{bzcat}
|
@item @code{bzcat}
|
||||||
- decompresses files to stdout
|
- decompresses files to stdout
|
||||||
@item @code{bzip2recover}
|
@item @code{bzip2recover}
|
||||||
@ -264,6 +271,11 @@ This really performs a trial decompression and throws away the result.
|
|||||||
Force overwrite of output files. Normally, @code{bzip2} will not overwrite
|
Force overwrite of output files. Normally, @code{bzip2} will not overwrite
|
||||||
existing output files. Also forces @code{bzip2} to break hard links
|
existing output files. Also forces @code{bzip2} to break hard links
|
||||||
to files, which it otherwise wouldn't do.
|
to files, which it otherwise wouldn't do.
|
||||||
|
|
||||||
|
@code{bzip2} normally declines to decompress files which don't have the
|
||||||
|
correct magic header bytes. If forced (@code{-f}), however, it will
|
||||||
|
pass such files through unmodified. This is how GNU @code{gzip}
|
||||||
|
behaves.
|
||||||
@item -k --keep
|
@item -k --keep
|
||||||
Keep (don't delete) input files during compression
|
Keep (don't delete) input files during compression
|
||||||
or decompression.
|
or decompression.
|
||||||
@ -286,9 +298,13 @@ Further @code{-v}'s increase the verbosity level, spewing out lots of
|
|||||||
information which is primarily of interest for diagnostic purposes.
|
information which is primarily of interest for diagnostic purposes.
|
||||||
@item -L --license -V --version
|
@item -L --license -V --version
|
||||||
Display the software version, license terms and conditions.
|
Display the software version, license terms and conditions.
|
||||||
@item -1 to -9
|
@item -1 (or --fast) to -9 (or --best)
|
||||||
Set the block size to 100 k, 200 k .. 900 k when compressing. Has no
|
Set the block size to 100 k, 200 k .. 900 k when compressing. Has no
|
||||||
effect when decompressing. See MEMORY MANAGEMENT below.
|
effect when decompressing. See MEMORY MANAGEMENT below.
|
||||||
|
The @code{--fast} and @code{--best} aliases are primarily for GNU
|
||||||
|
@code{gzip} compatibility. In particular, @code{--fast} doesn't make
|
||||||
|
things significantly faster. And @code{--best} merely selects the
|
||||||
|
default behaviour.
|
||||||
@item --
|
@item --
|
||||||
Treats all subsequent arguments as file names, even if they start
|
Treats all subsequent arguments as file names, even if they start
|
||||||
with a dash. This is so you can handle files with names beginning
|
with a dash. This is so you can handle files with names beginning
|
||||||
@ -389,21 +405,19 @@ integrity of the resulting files, and decompress those which are
|
|||||||
undamaged.
|
undamaged.
|
||||||
|
|
||||||
@code{bzip2recover}
|
@code{bzip2recover}
|
||||||
takes a single argument, the name of the damaged file,
|
takes a single argument, the name of the damaged file, and writes a
|
||||||
and writes a number of files @code{rec0001file.bz2},
|
number of files @code{rec00001file.bz2}, @code{rec00002file.bz2}, etc,
|
||||||
@code{rec0002file.bz2}, etc, containing the extracted blocks.
|
containing the extracted blocks. The output filenames are designed so
|
||||||
The output filenames are designed so that the use of
|
that the use of wildcards in subsequent processing -- for example,
|
||||||
wildcards in subsequent processing -- for example,
|
@code{bzip2 -dc rec*file.bz2 > recovered_data} -- processes the files in
|
||||||
@code{bzip2 -dc rec*file.bz2 > recovered_data} -- lists the files in
|
the correct order.
|
||||||
the correct order.
|
|
||||||
|
|
||||||
@code{bzip2recover} should be of most use dealing with large @code{.bz2}
|
@code{bzip2recover} should be of most use dealing with large @code{.bz2}
|
||||||
files, as these will contain many blocks. It is clearly
|
files, as these will contain many blocks. It is clearly futile to use
|
||||||
futile to use it on damaged single-block files, since a
|
it on damaged single-block files, since a damaged block cannot be
|
||||||
damaged block cannot be recovered. If you wish to minimise
|
recovered. If you wish to minimise any potential data loss through
|
||||||
any potential data loss through media or transmission errors,
|
media or transmission errors, you might consider compressing with a
|
||||||
you might consider compressing with a smaller
|
smaller block size.
|
||||||
block size.
|
|
||||||
|
|
||||||
|
|
||||||
@unnumberedsubsubsec PERFORMANCE NOTES
|
@unnumberedsubsubsec PERFORMANCE NOTES
|
||||||
@ -435,22 +449,31 @@ I/O error messages are not as helpful as they could be. @code{bzip2}
|
|||||||
tries hard to detect I/O errors and exit cleanly, but the details of
|
tries hard to detect I/O errors and exit cleanly, but the details of
|
||||||
what the problem is sometimes seem rather misleading.
|
what the problem is sometimes seem rather misleading.
|
||||||
|
|
||||||
This manual page pertains to version 1.0 of @code{bzip2}. Compressed
|
This manual page pertains to version 1.0.2 of @code{bzip2}. Compressed
|
||||||
data created by this version is entirely forwards and backwards
|
data created by this version is entirely forwards and backwards
|
||||||
compatible with the previous public releases, versions 0.1pl2, 0.9.0 and
|
compatible with the previous public releases, versions 0.1pl2, 0.9.0,
|
||||||
0.9.5, but with the following exception: 0.9.0 and above can correctly
|
0.9.5, 1.0.0 and 1.0.1, but with the following exception: 0.9.0 and
|
||||||
decompress multiple concatenated compressed files. 0.1pl2 cannot do
|
above can correctly decompress multiple concatenated compressed files.
|
||||||
this; it will stop after decompressing just the first file in the
|
0.1pl2 cannot do this; it will stop after decompressing just the first
|
||||||
stream.
|
file in the stream.
|
||||||
|
|
||||||
|
@code{bzip2recover} versions prior to this one, 1.0.2, used 32-bit
|
||||||
|
integers to represent bit positions in compressed files, so it could not
|
||||||
|
handle compressed files more than 512 megabytes long. Version 1.0.2 and
|
||||||
|
above uses 64-bit ints on some platforms which support them (GNU
|
||||||
|
supported targets, and Windows). To establish whether or not
|
||||||
|
@code{bzip2recover} was built with such a limitation, run it without
|
||||||
|
arguments. In any event you can build yourself an unlimited version if
|
||||||
|
you can recompile it with @code{MaybeUInt64} set to be an unsigned
|
||||||
|
64-bit integer.
|
||||||
|
|
||||||
@code{bzip2recover} uses 32-bit integers to represent bit positions in
|
|
||||||
compressed files, so it cannot handle compressed files more than 512
|
|
||||||
megabytes long. This could easily be fixed.
|
|
||||||
|
|
||||||
|
|
||||||
@unnumberedsubsubsec AUTHOR
|
@unnumberedsubsubsec AUTHOR
|
||||||
Julian Seward, @code{jseward@@acm.org}.
|
Julian Seward, @code{jseward@@acm.org}.
|
||||||
|
|
||||||
|
@code{http://sources.redhat.com/bzip2}
|
||||||
|
|
||||||
The ideas embodied in @code{bzip2} are due to (at least) the following
|
The ideas embodied in @code{bzip2} are due to (at least) the following
|
||||||
people: Michael Burrows and David Wheeler (for the block sorting
|
people: Michael Burrows and David Wheeler (for the block sorting
|
||||||
transformation), David Wheeler (again, for the Huffman coder), Peter
|
transformation), David Wheeler (again, for the Huffman coder), Peter
|
||||||
@ -461,8 +484,9 @@ indebted for their help, support and advice. See the manual in the
|
|||||||
source distribution for pointers to sources of documentation. Christian
|
source distribution for pointers to sources of documentation. Christian
|
||||||
von Roques encouraged me to look for faster sorting algorithms, so as to
|
von Roques encouraged me to look for faster sorting algorithms, so as to
|
||||||
speed up compression. Bela Lubkin encouraged me to improve the
|
speed up compression. Bela Lubkin encouraged me to improve the
|
||||||
worst-case compression performance. Many people sent patches, helped
|
worst-case compression performance. The @code{bz*} scripts are derived
|
||||||
with portability problems, lent machines, gave advice and were generally
|
from those of GNU @code{gzip}. Many people sent patches, helped with
|
||||||
|
portability problems, lent machines, gave advice and were generally
|
||||||
helpful.
|
helpful.
|
||||||
|
|
||||||
@end quotation
|
@end quotation
|
||||||
@ -1769,16 +1793,20 @@ was compiled with @code{BZ_NO_STDIO} set.
|
|||||||
For a normal compile, an assertion failure yields the message
|
For a normal compile, an assertion failure yields the message
|
||||||
@example
|
@example
|
||||||
bzip2/libbzip2: internal error number N.
|
bzip2/libbzip2: internal error number N.
|
||||||
This is a bug in bzip2/libbzip2, 1.0 of 21-Mar-2000.
|
This is a bug in bzip2/libbzip2, 1.0.2, 30-Dec-2001.
|
||||||
Please report it to me at: jseward@@acm.org. If this happened
|
Please report it to me at: jseward@@acm.org. If this happened
|
||||||
when you were using some program which uses libbzip2 as a
|
when you were using some program which uses libbzip2 as a
|
||||||
component, you should also report this bug to the author(s)
|
component, you should also report this bug to the author(s)
|
||||||
of that program. Please make an effort to report this bug;
|
of that program. Please make an effort to report this bug;
|
||||||
timely and accurate bug reports eventually lead to higher
|
timely and accurate bug reports eventually lead to higher
|
||||||
quality software. Thanks. Julian Seward, 21 March 2000.
|
quality software. Thanks. Julian Seward, 30 December 2001.
|
||||||
@end example
|
@end example
|
||||||
where @code{N} is some error code number. @code{exit(3)}
|
where @code{N} is some error code number. If @code{N == 1007}, it also
|
||||||
is then called.
|
prints some extra text advising the reader that unreliable memory is
|
||||||
|
often associated with internal error 1007. (This is a
|
||||||
|
frequently-observed-phenomenon with versions 1.0.0/1.0.1).
|
||||||
|
|
||||||
|
@code{exit(3)} is then called.
|
||||||
|
|
||||||
For a @code{stdio}-free library, assertion failures result
|
For a @code{stdio}-free library, assertion failures result
|
||||||
in a call to a function declared as:
|
in a call to a function declared as:
|
||||||
@ -2056,10 +2084,10 @@ Maybe this isn't what you want.
|
|||||||
If you want a compressor and/or library which is faster, uses less
|
If you want a compressor and/or library which is faster, uses less
|
||||||
memory but gets pretty good compression, and has minimal latency,
|
memory but gets pretty good compression, and has minimal latency,
|
||||||
consider Jean-loup
|
consider Jean-loup
|
||||||
Gailly's and Mark Adler's work, @code{zlib-1.1.2} and
|
Gailly's and Mark Adler's work, @code{zlib-1.1.3} and
|
||||||
@code{gzip-1.2.4}. Look for them at
|
@code{gzip-1.2.4}. Look for them at
|
||||||
|
|
||||||
@code{http://www.cdrom.com/pub/infozip/zlib} and
|
@code{http://www.zlib.org} and
|
||||||
@code{http://www.gzip.org} respectively.
|
@code{http://www.gzip.org} respectively.
|
||||||
|
|
||||||
For something faster and lighter still, you might try Markus F X J
|
For something faster and lighter still, you might try Markus F X J
|
||||||
|
16
mk251.c
Normal file
16
mk251.c
Normal file
@ -0,0 +1,16 @@
|
|||||||
|
|
||||||
|
/* Spew out a long sequence of the byte 251. When fed to bzip2
|
||||||
|
versions 1.0.0 or 1.0.1, causes it to die with internal error
|
||||||
|
1007 in blocksort.c. This assertion misses an extremely rare
|
||||||
|
case, which is fixed in this version (1.0.2) and above.
|
||||||
|
*/
|
||||||
|
|
||||||
|
#include <stdio.h>
|
||||||
|
|
||||||
|
int main ()
|
||||||
|
{
|
||||||
|
int i;
|
||||||
|
for (i = 0; i < 48500000 ; i++)
|
||||||
|
putchar(251);
|
||||||
|
return 0;
|
||||||
|
}
|
@ -8,7 +8,7 @@
|
|||||||
This file is a part of bzip2 and/or libbzip2, a program and
|
This file is a part of bzip2 and/or libbzip2, a program and
|
||||||
library for lossless, block-sorting data compression.
|
library for lossless, block-sorting data compression.
|
||||||
|
|
||||||
Copyright (C) 1996-2000 Julian R Seward. All rights reserved.
|
Copyright (C) 1996-2002 Julian R Seward. All rights reserved.
|
||||||
|
|
||||||
Redistribution and use in source and binary forms, with or without
|
Redistribution and use in source and binary forms, with or without
|
||||||
modification, are permitted provided that the following conditions
|
modification, are permitted provided that the following conditions
|
||||||
|
4
words3
4
words3
@ -15,8 +15,8 @@ not actually execute them.
|
|||||||
|
|
||||||
Instructions for use are in the preformatted manual page, in the file
|
Instructions for use are in the preformatted manual page, in the file
|
||||||
bzip2.txt. For more detailed documentation, read the full manual.
|
bzip2.txt. For more detailed documentation, read the full manual.
|
||||||
It is available in Postscript form (manual.ps) and HTML form
|
It is available in Postscript form (manual.ps), PDF form (manual.pdf),
|
||||||
(manual_toc.html).
|
and HTML form (manual_toc.html).
|
||||||
|
|
||||||
You can also do "bzip2 --help" to see some helpful information.
|
You can also do "bzip2 --help" to see some helpful information.
|
||||||
"bzip2 -L" displays the software license.
|
"bzip2 -L" displays the software license.
|
||||||
|
Loading…
Reference in New Issue
Block a user