Data Compression Pointers

This page is partly written in Japanese.


[2005-03] Some Recent News:

[2003-07] LZW Patent and Software Information: The U.S. LZW patent expires June 20, 2003, the counterpart Canadian patent expires July 7, 2004, the counterpart patents in the United Kingdom, France, Germany and Italy expire June 18, 2004, and the Japanese counterpart patents expire June 20, 2004.

[2003-02-2X] buffer overrun in zlib 1.1.4

もう LZW 米国特許は expire したのか? (Re: LZW Patent Expiry) [2003年6月まで有効というのが正しいようです] 日本特許 2123602, 2610084 は 1984-06-20 出願だから 2004-06-19 まで有効?

[2003-01-30] bwtzip: A Linear-Time Portable Research-Grade Universal Data Compressor (by Stephan T. Lavavej)

[2002-09-30] UnZip versions < 5.50 security vulnerability / 5.50 DOS version textmode corruption bug

[2002-07-20] special page on JPEG Patent

[2002-04-18] JPEG-LS dll and plug-in available at HP Labs.

[2002-04-15] Mark Nelson's Data Compression Library renamed:

[2002-04-07] ZeoSync Patent: Relational Differentiation Encoding (local mirror: zeosync.pdf) (See also my ZeoSync page in Japanese)

[2002-03-29] Open-source ARJ started (Project Info)

[2002-03-13] zlib Compression Library Corrupts malloc Data Structures via Double Free

Some of my writings


Among my writings on data compression, the only one written in English is History of Data Compression in Japan, which is rather outdated.


Introductory books in English


comp.compression FAQ is archived at:

Links to Links

Mark Nelson's Data Compression Library (old / new) also has a lot of links.

Test Data Sets (テストデータ)

昔は Calgary Corpus がよく使われましたが, より新しい Canterbury Corpus のほうがお薦めです。

Info-ZIP, gzip, zlib, ...


Haruyasu Yoshizaki (Yoshi)'s main email address is/was, but I've been unable to get in touch with him lately.

Compuression using Burrows-Wheeler Transform

Burrows-Wheeler 変換(ブロック整列法)は今までにない方法です。 Burrows 氏に聞いたところ, 特許を取るつもりはないので自由に使ってくれということです。 日本でもいくつか実験的な実装が作られましたが, 今のところ最も有名な実装は bzip2 です。 bzip2 は gzip ほどではありませんがなかなか高速で, ある程度以上大きいファイルでは gzip より高圧縮です。 bzip2 の作る圧縮ファイルの標準の拡張子は .bz2 です。 Linux の一部の配布でも gzip と併用して bzip2 が使われるようになりました。 なお,旧版 bzip は算術符号化を使っていたので特許の問題があります。

szip も同様なソフトです。 Schindler はかつては CERN に出向いて高エネルギー物理実験データの圧縮などを手掛けていた人で, たしか私もメールをもらったことがあるように思います。 今はソフト屋になったようです。

Range Coder

Almost as good as arithmetic coder, and patent free!



JPEG-LS, a proposed JPEG lossless/near-lossless image compression mode, is faster and in many cases compresses tighter than zlib-based PNG. Its core algorithm is LOCO-I.

``As part of the JPEG-LS work, HP and Mitsubishi have graciously agreed that the patents needed for implementation of the standard may be used without payment of license or royalty fees, and IBM have offered their QM coder patents on a similar basis for JPEG and JBIG work'' -- JPEG.ORG, but you must fill out this form.

(US Patent No. 5,680,129, "System and Method for Lossless Image Compression"; US Patent No. 5,764,374, "System and Method for Lossless Image Compression Having Improved Sequential Determination of Golomb Parameter")



Lossless Video


GIF, LZW, Unisys


See also Astronomy/FITS below.

Astronomy and FITS

FITS = Flexible Image Transport System

Seismic Data

Other Formats


FLDC (Fujitsu Lossless Data Compression)

3D Compression

Other (Japan)

Other (Overseas)

Compression/Decompression Tools on Windows

奥村晴彦 (Haruhiko Okumura)

Last modified: 2005-03-21 11:52:32