Snappy compressor/decompressor for Java
Go to file
Taro L. Saito 88db0bd943
Remove jvm option
2020-06-25 23:44:14 -07:00
.github/workflows Fix cmd 2020-06-25 19:58:26 -07:00
docker Add makefile for building centos5 docker image 2017-12-01 14:10:24 -08:00
lib Add cmake result headers for Linux centos5 build 2017-11-30 16:43:16 -08:00
project Upgrade sbt and other sbt plugins 2020-06-25 19:42:39 -07:00
script Remove unnecessary scala version setting 2017-12-01 16:26:13 -08:00
src Ignore SnappyHadoopComptibleOutputStreamTest 2020-06-25 23:02:26 -07:00
.gitattributes Use LF for sbt script 2019-11-09 02:35:32 +00:00
.gitignore Use javac -h instead of javah (#238) 2019-12-13 14:58:28 -08:00
.scalafmt.conf Upgrade sbt and other sbt plugins 2020-06-25 19:42:39 -07:00
.travis.yml set jdk 2020-06-25 23:39:10 -07:00
BUILD.md Fix markdown 2017-01-19 23:33:08 -08:00
LICENSE add license notes 2011-03-30 18:17:16 +09:00
Makefile Use javac -h instead of javah (#238) 2019-12-13 14:58:28 -08:00
Makefile.common update libsnappyjava.so for ppc64le to make it compat with rhel 7 (#236) 2019-11-27 22:31:48 -08:00
Makefile.package to version 1.1.0-M1 2013-03-27 17:24:01 +09:00
Milestone.md 1.1.7.5 relase notes 2020-05-06 10:30:52 -07:00
NOTICE add support for x-snappy-framed streams 2013-04-15 11:15:02 -05:00
README.md Add pure-java Snappy implementation (#244) 2020-05-11 23:48:43 -07:00
build.sbt Enforce target to JDK7 2020-06-25 19:42:39 -07:00
sbt Remove jvm option 2020-06-25 23:44:14 -07:00
stylesheet.css Fixes issue 29 Javadoc 2011-09-22 16:39:51 +09:00
version.sbt Setting version to 1.1.7.6-SNAPSHOT 2020-05-06 10:28:09 -07:00

README.md

snappy-java Build Status Maven Central Javadoc

snappy-java is a Java port of the snappy http://code.google.com/p/snappy/, a fast C++ compresser/decompresser developed by Google.

Features

  • Fast compression/decompression around 200~400MB/sec.
  • Less memory usage. SnappyOutputStream uses only 32KB+ in default.
  • JNI-based implementation to achieve comparable performance to the native C++ version.
    • Although snappy-java uses JNI, it can be used safely with multiple class loaders (e.g. Tomcat, etc.).
  • Compression/decompression of Java primitive arrays (float[], double[], int[], short[], long[], etc.)
    • To improve the compression ratios of these arrays, you can use a fast data-rearrangement implementation (BitShuffle) before compression
  • Portable across various operating systems; Snappy-java contains native libraries built for Window/Mac/Linux (64-bit). snappy-java loads one of these libraries according to your machine environment (It looks system properties, os.name and os.arch).
  • Simple usage. Add the snappy-java-(version).jar file to your classpath. Then call compression/decompression methods in org.xerial.snappy.Snappy.
  • Framing-format support (Since 1.1.0 version)
  • OSGi support
  • Apache License Version 2.0. Free for both commercial and non-commercial use.

Performance

Download

Maven Central Javadoc

The current stable version is available from here:

Using with Maven

Add the following dependency to your pom.xml:

<dependency>
  <groupId>org.xerial.snappy</groupId>
  <artifactId>snappy-java</artifactId>
  <version>(version)</version>
  <type>jar</type>
  <scope>compile</scope>
</dependency>

Using with sbt

libraryDependencies += "org.xerial.snappy" % "snappy-java" % "(version)"

Usage

First, import org.xerial.snapy.Snappy in your Java code:

import org.xerial.snappy.Snappy;

Then use Snappy.compress(byte[]) and Snappy.uncompress(byte[]):

String input = "Hello snappy-java! Snappy-java is a JNI-based wrapper of "
     + "Snappy, a fast compresser/decompresser.";
byte[] compressed = Snappy.compress(input.getBytes("UTF-8"));
byte[] uncompressed = Snappy.uncompress(compressed);

String result = new String(uncompressed, "UTF-8");
System.out.println(result);

In addition, high-level methods (Snappy.compress(String), Snappy.compress(float[] ..) etc. ) and low-level ones (e.g. Snappy.rawCompress(.. ), Snappy.rawUncompress(..), etc.), which minimize memory copies, can be used.

Stream-based API

Stream-based compressor/decompressor SnappyOutputStream/SnappyInputStream are also available for reading/writing large data sets. SnappyFramedOutputStream/SnappyFramedInputStream can be used for the framing format.

Compatibility Notes

The original Snappy format definition did not define a file format. It later added a "framing" format to define a file format, but by this point major software was already using an industry standard instead -- represented in this library by the SnappyOutputStream and SnappyInputStream methods.

For interoperability with other libraries, check that compatible formats are used. Note that not all libraries support all variants.

  • SnappyOutputStream and SnappyInputStream use [magic header:16 bytes]([block size:int32][compressed data:byte array])* format. You can read the result of Snappy.compress with SnappyInputStream, but you cannot read the compressed data generated by SnappyOutputStream with Snappy.uncompress.
  • SnappyHadoopCompatibleOutputStream does not emit a file header but write out the current block size as a preemble to each block

Data format compatibility matrix:

Write\Read Snappy.uncompress SnappyInputStream SnappyFramedInputStream org.apache.hadoop.io.compress.SnappyCodec
Snappy.compress ok ok x x
SnappyOutputStream x ok x x
SnappyFramedOutputStream x x ok x
SnappyHadoopCompatibleOutputStream x x x ok

BitShuffle API (Since 1.1.3-M2)

BitShuffle is an algorithm that reorders data bits (shuffle) for efficient compression (e.g., a sequence of integers, float values, etc.). To use BitShuffle routines, import org.xerial.snapy.BitShuffle:

import org.xerial.snappy.BitShuffle;

int[] data = new int[] {1, 3, 34, 43, 34};
byte[] shuffledByteArray = BitShuffle.shuffle(data);
byte[] compressed = Snappy.compress(shuffledByteArray);
byte[] uncompressed = Snappy.uncompress(compressed);
int[] result = BitShuffle.unshuffleIntArray(uncompress);

System.out.println(result);

Shuffling and unshuffling of primitive arrays (e.g., short[], long[], float[], double[], etc.) are supported. See Javadoc for the details.

Setting classpath

If you have snappy-java-(VERSION).jar in the current directory, use -classpath option as follows:

$ javac -classpath ".;snappy-java-(VERSION).jar" Sample.java  # in Windows
or
$ javac -classpath ".:snappy-java-(VERSION).jar" Sample.java  # in Mac or Linux

Public discussion group

Post bug reports or feature request to the Issue Tracker: https://github.com/xerial/snappy-java/issues

Public discussion forum is here: Xerial Public Discussion Group

For developers

snappy-java uses sbt (simple build tool for Scala) as a build tool. Here is a simple usage

$ ./sbt            # enter sbt console
> ~test            # run tests upon source code change
> ~test-only *     # run tests that matches a given name pattern  
> publishM2        # publish jar to $HOME/.m2/repository
> package          # create jar file
> findbugs         # Produce findbugs report in target/findbugs
> jacoco:cover     # Report the code coverage of tests to target/jacoco folder    

If you need to see detailed debug messages, launch sbt with -Dloglevel=debug option:

$ ./sbt -Dloglevel=debug

For the details of sbt usage, see my blog post: Building Java Projects with sbt

Building from the source code

See the build instruction. Building from the source code is an option when your OS platform and CPU architecture is not supported. To build snappy-java, you need Git, JDK (1.6 or higher), g++ compiler (mingw in Windows) etc.

$ git clone https://github.com/xerial/snappy-java.git
$ cd snappy-java
$ make

When building on Solaris, use gmake:

$ gmake

A file target/snappy-java-$(version).jar is the product additionally containing the native library built for your platform.

Miscellaneous Notes

Using pure-java Snappy implementation

snappy-java can optionally use a pure-java implementation of Snappy based on aircompressor. This implementation is selected when no native Snappy library for your platform is found. You can also force using this pure-java implementation by setting a JVM property org.xerial.snappy.purejava=true before loading any class of Snappy (e.g., using -Dorg.xerial.snappy.purejava=true option when launching JVM).

Using snappy-java with Tomcat 6 (or higher) Web Server

Simply put the snappy-java's jar to WEB-INF/lib folder of your web application. Usual JNI-library specific problem no longer exists since snappy-java version 1.0.3 or higher can be loaded by multiple class loaders.

Configure snappy-java using property file

Prepare org-xerial-snappy.properties file (under the root path of your library) in Java's property file format. Here is a list of the available properties:

  • org.xerial.snappy.lib.path (directory containing a snappyjava's native library)
  • org.xerial.snappy.lib.name (library file name)
  • org.xerial.snappy.tempdir (temporary directory to extract a native library bundled in snappy-java)
  • org.xerial.snappy.use.systemlib (if this value is true, use system installed libsnappyjava.so looking the path specified by java.library.path)

Snappy-java is developed by Taro L. Saito. Twitter @taroleo