Snappy compressor/decompressor for Java
Go to file
xerial c597ac2d45 Update OSInfo 2017-01-19 17:57:42 -08:00
docker Add linu32 linu64 targets 2017-01-19 16:48:28 -08:00
lib Update OSInfo 2017-01-19 17:57:42 -08:00
project Fix build scripts 2017-01-19 10:51:35 -08:00
src Add linux x86 binaries 2017-01-19 17:33:12 -08:00
.gitignore Removed unnecessary code 2014-06-26 14:33:54 +09:00
.travis.yml Fix Travis config 2017-01-19 17:52:15 -08:00
INSTALL Fixes issue 26. Add note on using configuration file and license of the embedded libstdc++ library (GCC Runtime Library Exception) 2011-08-23 12:05:19 +09:00
LICENSE add license notes 2011-03-30 18:17:16 +09:00
Makefile Add linu32 linu64 targets 2017-01-19 16:48:28 -08:00
Makefile.common Add linux x86 binaries 2017-01-19 17:33:12 -08:00
Makefile.package to version 1.1.0-M1 2013-03-27 17:24:01 +09:00
Milestone.md Fix release note 2016-06-02 10:43:01 -07:00
NOTICE add support for x-snappy-framed streams 2013-04-15 11:15:02 -05:00
README.md Merge pull request #137 from maropu/SupportBitShuffle 2017-01-19 10:02:07 -08:00
build.sbt Use Java 1.7 2017-01-19 17:55:31 -08:00
googlecode_upload.py Add google code upload script 2011-10-05 11:05:21 +09:00
sbt Upgrade sbt script; the old version could no longer download SBT properly 2015-08-01 12:43:07 -04:00
stylesheet.css Fixes issue 29 Javadoc 2011-09-22 16:39:51 +09:00
version.sbt Fix release note 2016-06-02 10:43:01 -07:00

README.md

The snappy-java is a Java port of the snappy http://code.google.com/p/snappy/, a fast C++ compresser/decompresser developed by Google.

Features

  • Fast compression/decompression around 200~400MB/sec.
  • Less memory usage. SnappyOutputStream uses only 32KB+ in default.
  • JNI-based implementation to achieve comparable performance to the native C++ version.
    • Although snappy-java uses JNI, it can be used safely with multiple class loaders (e.g. Tomcat, etc.).
  • Compression/decompression of Java primitive arrays (float[], double[], int[], short[], long[], etc.)
  • Portable across various operating systems; Snappy-java contains native libraries built for Window/Mac/Linux (64-bit). snappy-java loads one of these libraries according to your machine environment (It looks system properties, os.name and os.arch).
  • Simple usage. Add the snappy-java-(version).jar file to your classpath. Then call compression/decompression methods in org.xerial.snappy.Snappy.
  • Framing-format support (Since 1.1.0 version)
  • OSGi support
  • Apache License Version 2.0. Free for both commercial and non-commercial use.

Performance

Download

The current stable version is available from here:

Using with Maven

Add the following dependency to your pom.xml:

<dependency>
  <groupId>org.xerial.snappy</groupId>
  <artifactId>snappy-java</artifactId>
  <version>1.1.2.1</version>
  <type>jar</type>
  <scope>compile</scope>
</dependency>

Using with sbt

libraryDependencies += "org.xerial.snappy" % "snappy-java" % "1.1.2.1"

Usage

First, import org.xerial.snapy.Snappy in your Java code:

import org.xerial.snappy.Snappy;

Then use Snappy.compress(byte[]) and Snappy.uncompress(byte[]):

String input = "Hello snappy-java! Snappy-java is a JNI-based wrapper of "
     + "Snappy, a fast compresser/decompresser.";
byte[] compressed = Snappy.compress(input.getBytes("UTF-8"));
byte[] uncompressed = Snappy.uncompress(compressed);

String result = new String(uncompressed, "UTF-8");
System.out.println(result);

In addition, high-level methods (Snappy.compress(String), Snappy.compress(float[] ..) etc. ) and low-level ones (e.g. Snappy.rawCompress(.. ), Snappy.rawUncompress(..), etc.), which minimize memory copies, can be used.

Stream-based API

Stream-based compressor/decompressor SnappyOutputStream/SnappyInputStream are also available for reading/writing large data sets. SnappyFramedOutputStream/SnappyFramedInputStream can be used for the framing format.

Compatibility Notes

  • SnappyOutputStream and SnappyInputStream use [magic header:16 bytes]([block size:int32][compressed data:byte array])* format. You can read the result of Snappy.compress with SnappyInputStream, but you cannot read the compressed data generated by SnappyOutputStream with Snappy.uncompress. Here is the data format compatibility matrix:
Write\Read Snappy.uncompress SnappyInputStream SnappyFramedInputStream
Snappy.compress ok ok x
SnappyOutputStream x ok x
SnappyFramedOutputStream x x ok

Setting classpath

If you have snappy-java-(VERSION).jar in the current directory, use -classpath option as follows:

$ javac -classpath ".;snappy-java-(VERSION).jar" Sample.java  # in Windows
or
$ javac -classpath ".:snappy-java-(VERSION).jar" Sample.java  # in Mac or Linux

Public discussion group

Post bug reports or feature request to the Issue Tracker: https://github.com/xerial/snappy-java/issues

Public discussion forum is here: Xerial Public Discussion Group

Building from the source code

See the installation instruction. Building from the source code is an option when your OS platform and CPU architecture is not supported. To build snappy-java, you need Git, JDK (1.6 or higher), g++ compiler (mingw in Windows) etc.

$ git clone https://github.com/xerial/snappy-java.git
$ cd snappy-java
$ make

When building on Solaris use

$ gmake

A file target/snappy-java-$(version).jar is the product additionally containing the native library built for your platform.

Building Linux x86_64 binary

snappy-java tries to static link libstdc++ to increase the availability for various Linux versions. However, standard distributions of 64-bit Linux OS rarely provide libstdc++ compiled with -fPIC option. I currently uses custom g++, compiled as follows:

$ cd work
$ wget (gcc-4.8.3 source)
$ tar xvfz (gcc-4.8.3.tar.gz)
$ cd gcc-4.8.3
$ ./contrib/download_prerequisites
$ cd ..
$ mkdir objdir
$ cd objdir
$ ../gcc-4.8.3/configure --prefix=$HOME/local/gcc-4.8.3 CXXFLAGS=-fPIC CFLAGS=-fPIC --enable-languages=c,c++
$ make
$ make install

This g++ build enables static linking of libstdc++. For more infomation on building GCC, see GCC's home page.

Building Linux s390/s390x binaries

Older snapshots of snappy contain a buggy config.h.in that does not work properly on some big-endian platforms like Linux on IBM z (s390/s390x). Building snappy-java on s390/s390x requires fetching the snappy source from GitHub, and processing the source with autoconf to obtain a usable config.h. On a RHEL s390x system, these steps produced a working 64-bit snappy-java build (the process should be similar for other distributions):

$ sudo yum install java-1.7.1-ibm-devel libstdc++-static-devel
$ export JAVA_HOME=/usr/lib/jvm/java-1.7.1-ibm-1.7.1.2.10-1jpp.3.el7_0.s390x
$ make USE_GIT=1 GIT_REPO_URL=https://github.com/google/snappy.git GIT_SNAPPY_BRANCH=master IBM_JDK_7=1

Activating SSE2/AVX2 instructions in BitShuffle

The most of the native libraries that snappy-java contains disable SSE2/AVX2 instructions in terms of portability (SSE2 is enabled only in Linux/x86_64 platforms). To enable AVX2 instructions, you need to compile as follows:

$ make CXXFLAGS_BITSHUFFLE=-mavx2  # -msse2 for SSE2 instructions

Cross-compiling for other platforms

The Makefile contains rules for cross-compiling the native library for other platforms so that the snappy-java JAR can support multiple platforms. For example, to build the native libraries for x86 Linux, x86 and x86-64 Windows, and soft- and hard-float ARM:

$ make linux32 win32 win64 linux-arm linux-armhf linux-aarch64

If you append snappy to the line above, it will also build the native library for the current platform and then build the snappy-java JAR (containing all native libraries built so far).

Of course, you must first have the necessary cross-compilers and development libraries installed for each target CPU and OS. For example, on Ubuntu 12.04 for x86-64, install the following packages for each target:

  • linux32: sudo apt-get install g++-multilib libc6-dev-i386 lib32stdc++6
  • win32: sudo apt-get install g++-mingw-w64-i686
  • win64: sudo apt-get install g++-mingw-w64-x86-64
  • arm: sudo apt-get install g++-arm-linux-gnueabi
  • armhf: sudo apt-get install g++-arm-linux-gnueabihf
  • aarch64: sudo apt-get install g++-aarch64-linux

Unfortunately, cross-compiling for Mac OS X is not currently possible; you must compile within OS X.

If you are using Mac and openjdk7 (or higher), use the following option:

$ make native LIBNAME=libsnappyjava.dylib

For developers

snappy-java uses sbt (simple build tool for Scala) as a build tool. Here is a simple usage

$ ./sbt            # enter sbt console
> ~test            # run tests upon source code change
> ~test-only *     # run tests that matches a given name pattern  
> publishM2        # publish jar to $HOME/.m2/repository
> package          # create jar file
> findbugs         # Produce findbugs report in target/findbugs
> jacoco:cover     # Report the code coverage of tests to target/jacoco folder    

If you need to see detailed debug messages, launch sbt with -Dloglevel=debug option:

$ ./sbt -Dloglevel=debug

For the details of sbt usage, see my blog post: Building Java Projects with sbt

Miscellaneous Notes

Using snappy-java with Tomcat 6 (or higher) Web Server

Simply put the snappy-java's jar to WEB-INF/lib folder of your web application. Usual JNI-library specific problem no longer exists since snappy-java version 1.0.3 or higher can be loaded by multiple class loaders.


Snappy-java is developed by Taro L. Saito. Twitter @taroleo