datasketches-java 7.0.0 API

Sketching Core Library

Overview

The Sketching Core Library provides a range of stochastic streaming algorithms and closely related java technologies that are particularly useful when integrating this technology into systems that must deal with massive data.

This library is divided into packages that constitute distinct groups of functionality:

Note: In general, if the requirements or promises of any method's contract are not fulfilled (that is, if there is a bug in either the method or its caller), then an unchecked exception will be thrown. The precise type of such an unchecked exception does not form part of any method's contract.
Packages
Package
Description
This package is the parent package for all sketch families and common code areas.
This package is for common classes that may be used across all the sketch families.
Compressed Probabilistic Counting sketch family
Frequent Distinct Tuples Sketch
The filters package contains data structures used to determine approximate set-membership.
BloomFilter package
This package is dedicated to streaming algorithms that enable estimation of the frequency of occurrence of items in a weighted multiset stream of items.
The hash package contains a high-performing and extended Java implementations of Austin Appleby's 128-bit MurmurHash3 hash function originally coded in C.
The DataSketches™ HLL sketch family package
The hllmap package contains a space efficient HLL mapping sketch of keys to approximate unique count of identifiers.
This package is for the implementations of the sketch algorithm developed by Zohar Karnin, Kevin Lang, and Edo Liberty that is commonly referred to as the "KLL" sketch after the authors' last names.
 
The quantiles package contains stochastic streaming algorithms that enable single-pass analysis of the distribution of a stream of quantiles.
This package contains common tools and methods for the quantiles, kll and req packages.
This package is for the implementation of the Relative Error Quantiles sketch algorithm.
This package is dedicated to streaming algorithms that enable fixed size, uniform sampling of weighted and unweighted items from a stream.
t-Digest for estimating quantiles and ranks.
The theta package contains the basic sketch classes that are members of the Theta Sketch Framework.
This package contains common tools and methods for the theta, tuple, tuple/* and fdt packages.
The tuple package contains a number of sketches based on the same fundamental algorithms of the Theta Sketch Framework and extend these concepts for whole new families of sketches.
This package is for a generic implementation of the Tuple sketch for single Double value.
This package is for a generic implementation of the Tuple sketch for single Integer value.
This package is for a concrete implementation of the Tuple sketch for an array of double values.
This package is for a generic implementation of the Tuple sketch for single String value.