All Classes and Interfaces (datasketches-java 9.0.0 API)

This class enables the estimation of error bounds given a sample set size, the sampling probability theta, the number of standard deviations and a simple noDataSeen flag.

BloomFilter

A Bloom filter is a data structure that can be used for probabilistic set membership.

BloomFilterBuilder

This class provides methods to help estimate the correct parameters when creating a Bloom filter, and methods to create the filter using those values.

BoundsOnBinomialProportions

Confidence intervals for binomial proportions.

BoundsOnRatiosInSampledSets

This class is used to compute the bounds on the estimate of the ratio |B| / |A|, where: |A| is the unknown size of a set A of unique identifiers. |B| is the unknown size of a subset B of A. a = |S_A| is the observed size of a sample of A that was obtained by Bernoulli sampling with a known inclusion probability f. b = |S_A ∩ B| is the observed size of a subset of S_A.

BoundsOnRatiosInThetaSketchedSets

This class is used to compute the bounds on the estimate of the ratio B / A, where: A is a Theta Sketch of population PopA. B is a Theta Sketch of population PopB that is a subset of A, obtained by an intersection of A with some other Theta Sketch C, which acts like a predicate or selection clause. The estimate of the ratio PopB/PopA is BoundsOnRatiosInThetaSketchedSets.getEstimateOfBoverA(A, B). The Upper Bound estimate on the ratio PopB/PopA is BoundsOnRatiosInThetaSketchedSets.getUpperBoundForBoverA(A, B). The Lower Bound estimate on the ratio PopB/PopA is BoundsOnRatiosInThetaSketchedSets.getLowerBoundForBoverA(A, B). Note: The theta of A cannot be greater than the theta of B.

BoundsOnRatiosInTupleSketchedSets

This class is used to compute the bounds on the estimate of the ratio B / A, where: A is a Tuple Sketch of population PopA. B is a Tuple or Theta Sketch of population PopB that is a subset of A, obtained by an intersection of A with some other Tuple or Theta Sketch C, which acts like a predicate or selection clause. The estimate of the ratio PopB/PopA is BoundsOnRatiosInThetaSketchedSets.getEstimateOfBoverA(A, B). The Upper Bound estimate on the ratio PopB/PopA is BoundsOnRatiosInThetaSketchedSets.getUpperBoundForBoverA(A, B). The Lower Bound estimate on the ratio PopB/PopA is BoundsOnRatiosInThetaSketchedSets.getLowerBoundForBoverA(A, B). Note: The theta of A cannot be greater than the theta of B.

BoundsRule

This instructs the user about which of the upper and lower bounds of a partition definition row should be included with the returned data.

ByteArrayUtil

Useful methods for byte arrays.

ClassicUtil

Utilities for the classic quantiles sketches and independent of the type.

CompactQuantilesDoublesSketch

Compact sketches are inherently read only.

CompactThetaSketch

The parent class of all the CompactThetaSketches.

CompactTupleSketch<S>

CompactTupleSketches are never created directly.

CompressionCharacterization

This code is used both by unit tests, for short running tests, and by the characterization repository for longer running, more exhaustive testing.

CountMinSketch

Java implementation of the CountMin sketch data structure of Cormode and Muthukrishnan.

CpcSketch

This is a unique-counting sketch that implements the Compressed Probabilistic Counting (CPC, a.k.a FM85) algorithms developed by Kevin Lang in his paper Back to the Future: an Even More Nearly Optimal Cardinality Estimation Algorithm.

CpcUnion

The union (merge) operation for the CPC sketches.

CpcWrapper

This provides a read-only view of a serialized image of a CpcSketch, which can be on-heap or off-heap represented as a MemorySegment object, or on-heap represented as a byte array.

DeserializeResult<T>

Returns an object and its size in bytes as a result of a deserialize operation

DirectBitArrayR

This class can maintain the BitArray object off-heap.

DoublesSketchSortedView

The SortedView of the Quantiles Classic QuantilesDoublesSketch and the KllDoublesSketch.

DoublesSortedView

The Sorted View for quantile sketches of primitive type double.

DoublesSortedViewIterator

Iterator over quantile sketches of primitive type double.

DoubleSummary

Summary for generic tuple sketches of type Double.

DoubleSummary.Mode

The aggregation modes for this Summary

DoubleSummaryDeserializer

Implements SummaryDeserializer<DoubleSummary>

DoubleSummaryFactory

Factory for DoubleSummary.

DoubleSummarySetOperations

Methods for defining how unions and intersections of two objects of type DoubleSummary are performed.

DoubleTupleSketch

Extends UpdatableTupleSketch<Double, DoubleSummary>

EbppsItemsSketch<T>

An implementation of an Exact and Bounded Sampling Proportional to Size sketch.

ErrorType

Specifies one of two types of error regions of the statistical classification Confusion Matrix that can be excluded from a returned sample of Frequent Items.

Family

Defines the various families of sketch and set operation classes.

FdtSketch

A Frequent Distinct Tuples sketch.

Filter<T>

Class for filtering entries from a TupleSketch given a Summary

FloatsSketchSortedView

The SortedView for the KllFloatsSketch and the ReqSketch.

FloatsSortedView

The Sorted View for quantiles of primitive type float.

FloatsSortedViewIterator

Iterator over quantile sketches of primitive type float.

FrequentItemsSketch<T>

This sketch is based on the paper https://arxiv.org/abs/1705.07001 ("A High-Performance Algorithm for Identifying Frequent Items in Data Streams" by Daniel Anderson, Pryce Bevan, Kevin Lang, Edo Liberty, Lee Rhodes, and Justin Thaler) and is useful for tracking approximate frequencies of items of type <T> with optional associated counts (<T> item, long count) that are members of a multiset of such items.

FrequentItemsSketch.Row<T>

Row class that defines the return values from a getFrequentItems query.

FrequentLongsSketch

This sketch is based on the paper https://arxiv.org/abs/1705.07001 ("A High-Performance Algorithm for Identifying Frequent Items in Data Streams" by Daniel Anderson, Pryce Bevan, Kevin Lang, Edo Liberty, Lee Rhodes, and Justin Thaler) and is useful for tracking approximate frequencies of long items with optional associated counts (long item, long count) that are members of a multiset of such items.

FrequentLongsSketch.Row

Row class that defines the return values from a getFrequentItems query.

GenericInequalitySearch

This provides efficient, unique and unambiguous binary searching for inequality comparison criteria for ordered arrays of values that may include duplicate values.

GenericInequalitySearch.Inequality

The enumerator of inequalities

GenericPartitionBoundaries<T>

This defines the returned results of the getParitionBoundaries() function and includes the basic methods needed to construct actual partitions.

GenericSortedView<T>

The Sorted View for quantiles of generic type.

GenericSortedViewIterator<T>

Iterator over quantile sketches of generic type.

Group

Defines a Group from a Frequent Distinct Tuple query.

HashIterator

This is used to iterate over the retained hash values of the Theta sketch.

HashOperations

Helper class for the common hash table methods.

HllSketch

The HllSketch is actually a collection of compact implementations of Phillipe Flajolet’s HyperLogLog (HLL) sketch but with significantly improved error behavior and excellent speed performance.

HllUnion

This performs union operations for all HllSketches.

IncludeMinMax

This class reinserts the min and max values into the sorted view arrays as required.

IncludeMinMax.DoublesPair

A simple structure to hold a pair of arrays

IncludeMinMax.FloatsPair

A simple structure to hold a pair of arrays

IncludeMinMax.ItemsPair<T>

A simple structure to hold a pair of arrays

IncludeMinMax.LongsPair

A simple structure to hold a pair of arrays

InequalitySearch

This provides efficient, unique and unambiguous binary searching for inequality comparison criteria for ordered arrays of values that may include duplicate values.

IntegerSummary

Summary for generic tuple sketches of type Integer.

IntegerSummary.Mode

The aggregation modes for this Summary

IntegerSummaryDeserializer

Implements SummaryDeserializer<IntegerSummary>

IntegerSummaryFactory

Factory for IntegerSummary.

IntegerSummarySetOperations

Methods for defining how unions and intersections of two objects of type IntegerSummary are performed.

IntegerTupleSketch

Extends UpdatableTupleSketch<Integer, IntegerSummary>

ItemsSketchSortedView<T>

The SortedView for the KllItemsSketch and the classic QuantilesItemsSketch.

JaccardSimilarity

Jaccard similarity of two ThetaSketches.

JaccardSimilarity

Jaccard similarity of two TupleSketches, or alternatively, of a TupleSketch and a ThetaSketch.

KllDoublesSketch

This variation of the KllSketch implements primitive doubles.

KllDoublesSketchIterator

Iterator over KllDoublesSketch.

KllFloatsSketch

This variation of the KllSketch implements primitive floats.

KllFloatsSketchIterator

Iterator over KllFloatsSketch.

KllItemsSketch<T>

This variation of the KllSketch implements generic data types.

KllItemsSketchIterator<T>

Iterator over KllItemsSketch.

KllLongsSketch

This variation of the KllSketch implements primitive longs.

KllLongsSketchIterator

Iterator over KllLongsSketch.

KllSketch

This class is the root of the KLL sketch class hierarchy.

KllSketch.SketchStructure

Used primarily to define the structure of the serialized sketch.

KllSketch.SketchType

Used to define the variable type of the current instance of this class.

KllSketchIterator

The base implementation for the KLL sketch iterator hierarchy used for viewing the non-ordered quantiles retained by a sketch.

KolmogorovSmirnov

Kolmogorov-Smirnov Test See Kolmogorov–Smirnov Test

LongsSketchSortedView

The SortedView of the KllLongsSketch.

LongsSortedView

The Sorted View for quantile sketches of primitive type long.

LongsSortedViewIterator

Iterator over quantile sketches of primitive type long.

MemorySegmentRequest

This is a callback interface to provide a means to request a new MemorySegment of a specified size.

MemorySegmentRequest.Default

A convenience class that implements a default implementation.

MemorySegmentRequestExample

This is an example of a possible implementation of the MemorySegmentRequest interface where all requested segments are allocated off-heap.

MemorySegmentStatus

Methods for inquiring the status of a backing MemorySegment.

MergingValidation

This code is used both by unit tests, for short running tests, and by the characterization repository for longer running, more exhaustive testing.

MurmurHash3

The MurmurHash3 is a fast, non-cryptographic, 128-bit hash function that has excellent avalanche and 2-way bit independence properties.

MurmurHash3FFM

The MurmurHash3 is a fast, non-cryptographic, 128-bit hash function that has excellent avalanche and 2-way bit independence properties.

Partitioner<T,S>

A partitioning process that can partition very large data sets into thousands of partitions of approximately the same size.

Partitioner.PartitionBoundsRow<T>

Defines a row for List of PartitionBounds.

Partitioner.StackElement<T>

Holds data for a Stack element

PartitioningFeature<T>

This enables the special functions for performing efficient partitioning of massive data.

Positional

Defines the relative positional API.

PositionalSegment

Defines the API for relative positional access to a MemorySegment.

PositionInvariantsException

Position operation violation.

PostProcessor

This processes the contents of a FDT sketch to extract the primary keys with the most frequent unique combinations of the non-primary dimensions.

QuantilesAPI

This is a stochastic streaming sketch that enables near-real time analysis of the approximate distribution of items from a very large stream in a single pass, requiring only that the items are comparable.

QuantilesDoublesAPI

The Quantiles API for item type double.

QuantilesDoublesSketch

This is an implementation of the Low Discrepancy Mergeable Quantiles Sketch, using doubles, described in section 3.2 of the journal version of the paper "Mergeable Summaries" by Agarwal, Cormode, Huang, Phillips, Wei, and Yi:

QuantilesDoublesSketchBuilder

For building a new quantiles QuantilesDoublesSketch.

QuantilesDoublesSketchIterator

Iterator over QuantilesDoublesSketch.

QuantilesDoublesSketchIteratorAPI

The quantiles sketch iterator for primitive type double.

QuantilesDoublesUnion

The API for Union operations for QuantilesDoublesSketches

QuantilesDoublesUnionBuilder

For building a new QuantilesDoublesSketch Union operation.

QuantileSearchCriteria

These search criteria are used by the KLL, REQ and Classic Quantiles sketches in the DataSketches library.

QuantilesFloatsAPI

The Quantiles API for item type float.

QuantilesFloatsSketchIterator

The quantiles sketch iterator for primitive type float.

QuantilesGenericAPI<T>

The Quantiles API for item type generic.

QuantilesGenericSketchIteratorAPI<T>

The quantiles sketch iterator for generic types.

QuantilesItemsSketch<T>

This is an implementation of the Low Discrepancy Mergeable Quantiles Sketch, using generic items, described in section 3.2 of the journal version of the paper "Mergeable Summaries" by Agarwal, Cormode, Huang, Phillips, Wei, and Yi:

QuantilesItemsSketchIterator<T>

Iterator over QuantilesItemsSketch.

QuantilesItemsUnion<T>

The API for Union operations for generic QuantilesItemsSketches

QuantilesLongsAPI

The Quantiles API for item type long.

QuantilesLongsSketchIterator

The quantiles sketch iterator for primitive type long.

QuantilesSketchIteratorAPI

This is the base API for the iterator hierarchy used for viewing the non-ordered quantiles retained by the classic Quantiles* sketches and KLL sketches, for example.

QuantilesUtil

Utilities for the quantiles sketches.

QuickMergingValidation

This code is used both by unit tests, for short running tests, and by the characterization repository for longer running, more exhaustive testing.

QuickSelect

QuickSelect algorithm improved from Sedgewick.

ReqDebug

The signaling interface that allows comprehensive analysis of the ReqSketch and ReqCompactor while eliminating code clutter in the main classes.

ReqSketch

This Relative Error Quantiles Sketch is the Java implementation based on the paper "Relative Error Streaming Quantiles" by Graham Cormode, Zohar Karnin, Edo Liberty, Justin Thaler, Pavel Veselý, and loosely derived from a Python prototype written by Pavel Veselý.

ReqSketchBuilder

For building a new ReqSketch

ReqSketchIterator

Iterator over all retained items of the ReqSketch.

ReservoirItemsSketch<T>

This sketch provides a reservoir sample over an input stream of items.

ReservoirItemsUnion<T>

Class to union reservoir samples of generic items.

ReservoirLongsSketch

This sketch provides a reservoir sample over an input stream of longs.

ReservoirLongsUnion

Class to union reservoir samples of longs.

ResizeFactor

For the Families that accept this configuration parameter, it controls the size multiple that affects how fast the internal cache grows, when more space is required.

SampleSubsetSummary

A simple object o capture the results of a subset sum query on a sampling sketch.

SerializerDeserializer

Multipurpose serializer-deserializer for a collection of sketches defined by the enum.

SerializerDeserializer.SketchType

Defines the sketch classes that this SerializerDeserializer can handle.

SetOperationCornerCases

Simplifies and speeds up set operations by resolving specific corner cases.

SetOperationCornerCases.AnotbAction

A not B actions

SetOperationCornerCases.CornerCase

List of corner cases

SetOperationCornerCases.IntersectAction

Intersection actions

SetOperationCornerCases.UnionAction

List of union actions

SketchesArgumentException

Illegal Arguments Exception class for the library

SketchesException

Exception class for the library

SketchesNotSupportedException

This operation or mode is not supported.

SketchesReadOnlyException

Write operation attempted on a read-only class.

SketchesStateException

Illegal State Exception class for the library

SketchFillRequest<T,S>

This is a callback request to the data source to fill a quantiles sketch, which is returned to the caller.

SketchPartitionLimits

This defines the methods required to compute the partition limits.

Sort

Specialized sorting algorithm that can sort one array and permute another array the same way.

SortedView

This is the base interface for the Sorted View interface hierarchy and defines the methods that are type independent.

SortedViewIterator

This is the base interface for the SortedViewIterator hierarchy used with a SortedView obtained from a quantile-type sketch.

SpecialValueLayouts

Value Layouts for Non-native Endianness

StreamingValidation

This code is used both by unit tests, for short running tests, and by the characterization repository for longer running, more exhaustive testing.

Summary

Interface for user-defined Summary, which is associated with every hash in a tuple sketch

SummaryDeserializer<S>

Interface for deserializing user-defined Summary

SummaryFactory<S>

Interface for user-defined SummaryFactory

SummarySetOperations<S>

This is to provide methods of producing unions and intersections of two Summary objects.

SuppressFBWarnings

Used to suppress SpotBug warnings.

TDigestDouble

t-Digest for estimating quantiles and ranks.

TestUtil

Utility methods for Test

TgtHllType

Specifies the target type of HLL sketch to be created.

ThetaAnotB

Computes a set difference, A-AND-NOT-B, of two ThetaSketches.

ThetaIntersection

The API for intersection operations

ThetaSetOperation

The parent API for all Set Operations

ThetaSetOperationBuilder

For building a new ThetaSetOperation.

ThetaSketch

The top-level class for all theta sketches.

ThetaUnion

Compute the union of two or more theta sketches.

ThetaUtil

Utility methods for the Theta Family of sketches

TupleAnotB<S>

Computes a set difference, A-AND-NOT-B, of two generic TupleSketches.

TupleIntersection<S>

Computes an intersection of two or more generic TupleSketches or generic TupleSketches combined with ThetaSketches.

TupleSketch<S>

The top-level class for all Tuple sketches.

TupleSketchIterator<S>

Iterator over a generic tuple sketch

TupleUnion<S>

Compute the union of two or more generic tuple sketches or generic TupleSketches combined with ThetaSketches.

UniqueCountMap

This is a real-time, key-value HLL mapping sketch that tracks approximate unique counts of identifiers (the values) associated with each key.

UpdatableQuantilesDoublesSketch

Extends QuantilesDoubleSketch

UpdatableSummary<U>

Interface for updating user-defined Summary

UpdatableThetaSketch

The parent class for the UpdatableThetaSketch families, such as QuickSelectThetaSketch and AlphaSketch.

UpdatableThetaSketchBuilder

For building a new UpdatableThetaSketch.

UpdatableTupleSketch<U,S>

An extension of QuickSelectSketch<S>, which can be updated with many types of keys.

UpdatableTupleSketchBuilder<U,S>

For building a new generic tuple UpdatableTupleSketch

UpdateReturnState

See Update Return State

Util

Common utility functions.

Util

Common utility functions for Tuples

VarOptItemsSamples<T>

This class provides access to the samples contained in a VarOptItemsSketch.

VarOptItemsSketch<T>

This sketch provides a variance optimal sample over an input stream of weighted items.

VarOptItemsUnion<T>

Provides a unioning operation over varopt sketches.

XxHash

The XxHash is a fast, non-cryptographic, 64-bit hash function that has excellent avalanche and 2-way bit independence properties.

XxHash64

The XxHash is a fast, non-cryptographic, 64-bit hash function that has excellent avalanche and 2-way bit independence properties.