All Classes and Interfaces
Class
Description
Methods of serializing and deserializing arrays of Boolean as a bit array.
Computes a set difference of two tuple sketches of type ArrayOfDoubles
Computes a set difference, A-AND-NOT-B, of two ArrayOfDoublesSketches.
Combines two arrays of double values for use with ArrayOfDoubles tuple sketches
Top level compact tuple sketch of type ArrayOfDoubles.
Computes the intersection of two or more tuple sketches of type ArrayOfDoubles.
Methods of serializing and deserializing arrays of Double.
Builds set operations object for tuple sketches of type ArrayOfDoubles.
The base class for the tuple sketch of type ArrayOfDoubles, where an array of double values
is associated with each key.
Interface for iterating over tuple sketches of type ArrayOfDoubles
The base class for unions of tuple sketches of type ArrayOfDoubles.
The top level for updatable tuple sketches of type ArrayOfDoubles.
For building a new ArrayOfDoublesUpdatableSketch
Base class for serializing and deserializing custom types.
Methods of serializing and deserializing arrays of Long.
Methods of serializing and deserializing arrays of the object version of primitive types of
Number.
Methods of serializing and deserializing arrays of String.
Implements UpdatableSummary<String[]>
Implements SummaryDeserializer<ArrayOfStringsSummary>
Implements SummaryFactory<ArrayOfStringsSummary>
Implements SummarySetOperations<ArrayOfStringsSummary>
Extends UpdatableTupleSketch<String[], ArrayOfStringsSummary>
Methods of serializing and deserializing arrays of String.
Contains common equality binary search algorithms.
Algorithms with logarithmic complexity for searching in an array.
This class enables the estimation of error bounds given a sample set size, the sampling
probability theta, the number of standard deviations and a simple noDataSeen flag.
A Bloom filter is a data structure that can be used for probabilistic
set membership.
This class provides methods to help estimate the correct parameters when
creating a Bloom filter, and methods to create the filter using those values.
Confidence intervals for binomial proportions.
This class is used to compute the bounds on the estimate of the ratio |B| / |A|, where:
|A| is the unknown size of a set A of unique identifiers.
|B| is the unknown size of a subset B of A.
a = |SA| is the observed size of a sample of A
that was obtained by Bernoulli sampling with a known inclusion probability f.
b = |SA ∩ B| is the observed size of a subset
of SA.
This class is used to compute the bounds on the estimate of the ratio B / A, where:
A is a Theta Sketch of population PopA.
B is a Theta Sketch of population PopB that is a subset of A,
obtained by an intersection of A with some other Theta Sketch C,
which acts like a predicate or selection clause.
The estimate of the ratio PopB/PopA is
BoundsOnRatiosInThetaSketchedSets.getEstimateOfBoverA(A, B).
The Upper Bound estimate on the ratio PopB/PopA is
BoundsOnRatiosInThetaSketchedSets.getUpperBoundForBoverA(A, B).
The Lower Bound estimate on the ratio PopB/PopA is
BoundsOnRatiosInThetaSketchedSets.getLowerBoundForBoverA(A, B).
Note: The theta of A cannot be greater than the theta of B.
This class is used to compute the bounds on the estimate of the ratio B / A, where:
A is a Tuple Sketch of population PopA.
B is a Tuple or Theta Sketch of population PopB that is a subset of A,
obtained by an intersection of A with some other Tuple or Theta Sketch C,
which acts like a predicate or selection clause.
The estimate of the ratio PopB/PopA is
BoundsOnRatiosInThetaSketchedSets.getEstimateOfBoverA(A, B).
The Upper Bound estimate on the ratio PopB/PopA is
BoundsOnRatiosInThetaSketchedSets.getUpperBoundForBoverA(A, B).
The Lower Bound estimate on the ratio PopB/PopA is
BoundsOnRatiosInThetaSketchedSets.getLowerBoundForBoverA(A, B).
Note: The theta of A cannot be greater than the theta of B.
This instructs the user about which of the upper and lower bounds of a partition definition row
should be included with the returned data.
Useful methods for byte arrays.
Utilities for the classic quantiles sketches and independent of the type.
Compact sketches are inherently read only.
The parent class of all the CompactThetaSketches.
CompactTupleSketches are never created directly.
This code is used both by unit tests, for short running tests,
and by the characterization repository for longer running, more exhaustive testing.
Java implementation of the CountMin sketch data structure of Cormode and Muthukrishnan.
This is a unique-counting sketch that implements the
Compressed Probabilistic Counting (CPC, a.k.a FM85) algorithms developed by Kevin Lang in
his paper
Back to the Future: an Even More Nearly
Optimal Cardinality Estimation Algorithm.
The union (merge) operation for the CPC sketches.
This provides a read-only view of a serialized image of a CpcSketch, which can be
on-heap or off-heap represented as a MemorySegment object, or on-heap represented as a byte array.
Returns an object and its size in bytes as a result of a deserialize operation
This class can maintain the BitArray object off-heap.
The SortedView of the Quantiles Classic QuantilesDoublesSketch and the KllDoublesSketch.
The Sorted View for quantile sketches of primitive type double.
Iterator over quantile sketches of primitive type double.
Summary for generic tuple sketches of type Double.
The aggregation modes for this Summary
Implements SummaryDeserializer<DoubleSummary>
Factory for DoubleSummary.
Methods for defining how unions and intersections of two objects of type DoubleSummary
are performed.
Extends UpdatableTupleSketch<Double, DoubleSummary>
An implementation of an Exact and Bounded Sampling Proportional to Size sketch.
Specifies one of two types of error regions of the statistical classification Confusion Matrix
that can be excluded from a returned sample of Frequent Items.
Defines the various families of sketch and set operation classes.
A Frequent Distinct Tuples sketch.
Class for filtering entries from a
TupleSketch given a SummaryThe SortedView for the KllFloatsSketch and the ReqSketch.
The Sorted View for quantiles of primitive type float.
Iterator over quantile sketches of primitive type float.
This sketch is based on the paper https://arxiv.org/abs/1705.07001
("A High-Performance Algorithm for Identifying Frequent Items in Data Streams"
by Daniel Anderson, Pryce Bevan, Kevin Lang, Edo Liberty, Lee Rhodes, and Justin Thaler)
and is useful for tracking approximate frequencies of items of type <T>
with optional associated counts (<T> item, long count) that are members of a
multiset of such items.
Row class that defines the return values from a getFrequentItems query.
This sketch is based on the paper https://arxiv.org/abs/1705.07001
("A High-Performance Algorithm for Identifying Frequent Items in Data Streams"
by Daniel Anderson, Pryce Bevan, Kevin Lang, Edo Liberty, Lee Rhodes, and Justin Thaler)
and is useful for tracking approximate frequencies of long items with optional
associated counts (long item, long count) that are members of a multiset of
such items.
Row class that defines the return values from a getFrequentItems query.
This provides efficient, unique and unambiguous binary searching for inequality comparison criteria
for ordered arrays of values that may include duplicate values.
The enumerator of inequalities
This defines the returned results of the getParitionBoundaries() function and
includes the basic methods needed to construct actual partitions.
The Sorted View for quantiles of generic type.
Iterator over quantile sketches of generic type.
Defines a Group from a Frequent Distinct Tuple query.
This is used to iterate over the retained hash values of the Theta sketch.
Helper class for the common hash table methods.
The HllSketch is actually a collection of compact implementations of Phillipe Flajolet’s HyperLogLog (HLL)
sketch but with significantly improved error behavior and excellent speed performance.
This performs union operations for all HllSketches.
This class reinserts the min and max values into the sorted view arrays as required.
A simple structure to hold a pair of arrays
A simple structure to hold a pair of arrays
A simple structure to hold a pair of arrays
A simple structure to hold a pair of arrays
This provides efficient, unique and unambiguous binary searching for inequality comparison criteria
for ordered arrays of values that may include duplicate values.
Summary for generic tuple sketches of type Integer.
The aggregation modes for this Summary
Implements SummaryDeserializer<IntegerSummary>
Factory for IntegerSummary.
Methods for defining how unions and intersections of two objects of type IntegerSummary
are performed.
Extends UpdatableTupleSketch<Integer, IntegerSummary>
The SortedView for the KllItemsSketch and the classic QuantilesItemsSketch.
Jaccard similarity of two ThetaSketches.
Jaccard similarity of two TupleSketches, or alternatively, of a TupleSketch and a ThetaSketch.
This variation of the KllSketch implements primitive doubles.
Iterator over KllDoublesSketch.
This variation of the KllSketch implements primitive floats.
Iterator over KllFloatsSketch.
This variation of the KllSketch implements generic data types.
Iterator over KllItemsSketch.
This variation of the KllSketch implements primitive longs.
Iterator over KllLongsSketch.
This class is the root of the KLL sketch class hierarchy.
Used primarily to define the structure of the serialized sketch.
Used to define the variable type of the current instance of this class.
The base implementation for the KLL sketch iterator hierarchy used for viewing the
non-ordered quantiles retained by a sketch.
Kolmogorov-Smirnov Test
See Kolmogorov–Smirnov Test
The SortedView of the KllLongsSketch.
The Sorted View for quantile sketches of primitive type long.
Iterator over quantile sketches of primitive type long.
This is a callback interface to provide a means to request a new MemorySegment of a specified size.
A convenience class that implements a default implementation.
This is an example of a possible implementation of the MemorySegmentRequest interface
where all requested segments are allocated off-heap.
Methods for inquiring the status of a backing MemorySegment.
This code is used both by unit tests, for short running tests,
and by the characterization repository for longer running, more exhaustive testing.
The MurmurHash3 is a fast, non-cryptographic, 128-bit hash function that has
excellent avalanche and 2-way bit independence properties.
The MurmurHash3 is a fast, non-cryptographic, 128-bit hash function that has
excellent avalanche and 2-way bit independence properties.
A partitioning process that can partition very large data sets into thousands
of partitions of approximately the same size.
Defines a row for List of PartitionBounds.
Holds data for a Stack element
This enables the special functions for performing efficient partitioning of massive data.
Defines the relative positional API.
Defines the API for relative positional access to a MemorySegment.
Position operation violation.
This processes the contents of a FDT sketch to extract the
primary keys with the most frequent unique combinations of the non-primary dimensions.
This is a stochastic streaming sketch that enables near-real time analysis of the
approximate distribution of items from a very large stream in a single pass, requiring only
that the items are comparable.
The Quantiles API for item type double.
This is an implementation of the Low Discrepancy Mergeable Quantiles Sketch, using doubles,
described in section 3.2 of the journal version of the paper "Mergeable Summaries"
by Agarwal, Cormode, Huang, Phillips, Wei, and Yi:
For building a new quantiles QuantilesDoublesSketch.
Iterator over QuantilesDoublesSketch.
The quantiles sketch iterator for primitive type double.
The API for Union operations for QuantilesDoublesSketches
For building a new QuantilesDoublesSketch Union operation.
These search criteria are used by the KLL, REQ and Classic Quantiles sketches in the DataSketches library.
The Quantiles API for item type float.
The quantiles sketch iterator for primitive type float.
The Quantiles API for item type generic.
The quantiles sketch iterator for generic types.
This is an implementation of the Low Discrepancy Mergeable Quantiles Sketch, using generic items,
described in section 3.2 of the journal version of the paper "Mergeable Summaries"
by Agarwal, Cormode, Huang, Phillips, Wei, and Yi:
Iterator over QuantilesItemsSketch.
The API for Union operations for generic QuantilesItemsSketches
The Quantiles API for item type long.
The quantiles sketch iterator for primitive type long.
This is the base API for the iterator hierarchy used for viewing the
non-ordered quantiles retained by the classic Quantiles* sketches and KLL sketches, for example.
Utilities for the quantiles sketches.
This code is used both by unit tests, for short running tests,
and by the characterization repository for longer running, more exhaustive testing.
QuickSelect algorithm improved from Sedgewick.
The signaling interface that allows comprehensive analysis of the ReqSketch and ReqCompactor
while eliminating code clutter in the main classes.
This Relative Error Quantiles Sketch is the Java implementation based on the paper
"Relative Error Streaming Quantiles" by Graham Cormode, Zohar Karnin, Edo Liberty,
Justin Thaler, Pavel Veselý, and loosely derived from a Python prototype written by Pavel Veselý.
For building a new ReqSketch
Iterator over all retained items of the ReqSketch.
This sketch provides a reservoir sample over an input stream of items.
Class to union reservoir samples of generic items.
This sketch provides a reservoir sample over an input stream of
longs.Class to union reservoir samples of longs.
For the Families that accept this configuration parameter, it controls the size multiple that
affects how fast the internal cache grows, when more space is required.
A simple object o capture the results of a subset sum query on a sampling sketch.
Multipurpose serializer-deserializer for a collection of sketches defined by the enum.
Defines the sketch classes that this SerializerDeserializer can handle.
Simplifies and speeds up set operations by resolving specific corner cases.
A not B actions
List of corner cases
Intersection actions
List of union actions
Illegal Arguments Exception class for the library
Exception class for the library
This operation or mode is not supported.
Write operation attempted on a read-only class.
Illegal State Exception class for the library
This is a callback request to the data source to fill a quantiles sketch,
which is returned to the caller.
This defines the methods required to compute the partition limits.
Specialized sorting algorithm that can sort one array and permute another array the same way.
This is the base interface for the Sorted View interface hierarchy and defines the methods that are type independent.
This is the base interface for the SortedViewIterator hierarchy used with a SortedView obtained
from a quantile-type sketch.
Value Layouts for Non-native Endianness
This code is used both by unit tests, for short running tests,
and by the characterization repository for longer running, more exhaustive testing.
Interface for user-defined Summary, which is associated with every hash in a tuple sketch
Interface for deserializing user-defined Summary
Interface for user-defined SummaryFactory
This is to provide methods of producing unions and intersections of two Summary objects.
Used to suppress SpotBug warnings.
t-Digest for estimating quantiles and ranks.
Utility methods for Test
Specifies the target type of HLL sketch to be created.
Computes a set difference, A-AND-NOT-B, of two ThetaSketches.
The API for intersection operations
The parent API for all Set Operations
For building a new ThetaSetOperation.
The top-level class for all theta sketches.
Compute the union of two or more theta sketches.
Utility methods for the Theta Family of sketches
Computes a set difference, A-AND-NOT-B, of two generic TupleSketches.
Computes an intersection of two or more generic TupleSketches or generic TupleSketches
combined with ThetaSketches.
The top-level class for all Tuple sketches.
Iterator over a generic tuple sketch
Compute the union of two or more generic tuple sketches or generic TupleSketches combined with
ThetaSketches.
This is a real-time, key-value HLL mapping sketch that tracks approximate unique counts of
identifiers (the values) associated with each key.
Extends QuantilesDoubleSketch
Interface for updating user-defined Summary
The parent class for the UpdatableThetaSketch families, such as QuickSelectThetaSketch and AlphaSketch.
For building a new UpdatableThetaSketch.
An extension of QuickSelectSketch<S>, which can be updated with many types of keys.
For building a new generic tuple UpdatableTupleSketch
Common utility functions.
Common utility functions for Tuples
This class provides access to the samples contained in a VarOptItemsSketch.
This sketch provides a variance optimal sample over an input stream of weighted items.
Provides a unioning operation over varopt sketches.
The XxHash is a fast, non-cryptographic, 64-bit hash function that has
excellent avalanche and 2-way bit independence properties.
The XxHash is a fast, non-cryptographic, 64-bit hash function that has
excellent avalanche and 2-way bit independence properties.