Class ItemsSketch<T>
- java.lang.Object
-
- org.apache.datasketches.quantiles.ItemsSketch<T>
-
- Type Parameters:
T
- The sketch data type
- All Implemented Interfaces:
QuantilesAPI
,QuantilesGenericAPI<T>
public final class ItemsSketch<T> extends Object implements QuantilesGenericAPI<T>
This is an implementation of the Low Discrepancy Mergeable Quantiles Sketch, using generic items, described in section 3.2 of the journal version of the paper "Mergeable Summaries" by Agarwal, Cormode, Huang, Phillips, Wei, and Yi:A k of 128 produces a normalized, rank error of about 1.7%. For example, the median returned from getQuantile(0.5) will be between the actual quantiles from the hypothetically sorted array of input quantiles at normalized ranks of 0.483 and 0.517, with a confidence of about 99%.
The size of an ItemsSketch is very dependent on the size of the generic Items input into the sketch, so there is no comparable size table as there is for the DoublesSketch.
- See Also:
QuantilesAPI
-
-
Nested Class Summary
-
Nested classes/interfaces inherited from interface org.apache.datasketches.quantilescommon.QuantilesGenericAPI
QuantilesGenericAPI.GenericPartitionBoundaries<T>
-
-
Field Summary
Fields Modifier and Type Field Description static Random
rand
Setting the seed makes the results of the sketch deterministic if the input items are received in exactly the same order.-
Fields inherited from interface org.apache.datasketches.quantilescommon.QuantilesAPI
EMPTY_MSG, MEM_REQ_SVR_NULL_MSG, NOT_SINGLE_ITEM_MSG, TGT_IS_READ_ONLY_MSG, UNSUPPORTED_MSG
-
-
Method Summary
All Methods Static Methods Instance Methods Concrete Methods Modifier and Type Method Description ItemsSketch<T>
downSample(int newK)
From an existing sketch, this creates a new sketch that can have a smaller K.double[]
getCDF(T[] splitPoints)
This is equivalent togetCDF(splitPoints, INCLUSIVE)
double[]
getCDF(T[] splitPoints, QuantileSearchCriteria searchCrit)
Returns an approximation to the Cumulative Distribution Function (CDF) of the input stream as a monotonically increasing array of double ranks (or cumulative probabilities) on the interval [0.0, 1.0], given a set of splitPoints.static <T> ItemsSketch<T>
getInstance(Class<T> clazz, int k, Comparator<? super T> comparator)
Obtains a new instance of an ItemsSketch.static <T> ItemsSketch<T>
getInstance(Class<T> clazz, Comparator<? super T> comparator)
Obtains a new instance of an ItemsSketch using the DEFAULT_K.static <T> ItemsSketch<T>
getInstance(Class<T> clazz, org.apache.datasketches.memory.Memory srcMem, Comparator<? super T> comparator, ArrayOfItemsSerDe<T> serDe)
Heapifies the given srcMem, which must be a Memory image of a ItemsSketchint
getK()
Gets the user configured parameter k, which controls the accuracy of the sketch and its memory space usage.static int
getKFromEpsilon(double epsilon, boolean pmf)
Gets the approximate k to use given epsilon, the normalized rank error.T
getMaxItem()
Returns the maximum item of the stream.T
getMinItem()
Returns the minimum item of the stream.long
getN()
Gets the length of the input stream.double
getNormalizedRankError(boolean pmf)
Gets the approximate rank error of this sketch normalized as a fraction between zero and one.static double
getNormalizedRankError(int k, boolean pmf)
Gets the normalized rank error given k and pmf.int
getNumRetained()
Gets the number of quantiles retained by the sketch.QuantilesGenericAPI.GenericPartitionBoundaries<T>
getPartitionBoundaries(int numEquallyWeighted, QuantileSearchCriteria searchCrit)
This method returns an instance ofGenericPartitionBoundaries
which provides sufficient information for the user to create the given number of equally weighted partitions.double[]
getPMF(T[] splitPoints)
This is equivalent togetPMF(splitPoints, INCLUSIVE)
double[]
getPMF(T[] splitPoints, QuantileSearchCriteria searchCrit)
Returns an approximation to the Probability Mass Function (PMF) of the input stream as an array of probability masses as doubles on the interval [0.0, 1.0], given a set of splitPoints.T
getQuantile(double rank)
This is equivalent togetQuantile(rank, INCLUSIVE)
T
getQuantile(double rank, QuantileSearchCriteria searchCrit)
Gets the approximate quantile of the given normalized rank and the given search criterion.T
getQuantileLowerBound(double rank)
Gets the lower bound of the quantile confidence interval in which the quantile of the given rank exists.T[]
getQuantiles(double[] ranks)
This is equivalent togetQuantiles(ranks, INCLUSIVE)
T[]
getQuantiles(double[] ranks, QuantileSearchCriteria searchCrit)
Gets an array of quantiles from the given array of normalized ranks.T
getQuantileUpperBound(double rank)
Gets the upper bound of the quantile confidence interval in which the true quantile of the given rank exists.double
getRank(T quantile)
This is equivalent togetRank(T quantile, INCLUSIVE)
double
getRank(T quantile, QuantileSearchCriteria searchCrit)
Gets the normalized rank corresponding to the given a quantile.double
getRankLowerBound(double rank)
Gets the lower bound of the rank confidence interval in which the true rank of the given rank exists.double[]
getRanks(T[] quantiles)
This is equivalent togetRanks(quantiles, INCLUSIVE)
double[]
getRanks(T[] quantiles, QuantileSearchCriteria searchCrit)
Gets an array of normalized ranks corresponding to the given array of quantiles and the given search criterion.double
getRankUpperBound(double rank)
Gets the upper bound of the rank confidence interval in which the true rank of the given rank exists.Class<T>
getSketchType()
GenericSortedView<T>
getSortedView()
Gets the sorted view of this sketchboolean
hasMemory()
Returns true if this sketch's data structure is backed by Memory or WritableMemory.boolean
isDirect()
Returns true if this sketch's data structure is off-heap (a.k.a., Direct or Native memory).boolean
isEmpty()
Returns true if this sketch is empty.boolean
isEstimationMode()
Returns true if this sketch is in estimation mode.boolean
isReadOnly()
Returns true if this sketch is read only.QuantilesGenericSketchIterator<T>
iterator()
Gets the iterator for this sketch, which is not sorted.void
putMemory(org.apache.datasketches.memory.WritableMemory dstMem, ArrayOfItemsSerDe<T> serDe)
Puts the current sketch into the given Memory if there is sufficient space.void
reset()
Resets this sketch to the empty state.byte[]
toByteArray(boolean ordered, ArrayOfItemsSerDe<T> serDe)
Serialize this sketch to a byte array form.byte[]
toByteArray(ArrayOfItemsSerDe<T> serDe)
Serialize this sketch to a byte array form.String
toString()
Returns a summary of the key parameters of the sketch.String
toString(boolean sketchSummary, boolean dataDetail)
Returns summary information about this sketch.static String
toString(byte[] byteArr)
Returns a human readable string of the preamble of a byte array image of an ItemsSketch.static String
toString(org.apache.datasketches.memory.Memory mem)
Returns a human readable string of the preamble of a Memory image of an ItemsSketch.void
update(T item)
Updates this sketch with the given item.-
Methods inherited from class java.lang.Object
clone, equals, finalize, getClass, hashCode, notify, notifyAll, wait, wait, wait
-
Methods inherited from interface org.apache.datasketches.quantilescommon.QuantilesGenericAPI
getPartitionBoundaries
-
-
-
-
Field Detail
-
rand
public static final Random rand
Setting the seed makes the results of the sketch deterministic if the input items are received in exactly the same order. This is only useful when performing test comparisons, otherwise is not recommended.
-
-
Method Detail
-
getInstance
public static <T> ItemsSketch<T> getInstance(Class<T> clazz, Comparator<? super T> comparator)
Obtains a new instance of an ItemsSketch using the DEFAULT_K.- Type Parameters:
T
- The sketch data type- Parameters:
clazz
- the given class of Tcomparator
- to compare items- Returns:
- an ItemSketch<T>.
-
getInstance
public static <T> ItemsSketch<T> getInstance(Class<T> clazz, int k, Comparator<? super T> comparator)
Obtains a new instance of an ItemsSketch.- Type Parameters:
T
- The sketch data type- Parameters:
clazz
- the given class of Tk
- Parameter that controls space usage of sketch and accuracy of estimates. Must be greater than 2 and less than 65536 and a power of 2.comparator
- to compare items- Returns:
- an ItemSketch<T>.
-
getInstance
public static <T> ItemsSketch<T> getInstance(Class<T> clazz, org.apache.datasketches.memory.Memory srcMem, Comparator<? super T> comparator, ArrayOfItemsSerDe<T> serDe)
Heapifies the given srcMem, which must be a Memory image of a ItemsSketch- Type Parameters:
T
- The sketch data type- Parameters:
clazz
- the given class of TsrcMem
- a Memory image of a sketch. See Memorycomparator
- to compare itemsserDe
- an instance of ArrayOfItemsSerDe- Returns:
- a ItemSketch<T> on the Java heap.
-
getCDF
public double[] getCDF(T[] splitPoints)
Description copied from interface:QuantilesGenericAPI
This is equivalent togetCDF(splitPoints, INCLUSIVE)
- Specified by:
getCDF
in interfaceQuantilesGenericAPI<T>
- Parameters:
splitPoints
- an array of m unique, monotonically increasing items.- Returns:
- a discrete CDF array of m+1 double ranks (or cumulative probabilities) on the interval [0.0, 1.0].
-
getCDF
public double[] getCDF(T[] splitPoints, QuantileSearchCriteria searchCrit)
Description copied from interface:QuantilesGenericAPI
Returns an approximation to the Cumulative Distribution Function (CDF) of the input stream as a monotonically increasing array of double ranks (or cumulative probabilities) on the interval [0.0, 1.0], given a set of splitPoints.The resulting approximations have a probabilistic guarantee that can be obtained from the getNormalizedRankError(false) function.
- Specified by:
getCDF
in interfaceQuantilesGenericAPI<T>
- Parameters:
splitPoints
- an array of m unique, monotonically increasing items (of the same type as the input items) that divide the item input domain into m+1 overlapping intervals.The start of each interval is below the lowest item retained by the sketch corresponding to a zero rank or zero probability, and the end of the interval is the rank or cumulative probability corresponding to the split point.
The (m+1)th interval represents 100% of the distribution represented by the sketch and consistent with the definition of a cumulative probability distribution, thus the (m+1)th rank or probability in the returned array is always 1.0.
If a split point exactly equals a retained item of the sketch and the search criterion is:
- INCLUSIVE, the resulting cumulative probability will include that item.
- EXCLUSIVE, the resulting cumulative probability will not include the weight of that split point.
It is not recommended to include either the minimum or maximum items of the input stream.
searchCrit
- the desired search criteria.- Returns:
- a discrete CDF array of m+1 double ranks (or cumulative probabilities) on the interval [0.0, 1.0].
-
getMaxItem
public T getMaxItem()
Description copied from interface:QuantilesGenericAPI
Returns the maximum item of the stream. This may be distinct from the largest item retained by the sketch algorithm.- Specified by:
getMaxItem
in interfaceQuantilesGenericAPI<T>
- Returns:
- the maximum item of the stream
-
getMinItem
public T getMinItem()
Description copied from interface:QuantilesGenericAPI
Returns the minimum item of the stream. This may be distinct from the smallest item retained by the sketch algorithm.- Specified by:
getMinItem
in interfaceQuantilesGenericAPI<T>
- Returns:
- the minimum item of the stream
-
getPartitionBoundaries
public QuantilesGenericAPI.GenericPartitionBoundaries<T> getPartitionBoundaries(int numEquallyWeighted, QuantileSearchCriteria searchCrit)
Description copied from interface:QuantilesGenericAPI
This method returns an instance ofGenericPartitionBoundaries
which provides sufficient information for the user to create the given number of equally weighted partitions.- Specified by:
getPartitionBoundaries
in interfaceQuantilesGenericAPI<T>
- Parameters:
numEquallyWeighted
- an integer that specifies the number of equally weighted partitions betweengetMinItem()
andgetMaxItem()
. This must be a positive integer greater than zero.- A 1 will return: minItem, maxItem.
- A 2 will return: minItem, median quantile, maxItem.
- Etc.
searchCrit
- If INCLUSIVE, all the returned quantiles are the upper boundaries of the equally weighted partitions with the exception of the lowest returned quantile, which is the lowest boundary of the lowest ranked partition. If EXCLUSIVE, all the returned quantiles are the lower boundaries of the equally weighted partitions with the exception of the highest returned quantile, which is the upper boundary of the highest ranked partition.- Returns:
- an instance of
GenericPartitionBoundaries
.
-
getPMF
public double[] getPMF(T[] splitPoints)
Description copied from interface:QuantilesGenericAPI
This is equivalent togetPMF(splitPoints, INCLUSIVE)
- Specified by:
getPMF
in interfaceQuantilesGenericAPI<T>
- Parameters:
splitPoints
- an array of m unique, monotonically increasing items.- Returns:
- a PMF array of m+1 probability masses as doubles on the interval [0.0, 1.0].
-
getPMF
public double[] getPMF(T[] splitPoints, QuantileSearchCriteria searchCrit)
Description copied from interface:QuantilesGenericAPI
Returns an approximation to the Probability Mass Function (PMF) of the input stream as an array of probability masses as doubles on the interval [0.0, 1.0], given a set of splitPoints.The resulting approximations have a probabilistic guarantee that can be obtained from the getNormalizedRankError(true) function.
- Specified by:
getPMF
in interfaceQuantilesGenericAPI<T>
- Parameters:
splitPoints
- an array of m unique, monotonically increasing items (of the same type as the input items) that divide the item input domain into m+1 consecutive, non-overlapping intervals.Each interval except for the end intervals starts with a split point and ends with the next split point in sequence.
The first interval starts below the lowest item retained by the sketch corresponding to a zero rank or zero probability, and ends with the first split point
The last (m+1)th interval starts with the last split point and ends after the last item retained by the sketch corresponding to a rank or probability of 1.0.
The sum of the probability masses of all (m+1) intervals is 1.0.
If the search criterion is:
- INCLUSIVE, and the upper split point of an interval equals an item retained by the sketch, the interval will include that item. If the lower split point equals an item retained by the sketch, the interval will exclude that item.
- EXCLUSIVE, and the upper split point of an interval equals an item retained by the sketch, the interval will exclude that item. If the lower split point equals an item retained by the sketch, the interval will include that item.
It is not recommended to include either the minimum or maximum items of the input stream.
searchCrit
- the desired search criteria.- Returns:
- a PMF array of m+1 probability masses as doubles on the interval [0.0, 1.0].
-
getQuantile
public T getQuantile(double rank)
Description copied from interface:QuantilesGenericAPI
This is equivalent togetQuantile(rank, INCLUSIVE)
- Specified by:
getQuantile
in interfaceQuantilesGenericAPI<T>
- Parameters:
rank
- the given normalized rank, a double in the range [0.0, 1.0].- Returns:
- the approximate quantile given the normalized rank.
-
getQuantile
public T getQuantile(double rank, QuantileSearchCriteria searchCrit)
Description copied from interface:QuantilesGenericAPI
Gets the approximate quantile of the given normalized rank and the given search criterion.- Specified by:
getQuantile
in interfaceQuantilesGenericAPI<T>
- Parameters:
rank
- the given normalized rank, a double in the range [0.0, 1.0].searchCrit
- If INCLUSIVE, the given rank includes all quantiles ≤ the quantile directly corresponding to the given rank. If EXCLUSIVE, he given rank includes all quantiles < the quantile directly corresponding to the given rank.- Returns:
- the approximate quantile given the normalized rank.
- See Also:
QuantileSearchCriteria
-
getQuantileLowerBound
public T getQuantileLowerBound(double rank)
Description copied from interface:QuantilesGenericAPI
Gets the lower bound of the quantile confidence interval in which the quantile of the given rank exists.Although it is possible to estimate the probability that the true quantile exists within the quantile confidence interval specified by the upper and lower quantile bounds, it is not possible to guarantee the width of the quantile confidence interval as an additive or multiplicative percent of the true quantile.
- Specified by:
getQuantileLowerBound
in interfaceQuantilesGenericAPI<T>
- Parameters:
rank
- the given normalized rank- Returns:
- the lower bound of the quantile confidence interval in which the quantile of the given rank exists.
-
getQuantileUpperBound
public T getQuantileUpperBound(double rank)
Description copied from interface:QuantilesGenericAPI
Gets the upper bound of the quantile confidence interval in which the true quantile of the given rank exists.Although it is possible to estimate the probability that the true quantile exists within the quantile confidence interval specified by the upper and lower quantile bounds, it is not possible to guarantee the width of the quantile interval as an additive or multiplicative percent of the true quantile.
- Specified by:
getQuantileUpperBound
in interfaceQuantilesGenericAPI<T>
- Parameters:
rank
- the given normalized rank- Returns:
- the upper bound of the quantile confidence interval in which the true quantile of the given rank exists.
-
getQuantiles
public T[] getQuantiles(double[] ranks)
Description copied from interface:QuantilesGenericAPI
This is equivalent togetQuantiles(ranks, INCLUSIVE)
- Specified by:
getQuantiles
in interfaceQuantilesGenericAPI<T>
- Parameters:
ranks
- the given array of normalized ranks, each of which must be in the interval [0.0,1.0].- Returns:
- an array of quantiles corresponding to the given array of normalized ranks.
-
getQuantiles
public T[] getQuantiles(double[] ranks, QuantileSearchCriteria searchCrit)
Description copied from interface:QuantilesGenericAPI
Gets an array of quantiles from the given array of normalized ranks.- Specified by:
getQuantiles
in interfaceQuantilesGenericAPI<T>
- Parameters:
ranks
- the given array of normalized ranks, each of which must be in the interval [0.0,1.0].searchCrit
- if INCLUSIVE, the given ranks include all quantiles ≤ the quantile directly corresponding to each rank.- Returns:
- an array of quantiles corresponding to the given array of normalized ranks.
- See Also:
QuantileSearchCriteria
-
getRank
public double getRank(T quantile)
Description copied from interface:QuantilesGenericAPI
This is equivalent togetRank(T quantile, INCLUSIVE)
- Specified by:
getRank
in interfaceQuantilesGenericAPI<T>
- Parameters:
quantile
- the given quantile- Returns:
- the normalized rank corresponding to the given quantile.
-
getRank
public double getRank(T quantile, QuantileSearchCriteria searchCrit)
Description copied from interface:QuantilesGenericAPI
Gets the normalized rank corresponding to the given a quantile.- Specified by:
getRank
in interfaceQuantilesGenericAPI<T>
- Parameters:
quantile
- the given quantilesearchCrit
- if INCLUSIVE the given quantile is included into the rank.- Returns:
- the normalized rank corresponding to the given quantile.
- See Also:
QuantileSearchCriteria
-
getRankLowerBound
public double getRankLowerBound(double rank)
Description copied from interface:QuantilesAPI
Gets the lower bound of the rank confidence interval in which the true rank of the given rank exists.- Specified by:
getRankLowerBound
in interfaceQuantilesAPI
- Parameters:
rank
- the given normalized rank.- Returns:
- the lower bound of the rank confidence interval in which the true rank of the given rank exists.
-
getRankUpperBound
public double getRankUpperBound(double rank)
Description copied from interface:QuantilesAPI
Gets the upper bound of the rank confidence interval in which the true rank of the given rank exists.- Specified by:
getRankUpperBound
in interfaceQuantilesAPI
- Parameters:
rank
- the given normalized rank.- Returns:
- the upper bound of the rank confidence interval in which the true rank of the given rank exists.
-
getRanks
public double[] getRanks(T[] quantiles)
Description copied from interface:QuantilesGenericAPI
This is equivalent togetRanks(quantiles, INCLUSIVE)
- Specified by:
getRanks
in interfaceQuantilesGenericAPI<T>
- Parameters:
quantiles
- the given array of quantiles- Returns:
- an array of normalized ranks corresponding to the given array of quantiles.
-
getRanks
public double[] getRanks(T[] quantiles, QuantileSearchCriteria searchCrit)
Description copied from interface:QuantilesGenericAPI
Gets an array of normalized ranks corresponding to the given array of quantiles and the given search criterion.- Specified by:
getRanks
in interfaceQuantilesGenericAPI<T>
- Parameters:
quantiles
- the given array of quantilessearchCrit
- if INCLUSIVE, the given quantiles include the rank directly corresponding to each quantile.- Returns:
- an array of normalized ranks corresponding to the given array of quantiles.
- See Also:
QuantileSearchCriteria
-
iterator
public QuantilesGenericSketchIterator<T> iterator()
Description copied from interface:QuantilesGenericAPI
Gets the iterator for this sketch, which is not sorted.- Specified by:
iterator
in interfaceQuantilesGenericAPI<T>
- Returns:
- the iterator for this sketch
-
getK
public int getK()
Description copied from interface:QuantilesAPI
Gets the user configured parameter k, which controls the accuracy of the sketch and its memory space usage.- Specified by:
getK
in interfaceQuantilesAPI
- Returns:
- the user configured parameter k, which controls the accuracy of the sketch and its memory space usage.
-
getN
public long getN()
Description copied from interface:QuantilesAPI
Gets the length of the input stream.- Specified by:
getN
in interfaceQuantilesAPI
- Returns:
- the length of the input stream.
-
getNormalizedRankError
public double getNormalizedRankError(boolean pmf)
Gets the approximate rank error of this sketch normalized as a fraction between zero and one.- Parameters:
pmf
- if true, returns the "double-sided" normalized rank error for the getPMF() function. Otherwise, it is the "single-sided" normalized rank error for all the other queries.- Returns:
- if pmf is true, returns the normalized rank error for the getPMF() function. Otherwise, it is the "single-sided" normalized rank error for all the other queries.
-
getNormalizedRankError
public static double getNormalizedRankError(int k, boolean pmf)
Gets the normalized rank error given k and pmf. Static method version of thegetNormalizedRankError(boolean)
.- Parameters:
k
- the configuration parameterpmf
- if true, returns the "double-sided" normalized rank error for the getPMF() function. Otherwise, it is the "single-sided" normalized rank error for all the other queries.- Returns:
- if pmf is true, the normalized rank error for the getPMF() function. Otherwise, it is the "single-sided" normalized rank error for all the other queries.
-
getKFromEpsilon
public static int getKFromEpsilon(double epsilon, boolean pmf)
Gets the approximate k to use given epsilon, the normalized rank error.- Parameters:
epsilon
- the normalized rank error between zero and one.pmf
- if true, this function returns k assuming the input epsilon is the desired "double-sided" epsilon for the getPMF() function. Otherwise, this function returns k assuming the input epsilon is the desired "single-sided" epsilon for all the other queries.- Returns:
- k given epsilon.
-
hasMemory
public boolean hasMemory()
Description copied from interface:QuantilesAPI
Returns true if this sketch's data structure is backed by Memory or WritableMemory.- Specified by:
hasMemory
in interfaceQuantilesAPI
- Returns:
- true if this sketch's data structure is backed by Memory or WritableMemory.
-
isEmpty
public boolean isEmpty()
Description copied from interface:QuantilesAPI
Returns true if this sketch is empty.- Specified by:
isEmpty
in interfaceQuantilesAPI
- Returns:
- true if this sketch is empty.
-
isDirect
public boolean isDirect()
Description copied from interface:QuantilesAPI
Returns true if this sketch's data structure is off-heap (a.k.a., Direct or Native memory).- Specified by:
isDirect
in interfaceQuantilesAPI
- Returns:
- true if this sketch's data structure is off-heap (a.k.a., Direct or Native memory).
-
isEstimationMode
public boolean isEstimationMode()
Description copied from interface:QuantilesAPI
Returns true if this sketch is in estimation mode.- Specified by:
isEstimationMode
in interfaceQuantilesAPI
- Returns:
- true if this sketch is in estimation mode.
-
isReadOnly
public boolean isReadOnly()
Description copied from interface:QuantilesAPI
Returns true if this sketch is read only.- Specified by:
isReadOnly
in interfaceQuantilesAPI
- Returns:
- true if this sketch is read only.
-
reset
public void reset()
Description copied from interface:QuantilesAPI
Resets this sketch to the empty state. If the sketch is read only this does nothing.The parameter k will not change.
- Specified by:
reset
in interfaceQuantilesAPI
-
toByteArray
public byte[] toByteArray(ArrayOfItemsSerDe<T> serDe)
Serialize this sketch to a byte array form.- Parameters:
serDe
- an instance of ArrayOfItemsSerDe- Returns:
- byte array of this sketch
-
toByteArray
public byte[] toByteArray(boolean ordered, ArrayOfItemsSerDe<T> serDe)
Serialize this sketch to a byte array form.- Parameters:
ordered
- if true the base buffer will be ordered (default == false).serDe
- an instance of ArrayOfItemsSerDe- Returns:
- this sketch in a byte array form.
-
toString
public String toString()
Description copied from interface:QuantilesAPI
Returns a summary of the key parameters of the sketch.- Specified by:
toString
in interfaceQuantilesAPI
- Overrides:
toString
in classObject
- Returns:
- a summary of the key parameters of the sketch.
-
toString
public String toString(boolean sketchSummary, boolean dataDetail)
Returns summary information about this sketch. Used for debugging.- Parameters:
sketchSummary
- if true includes sketch summarydataDetail
- if true includes data detail- Returns:
- summary information about the sketch.
-
toString
public static String toString(byte[] byteArr)
Returns a human readable string of the preamble of a byte array image of an ItemsSketch.- Parameters:
byteArr
- the given byte array- Returns:
- a human readable string of the preamble of a byte array image of an ItemsSketch.
-
toString
public static String toString(org.apache.datasketches.memory.Memory mem)
Returns a human readable string of the preamble of a Memory image of an ItemsSketch.- Parameters:
mem
- the given Memory- Returns:
- a human readable string of the preamble of a Memory image of an ItemsSketch.
-
downSample
public ItemsSketch<T> downSample(int newK)
From an existing sketch, this creates a new sketch that can have a smaller K. The original sketch is not modified.- Parameters:
newK
- the new K that must be smaller than current K. It is required that this.getK() = newK * 2^(nonnegative integer).- Returns:
- the new sketch.
-
getNumRetained
public int getNumRetained()
Description copied from interface:QuantilesAPI
Gets the number of quantiles retained by the sketch.- Specified by:
getNumRetained
in interfaceQuantilesAPI
- Returns:
- the number of quantiles retained by the sketch
-
putMemory
public void putMemory(org.apache.datasketches.memory.WritableMemory dstMem, ArrayOfItemsSerDe<T> serDe)
Puts the current sketch into the given Memory if there is sufficient space. Otherwise, throws an error.- Parameters:
dstMem
- the given memory.serDe
- an instance of ArrayOfItemsSerDe
-
getSortedView
public GenericSortedView<T> getSortedView()
Description copied from interface:QuantilesGenericAPI
Gets the sorted view of this sketch- Specified by:
getSortedView
in interfaceQuantilesGenericAPI<T>
- Returns:
- the sorted view of this sketch
-
update
public void update(T item)
Description copied from interface:QuantilesGenericAPI
Updates this sketch with the given item.- Specified by:
update
in interfaceQuantilesGenericAPI<T>
- Parameters:
item
- from a stream of items. Nulls are ignored.
-
-