Class DoublesSketch
- java.lang.Object
-
- org.apache.datasketches.quantiles.DoublesSketch
-
- All Implemented Interfaces:
QuantilesAPI
,QuantilesDoublesAPI
- Direct Known Subclasses:
CompactDoublesSketch
,UpdateDoublesSketch
public abstract class DoublesSketch extends Object implements QuantilesDoublesAPI
This is an implementation of the Low Discrepancy Mergeable Quantiles Sketch, using doubles, described in section 3.2 of the journal version of the paper "Mergeable Summaries" by Agarwal, Cormode, Huang, Phillips, Wei, and Yi:A k of 128 produces a normalized, rank error of about 1.7%. For example, the median returned from getQuantile(0.5) will be between the actual quantiles from the hypothetically sorted array of input quantiles at normalized ranks of 0.483 and 0.517, with a confidence of about 99%.
Table Guide for DoublesSketch Size in Bytes and Approximate Error: K => | 16 32 64 128 256 512 1,024 ~ Error => | 12.145% 6.359% 3.317% 1.725% 0.894% 0.463% 0.239% N | Size in Bytes -> ------------------------------------------------------------------------ 0 | 8 8 8 8 8 8 8 1 | 72 72 72 72 72 72 72 3 | 72 72 72 72 72 72 72 7 | 104 104 104 104 104 104 104 15 | 168 168 168 168 168 168 168 31 | 296 296 296 296 296 296 296 63 | 424 552 552 552 552 552 552 127 | 552 808 1,064 1,064 1,064 1,064 1,064 255 | 680 1,064 1,576 2,088 2,088 2,088 2,088 511 | 808 1,320 2,088 3,112 4,136 4,136 4,136 1,023 | 936 1,576 2,600 4,136 6,184 8,232 8,232 2,047 | 1,064 1,832 3,112 5,160 8,232 12,328 16,424 4,095 | 1,192 2,088 3,624 6,184 10,280 16,424 24,616 8,191 | 1,320 2,344 4,136 7,208 12,328 20,520 32,808 16,383 | 1,448 2,600 4,648 8,232 14,376 24,616 41,000 32,767 | 1,576 2,856 5,160 9,256 16,424 28,712 49,192 65,535 | 1,704 3,112 5,672 10,280 18,472 32,808 57,384 131,071 | 1,832 3,368 6,184 11,304 20,520 36,904 65,576 262,143 | 1,960 3,624 6,696 12,328 22,568 41,000 73,768 524,287 | 2,088 3,880 7,208 13,352 24,616 45,096 81,960 1,048,575 | 2,216 4,136 7,720 14,376 26,664 49,192 90,152 2,097,151 | 2,344 4,392 8,232 15,400 28,712 53,288 98,344 4,194,303 | 2,472 4,648 8,744 16,424 30,760 57,384 106,536 8,388,607 | 2,600 4,904 9,256 17,448 32,808 61,480 114,728 16,777,215 | 2,728 5,160 9,768 18,472 34,856 65,576 122,920 33,554,431 | 2,856 5,416 10,280 19,496 36,904 69,672 131,112 67,108,863 | 2,984 5,672 10,792 20,520 38,952 73,768 139,304 134,217,727 | 3,112 5,928 11,304 21,544 41,000 77,864 147,496 268,435,455 | 3,240 6,184 11,816 22,568 43,048 81,960 155,688 536,870,911 | 3,368 6,440 12,328 23,592 45,096 86,056 163,880 1,073,741,823 | 3,496 6,696 12,840 24,616 47,144 90,152 172,072 2,147,483,647 | 3,624 6,952 13,352 25,640 49,192 94,248 180,264 4,294,967,295 | 3,752 7,208 13,864 26,664 51,240 98,344 188,456
- See Also:
QuantilesAPI
-
-
Nested Class Summary
-
Nested classes/interfaces inherited from interface org.apache.datasketches.quantilescommon.QuantilesDoublesAPI
QuantilesDoublesAPI.DoublesPartitionBoundaries
-
-
Field Summary
-
Fields inherited from interface org.apache.datasketches.quantilescommon.QuantilesAPI
EMPTY_MSG, MEM_REQ_SVR_NULL_MSG, NOT_SINGLE_ITEM_MSG, TGT_IS_READ_ONLY_MSG, UNSUPPORTED_MSG
-
-
Method Summary
All Methods Static Methods Instance Methods Abstract Methods Concrete Methods Modifier and Type Method Description static DoublesSketchBuilder
builder()
Returns a new builderDoublesSketch
downSample(DoublesSketch srcSketch, int smallerK, org.apache.datasketches.memory.WritableMemory dstMem)
From an source sketch, create a new sketch that must have a smaller K.double[]
getCDF(double[] splitPoints, QuantileSearchCriteria searchCrit)
Returns an approximation to the Cumulative Distribution Function (CDF) of the input stream as a monotonically increasing array of double ranks (or cumulative probabilities) on the interval [0.0, 1.0], given a set of splitPoints.static int
getCompactSerialiedSizeBytes(int k, long n)
Returns the number of bytes a DoublesSketch would require to store in compact form given k and n.int
getCurrentCompactSerializedSizeBytes()
Returns the current number of bytes this sketch would require to store in the compact Memory Format.int
getCurrentUpdatableSerializedSizeBytes()
Returns the current number of bytes this sketch would require to store in the updatable Memory Format.int
getK()
Gets the user configured parameter k, which controls the accuracy of the sketch and its memory space usage.static int
getKFromEpsilon(double epsilon, boolean pmf)
Gets the approximate k to use given epsilon, the normalized rank error.abstract double
getMaxItem()
Returns the maximum item of the stream.abstract double
getMinItem()
Returns the minimum item of the stream.abstract long
getN()
Gets the length of the input stream.double
getNormalizedRankError(boolean pmf)
Gets the approximate rank error of this sketch normalized as a fraction between zero and one.static double
getNormalizedRankError(int k, boolean pmf)
Gets the normalized rank error given k and pmf.int
getNumRetained()
Gets the number of quantiles retained by the sketch.QuantilesDoublesAPI.DoublesPartitionBoundaries
getPartitionBoundaries(int numEquallyWeighted, QuantileSearchCriteria searchCrit)
This method returns an instance ofDoublesPartitionBoundaries
which provides sufficient information for the user to create the given number of equally weighted partitions.double[]
getPMF(double[] splitPoints, QuantileSearchCriteria searchCrit)
Returns an approximation to the Probability Mass Function (PMF) of the input stream as an array of probability masses as doubles on the interval [0.0, 1.0], given a set of splitPoints.double
getQuantile(double rank, QuantileSearchCriteria searchCrit)
Gets the approximate quantile of the given normalized rank and the given search criterion.double
getQuantileLowerBound(double rank)
Gets the lower bound of the quantile confidence interval in which the quantile of the given rank exists.double[]
getQuantiles(double[] ranks, QuantileSearchCriteria searchCrit)
Gets an array of quantiles from the given array of normalized ranks.double
getQuantileUpperBound(double rank)
Gets the upper bound of the quantile confidence interval in which the true quantile of the given rank exists.double
getRank(double quantile, QuantileSearchCriteria searchCrit)
Gets the normalized rank corresponding to the given a quantile.double
getRankLowerBound(double rank)
Gets the lower bound of the rank confidence interval in which the true rank of the given rank exists.double[]
getRanks(double[] quantiles, QuantileSearchCriteria searchCrit)
Gets an array of normalized ranks corresponding to the given array of quantiles and the given search criterion.double
getRankUpperBound(double rank)
Gets the upper bound of the rank confidence interval in which the true rank of the given rank exists.int
getSerializedSizeBytes()
Returns the current number of bytes this Sketch would require if serialized.DoublesSortedView
getSortedView()
Gets the sorted view of this sketchstatic int
getUpdatableStorageBytes(int k, long n)
Returns the number of bytes a sketch would require to store in updatable form.abstract boolean
hasMemory()
Returns true if this sketch's data structure is backed by Memory or WritableMemory.static DoublesSketch
heapify(org.apache.datasketches.memory.Memory srcMem)
Heapify takes the sketch image in Memory and instantiates an on-heap Sketch.abstract boolean
isDirect()
Returns true if this sketch's data structure is off-heap (a.k.a., Direct or Native memory).boolean
isEmpty()
Returns true if this sketch is empty.boolean
isEstimationMode()
Returns true if this sketch is in estimation mode.abstract boolean
isReadOnly()
Returns true if this sketch is read only.boolean
isSameResource(org.apache.datasketches.memory.Memory that)
Returns true if the backing resource of this is identical with the backing resource of that.QuantilesDoublesSketchIterator
iterator()
Gets the iterator for this sketch, which is not sorted.void
putMemory(org.apache.datasketches.memory.WritableMemory dstMem)
Puts the current sketch into the given Memory in compact form if there is sufficient space, otherwise, it throws an error.void
putMemory(org.apache.datasketches.memory.WritableMemory dstMem, boolean compact)
Puts the current sketch into the given Memory if there is sufficient space, otherwise, throws an error.abstract void
reset()
Resets this sketch to the empty state.byte[]
toByteArray()
Returns a byte array representation of this sketch.byte[]
toByteArray(boolean compact)
Serialize this sketch in a byte array form.String
toString()
Returns a summary of the key parameters of the sketch.String
toString(boolean sketchSummary, boolean dataDetail)
Returns summary information about this sketch.static String
toString(byte[] byteArr)
Returns a human readable string of the preamble of a byte array image of a DoublesSketch.static String
toString(org.apache.datasketches.memory.Memory mem)
Returns a human readable string of the preamble of a Memory image of a DoublesSketch.static DoublesSketch
wrap(org.apache.datasketches.memory.Memory srcMem)
Wrap this sketch around the given Memory image of a DoublesSketch, compact or updatable.-
Methods inherited from class java.lang.Object
clone, equals, finalize, getClass, hashCode, notify, notifyAll, wait, wait, wait
-
Methods inherited from interface org.apache.datasketches.quantilescommon.QuantilesDoublesAPI
getCDF, getPartitionBoundaries, getPMF, getQuantile, getQuantiles, getRank, getRanks, update
-
-
-
-
Method Detail
-
builder
public static final DoublesSketchBuilder builder()
Returns a new builder- Returns:
- a new builder
-
heapify
public static DoublesSketch heapify(org.apache.datasketches.memory.Memory srcMem)
Heapify takes the sketch image in Memory and instantiates an on-heap Sketch. The resulting sketch will not retain any link to the source Memory.- Parameters:
srcMem
- a Memory image of a Sketch. See Memory- Returns:
- a heap-based Sketch based on the given Memory
-
wrap
public static DoublesSketch wrap(org.apache.datasketches.memory.Memory srcMem)
Wrap this sketch around the given Memory image of a DoublesSketch, compact or updatable. A DirectUpdateDoublesSketch can only wrap an updatable array, and a DirectCompactDoublesSketch can only wrap a compact array.- Parameters:
srcMem
- the given Memory image of a DoublesSketch that may have data,- Returns:
- a sketch that wraps the given srcMem
-
getCDF
public double[] getCDF(double[] splitPoints, QuantileSearchCriteria searchCrit)
Description copied from interface:QuantilesDoublesAPI
Returns an approximation to the Cumulative Distribution Function (CDF) of the input stream as a monotonically increasing array of double ranks (or cumulative probabilities) on the interval [0.0, 1.0], given a set of splitPoints.The resulting approximations have a probabilistic guarantee that can be obtained from the getNormalizedRankError(false) function.
- Specified by:
getCDF
in interfaceQuantilesDoublesAPI
- Parameters:
splitPoints
- an array of m unique, monotonically increasing items (of the same type as the input items) that divide the item input domain into m+1 overlapping intervals.The start of each interval is below the lowest item retained by the sketch corresponding to a zero rank or zero probability, and the end of the interval is the rank or cumulative probability corresponding to the split point.
The (m+1)th interval represents 100% of the distribution represented by the sketch and consistent with the definition of a cumulative probability distribution, thus the (m+1)th rank or probability in the returned array is always 1.0.
If a split point exactly equals a retained item of the sketch and the search criterion is:
- INCLUSIVE, the resulting cumulative probability will include that item.
- EXCLUSIVE, the resulting cumulative probability will not include the weight of that split point.
It is not recommended to include either the minimum or maximum items of the input stream.
searchCrit
- the desired search criteria.- Returns:
- a discrete CDF array of m+1 double ranks (or cumulative probabilities) on the interval [0.0, 1.0].
-
getMaxItem
public abstract double getMaxItem()
Description copied from interface:QuantilesDoublesAPI
Returns the maximum item of the stream. This is provided for convenience, but may be different from the largest item retained by the sketch algorithm.- Specified by:
getMaxItem
in interfaceQuantilesDoublesAPI
- Returns:
- the maximum item of the stream
-
getMinItem
public abstract double getMinItem()
Description copied from interface:QuantilesDoublesAPI
Returns the minimum item of the stream. This is provided for convenience, but is distinct from the smallest item retained by the sketch algorithm.- Specified by:
getMinItem
in interfaceQuantilesDoublesAPI
- Returns:
- the minimum item of the stream
-
getPartitionBoundaries
public QuantilesDoublesAPI.DoublesPartitionBoundaries getPartitionBoundaries(int numEquallyWeighted, QuantileSearchCriteria searchCrit)
Description copied from interface:QuantilesDoublesAPI
This method returns an instance ofDoublesPartitionBoundaries
which provides sufficient information for the user to create the given number of equally weighted partitions.- Specified by:
getPartitionBoundaries
in interfaceQuantilesDoublesAPI
- Parameters:
numEquallyWeighted
- an integer that specifies the number of equally weighted partitions betweengetMinItem()
andgetMaxItem()
. This must be a positive integer greater than zero.- A 1 will return: minItem, maxItem.
- A 2 will return: minItem, median quantile, maxItem.
- Etc.
searchCrit
- If INCLUSIVE, all the returned quantiles are the upper boundaries of the equally weighted partitions with the exception of the lowest returned quantile, which is the lowest boundary of the lowest ranked partition. If EXCLUSIVE, all the returned quantiles are the lower boundaries of the equally weighted partitions with the exception of the highest returned quantile, which is the upper boundary of the highest ranked partition.- Returns:
- an instance of
DoublesPartitionBoundaries
.
-
getPMF
public double[] getPMF(double[] splitPoints, QuantileSearchCriteria searchCrit)
Description copied from interface:QuantilesDoublesAPI
Returns an approximation to the Probability Mass Function (PMF) of the input stream as an array of probability masses as doubles on the interval [0.0, 1.0], given a set of splitPoints.The resulting approximations have a probabilistic guarantee that can be obtained from the getNormalizedRankError(true) function.
- Specified by:
getPMF
in interfaceQuantilesDoublesAPI
- Parameters:
splitPoints
- an array of m unique, monotonically increasing items (of the same type as the input items) that divide the item input domain into m+1 consecutive, non-overlapping intervals.Each interval except for the end intervals starts with a split point and ends with the next split point in sequence.
The first interval starts below the lowest item retained by the sketch corresponding to a zero rank or zero probability, and ends with the first split point
The last (m+1)th interval starts with the last split point and ends after the last item retained by the sketch corresponding to a rank or probability of 1.0.
The sum of the probability masses of all (m+1) intervals is 1.0.
If the search criterion is:
- INCLUSIVE, and the upper split point of an interval equals an item retained by the sketch, the interval will include that item. If the lower split point equals an item retained by the sketch, the interval will exclude that item.
- EXCLUSIVE, and the upper split point of an interval equals an item retained by the sketch, the interval will exclude that item. If the lower split point equals an item retained by the sketch, the interval will include that item.
It is not recommended to include either the minimum or maximum items of the input stream.
searchCrit
- the desired search criteria.- Returns:
- a PMF array of m+1 probability masses as doubles on the interval [0.0, 1.0].
-
getQuantile
public double getQuantile(double rank, QuantileSearchCriteria searchCrit)
Description copied from interface:QuantilesDoublesAPI
Gets the approximate quantile of the given normalized rank and the given search criterion.- Specified by:
getQuantile
in interfaceQuantilesDoublesAPI
- Parameters:
rank
- the given normalized rank, a double in the range [0.0, 1.0].searchCrit
- If INCLUSIVE, the given rank includes all quantiles ≤ the quantile directly corresponding to the given rank. If EXCLUSIVE, he given rank includes all quantiles < the quantile directly corresponding to the given rank.- Returns:
- the approximate quantile given the normalized rank.
- See Also:
QuantileSearchCriteria
-
getQuantiles
public double[] getQuantiles(double[] ranks, QuantileSearchCriteria searchCrit)
Description copied from interface:QuantilesDoublesAPI
Gets an array of quantiles from the given array of normalized ranks.- Specified by:
getQuantiles
in interfaceQuantilesDoublesAPI
- Parameters:
ranks
- the given array of normalized ranks, each of which must be in the interval [0.0,1.0].searchCrit
- if INCLUSIVE, the given ranks include all quantiles ≤ the quantile directly corresponding to each rank.- Returns:
- an array of quantiles corresponding to the given array of normalized ranks.
- See Also:
QuantileSearchCriteria
-
getQuantileLowerBound
public double getQuantileLowerBound(double rank)
Gets the lower bound of the quantile confidence interval in which the quantile of the given rank exists.Although it is possible to estimate the probability that the true quantile exists within the quantile confidence interval specified by the upper and lower quantile bounds, it is not possible to guarantee the width of the quantile confidence interval as an additive or multiplicative percent of the true quantile.
The approximate probability that the true quantile is within the confidence interval specified by the upper and lower quantile bounds for this sketch is 0.99.- Specified by:
getQuantileLowerBound
in interfaceQuantilesDoublesAPI
- Parameters:
rank
- the given normalized rank- Returns:
- the lower bound of the quantile confidence interval in which the quantile of the given rank exists.
-
getQuantileUpperBound
public double getQuantileUpperBound(double rank)
Gets the upper bound of the quantile confidence interval in which the true quantile of the given rank exists.Although it is possible to estimate the probability that the true quantile exists within the quantile confidence interval specified by the upper and lower quantile bounds, it is not possible to guarantee the width of the quantile interval as an additive or multiplicative percent of the true quantile.
The approximate probability that the true quantile is within the confidence interval specified by the upper and lower quantile bounds for this sketch is 0.99.- Specified by:
getQuantileUpperBound
in interfaceQuantilesDoublesAPI
- Parameters:
rank
- the given normalized rank- Returns:
- the upper bound of the quantile confidence interval in which the true quantile of the given rank exists.
-
getRank
public double getRank(double quantile, QuantileSearchCriteria searchCrit)
Description copied from interface:QuantilesDoublesAPI
Gets the normalized rank corresponding to the given a quantile.- Specified by:
getRank
in interfaceQuantilesDoublesAPI
- Parameters:
quantile
- the given quantilesearchCrit
- if INCLUSIVE the given quantile is included into the rank.- Returns:
- the normalized rank corresponding to the given quantile
- See Also:
QuantileSearchCriteria
-
getRankLowerBound
public double getRankLowerBound(double rank)
Gets the lower bound of the rank confidence interval in which the true rank of the given rank exists. The approximate probability that the true rank is within the confidence interval specified by the upper and lower rank bounds for this sketch is 0.99.- Specified by:
getRankLowerBound
in interfaceQuantilesAPI
- Parameters:
rank
- the given normalized rank.- Returns:
- the lower bound of the rank confidence interval in which the true rank of the given rank exists.
-
getRankUpperBound
public double getRankUpperBound(double rank)
Gets the upper bound of the rank confidence interval in which the true rank of the given rank exists. The approximate probability that the true rank is within the confidence interval specified by the upper and lower rank bounds for this sketch is 0.99.- Specified by:
getRankUpperBound
in interfaceQuantilesAPI
- Parameters:
rank
- the given normalized rank.- Returns:
- the upper bound of the rank confidence interval in which the true rank of the given rank exists.
-
getRanks
public double[] getRanks(double[] quantiles, QuantileSearchCriteria searchCrit)
Description copied from interface:QuantilesDoublesAPI
Gets an array of normalized ranks corresponding to the given array of quantiles and the given search criterion.- Specified by:
getRanks
in interfaceQuantilesDoublesAPI
- Parameters:
quantiles
- the given array of quantilessearchCrit
- if INCLUSIVE, the given quantiles include the rank directly corresponding to each quantile.- Returns:
- an array of normalized ranks corresponding to the given array of quantiles.
- See Also:
QuantileSearchCriteria
-
getK
public int getK()
Description copied from interface:QuantilesAPI
Gets the user configured parameter k, which controls the accuracy of the sketch and its memory space usage.- Specified by:
getK
in interfaceQuantilesAPI
- Returns:
- the user configured parameter k, which controls the accuracy of the sketch and its memory space usage.
-
getN
public abstract long getN()
Description copied from interface:QuantilesAPI
Gets the length of the input stream.- Specified by:
getN
in interfaceQuantilesAPI
- Returns:
- the length of the input stream.
-
getNormalizedRankError
public double getNormalizedRankError(boolean pmf)
Gets the approximate rank error of this sketch normalized as a fraction between zero and one. The epsilon returned is a best fit to 99 percent confidence empirically measured max error in thousands of trials.- Parameters:
pmf
- if true, returns the "double-sided" normalized rank error for the getPMF() function. Otherwise, it is the "single-sided" normalized rank error for all the other queries.- Returns:
- if pmf is true, returns the normalized rank error for the getPMF() function. Otherwise, it is the "single-sided" normalized rank error for all the other queries.
-
getNormalizedRankError
public static double getNormalizedRankError(int k, boolean pmf)
Gets the normalized rank error given k and pmf. Static method version of the getNormalizedRankError(boolean). The epsilon returned is a best fit to 99 percent confidence empirically measured max error in thousands of trials.- Parameters:
k
- the configuration parameterpmf
- if true, returns the "double-sided" normalized rank error for the getPMF() function. Otherwise, it is the "single-sided" normalized rank error for all the other queries.- Returns:
- if pmf is true, the normalized rank error for the getPMF() function. Otherwise, it is the "single-sided" normalized rank error for all the other queries.
-
getKFromEpsilon
public static int getKFromEpsilon(double epsilon, boolean pmf)
Gets the approximate k to use given epsilon, the normalized rank error.- Parameters:
epsilon
- the normalized rank error between zero and one.pmf
- if true, this function returns k assuming the input epsilon is the desired "double-sided" epsilon for the getPMF() function. Otherwise, this function returns k assuming the input epsilon is the desired "single-sided" epsilon for all the other queries.- Returns:
- k given epsilon.
-
hasMemory
public abstract boolean hasMemory()
Description copied from interface:QuantilesAPI
Returns true if this sketch's data structure is backed by Memory or WritableMemory.- Specified by:
hasMemory
in interfaceQuantilesAPI
- Returns:
- true if this sketch's data structure is backed by Memory or WritableMemory.
-
isDirect
public abstract boolean isDirect()
Description copied from interface:QuantilesAPI
Returns true if this sketch's data structure is off-heap (a.k.a., Direct or Native memory).- Specified by:
isDirect
in interfaceQuantilesAPI
- Returns:
- true if this sketch's data structure is off-heap (a.k.a., Direct or Native memory).
-
isEmpty
public boolean isEmpty()
Description copied from interface:QuantilesAPI
Returns true if this sketch is empty.- Specified by:
isEmpty
in interfaceQuantilesAPI
- Returns:
- true if this sketch is empty.
-
isEstimationMode
public boolean isEstimationMode()
Description copied from interface:QuantilesAPI
Returns true if this sketch is in estimation mode.- Specified by:
isEstimationMode
in interfaceQuantilesAPI
- Returns:
- true if this sketch is in estimation mode.
-
isReadOnly
public abstract boolean isReadOnly()
Description copied from interface:QuantilesAPI
Returns true if this sketch is read only.- Specified by:
isReadOnly
in interfaceQuantilesAPI
- Returns:
- true if this sketch is read only.
-
isSameResource
public boolean isSameResource(org.apache.datasketches.memory.Memory that)
Returns true if the backing resource of this is identical with the backing resource of that. The capacities must be the same. If this is a region, the region offset must also be the same.- Parameters:
that
- A different non-null object- Returns:
- true if the backing resource of this is the same as the backing resource of that.
-
toByteArray
public byte[] toByteArray()
Description copied from interface:QuantilesDoublesAPI
Returns a byte array representation of this sketch.- Specified by:
toByteArray
in interfaceQuantilesDoublesAPI
- Returns:
- a byte array representation of this sketch.
-
toByteArray
public byte[] toByteArray(boolean compact)
Serialize this sketch in a byte array form.- Parameters:
compact
- if true the sketch will be serialized in compact form. DirectCompactDoublesSketch can wrap() only a compact byte array; DirectUpdateDoublesSketch can wrap() only a updatable byte array.- Returns:
- this sketch in a byte array form.
-
toString
public String toString()
Description copied from interface:QuantilesAPI
Returns a summary of the key parameters of the sketch.- Specified by:
toString
in interfaceQuantilesAPI
- Overrides:
toString
in classObject
- Returns:
- a summary of the key parameters of the sketch.
-
toString
public String toString(boolean sketchSummary, boolean dataDetail)
Returns summary information about this sketch. Used for debugging.- Parameters:
sketchSummary
- if true includes sketch summarydataDetail
- if true includes data detail- Returns:
- summary information about the sketch.
-
toString
public static String toString(byte[] byteArr)
Returns a human readable string of the preamble of a byte array image of a DoublesSketch.- Parameters:
byteArr
- the given byte array- Returns:
- a human readable string of the preamble of a byte array image of a DoublesSketch.
-
toString
public static String toString(org.apache.datasketches.memory.Memory mem)
Returns a human readable string of the preamble of a Memory image of a DoublesSketch.- Parameters:
mem
- the given Memory- Returns:
- a human readable string of the preamble of a Memory image of a DoublesSketch.
-
downSample
public DoublesSketch downSample(DoublesSketch srcSketch, int smallerK, org.apache.datasketches.memory.WritableMemory dstMem)
From an source sketch, create a new sketch that must have a smaller K. The original sketch is not modified.- Parameters:
srcSketch
- the sourcing sketchsmallerK
- the new sketch's K that must be smaller than this K. It is required that this.getK() = smallerK * 2^(nonnegative integer).dstMem
- the destination Memory. It must not overlap the Memory of this sketch. If null, a heap sketch will be returned, otherwise it will be off-heap.- Returns:
- the new sketch.
-
getNumRetained
public int getNumRetained()
Description copied from interface:QuantilesAPI
Gets the number of quantiles retained by the sketch.- Specified by:
getNumRetained
in interfaceQuantilesAPI
- Returns:
- the number of quantiles retained by the sketch
-
getCurrentCompactSerializedSizeBytes
public int getCurrentCompactSerializedSizeBytes()
Returns the current number of bytes this sketch would require to store in the compact Memory Format.- Returns:
- the current number of bytes this sketch would require to store in the compact Memory Format.
-
getCompactSerialiedSizeBytes
public static int getCompactSerialiedSizeBytes(int k, long n)
Returns the number of bytes a DoublesSketch would require to store in compact form given k and n. The compact form is not updatable.- Parameters:
k
- the size configuration parameter for the sketchn
- the number of quantiles input into the sketch- Returns:
- the number of bytes required to store this sketch in compact form.
-
getSerializedSizeBytes
public int getSerializedSizeBytes()
Description copied from interface:QuantilesDoublesAPI
Returns the current number of bytes this Sketch would require if serialized.- Specified by:
getSerializedSizeBytes
in interfaceQuantilesDoublesAPI
- Returns:
- the number of bytes this sketch would require if serialized.
-
getCurrentUpdatableSerializedSizeBytes
public int getCurrentUpdatableSerializedSizeBytes()
Returns the current number of bytes this sketch would require to store in the updatable Memory Format.- Returns:
- the current number of bytes this sketch would require to store in the updatable Memory Format.
-
getUpdatableStorageBytes
public static int getUpdatableStorageBytes(int k, long n)
Returns the number of bytes a sketch would require to store in updatable form. This uses roughly 2X the storage of the compact form given k and n.- Parameters:
k
- the size configuration parameter for the sketchn
- the number of quantiles input into the sketch- Returns:
- the number of bytes this sketch would require to store in updatable form.
-
putMemory
public void putMemory(org.apache.datasketches.memory.WritableMemory dstMem)
Puts the current sketch into the given Memory in compact form if there is sufficient space, otherwise, it throws an error.- Parameters:
dstMem
- the given memory.
-
putMemory
public void putMemory(org.apache.datasketches.memory.WritableMemory dstMem, boolean compact)
Puts the current sketch into the given Memory if there is sufficient space, otherwise, throws an error.- Parameters:
dstMem
- the given memory.compact
- if true, compacts and sorts the base buffer, which optimizes merge performance at the cost of slightly increased serialization time.
-
iterator
public QuantilesDoublesSketchIterator iterator()
Description copied from interface:QuantilesDoublesAPI
Gets the iterator for this sketch, which is not sorted.- Specified by:
iterator
in interfaceQuantilesDoublesAPI
- Returns:
- the iterator for this sketch
-
getSortedView
public DoublesSortedView getSortedView()
Description copied from interface:QuantilesDoublesAPI
Gets the sorted view of this sketch- Specified by:
getSortedView
in interfaceQuantilesDoublesAPI
- Returns:
- the sorted view of this sketch
-
reset
public abstract void reset()
Resets this sketch to the empty state. If the sketch is read only this does nothing.The parameter k will not change.
The parameter k will not change.
- Specified by:
reset
in interfaceQuantilesAPI
-
-