Class KllFloatsSketch

java.lang.Object
org.apache.datasketches.kll.KllSketch
org.apache.datasketches.kll.KllFloatsSketch
All Implemented Interfaces:
QuantilesAPI, QuantilesFloatsAPI

public abstract class KllFloatsSketch extends KllSketch implements QuantilesFloatsAPI
This variation of the KllSketch implements primitive floats.
See Also:
  • Method Details

    • newHeapInstance

      public static KllFloatsSketch newHeapInstance()
      Create a new heap instance of this sketch with the default k = 200. The default k = 200 results in a normalized rank error of about 1.65%. Larger K will have smaller error but the sketch will be larger (and slower).
      Returns:
      new KllFloatsSketch on the Java heap.
    • newHeapInstance

      public static KllFloatsSketch newHeapInstance(int k)
      Create a new heap instance of this sketch with a given parameter k. k can be between 8, inclusive, and 65535, inclusive. The default k = 200 results in a normalized rank error of about 1.65%. Larger K will have smaller error but the sketch will be larger (and slower).
      Parameters:
      k - parameter that controls size of the sketch and accuracy of estimates.
      Returns:
      new KllFloatsSketch on the Java heap.
    • newDirectInstance

      public static KllFloatsSketch newDirectInstance(org.apache.datasketches.memory.WritableMemory dstMem, org.apache.datasketches.memory.MemoryRequestServer memReqSvr)
      Create a new direct updatable instance of this sketch with the default k. The default k = 200 results in a normalized rank error of about 1.65%. Larger k will have smaller error but the sketch will be larger (and slower).
      Parameters:
      dstMem - the given destination WritableMemory object for use by the sketch
      memReqSvr - the given MemoryRequestServer to request a larger WritableMemory
      Returns:
      a new direct instance of this sketch
    • newDirectInstance

      public static KllFloatsSketch newDirectInstance(int k, org.apache.datasketches.memory.WritableMemory dstMem, org.apache.datasketches.memory.MemoryRequestServer memReqSvr)
      Create a new direct updatable instance of this sketch with a given k.
      Parameters:
      k - parameter that controls size of the sketch and accuracy of estimates.
      dstMem - the given destination WritableMemory object for use by the sketch
      memReqSvr - the given MemoryRequestServer to request a larger WritableMemory
      Returns:
      a new direct instance of this sketch
    • heapify

      public static KllFloatsSketch heapify(org.apache.datasketches.memory.Memory srcMem)
      Factory heapify takes a compact sketch image in Memory and instantiates an on-heap sketch. The resulting sketch will not retain any link to the source Memory.
      Parameters:
      srcMem - a compact Memory image of a sketch serialized by this sketch. See Memory
      Returns:
      a heap-based sketch based on the given Memory.
    • wrap

      public static KllFloatsSketch wrap(org.apache.datasketches.memory.Memory srcMem)
      Wrap a sketch around the given read only compact source Memory containing sketch data that originated from this sketch.
      Parameters:
      srcMem - the read only source Memory
      Returns:
      instance of this sketch
    • writableWrap

      public static KllFloatsSketch writableWrap(org.apache.datasketches.memory.WritableMemory srcMem, org.apache.datasketches.memory.MemoryRequestServer memReqSvr)
      Wrap a sketch around the given source Writable Memory containing sketch data that originated from this sketch.
      Parameters:
      srcMem - a WritableMemory that contains data.
      memReqSvr - the given MemoryRequestServer to request a larger WritableMemory
      Returns:
      instance of this sketch
    • getCDF

      public double[] getCDF(float[] splitPoints, QuantileSearchCriteria searchCrit)
      Description copied from interface: QuantilesFloatsAPI
      Returns an approximation to the Cumulative Distribution Function (CDF) of the input stream as a monotonically increasing array of double ranks (or cumulative probabilities) on the interval [0.0, 1.0], given a set of splitPoints.

      The resulting approximations have a probabilistic guarantee that can be obtained from the getNormalizedRankError(false) function.

      Specified by:
      getCDF in interface QuantilesFloatsAPI
      Parameters:
      splitPoints - an array of m unique, monotonically increasing items (of the same type as the input items) that divide the item input domain into m+1 overlapping intervals.

      The start of each interval is below the lowest item retained by the sketch corresponding to a zero rank or zero probability, and the end of the interval is the rank or cumulative probability corresponding to the split point.

      The (m+1)th interval represents 100% of the distribution represented by the sketch and consistent with the definition of a cumulative probability distribution, thus the (m+1)th rank or probability in the returned array is always 1.0.

      If a split point exactly equals a retained item of the sketch and the search criterion is:

      • INCLUSIVE, the resulting cumulative probability will include that item.
      • EXCLUSIVE, the resulting cumulative probability will not include the weight of that split point.

      It is not recommended to include either the minimum or maximum items of the input stream.

      searchCrit - the desired search criteria.
      Returns:
      a discrete CDF array of m+1 double ranks (or cumulative probabilities) on the interval [0.0, 1.0].
    • getPMF

      public double[] getPMF(float[] splitPoints, QuantileSearchCriteria searchCrit)
      Description copied from interface: QuantilesFloatsAPI
      Returns an approximation to the Probability Mass Function (PMF) of the input stream as an array of probability masses as doubles on the interval [0.0, 1.0], given a set of splitPoints.

      The resulting approximations have a probabilistic guarantee that can be obtained from the getNormalizedRankError(true) function.

      Specified by:
      getPMF in interface QuantilesFloatsAPI
      Parameters:
      splitPoints - an array of m unique, monotonically increasing items (of the same type as the input items) that divide the item input domain into m+1 consecutive, non-overlapping intervals.

      Each interval except for the end intervals starts with a split point and ends with the next split point in sequence.

      The first interval starts below the lowest item retained by the sketch corresponding to a zero rank or zero probability, and ends with the first split point

      The last (m+1)th interval starts with the last split point and ends after the last item retained by the sketch corresponding to a rank or probability of 1.0.

      The sum of the probability masses of all (m+1) intervals is 1.0.

      If the search criterion is:

      • INCLUSIVE, and the upper split point of an interval equals an item retained by the sketch, the interval will include that item. If the lower split point equals an item retained by the sketch, the interval will exclude that item.
      • EXCLUSIVE, and the upper split point of an interval equals an item retained by the sketch, the interval will exclude that item. If the lower split point equals an item retained by the sketch, the interval will include that item.

      It is not recommended to include either the minimum or maximum items of the input stream.

      searchCrit - the desired search criteria.
      Returns:
      a PMF array of m+1 probability masses as doubles on the interval [0.0, 1.0].
    • getQuantile

      public float getQuantile(double rank, QuantileSearchCriteria searchCrit)
      Description copied from interface: QuantilesFloatsAPI
      Gets the approximate quantile of the given normalized rank and the given search criterion.
      Specified by:
      getQuantile in interface QuantilesFloatsAPI
      Parameters:
      rank - the given normalized rank, a double in the range [0.0, 1.0].
      searchCrit - If INCLUSIVE, the given rank includes all quantiles ≤ the quantile directly corresponding to the given rank. If EXCLUSIVE, he given rank includes all quantiles < the quantile directly corresponding to the given rank.
      Returns:
      the approximate quantile given the normalized rank.
      See Also:
    • getQuantiles

      public float[] getQuantiles(double[] ranks, QuantileSearchCriteria searchCrit)
      Description copied from interface: QuantilesFloatsAPI
      Gets an array of quantiles from the given array of normalized ranks.
      Specified by:
      getQuantiles in interface QuantilesFloatsAPI
      Parameters:
      ranks - the given array of normalized ranks, each of which must be in the interval [0.0,1.0].
      searchCrit - if INCLUSIVE, the given ranks include all quantiles ≤ the quantile directly corresponding to each rank.
      Returns:
      an array of quantiles corresponding to the given array of normalized ranks.
      See Also:
    • getQuantileLowerBound

      public float getQuantileLowerBound(double rank)
      Gets the lower bound of the quantile confidence interval in which the quantile of the given rank exists.

      Although it is possible to estimate the probability that the true quantile exists within the quantile confidence interval specified by the upper and lower quantile bounds, it is not possible to guarantee the width of the quantile confidence interval as an additive or multiplicative percent of the true quantile.

      The approximate probability that the true quantile is within the confidence interval specified by the upper and lower quantile bounds for this sketch is 0.99.
      Specified by:
      getQuantileLowerBound in interface QuantilesFloatsAPI
      Parameters:
      rank - the given normalized rank
      Returns:
      the lower bound of the quantile confidence interval in which the quantile of the given rank exists.
    • getQuantileUpperBound

      public float getQuantileUpperBound(double rank)
      Gets the upper bound of the quantile confidence interval in which the true quantile of the given rank exists.

      Although it is possible to estimate the probability that the true quantile exists within the quantile confidence interval specified by the upper and lower quantile bounds, it is not possible to guarantee the width of the quantile interval as an additive or multiplicative percent of the true quantile.

      The approximate probability that the true quantile is within the confidence interval specified by the upper and lower quantile bounds for this sketch is 0.99.
      Specified by:
      getQuantileUpperBound in interface QuantilesFloatsAPI
      Parameters:
      rank - the given normalized rank
      Returns:
      the upper bound of the quantile confidence interval in which the true quantile of the given rank exists.
    • getRank

      public double getRank(float quantile, QuantileSearchCriteria searchCrit)
      Description copied from interface: QuantilesFloatsAPI
      Gets the normalized rank corresponding to the given a quantile.
      Specified by:
      getRank in interface QuantilesFloatsAPI
      Parameters:
      quantile - the given quantile
      searchCrit - if INCLUSIVE the given quantile is included into the rank.
      Returns:
      the normalized rank corresponding to the given quantile.
      See Also:
    • getRankLowerBound

      public double getRankLowerBound(double rank)
      Gets the lower bound of the rank confidence interval in which the true rank of the given rank exists. The approximate probability that the true rank is within the confidence interval specified by the upper and lower rank bounds for this sketch is 0.99.
      Specified by:
      getRankLowerBound in interface QuantilesAPI
      Parameters:
      rank - the given normalized rank.
      Returns:
      the lower bound of the rank confidence interval in which the true rank of the given rank exists.
    • getRankUpperBound

      public double getRankUpperBound(double rank)
      Gets the upper bound of the rank confidence interval in which the true rank of the given rank exists. The approximate probability that the true rank is within the confidence interval specified by the upper and lower rank bounds for this sketch is 0.99.
      Specified by:
      getRankUpperBound in interface QuantilesAPI
      Parameters:
      rank - the given normalized rank.
      Returns:
      the upper bound of the rank confidence interval in which the true rank of the given rank exists.
    • getRanks

      public double[] getRanks(float[] quantiles, QuantileSearchCriteria searchCrit)
      Description copied from interface: QuantilesFloatsAPI
      Gets an array of normalized ranks corresponding to the given array of quantiles and the given search criterion.
      Specified by:
      getRanks in interface QuantilesFloatsAPI
      Parameters:
      quantiles - the given array of quantiles
      searchCrit - if INCLUSIVE, the given quantiles include the rank directly corresponding to each quantile.
      Returns:
      an array of normalized ranks corresponding to the given array of quantiles.
      See Also:
    • iterator

      public QuantilesFloatsSketchIterator iterator()
      Description copied from interface: QuantilesFloatsAPI
      Gets the iterator for this sketch, which is not sorted.
      Specified by:
      iterator in interface QuantilesFloatsAPI
      Returns:
      the iterator for this sketch
    • merge

      public final void merge(KllSketch other)
      Description copied from class: KllSketch
      Merges another sketch into this one. Attempting to merge a sketch of the wrong type will throw an exception.
      Specified by:
      merge in class KllSketch
      Parameters:
      other - sketch to merge into this one
    • reset

      public final void reset()
      Resets this sketch to the empty state. If the sketch is read only this does nothing.

      The parameter k will not change.

      The parameter k will not change.

      Specified by:
      reset in interface QuantilesAPI
    • toByteArray

      public byte[] toByteArray()
      Description copied from interface: QuantilesFloatsAPI
      Returns a byte array representation of this sketch.
      Specified by:
      toByteArray in interface QuantilesFloatsAPI
      Returns:
      a byte array representation of this sketch.
    • toString

      public String toString(boolean withLevels, boolean withLevelsAndItems)
      Description copied from class: KllSketch
      Returns human readable summary information about this sketch. Used for debugging.
      Specified by:
      toString in class KllSketch
      Parameters:
      withLevels - if true includes sketch levels array summary information
      withLevelsAndItems - if true include detail of levels array and items array together
      Returns:
      human readable summary information about this sketch.
    • update

      public void update(float item)
      Description copied from interface: QuantilesFloatsAPI
      Updates this sketch with the given item.
      Specified by:
      update in interface QuantilesFloatsAPI
      Parameters:
      item - from a stream of quantiles. NaNs are ignored.
    • update

      public void update(float item, long weight)
      Weighted update. Updates this sketch with the given item the number of times specified by the given integer weight.
      Parameters:
      item - the item to be repeated. NaNs are ignored.
      weight - the number of times the update of item is to be repeated. It must be ≥ one.
    • update

      public void update(float[] items, int offset, int length)
      Vector update. Updates this sketch with the given array (vector) of items, starting at the items offset for a length number of items. This is not supported for direct sketches.
      Parameters:
      items - the vector of items
      offset - the starting index of the items[] array
      length - the number of items
    • getSortedView

      public FloatsSketchSortedView getSortedView()
      Description copied from interface: QuantilesFloatsAPI
      Gets the sorted view of this sketch
      Specified by:
      getSortedView in interface QuantilesFloatsAPI
      Returns:
      the sorted view of this sketch