Class KllItemsSketchSortedView<T>

    • Method Detail

      • getCDF

        public double[] getCDF​(T[] splitPoints,
                               QuantileSearchCriteria searchCrit)
        Description copied from interface: GenericSortedView
        Returns an approximation to the Cumulative Distribution Function (CDF) of the input stream as a monotonically increasing array of double ranks (or cumulative probabilities) on the interval [0.0, 1.0], given a set of splitPoints.

        If the sketch is empty this returns null.

        The resulting approximations have a probabilistic guarantee that can be obtained from the getNormalizedRankError(false) function.

        Specified by:
        getCDF in interface GenericSortedView<T>
        Parameters:
        splitPoints - an array of m unique, monotonically increasing items (of the same type as the input items) that divide the item input domain into m+1 overlapping intervals.

        The start of each interval is below the lowest item retained by the sketch corresponding to a zero rank or zero probability, and the end of the interval is the rank or cumulative probability corresponding to the split point.

        The (m+1)th interval represents 100% of the distribution represented by the sketch and consistent with the definition of a cumulative probability distribution, thus the (m+1)th rank or probability in the returned array is always 1.0.

        If a split point exactly equals a retained item of the sketch and the search criterion is:

        • INCLUSIVE, the resulting cumulative probability will include that item.
        • EXCLUSIVE, the resulting cumulative probability will not include the weight of that split point.

        It is not recommended to include either the minimum or maximum items of the input stream.

        searchCrit - the desired search criteria.
        Returns:
        a discrete CDF array of m+1 double ranks (or cumulative probabilities) on the interval [0.0, 1.0].
      • getCumulativeWeights

        public long[] getCumulativeWeights()
        Description copied from interface: SortedView
        Returns the array of cumulative weights from the sketch. Also known as the natural ranks, which are the Natural Numbers on the interval [1, N].
        Specified by:
        getCumulativeWeights in interface SortedView
        Returns:
        the array of cumulative weights (or natural ranks).
      • getMaxItem

        public T getMaxItem()
        Description copied from interface: GenericSortedView
        Returns the maximum item of the stream. This may be distinct from the largest item retained by the sketch algorithm.
        Specified by:
        getMaxItem in interface GenericSortedView<T>
        Returns:
        the maximum item of the stream
      • getMinItem

        public T getMinItem()
        Description copied from interface: GenericSortedView
        Returns the minimum item of the stream. This may be distinct from the smallest item retained by the sketch algorithm.
        Specified by:
        getMinItem in interface GenericSortedView<T>
        Returns:
        the minimum item of the stream
      • getN

        public long getN()
        Description copied from interface: SortedView
        Returns the total number of items presented to the sourcing sketch.
        Specified by:
        getN in interface SortedView
        Returns:
        the total number of items presented to the sourcing sketch.
      • getPartitionBoundaries

        public GenericPartitionBoundaries<T> getPartitionBoundaries​(int numEquallySized,
                                                                    QuantileSearchCriteria searchCrit)
        Description copied from interface: PartitioningFeature
        This method returns an instance of GenericPartitionBoundaries which provides sufficient information for the user to create the given number of equally sized partitions, where "equally sized" refers to an approximately equal number of items per partition.
        Specified by:
        getPartitionBoundaries in interface PartitioningFeature<T>
        Parameters:
        numEquallySized - an integer that specifies the number of equally sized partitions between getMinItem() and getMaxItem(). This must be a positive integer greater than zero.
        • A 1 will return: minItem, maxItem.
        • A 2 will return: minItem, median quantile, maxItem.
        • Etc.
        searchCrit - If INCLUSIVE, all the returned quantiles are the upper boundaries of the equally sized partitions with the exception of the lowest returned quantile, which is the lowest boundary of the lowest ranked partition. If EXCLUSIVE, all the returned quantiles are the lower boundaries of the equally sized partitions with the exception of the highest returned quantile, which is the upper boundary of the highest ranked partition.
        Returns:
        an instance of GenericPartitionBoundaries.
      • getPMF

        public double[] getPMF​(T[] splitPoints,
                               QuantileSearchCriteria searchCrit)
        Description copied from interface: GenericSortedView
        Returns an approximation to the Probability Mass Function (PMF) of the input stream as an array of probability masses as doubles on the interval [0.0, 1.0], given a set of splitPoints.

        The resulting approximations have a probabilistic guarantee that can be obtained from the getNormalizedRankError(true) function.

        Specified by:
        getPMF in interface GenericSortedView<T>
        Parameters:
        splitPoints - an array of m unique, monotonically increasing items (of the same type as the input items) that divide the item input domain into m+1 consecutive, non-overlapping intervals.

        Each interval except for the end intervals starts with a split point and ends with the next split point in sequence.

        The first interval starts below the lowest item retained by the sketch corresponding to a zero rank or zero probability, and ends with the first split point

        The last (m+1)th interval starts with the last split point and ends after the last item retained by the sketch corresponding to a rank or probability of 1.0.

        The sum of the probability masses of all (m+1) intervals is 1.0.

        If the search criterion is:

        • INCLUSIVE, and the upper split point of an interval equals an item retained by the sketch, the interval will include that item. If the lower split point equals an item retained by the sketch, the interval will exclude that item.
        • EXCLUSIVE, and the upper split point of an interval equals an item retained by the sketch, the interval will exclude that item. If the lower split point equals an item retained by the sketch, the interval will include that item.

        It is not recommended to include either the minimum or maximum items of the input stream.

        searchCrit - the desired search criteria.
        Returns:
        a PMF array of m+1 probability masses as doubles on the interval [0.0, 1.0].
      • getQuantile

        public T getQuantile​(double rank,
                             QuantileSearchCriteria searchCrit)
        Description copied from interface: GenericSortedView
        Gets the approximate quantile of the given normalized rank and the given search criterion.
        Specified by:
        getQuantile in interface GenericSortedView<T>
        Parameters:
        rank - the given normalized rank, a double in the range [0.0, 1.0].
        searchCrit - If INCLUSIVE, the given rank includes all quantiles ≤ the quantile directly corresponding to the given rank. If EXCLUSIVE, he given rank includes all quantiles < the quantile directly corresponding to the given rank.
        Returns:
        the approximate quantile given the normalized rank.
        See Also:
        QuantileSearchCriteria
      • getRank

        public double getRank​(T quantile,
                              QuantileSearchCriteria searchCrit)
        Description copied from interface: GenericSortedView
        Gets the normalized rank corresponding to the given a quantile.
        Specified by:
        getRank in interface GenericSortedView<T>
        Parameters:
        quantile - the given quantile
        searchCrit - if INCLUSIVE the given quantile is included into the rank.
        Returns:
        the normalized rank corresponding to the given quantile.
        See Also:
        QuantileSearchCriteria
      • isEmpty

        public boolean isEmpty()
        Description copied from interface: SortedView
        Returns true if this sorted view is empty.
        Specified by:
        isEmpty in interface SortedView
        Returns:
        true if this sorted view is empty.