Interface PartitioningFeature<T>

Type Parameters:
T - the item class type
All Known Subinterfaces:
GenericSortedView<T>, QuantilesGenericAPI<T>
All Known Implementing Classes:
ItemsSketch, ItemsSketchSortedView, KllItemsSketch

public interface PartitioningFeature<T>
This enables the special functions for performing efficient partitioning of massive data.
  • Method Details

    • getPartitionBoundariesFromNumParts

      default GenericPartitionBoundaries<T> getPartitionBoundariesFromNumParts(int numEquallySizedParts)
      This method returns an instance of GenericPartitionBoundaries which provides sufficient information for the user to create the given number of equally sized partitions, where "equally sized" refers to an approximately equal number of items per partition.

      This method is equivalent to getPartitionBoundariesFromNumParts(numEquallySizedParts, INCLUSIVE).

      The sketch must not be empty.

      Parameters:
      numEquallySizedParts - an integer that specifies the number of equally sized partitions between getMinItem() and getMaxItem(). This must be a positive integer less than getMaxPartitions()
      • A 1 will return: minItem, maxItem.
      • A 2 will return: minItem, median quantile, maxItem.
      • Etc.
      Returns:
      an instance of GenericPartitionBoundaries.
    • getPartitionBoundariesFromNumParts

      GenericPartitionBoundaries<T> getPartitionBoundariesFromNumParts(int numEquallySizedParts, QuantileSearchCriteria searchCrit)
      This method returns an instance of GenericPartitionBoundaries which provides sufficient information for the user to create the given number of equally sized partitions, where "equally sized" refers to an approximately equal number of items per partition.

      The sketch must not be empty.

      Parameters:
      numEquallySizedParts - an integer that specifies the number of equally sized partitions between getMinItem() and getMaxItem(). This must be a positive integer less than getMaxPartitions()
      • A 1 will return: minItem, maxItem.
      • A 2 will return: minItem, median quantile, maxItem.
      • Etc.
      searchCrit - If INCLUSIVE, all the returned quantiles are the upper boundaries of the equally sized partitions with the exception of the lowest returned quantile, which is the lowest boundary of the lowest ranked partition. If EXCLUSIVE, all the returned quantiles are the lower boundaries of the equally sized partitions with the exception of the highest returned quantile, which is the upper boundary of the highest ranked partition.
      Returns:
      an instance of GenericPartitionBoundaries.
    • getPartitionBoundariesFromPartSize

      default GenericPartitionBoundaries<T> getPartitionBoundariesFromPartSize(long nominalPartSizeItems)
      This method returns an instance of GenericPartitionBoundaries which provides sufficient information for the user to create the given number of equally sized partitions, where "equally sized" refers to an approximately equal number of items per partition.

      This method is equivalent to getPartitionBoundariesFromPartSize(nominalPartSizeItems, INCLUSIVE).

      The sketch must not be empty.

      Parameters:
      nominalPartSizeItems - an integer that specifies the nominal size, in items, of each target partition. This must be a positive integer greater than getMinPartitionSizeItems()
      Returns:
      an instance of GenericPartitionBoundaries.
    • getPartitionBoundariesFromPartSize

      GenericPartitionBoundaries<T> getPartitionBoundariesFromPartSize(long nominalPartSizeItems, QuantileSearchCriteria searchCrit)
      This method returns an instance of GenericPartitionBoundaries which provides sufficient information for the user to create the given number of equally sized partitions, where "equally sized" refers to an approximately equal number of items per partition.

      The sketch must not be empty.

      Parameters:
      nominalPartSizeItems - an integer that specifies the nominal size, in items, of each target partition. This must be a positive integer greater than getMinPartitionSizeItems().
      searchCrit - If INCLUSIVE, all the returned quantiles are the upper boundaries of the equally sized partitions with the exception of the lowest returned quantile, which is the lowest boundary of the lowest ranked partition. If EXCLUSIVE, all the returned quantiles are the lower boundaries of the equally sized partitions with the exception of the highest returned quantile, which is the upper boundary of the highest ranked partition.
      Returns:
      an instance of GenericPartitionBoundaries.