Class ReservoirLongsUnion

java.lang.Object
org.apache.datasketches.sampling.ReservoirLongsUnion

public final class ReservoirLongsUnion extends Object
Class to union reservoir samples of longs.

For efficiency reasons, the unioning process picks one of the two sketches to use as the base. As a result, we provide only a stateful union. Using the same approach for a merge would result in unpredictable side effects on the underlying sketches.

A union object is created with a maximum value of k, represented using the ReservoirSize class. The unioning process may cause the actual number of samples to fall below that maximum value, but never to exceed it. The result of a union will be a reservoir where each item from the global input has a uniform probability of selection, but there are no claims about higher order statistics. For instance, in general all possible permutations of the global input are not equally likely.

Author:
Jon Malkin, Kevin Lang
  • Method Details

    • newInstance

      public static ReservoirLongsUnion newInstance(int maxK)
      Creates an empty Union with a maximum reservoir capacity of size k.
      Parameters:
      maxK - The maximum allowed reservoir capacity for any sketches in the union
      Returns:
      A new ReservoirLongsUnion
    • heapify

      public static ReservoirLongsUnion heapify(org.apache.datasketches.memory.Memory srcMem)
      Instantiates a Union from Memory
      Parameters:
      srcMem - Memory object containing a serialized union
      Returns:
      A ReservoirLongsUnion created from the provided Memory
    • getMaxK

      public int getMaxK()
      Returns the maximum allowed reservoir capacity in this union. The current reservoir capacity may be lower.
      Returns:
      The maximum allowed reservoir capacity in this union.
    • update

      public void update(ReservoirLongsSketch sketchIn)
      Union the given sketch.

      This method can be repeatedly called. If the given sketch is null it is interpreted as an empty sketch.

      Parameters:
      sketchIn - The incoming sketch.
    • update

      public void update(org.apache.datasketches.memory.Memory mem)
      Union the given Memory image of the sketch.

      This method can be repeatedly called. If the given sketch is null it is interpreted as an empty sketch.

      Parameters:
      mem - Memory image of sketch to be merged
    • update

      public void update(long datum)
      Present this union with a long.
      Parameters:
      datum - The given long datum.
    • getResult

      public ReservoirLongsSketch getResult()
      Returns a sketch representing the current state of the union.
      Returns:
      The result of any unions already processed.
    • toString

      public String toString()
      Returns a human-readable summary of the sketch, without items.
      Overrides:
      toString in class Object
      Returns:
      A string version of the sketch summary
    • toByteArray

      public byte[] toByteArray()
      Returns a byte array representation of this union
      Returns:
      a byte array representation of this union