Class Union

All Implemented Interfaces:
MemoryStatus

public abstract class Union extends SetOperation
Compute the union of two or more theta sketches. A new instance represents an empty set.
Author:
Lee Rhodes
  • Constructor Summary

    Constructors
    Constructor
    Description
     
  • Method Summary

    Modifier and Type
    Method
    Description
    abstract int
    Returns the number of storage bytes required for this union in its current state.
    Gets the Family of this SetOperation
    abstract int
    Returns the maximum required storage bytes for this union.
    abstract CompactSketch
    Gets the result of this operation as an ordered CompactSketch on the Java heap.
    abstract CompactSketch
    getResult(boolean dstOrdered, org.apache.datasketches.memory.WritableMemory dstMem)
    Gets the result of this operation as a CompactSketch of the chosen form.
    abstract void
    Resets this Union.
    abstract byte[]
    Returns a byte array image of this Union object
    abstract void
    union(org.apache.datasketches.memory.Memory mem)
    Perform a Union operation with this union and the given Memory image of any sketch of the Theta Family.
    abstract void
    union(Sketch sketchIn)
    Perform a Union operation with this union and the given on-heap sketch of the Theta Family.
    union(Sketch sketchA, Sketch sketchB)
    This implements a stateless, pair-wise union operation.
    abstract CompactSketch
    union(Sketch sketchA, Sketch sketchB, boolean dstOrdered, org.apache.datasketches.memory.WritableMemory dstMem)
    This implements a stateless, pair-wise union operation.
    abstract void
    update(byte[] data)
    Update this union with the given byte array item.
    abstract void
    update(char[] data)
    Update this union with the given char array item.
    abstract void
    update(double datum)
    Update this union with the given double (or float) data item.
    abstract void
    update(int[] data)
    Update this union with the given integer array item.
    abstract void
    update(long datum)
    Update this union with the given long data item.
    abstract void
    update(long[] data)
    Update this union with the given long array item.
    abstract void
    update(String datum)
    Update this union with the with the given String data item.
    abstract void
    Update this union with the given ByteBuffer item.

    Methods inherited from class org.apache.datasketches.theta.SetOperation

    builder, getMaxAnotBResultBytes, getMaxIntersectionBytes, getMaxUnionBytes, heapify, heapify, wrap, wrap, wrap, wrap

    Methods inherited from class java.lang.Object

    equals, getClass, hashCode, notify, notifyAll, toString, wait, wait, wait

    Methods inherited from interface org.apache.datasketches.common.MemoryStatus

    hasMemory, isDirect, isSameResource
  • Constructor Details

    • Union

      public Union()
  • Method Details

    • getCurrentBytes

      public abstract int getCurrentBytes()
      Returns the number of storage bytes required for this union in its current state.
      Returns:
      the number of storage bytes required for this union in its current state.
    • getFamily

      public Family getFamily()
      Description copied from class: SetOperation
      Gets the Family of this SetOperation
      Specified by:
      getFamily in class SetOperation
      Returns:
      the Family of this SetOperation
    • getMaxUnionBytes

      public abstract int getMaxUnionBytes()
      Returns the maximum required storage bytes for this union.
      Returns:
      the maximum required storage bytes for this union.
    • getResult

      public abstract CompactSketch getResult()
      Gets the result of this operation as an ordered CompactSketch on the Java heap. This does not disturb the underlying data structure of the union. Therefore, it is OK to continue updating the union after this operation.
      Returns:
      the result of this operation as an ordered CompactSketch on the Java heap
    • getResult

      public abstract CompactSketch getResult(boolean dstOrdered, org.apache.datasketches.memory.WritableMemory dstMem)
      Gets the result of this operation as a CompactSketch of the chosen form. This does not disturb the underlying data structure of the union. Therefore, it is OK to continue updating the union after this operation.
      Parameters:
      dstOrdered - See Destination Ordered
      dstMem - See Destination Memory.
      Returns:
      the result of this operation as a CompactSketch of the chosen form
    • reset

      public abstract void reset()
      Resets this Union. The seed remains intact, everything else reverts back to its virgin state.
    • toByteArray

      public abstract byte[] toByteArray()
      Returns a byte array image of this Union object
      Returns:
      a byte array image of this Union object
    • union

      public CompactSketch union(Sketch sketchA, Sketch sketchB)
      This implements a stateless, pair-wise union operation. The returned sketch will be cut back to the smaller of the two k values if required.

      Nulls and empty sketches are ignored.

      Parameters:
      sketchA - The first argument
      sketchB - The second argument
      Returns:
      the result ordered CompactSketch on the heap.
    • union

      public abstract CompactSketch union(Sketch sketchA, Sketch sketchB, boolean dstOrdered, org.apache.datasketches.memory.WritableMemory dstMem)
      This implements a stateless, pair-wise union operation. The returned sketch will be cut back to k if required, similar to the regular Union operation.

      Nulls and empty sketches are ignored.

      Parameters:
      sketchA - The first argument
      sketchB - The second argument
      dstOrdered - If true, the returned CompactSketch will be ordered.
      dstMem - If not null, the returned CompactSketch will be placed in this WritableMemory.
      Returns:
      the result CompactSketch.
    • union

      public abstract void union(Sketch sketchIn)
      Perform a Union operation with this union and the given on-heap sketch of the Theta Family. This method is not valid for the older SetSketch, which was prior to Open Source (August, 2015).

      This method can be repeatedly called.

      Nulls and empty sketches are ignored.

      Parameters:
      sketchIn - The incoming sketch.
    • union

      public abstract void union(org.apache.datasketches.memory.Memory mem)
      Perform a Union operation with this union and the given Memory image of any sketch of the Theta Family. The input image may be from earlier versions of the Theta Compact Sketch, called the SetSketch (circa 2014), which was prior to Open Source and are compact and ordered.

      This method can be repeatedly called.

      Nulls and empty sketches are ignored.

      Parameters:
      mem - Memory image of sketch to be merged
    • update

      public abstract void update(long datum)
      Update this union with the given long data item.
      Parameters:
      datum - The given long datum.
    • update

      public abstract void update(double datum)
      Update this union with the given double (or float) data item. The double will be converted to a long using Double.doubleToLongBits(datum), which normalizes all NaN values to a single NaN representation. Plus and minus zero will be normalized to plus zero. Each of the special floating-point values NaN and +/- Infinity are treated as distinct.
      Parameters:
      datum - The given double datum.
    • update

      public abstract void update(String datum)
      Update this union with the with the given String data item. The string is converted to a byte array using UTF8 encoding. If the string is null or empty no update attempt is made and the method returns.

      Note: this will not produce the same output hash values as the update(char[]) method and will generally be a little slower depending on the complexity of the UTF8 encoding.

      Note: this is not a Sketch Union operation. This treats the given string as a data item.

      Parameters:
      datum - The given String.
    • update

      public abstract void update(byte[] data)
      Update this union with the given byte array item. If the byte array is null or empty no update attempt is made and the method returns.

      Note: this is not a Sketch Union operation. This treats the given byte array as a data item.

      Parameters:
      data - The given byte array.
    • update

      public abstract void update(ByteBuffer data)
      Update this union with the given ByteBuffer item. If the ByteBuffer is null or empty no update attempt is made and the method returns.

      Note: this is not a Sketch Union operation. This treats the given ByteBuffer as a data item.

      Parameters:
      data - The given ByteBuffer.
    • update

      public abstract void update(int[] data)
      Update this union with the given integer array item. If the integer array is null or empty no update attempt is made and the method returns.

      Note: this is not a Sketch Union operation. This treats the given integer array as a data item.

      Parameters:
      data - The given int array.
    • update

      public abstract void update(char[] data)
      Update this union with the given char array item. If the char array is null or empty no update attempt is made and the method returns.

      Note: this will not produce the same output hash values as the update(String) method but will be a little faster as it avoids the complexity of the UTF8 encoding.

      Note: this is not a Sketch Union operation. This treats the given char array as a data item.

      Parameters:
      data - The given char array.
    • update

      public abstract void update(long[] data)
      Update this union with the given long array item. If the long array is null or empty no update attempt is made and the method returns.

      Note: this is not a Sketch Union operation. This treats the given char array as a data item.

      Parameters:
      data - The given long array.