Class Union


  • public abstract class Union
    extends SetOperation
    Compute the union of two or more theta sketches. A new instance represents an empty set.
    Author:
    Lee Rhodes
    • Constructor Summary

      Constructors 
      Constructor Description
      Union()  
    • Constructor Detail

      • Union

        public Union()
    • Method Detail

      • getCurrentBytes

        public abstract int getCurrentBytes()
        Returns the number of storage bytes required for this union in its current state.
        Returns:
        the number of storage bytes required for this union in its current state.
      • getFamily

        public Family getFamily()
        Description copied from class: SetOperation
        Gets the Family of this SetOperation
        Specified by:
        getFamily in class SetOperation
        Returns:
        the Family of this SetOperation
      • getMaxUnionBytes

        public abstract int getMaxUnionBytes()
        Returns the maximum required storage bytes for this union.
        Returns:
        the maximum required storage bytes for this union.
      • getResult

        public abstract CompactSketch getResult()
        Gets the result of this operation as an ordered CompactSketch on the Java heap. This does not disturb the underlying data structure of the union. Therefore, it is OK to continue updating the union after this operation.
        Returns:
        the result of this operation as an ordered CompactSketch on the Java heap
      • getResult

        public abstract CompactSketch getResult​(boolean dstOrdered,
                                                org.apache.datasketches.memory.WritableMemory dstMem)
        Gets the result of this operation as a CompactSketch of the chosen form. This does not disturb the underlying data structure of the union. Therefore, it is OK to continue updating the union after this operation.
        Parameters:
        dstOrdered - See Destination Ordered
        dstMem - See Destination Memory.
        Returns:
        the result of this operation as a CompactSketch of the chosen form
      • reset

        public abstract void reset()
        Resets this Union. The seed remains intact, everything else reverts back to its virgin state.
      • toByteArray

        public abstract byte[] toByteArray()
        Returns a byte array image of this Union object
        Returns:
        a byte array image of this Union object
      • union

        public CompactSketch union​(Sketch sketchA,
                                   Sketch sketchB)
        This implements a stateless, pair-wise union operation. The returned sketch will be cut back to the smaller of the two k values if required.

        Nulls and empty sketches are ignored.

        Parameters:
        sketchA - The first argument
        sketchB - The second argument
        Returns:
        the result ordered CompactSketch on the heap.
      • union

        public abstract CompactSketch union​(Sketch sketchA,
                                            Sketch sketchB,
                                            boolean dstOrdered,
                                            org.apache.datasketches.memory.WritableMemory dstMem)
        This implements a stateless, pair-wise union operation. The returned sketch will be cut back to k if required, similar to the regular Union operation.

        Nulls and empty sketches are ignored.

        Parameters:
        sketchA - The first argument
        sketchB - The second argument
        dstOrdered - If true, the returned CompactSketch will be ordered.
        dstMem - If not null, the returned CompactSketch will be placed in this WritableMemory.
        Returns:
        the result CompactSketch.
      • union

        public abstract void union​(Sketch sketchIn)
        Perform a Union operation with this union and the given on-heap sketch of the Theta Family. This method is not valid for the older SetSketch, which was prior to Open Source (August, 2015).

        This method can be repeatedly called.

        Nulls and empty sketches are ignored.

        Parameters:
        sketchIn - The incoming sketch.
      • union

        public abstract void union​(org.apache.datasketches.memory.Memory mem)
        Perform a Union operation with this union and the given Memory image of any sketch of the Theta Family. The input image may be from earlier versions of the Theta Compact Sketch, called the SetSketch (circa 2014), which was prior to Open Source and are compact and ordered.

        This method can be repeatedly called.

        Nulls and empty sketches are ignored.

        Parameters:
        mem - Memory image of sketch to be merged
      • update

        public abstract void update​(long datum)
        Update this union with the given long data item.
        Parameters:
        datum - The given long datum.
      • update

        public abstract void update​(double datum)
        Update this union with the given double (or float) data item. The double will be converted to a long using Double.doubleToLongBits(datum), which normalizes all NaN values to a single NaN representation. Plus and minus zero will be normalized to plus zero. Each of the special floating-point values NaN and +/- Infinity are treated as distinct.
        Parameters:
        datum - The given double datum.
      • update

        public abstract void update​(String datum)
        Update this union with the with the given String data item. The string is converted to a byte array using UTF8 encoding. If the string is null or empty no update attempt is made and the method returns.

        Note: this will not produce the same output hash values as the update(char[]) method and will generally be a little slower depending on the complexity of the UTF8 encoding.

        Note: this is not a Sketch Union operation. This treats the given string as a data item.

        Parameters:
        datum - The given String.
      • update

        public abstract void update​(byte[] data)
        Update this union with the given byte array item. If the byte array is null or empty no update attempt is made and the method returns.

        Note: this is not a Sketch Union operation. This treats the given byte array as a data item.

        Parameters:
        data - The given byte array.
      • update

        public abstract void update​(int[] data)
        Update this union with the given integer array item. If the integer array is null or empty no update attempt is made and the method returns.

        Note: this is not a Sketch Union operation. This treats the given integer array as a data item.

        Parameters:
        data - The given int array.
      • update

        public abstract void update​(char[] data)
        Update this union with the given char array item. If the char array is null or empty no update attempt is made and the method returns.

        Note: this will not produce the same output hash values as the update(String) method but will be a little faster as it avoids the complexity of the UTF8 encoding.

        Note: this is not a Sketch Union operation. This treats the given char array as a data item.

        Parameters:
        data - The given char array.
      • update

        public abstract void update​(long[] data)
        Update this union with the given long array item. If the long array is null or empty no update attempt is made and the method returns.

        Note: this is not a Sketch Union operation. This treats the given char array as a data item.

        Parameters:
        data - The given long array.