Class Union
- java.lang.Object
-
- org.apache.datasketches.theta.SetOperation
-
- org.apache.datasketches.theta.Union
-
public abstract class Union extends SetOperation
Compute the union of two or more theta sketches. A new instance represents an empty set.- Author:
- Lee Rhodes
-
-
Constructor Summary
Constructors Constructor Description Union()
-
Method Summary
All Methods Instance Methods Abstract Methods Concrete Methods Modifier and Type Method Description abstract int
getCurrentBytes()
Returns the number of storage bytes required for this union in its current state.Family
getFamily()
Gets the Family of this SetOperationabstract int
getMaxUnionBytes()
Returns the maximum required storage bytes for this union.abstract CompactSketch
getResult()
Gets the result of this operation as an ordered CompactSketch on the Java heap.abstract CompactSketch
getResult(boolean dstOrdered, org.apache.datasketches.memory.WritableMemory dstMem)
Gets the result of this operation as a CompactSketch of the chosen form.abstract void
reset()
Resets this Union.abstract byte[]
toByteArray()
Returns a byte array image of this Union objectabstract void
union(org.apache.datasketches.memory.Memory mem)
Perform a Union operation with this union and the given Memory image of any sketch of the Theta Family.abstract void
union(Sketch sketchIn)
Perform a Union operation with this union and the given on-heap sketch of the Theta Family.CompactSketch
union(Sketch sketchA, Sketch sketchB)
This implements a stateless, pair-wise union operation.abstract CompactSketch
union(Sketch sketchA, Sketch sketchB, boolean dstOrdered, org.apache.datasketches.memory.WritableMemory dstMem)
This implements a stateless, pair-wise union operation.abstract void
update(byte[] data)
Update this union with the given byte array item.abstract void
update(char[] data)
Update this union with the given char array item.abstract void
update(double datum)
Update this union with the given double (or float) data item.abstract void
update(int[] data)
Update this union with the given integer array item.abstract void
update(long datum)
Update this union with the given long data item.abstract void
update(long[] data)
Update this union with the given long array item.abstract void
update(String datum)
Update this union with the with the given String data item.-
Methods inherited from class org.apache.datasketches.theta.SetOperation
builder, getMaxAnotBResultBytes, getMaxIntersectionBytes, getMaxUnionBytes, heapify, heapify, isSameResource, wrap, wrap, wrap, wrap
-
-
-
-
Method Detail
-
getCurrentBytes
public abstract int getCurrentBytes()
Returns the number of storage bytes required for this union in its current state.- Returns:
- the number of storage bytes required for this union in its current state.
-
getFamily
public Family getFamily()
Description copied from class:SetOperation
Gets the Family of this SetOperation- Specified by:
getFamily
in classSetOperation
- Returns:
- the Family of this SetOperation
-
getMaxUnionBytes
public abstract int getMaxUnionBytes()
Returns the maximum required storage bytes for this union.- Returns:
- the maximum required storage bytes for this union.
-
getResult
public abstract CompactSketch getResult()
Gets the result of this operation as an ordered CompactSketch on the Java heap. This does not disturb the underlying data structure of the union. Therefore, it is OK to continue updating the union after this operation.- Returns:
- the result of this operation as an ordered CompactSketch on the Java heap
-
getResult
public abstract CompactSketch getResult(boolean dstOrdered, org.apache.datasketches.memory.WritableMemory dstMem)
Gets the result of this operation as a CompactSketch of the chosen form. This does not disturb the underlying data structure of the union. Therefore, it is OK to continue updating the union after this operation.- Parameters:
dstOrdered
- See Destination OrdereddstMem
- See Destination Memory.- Returns:
- the result of this operation as a CompactSketch of the chosen form
-
reset
public abstract void reset()
Resets this Union. The seed remains intact, everything else reverts back to its virgin state.
-
toByteArray
public abstract byte[] toByteArray()
Returns a byte array image of this Union object- Returns:
- a byte array image of this Union object
-
union
public CompactSketch union(Sketch sketchA, Sketch sketchB)
This implements a stateless, pair-wise union operation. The returned sketch will be cut back to the smaller of the two k values if required.Nulls and empty sketches are ignored.
- Parameters:
sketchA
- The first argumentsketchB
- The second argument- Returns:
- the result ordered CompactSketch on the heap.
-
union
public abstract CompactSketch union(Sketch sketchA, Sketch sketchB, boolean dstOrdered, org.apache.datasketches.memory.WritableMemory dstMem)
This implements a stateless, pair-wise union operation. The returned sketch will be cut back to k if required, similar to the regular Union operation.Nulls and empty sketches are ignored.
- Parameters:
sketchA
- The first argumentsketchB
- The second argumentdstOrdered
- If true, the returned CompactSketch will be ordered.dstMem
- If not null, the returned CompactSketch will be placed in this WritableMemory.- Returns:
- the result CompactSketch.
-
union
public abstract void union(Sketch sketchIn)
Perform a Union operation with this union and the given on-heap sketch of the Theta Family. This method is not valid for the older SetSketch, which was prior to Open Source (August, 2015).This method can be repeatedly called.
Nulls and empty sketches are ignored.
- Parameters:
sketchIn
- The incoming sketch.
-
union
public abstract void union(org.apache.datasketches.memory.Memory mem)
Perform a Union operation with this union and the given Memory image of any sketch of the Theta Family. The input image may be from earlier versions of the Theta Compact Sketch, called the SetSketch (circa 2014), which was prior to Open Source and are compact and ordered.This method can be repeatedly called.
Nulls and empty sketches are ignored.
- Parameters:
mem
- Memory image of sketch to be merged
-
update
public abstract void update(long datum)
Update this union with the given long data item.- Parameters:
datum
- The given long datum.
-
update
public abstract void update(double datum)
Update this union with the given double (or float) data item. The double will be converted to a long using Double.doubleToLongBits(datum), which normalizes all NaN values to a single NaN representation. Plus and minus zero will be normalized to plus zero. Each of the special floating-point values NaN and +/- Infinity are treated as distinct.- Parameters:
datum
- The given double datum.
-
update
public abstract void update(String datum)
Update this union with the with the given String data item. The string is converted to a byte array using UTF8 encoding. If the string is null or empty no update attempt is made and the method returns.Note: this will not produce the same output hash values as the
update(char[])
method and will generally be a little slower depending on the complexity of the UTF8 encoding.Note: this is not a Sketch Union operation. This treats the given string as a data item.
- Parameters:
datum
- The given String.
-
update
public abstract void update(byte[] data)
Update this union with the given byte array item. If the byte array is null or empty no update attempt is made and the method returns.Note: this is not a Sketch Union operation. This treats the given byte array as a data item.
- Parameters:
data
- The given byte array.
-
update
public abstract void update(int[] data)
Update this union with the given integer array item. If the integer array is null or empty no update attempt is made and the method returns.Note: this is not a Sketch Union operation. This treats the given integer array as a data item.
- Parameters:
data
- The given int array.
-
update
public abstract void update(char[] data)
Update this union with the given char array item. If the char array is null or empty no update attempt is made and the method returns.Note: this will not produce the same output hash values as the
update(String)
method but will be a little faster as it avoids the complexity of the UTF8 encoding.Note: this is not a Sketch Union operation. This treats the given char array as a data item.
- Parameters:
data
- The given char array.
-
update
public abstract void update(long[] data)
Update this union with the given long array item. If the long array is null or empty no update attempt is made and the method returns.Note: this is not a Sketch Union operation. This treats the given char array as a data item.
- Parameters:
data
- The given long array.
-
-