Class UpdateSketch
- java.lang.Object
-
- org.apache.datasketches.theta.Sketch
-
- org.apache.datasketches.theta.UpdateSketch
-
public abstract class UpdateSketch extends Sketch
The parent class for the Update Sketch families, such as QuickSelect and Alpha. The primary task of an Update Sketch is to consider datums presented via the update() methods for inclusion in its internal cache. This is the sketch building process.- Author:
- Lee Rhodes
-
-
Method Summary
All Methods Static Methods Instance Methods Abstract Methods Concrete Methods Modifier and Type Method Description static UpdateSketchBuilder
builder()
Returns a new builderCompactSketch
compact(boolean dstOrdered, org.apache.datasketches.memory.WritableMemory dstMem)
Convert this sketch to a CompactSketch.int
getCompactBytes()
Returns the number of storage bytes required for this Sketch if its current state were compacted.abstract int
getLgNomLongs()
Gets the Log base 2 of the configured nominal entriesabstract ResizeFactor
getResizeFactor()
Returns the configured ResizeFactorstatic UpdateSketch
heapify(org.apache.datasketches.memory.Memory srcMem)
Instantiates an on-heap UpdateSketch from Memory.static UpdateSketch
heapify(org.apache.datasketches.memory.Memory srcMem, long expectedSeed)
Instantiates an on-heap UpdateSketch from Memory.boolean
isCompact()
Returns true if this sketch is in compact form.boolean
isOrdered()
Returns true if internal cache is orderedabstract UpdateSketch
rebuild()
Rebuilds the hash table to remove dirty values or to reduce the size to nominal entries.abstract void
reset()
Resets this sketch back to a virgin empty state.UpdateReturnState
update(byte[] data)
Present this sketch with the given byte array.UpdateReturnState
update(char[] data)
Present this sketch with the given char array.UpdateReturnState
update(double datum)
Present this sketch with the given double (or float) datum.UpdateReturnState
update(int[] data)
Present this sketch with the given integer array.UpdateReturnState
update(long datum)
Present this sketch with a long.UpdateReturnState
update(long[] data)
Present this sketch with the given long array.UpdateReturnState
update(String datum)
Present this sketch with the given String.static UpdateSketch
wrap(org.apache.datasketches.memory.WritableMemory srcMem)
Wrap takes the sketch image in Memory and refers to it directly.static UpdateSketch
wrap(org.apache.datasketches.memory.WritableMemory srcMem, long expectedSeed)
Wrap takes the sketch image in Memory and refers to it directly.-
Methods inherited from class org.apache.datasketches.theta.Sketch
compact, getCountLessThanThetaLong, getCurrentBytes, getEstimate, getFamily, getLowerBound, getMaxCompactSketchBytes, getMaxUpdateSketchBytes, getRetainedEntries, getRetainedEntries, getSerializationVersion, getTheta, getThetaLong, getUpperBound, hasMemory, isDirect, isEmpty, isEstimationMode, isSameResource, iterator, toByteArray, toString, toString, toString, toString, wrap, wrap
-
-
-
-
Method Detail
-
wrap
public static UpdateSketch wrap(org.apache.datasketches.memory.WritableMemory srcMem)
Wrap takes the sketch image in Memory and refers to it directly. There is no data copying onto the java heap. Only "Direct" Serialization Version 3 (i.e, OpenSource) sketches that have been explicitly stored as direct objects can be wrapped. This method assumes theThetaUtil.DEFAULT_UPDATE_SEED
. Default Update Seed.- Parameters:
srcMem
- an image of a Sketch where the image seed hash matches the default seed hash. It must have a size of at least 24 bytes. See Memory- Returns:
- a Sketch backed by the given Memory
-
wrap
public static UpdateSketch wrap(org.apache.datasketches.memory.WritableMemory srcMem, long expectedSeed)
Wrap takes the sketch image in Memory and refers to it directly. There is no data copying onto the java heap. Only "Direct" Serialization Version 3 (i.e, OpenSource) sketches that have been explicitly stored as direct objects can be wrapped. An attempt to "wrap" earlier version sketches will result in a "heapified", normal Java Heap version of the sketch where all data will be copied to the heap.- Parameters:
srcMem
- an image of a Sketch where the image seed hash matches the given seed hash. It must have a size of at least 24 bytes. See MemoryexpectedSeed
- the seed used to validate the given Memory image. See Update Hash Seed. Compact sketches store a 16-bit hash of the seed, but not the seed itself.- Returns:
- a UpdateSketch backed by the given Memory
-
heapify
public static UpdateSketch heapify(org.apache.datasketches.memory.Memory srcMem)
Instantiates an on-heap UpdateSketch from Memory. This method assumes theThetaUtil.DEFAULT_UPDATE_SEED
.- Parameters:
srcMem
- See Memory It must have a size of at least 24 bytes.- Returns:
- an UpdateSketch
-
heapify
public static UpdateSketch heapify(org.apache.datasketches.memory.Memory srcMem, long expectedSeed)
Instantiates an on-heap UpdateSketch from Memory.- Parameters:
srcMem
- See Memory It must have a size of at least 24 bytes.expectedSeed
- the seed used to validate the given Memory image. See Update Hash Seed.- Returns:
- an UpdateSketch
-
compact
public CompactSketch compact(boolean dstOrdered, org.apache.datasketches.memory.WritableMemory dstMem)
Description copied from class:Sketch
Convert this sketch to a CompactSketch.If this sketch is a type of UpdateSketch, the compacting process converts the hash table of the UpdateSketch to a simple list of the valid hash values. Any hash values of zero or equal-to or greater than theta will be discarded. The number of valid values remaining in the CompactSketch depends on a number of factors, but may be larger or smaller than Nominal Entries (or k). It will never exceed 2k. If it is critical to always limit the size to no more than k, then rebuild() should be called on the UpdateSketch prior to calling this method.
A CompactSketch is always immutable.
A new CompactSketch object is created:
- if dstMem != null
- if dstMem == null and this.hasMemory() == true
- if dstMem == null and this has more than 1 item and this.isOrdered() == false and dstOrdered == true.
Otherwise, this operation returns this.
- Specified by:
compact
in classSketch
- Parameters:
dstOrdered
- assumed true if this sketch is empty or has only one value See Destination OrdereddstMem
- See Destination Memory.- Returns:
- this sketch as a CompactSketch.
-
getCompactBytes
public int getCompactBytes()
Description copied from class:Sketch
Returns the number of storage bytes required for this Sketch if its current state were compacted. It this sketch is already in the compact form this is equivalent to callingSketch.getCurrentBytes()
.- Specified by:
getCompactBytes
in classSketch
- Returns:
- number of compact bytes
-
isCompact
public boolean isCompact()
Description copied from class:Sketch
Returns true if this sketch is in compact form.
-
isOrdered
public boolean isOrdered()
Description copied from class:Sketch
Returns true if internal cache is ordered
-
builder
public static final UpdateSketchBuilder builder()
Returns a new builder- Returns:
- a new builder
-
getResizeFactor
public abstract ResizeFactor getResizeFactor()
Returns the configured ResizeFactor- Returns:
- the configured ResizeFactor
-
reset
public abstract void reset()
Resets this sketch back to a virgin empty state.
-
rebuild
public abstract UpdateSketch rebuild()
Rebuilds the hash table to remove dirty values or to reduce the size to nominal entries.- Returns:
- this sketch
-
update
public UpdateReturnState update(long datum)
Present this sketch with a long.- Parameters:
datum
- The given long datum.- Returns:
- See Update Return State
-
update
public UpdateReturnState update(double datum)
Present this sketch with the given double (or float) datum. The double will be converted to a long using Double.doubleToLongBits(datum), which normalizes all NaN values to a single NaN representation. Plus and minus zero will be normalized to plus zero. The special floating-point values NaN and +/- Infinity are treated as distinct.- Parameters:
datum
- The given double datum.- Returns:
- See Update Return State
-
update
public UpdateReturnState update(String datum)
Present this sketch with the given String. The string is converted to a byte array using UTF8 encoding. If the string is null or empty no update attempt is made and the method returns.Note: this will not produce the same output hash values as the
update(char[])
method and will generally be a little slower depending on the complexity of the UTF8 encoding.- Parameters:
datum
- The given String.- Returns:
- See Update Return State
-
update
public UpdateReturnState update(byte[] data)
Present this sketch with the given byte array. If the byte array is null or empty no update attempt is made and the method returns.- Parameters:
data
- The given byte array.- Returns:
- See Update Return State
-
update
public UpdateReturnState update(char[] data)
Present this sketch with the given char array. If the char array is null or empty no update attempt is made and the method returns.Note: this will not produce the same output hash values as the
update(String)
method but will be a little faster as it avoids the complexity of the UTF8 encoding.- Parameters:
data
- The given char array.- Returns:
- See Update Return State
-
update
public UpdateReturnState update(int[] data)
Present this sketch with the given integer array. If the integer array is null or empty no update attempt is made and the method returns.- Parameters:
data
- The given int array.- Returns:
- See Update Return State
-
update
public UpdateReturnState update(long[] data)
Present this sketch with the given long array. If the long array is null or empty no update attempt is made and the method returns.- Parameters:
data
- The given long array.- Returns:
- See Update Return State
-
getLgNomLongs
public abstract int getLgNomLongs()
Gets the Log base 2 of the configured nominal entries- Returns:
- the Log base 2 of the configured nominal entries
-
-