Class UpdatableThetaSketch
- All Implemented Interfaces:
MemorySegmentStatus
- Author:
- Lee Rhodes
-
Method Summary
Modifier and TypeMethodDescriptionstatic final UpdatableThetaSketchBuilderbuilder()Returns a new buildercompact(boolean dstOrdered, MemorySegment dstWSeg) Convert this sketch to a CompactThetaSketch.intReturns the number of storage bytes required for this ThetaSketch if its current state were compacted.abstract intGets the Log base 2 of the configured nominal entriesabstract ResizeFactorReturns the configured ResizeFactorlonggetSeed()Gets the configured seedbooleanReturns true if this object's internal data is backed by a MemorySegment, which may be on-heap or off-heap.static UpdatableThetaSketchheapify(MemorySegment srcSeg) Instantiates an on-heap UpdatableThetaSketch from a MemorySegment.static UpdatableThetaSketchheapify(MemorySegment srcSeg, long expectedSeed) Instantiates an on-heap UpdatableThetaSketch from a MemorySegment.booleanReturns true if this sketch is in compact form.booleanReturns true if this object's internal data is backed by an off-heap (direct or native)) MemorySegment.booleanReturns true if internal cache is orderedbooleanisSameResource(MemorySegment that) Returns true if an internally referenced MemorySegment has the same backing resource as that, or equivalently, if their two memory regions overlap.abstract UpdatableThetaSketchrebuild()Rebuilds the hash table to remove dirty values or to reduce the size to nominal entries.abstract voidreset()Resets this sketch back to a virgin empty state.update(byte[] data) Present this sketch with the given byte array.update(char[] data) Present this sketch with the given char array.update(double datum) Present this sketch with the given double (or float) datum.update(int[] data) Present this sketch with the given integer array.update(long datum) Present this sketch with a long.update(long[] data) Present this sketch with the given long array.Present this sketch with the given String.update(ByteBuffer buffer) Present this sketch with the given ByteBuffer If the ByteBuffer is null or empty, no update attempt is made and the method returns.static UpdatableThetaSketchwrap(MemorySegment srcWSeg) Wrap takes the writable sketch image in MemorySegment and refers to it directly.static UpdatableThetaSketchwrap(MemorySegment srcWSeg, MemorySegmentRequest mSegReq, long expectedSeed) Wrap takes the sketch image in MemorySegment and refers to it directly.Methods inherited from class ThetaSketch
compact, getCompactSketchMaxBytes, getCountLessThanThetaLong, getCurrentBytes, getEstimate, getEstimate, getFamily, getLowerBound, getLowerBound, getMaxCompactSketchBytes, getMaxUpdateSketchBytes, getRetainedEntries, getRetainedEntries, getRetainedEntries, getSerializationVersion, getTheta, getThetaLong, getUpdateSketchMaxBytes, getUpperBound, getUpperBound, isEmpty, isEstimationMode, iterator, toByteArray, toString, toString, toString, toString, wrap
-
Method Details
-
wrap
Wrap takes the writable sketch image in MemorySegment and refers to it directly. There is no data copying onto the java heap. Only "Direct" Serialization Version 3 (i.e, OpenSource) sketches that have been explicitly stored as writable, direct objects can be wrapped. This method assumes theUtil.DEFAULT_UPDATE_SEED. Default Update Seed.- Parameters:
srcWSeg- an image of a writable sketch where the image seed hash matches the default seed hash. It must have a size of at least 24 bytes.- Returns:
- an UpdatableThetaSketch backed by the given MemorySegment
- Throws:
SketchesArgumentException- if the provided MemorySegment is invalid, corrupted, or incompatible with this sketch type. Callers must treat this as a fatal error for that segment.
-
wrap
public static UpdatableThetaSketch wrap(MemorySegment srcWSeg, MemorySegmentRequest mSegReq, long expectedSeed) Wrap takes the sketch image in MemorySegment and refers to it directly. There is no data copying onto the java heap. Only "Direct" Serialization Version 3 (i.e, OpenSource) sketches that have been explicitly stored as writable direct objects can be wrapped. An attempt to "wrap" earlier version sketches will result in a "heapified", normal Java Heap version of the sketch where all data will be copied to the heap.- Parameters:
srcWSeg- an image of a writable sketch where the image seed hash matches the given seed hash. It must have a size of at least 24 bytes.mSegReq- an implementation of the MemorySegmentRequest interface or null.expectedSeed- the seed used to validate the given MemorySegment image. See Update Hash Seed. Compact sketches store a 16-bit hash of the seed, but not the seed itself.- Returns:
- a UpdatableThetaSketch backed by the given MemorySegment
- Throws:
SketchesArgumentException- if the provided MemorySegment is invalid, corrupted, or incompatible with this sketch type. Callers must treat this as a fatal error for that segment.
-
heapify
Instantiates an on-heap UpdatableThetaSketch from a MemorySegment. This method assumes theUtil.DEFAULT_UPDATE_SEED.- Parameters:
srcSeg- the given MemorySegment with a sketch image. It must have a size of at least 24 bytes.- Returns:
- an UpdatableThetaSketch
- Throws:
SketchesArgumentException- if the provided MemorySegment is invalid, corrupted, or incompatible with this sketch type. Callers must treat this as a fatal error for that segment.
-
heapify
Instantiates an on-heap UpdatableThetaSketch from a MemorySegment.- Parameters:
srcSeg- the given MemorySegment. It must have a size of at least 24 bytes.expectedSeed- the seed used to validate the given MemorySegment image. See Update Hash Seed.- Returns:
- an UpdatableThetaSketch
- Throws:
SketchesArgumentException- if the provided MemorySegment is invalid, corrupted, or incompatible with this sketch type. Callers must treat this as a fatal error for that segment.
-
compact
Description copied from class:ThetaSketchConvert this sketch to a CompactThetaSketch.If this sketch is a type of UpdatableThetaSketch, the compacting process converts the hash table of the UpdatableThetaketch to a simple list of the valid hash values. Any hash values of zero or equal-to or greater than theta will be discarded. The number of valid values remaining in the CompactThetaSketch depends on a number of factors, but may be larger or smaller than Nominal Entries (or k). It will never exceed 2k. If it is critical to always limit the size to no more than k, then rebuild() should be called on the UpdatableThetaSketch prior to calling this method.
A CompactThetaSketch is always immutable.
A new CompactThetaSketch object is created:
- if dstSeg!= null
- if dstSeg == null and this.hasMemorySegment() == true
- if dstSeg == null and this has more than 1 item and this.isOrdered() == false and dstOrdered == true.
Otherwise, this operation returns this.
- Specified by:
compactin classThetaSketch- Parameters:
dstOrdered- assumed true if this sketch is empty or has only one value See Destination OrdereddstWSeg- See Destination MemorySegment.- Returns:
- this sketch as a CompactThetaSketch.
-
getCompactBytes
public int getCompactBytes()Description copied from class:ThetaSketchReturns the number of storage bytes required for this ThetaSketch if its current state were compacted. It this sketch is already in the compact form this is equivalent to callingThetaSketch.getCurrentBytes().- Specified by:
getCompactBytesin classThetaSketch- Returns:
- number of compact bytes
-
hasMemorySegment
public boolean hasMemorySegment()Description copied from interface:MemorySegmentStatusReturns true if this object's internal data is backed by a MemorySegment, which may be on-heap or off-heap.- Returns:
- true if this object's internal data is backed by a MemorySegment.
-
isCompact
public boolean isCompact()Description copied from class:ThetaSketchReturns true if this sketch is in compact form.- Specified by:
isCompactin classThetaSketch- Returns:
- true if this sketch is in compact form.
-
isOffHeap
public boolean isOffHeap()Description copied from interface:MemorySegmentStatusReturns true if this object's internal data is backed by an off-heap (direct or native)) MemorySegment.- Returns:
- true if this object's internal data is backed by an off-heap (direct or native)) MemorySegment.
-
isOrdered
public boolean isOrdered()Description copied from class:ThetaSketchReturns true if internal cache is ordered- Specified by:
isOrderedin classThetaSketch- Returns:
- true if internal cache is ordered
-
isSameResource
Description copied from interface:MemorySegmentStatusReturns true if an internally referenced MemorySegment has the same backing resource as that, or equivalently, if their two memory regions overlap. This applies to both on-heap and off-heap MemorySegments.Note: If both segments are on-heap and not read-only, it can be determined if they were derived from the same backing memory (array). However, this is not always possible off-heap. Because of this asymmetry, this definition of "isSameResource" is confined to the existence of an overlap.
- Parameters:
that- The given MemorySegment.- Returns:
- true if an internally referenced MemorySegment has the same backing resource as that.
-
builder
Returns a new builder- Returns:
- a new builder
-
getResizeFactor
Returns the configured ResizeFactor- Returns:
- the configured ResizeFactor
-
getSeed
public long getSeed()Gets the configured seed- Returns:
- the configured seed
-
reset
public abstract void reset()Resets this sketch back to a virgin empty state. -
rebuild
Rebuilds the hash table to remove dirty values or to reduce the size to nominal entries.- Returns:
- this sketch
-
update
Present this sketch with a long.- Parameters:
datum- The given long datum.- Returns:
- See Update Return State
-
update
Present this sketch with the given double (or float) datum. The double will be converted to a long using Double.doubleToLongBits(datum), which normalizes all NaN values to a single NaN representation. Plus and minus zero will be normalized to plus zero. The special floating-point values NaN and +/- Infinity are treated as distinct.- Parameters:
datum- The given double datum.- Returns:
- See Update Return State
-
update
Present this sketch with the given String. The string is converted to a byte array using UTF8 encoding. If the string is null or empty no update attempt is made and the method returns.Note: this will not produce the same output hash values as the
update(char[])method and will generally be a little slower depending on the complexity of the UTF8 encoding.- Parameters:
datum- The given String.- Returns:
- See Update Return State
-
update
Present this sketch with the given byte array. If the byte array is null or empty no update attempt is made and the method returns.- Parameters:
data- The given byte array.- Returns:
- See Update Return State
-
update
Present this sketch with the given ByteBuffer If the ByteBuffer is null or empty, no update attempt is made and the method returns.- Parameters:
buffer- the input ByteBuffer- Returns:
- See Update Return State
-
update
Present this sketch with the given char array. If the char array is null or empty no update attempt is made and the method returns.Note: this will not produce the same output hash values as the
update(String)method but will be a little faster as it avoids the complexity of the UTF8 encoding.- Parameters:
data- The given char array.- Returns:
- See Update Return State
-
update
Present this sketch with the given integer array. If the integer array is null or empty no update attempt is made and the method returns.- Parameters:
data- The given int array.- Returns:
- See Update Return State
-
update
Present this sketch with the given long array. If the long array is null or empty no update attempt is made and the method returns.- Parameters:
data- The given long array.- Returns:
- See Update Return State
-
getLgNomLongs
public abstract int getLgNomLongs()Gets the Log base 2 of the configured nominal entries- Returns:
- the Log base 2 of the configured nominal entries
-