Class UpdatableTupleSketch<U, S extends UpdatableSummary<U>>

java.lang.Object
org.apache.datasketches.tuple.TupleSketch<S>
org.apache.datasketches.tuple.UpdatableTupleSketch<U,S>
Type Parameters:
U - Type of the value, which is passed to update method of a Summary
S - Type of the UpdatableSummary<U>
Direct Known Subclasses:
ArrayOfStringsTupleSketch, DoubleTupleSketch, IntegerTupleSketch

public class UpdatableTupleSketch<U, S extends UpdatableSummary<U>> extends TupleSketch<S>
An extension of QuickSelectSketch<S>, which can be updated with many types of keys. Summary objects are created using a user-defined SummaryFactory class, which should allow very flexible parameterization if needed. Keys are presented to a sketch along with values of a user-defined update type U. When an entry is inserted into a sketch or a duplicate key is presented to a sketch then summary.update(U value) method will be called. So any kind of user-defined accumulation is possible. Summaries also must know how to copy themselves. Also union and intersection of summaries can be implemented in a sub-class of SummarySetOperations, which will be used in case TupleUnion or TupleIntersection of two instances of TupleSketch is needed
  • Constructor Details

    • UpdatableTupleSketch

      public UpdatableTupleSketch(int nomEntries, int lgResizeFactor, float samplingProbability, SummaryFactory<S> summaryFactory)
      This is to create a new instance of an UpdatableQuickSelectSketch.
      Parameters:
      nomEntries - Nominal number of entries. Forced to the nearest power of 2 greater than or equal to the given value.
      lgResizeFactor - log2(resizeFactor) - value from 0 to 3:
      0 - no resizing (max size allocated),
      1 - double internal hash table each time it reaches a threshold
      2 - grow four times
      3 - grow eight times (default)
      
      samplingProbability - See Sampling Probability
      summaryFactory - An instance of a SummaryFactory.
    • UpdatableTupleSketch

      @Deprecated public UpdatableTupleSketch(MemorySegment srcSeg, SummaryDeserializer<S> deserializer, SummaryFactory<S> summaryFactory)
      Deprecated.
      As of 3.0.0, heapifying an UpdatableTupleSketch is deprecated. This capability will be removed in a future release. Heapifying a CompactTupleSketch is not deprecated.
      This is to create an instance of an UpdatableTupleSketch given a serialized form
      Parameters:
      srcSeg - MemorySegment object with data of a serialized UpdatableTupleSketch
      deserializer - instance of SummaryDeserializer
      summaryFactory - instance of SummaryFactory
    • UpdatableTupleSketch

      public UpdatableTupleSketch(UpdatableTupleSketch<U,S> sketch)
      Copy Constructor
      Parameters:
      sketch - the sketch to copy
  • Method Details

    • copy

      public UpdatableTupleSketch<U,S> copy()
      Returns:
      a deep copy of this sketch
    • update

      public void update(long key, U value)
      Updates this sketch with a long key and U value. The value is passed to update() method of the Summary object associated with the key
      Parameters:
      key - The given long key
      value - The given U value
    • update

      public void update(double key, U value)
      Updates this sketch with a double key and U value. The value is passed to update() method of the Summary object associated with the key
      Parameters:
      key - The given double key
      value - The given U value
    • update

      public void update(String key, U value)
      Updates this sketch with a String key and U value. The value is passed to update() method of the Summary object associated with the key
      Parameters:
      key - The given String key
      value - The given U value
    • update

      public void update(byte[] key, U value)
      Updates this sketch with a byte[] key and U value. The value is passed to update() method of the Summary object associated with the key
      Parameters:
      key - The given byte[] key
      value - The given U value
    • update

      public void update(ByteBuffer buffer, U value)
      Updates this sketch with a ByteBuffer and U value The value is passed to the update() method of the Summary object associated with the key
      Parameters:
      buffer - The given ByteBuffer key
      value - The given U value
    • update

      public void update(int[] key, U value)
      Updates this sketch with a int[] key and U value. The value is passed to update() method of the Summary object associated with the key
      Parameters:
      key - The given int[] key
      value - The given U value
    • update

      public void update(long[] key, U value)
      Updates this sketch with a long[] key and U value. The value is passed to update() method of the Summary object associated with the key
      Parameters:
      key - The given long[] key
      value - The given U value
    • getRetainedEntries

      public int getRetainedEntries()
      Description copied from class: TupleSketch
      Returns number of retained entries
      Specified by:
      getRetainedEntries in class TupleSketch<S extends Summary>
      Returns:
      number of retained entries
    • getCountLessThanThetaLong

      public int getCountLessThanThetaLong(long thetaLong)
      Description copied from class: TupleSketch
      Gets the number of hash values less than the given theta expressed as a long.
      Specified by:
      getCountLessThanThetaLong in class TupleSketch<S extends Summary>
      Parameters:
      thetaLong - the given theta as a long in the range (zero, Long.MAX_VALUE].
      Returns:
      the number of hash values less than the given thetaLong.
    • getNominalEntries

      public int getNominalEntries()
      Get configured nominal number of entries
      Returns:
      nominal number of entries
    • getLgK

      public int getLgK()
      Get log_base2 of Nominal Entries
      Returns:
      log_base2 of Nominal Entries
    • getSamplingProbability

      public float getSamplingProbability()
      Get configured sampling probability
      Returns:
      sampling probability
    • getCurrentCapacity

      public int getCurrentCapacity()
      Get current capacity
      Returns:
      current capacity
    • getResizeFactor

      public ResizeFactor getResizeFactor()
      Get configured resize factor
      Returns:
      resize factor
    • trim

      public void trim()
      Rebuilds reducing the actual number of entries to the nominal number of entries if needed
    • reset

      public void reset()
      Resets this sketch an empty state.
    • compact

      public CompactTupleSketch<S> compact()
      Converts the current state of the sketch into a compact sketch
      Specified by:
      compact in class TupleSketch<S extends Summary>
      Returns:
      compact sketch
    • toByteArray

      @Deprecated public byte[] toByteArray()
      Deprecated.
      As of 3.0.0, serializing an UpdatableTupleSketch is deprecated. This capability will be removed in a future release. Serializing a CompactTupleSketch is not deprecated.
      This serializes an UpdatableTupleSketch (QuickSelectSketch).
      Specified by:
      toByteArray in class TupleSketch<S extends Summary>
      Returns:
      serialized representation of an UpdatableTupleSketch (QuickSelectSketch).
    • iterator

      public TupleSketchIterator<S> iterator()
      Description copied from class: TupleSketch
      Returns a SketchIterator
      Specified by:
      iterator in class TupleSketch<S extends Summary>
      Returns:
      a SketchIterator