Class UpdatableThetaSketchBuilder

java.lang.Object
org.apache.datasketches.theta.UpdatableThetaSketchBuilder

public final class UpdatableThetaSketchBuilder extends Object
For building a new UpdatableThetaSketch.
Author:
Lee Rhodes
  • Constructor Details

    • UpdatableThetaSketchBuilder

      public UpdatableThetaSketchBuilder()
      Constructor for building a new UpdatableThetaSketch. The default configuration is Parameters unique to the concurrent sketches only:
      • Concurrent NumPoolThreads: 3
      • Number of local Nominal Entries: 4
      • Concurrent PropagateOrderedCompact: true
      • Concurrent MaxConcurrencyError: 0
      • Concurrent MaxNumLocalThreads: 1
  • Method Details

    • setNominalEntries

      public UpdatableThetaSketchBuilder setNominalEntries(int nomEntries)
      Sets the local Nominal Entries for this builder. This value is also used for building a shared concurrent sketch. The minimum value is 16 (2^4) and the maximum value is 67,108,864 (2^26). Be aware that sketches as large as this maximum value may not have been thoroughly tested or characterized for performance.
      Parameters:
      nomEntries - Nominal Entries This will become the ceiling power of 2 if the given value is not.
      Returns:
      this UpdatableThetaSketchBuilder
    • setLogNominalEntries

      public UpdatableThetaSketchBuilder setLogNominalEntries(int lgNomEntries)
      Alternative method of setting the local Nominal Entries for this builder from the log_base2 value. This value is also used for building a shared concurrent sketch. The minimum value is 4 and the maximum value is 26. Be aware that sketches as large as this maximum value may not have been thoroughly characterized for performance.
      Parameters:
      lgNomEntries - the Log Nominal Entries. Also for the concurrent shared sketch
      Returns:
      this UpdatableThetaSketchBuilder
    • setLgK

      public UpdatableThetaSketchBuilder setLgK(int lgK)
      Alternative method of setting the Nominal Entries for this builder from the log_base2 value, commonly called LgK. This value is also used for building a shared concurrent sketch. The minimum value is 4 and the maximum value is 26. Be aware that sketches as large as 26 may not have been thoroughly characterized for performance.
      Parameters:
      lgK - the Log Nominal Entries. Also for the concurrent shared sketch.
      Returns:
      this UpdatableThetaSketchBuilder
    • getLgNominalEntries

      public int getLgNominalEntries()
      Returns the local Log-base 2 Nominal Entries
      Returns:
      Log-base 2 Nominal Entries
    • setConCurNominalEntries

      public UpdatableThetaSketchBuilder setConCurNominalEntries(int nomEntries)
      Sets the local (default) Concurrent Nominal Entries for the concurrent local sketch. The minimum value is 16 and the maximum value is 67,108,864, which is 2^26. Be aware that sketches as large as this maximum value have not been thoroughly tested or characterized for performance.
      Parameters:
      nomEntries - Nominal Entries This will become the ceiling power of 2 if it is not.
      Returns:
      this UpdatableThetaSketchBuilder
    • setConCurLogNominalEntries

      public UpdatableThetaSketchBuilder setConCurLogNominalEntries(int lgNomEntries)
      Alternative method of setting the local (default) Nominal Entries for a local concurrent sketch from the log_base2 value. The minimum value is 4 and the maximum value is 26. Be aware that sketches as large as this maximum value have not been thoroughly tested or characterized for performance.
      Parameters:
      lgNomEntries - the Log Nominal Entries for a concurrent local sketch
      Returns:
      this UpdatableThetaSketchBuilder
    • getConCurLgNominalEntries

      public int getConCurLgNominalEntries()
      Returns local Log-base 2 Nominal Entries for the concurrent local sketch
      Returns:
      Log-base 2 Nominal Entries for the concurrent local sketch
    • setSeed

      public UpdatableThetaSketchBuilder setSeed(long seed)
      Sets the local long seed value that is required by the hashing function.
      Parameters:
      seed - See seed
      Returns:
      this UpdatableThetaSketchBuilder
    • getSeed

      public long getSeed()
      Returns the local long seed value that is required by the hashing function.
      Returns:
      the seed
    • setP

      public UpdatableThetaSketchBuilder setP(float p)
      Sets the local upfront uniform pre-sampling probability, p
      Parameters:
      p - See Sampling Probability, p
      Returns:
      this UpdatableThetaSketchBuilder
    • getP

      public float getP()
      Returns the local upfront uniform pre-sampling probability p
      Returns:
      the pre-sampling probability p
    • setResizeFactor

      public UpdatableThetaSketchBuilder setResizeFactor(ResizeFactor rf)
      Sets the local cache Resize Factor.
      Parameters:
      rf - See Resize Factor
      Returns:
      this UpdatableThetaSketchBuilder
    • getResizeFactor

      public ResizeFactor getResizeFactor()
      Returns the local Resize Factor
      Returns:
      the Resize Factor
    • setFamily

      public UpdatableThetaSketchBuilder setFamily(Family family)
      Set the local Family. Choose either Family.ALPHA or Family.QUICKSELECT.
      Parameters:
      family - the family for this builder
      Returns:
      this UpdatableThetaSketchBuilder
    • getFamily

      public Family getFamily()
      Returns the local Family
      Returns:
      the Family
    • setMemorySegmentRequest

      public UpdatableThetaSketchBuilder setMemorySegmentRequest(MemorySegmentRequest mSegReq)
      Sets the local MemorySegmentRequest
      Parameters:
      mSegReq - the given MemorySegmentRequest
      Returns:
      this UpdatableThetaSketchBuilder
    • getMemorySegmentRequest

      public MemorySegmentRequest getMemorySegmentRequest()
      Returns the local MemorySegmentRequest
      Returns:
      the local MemorySegmentRequest
    • setNumPoolThreads

      public void setNumPoolThreads(int numPoolThreads)
      Sets the local number of pool threads used for background propagation in the concurrent sketches.
      Parameters:
      numPoolThreads - the given number of pool threads
    • getNumPoolThreads

      public int getNumPoolThreads()
      Gets the local number of background pool threads used for propagation in the concurrent sketches.
      Returns:
      the number of background pool threads
    • setPropagateOrderedCompact

      public UpdatableThetaSketchBuilder setPropagateOrderedCompact(boolean prop)
      Sets the local Propagate Ordered Compact flag to the given value. Used with concurrent sketches.
      Parameters:
      prop - the given value
      Returns:
      this UpdatableThetaSketchBuilder
    • getPropagateOrderedCompact

      public boolean getPropagateOrderedCompact()
      Gets the local Propagate Ordered Compact flag used with concurrent sketches.
      Returns:
      the Propagate Ordered Compact flag
    • setMaxConcurrencyError

      public void setMaxConcurrencyError(double maxConcurrencyError)
      Sets the local Maximum Concurrency Error.
      Parameters:
      maxConcurrencyError - the given Maximum Concurrency Error.
    • getMaxConcurrencyError

      public double getMaxConcurrencyError()
      Gets the local Maximum Concurrency Error
      Returns:
      the Maximum Concurrency Error
    • setMaxNumLocalThreads

      public void setMaxNumLocalThreads(int maxNumLocalThreads)
      Sets the local Maximum Number of Local Threads. This is used to set the size of the local concurrent buffers.
      Parameters:
      maxNumLocalThreads - the given Maximum Number of Local Threads
    • getMaxNumLocalThreads

      public int getMaxNumLocalThreads()
      Gets the local Maximum Number of Local Threads.
      Returns:
      the Maximum Number of Local Threads.
    • build

      public UpdatableThetaSketch build()
      Returns an UpdatableThetaSketch with the current configuration of this Builder.
      Returns:
      an UpdatableThetaSketch
    • build

      public UpdatableThetaSketch build(MemorySegment dstSeg)
      Returns an UpdatableThetaSketch with the current configuration of this Builder with the specified backing destination MemorySegment store. Note: this can only be used with the QUICKSELECT Family of sketches and cannot be used with the Alpha Family of sketches.
      Parameters:
      dstSeg - The destination MemorySegment.
      Returns:
      an UpdatableThetaSketch
    • buildShared

      public UpdatableThetaSketch buildShared()
      Returns an on-heap concurrent shared UpdatableThetaSketch with the current configuration of the Builder.

      The parameters unique to the shared concurrent sketch are:

      • Number of Pool Threads (default is 3)
      • Maximum Concurrency Error

      Key parameters that are in common with other ThetaSketches:

      • Nominal Entries or Log Nominal Entries (for the shared concurrent sketch)
      Returns:
      an on-heap concurrent UpdatableThetaSketch with the current configuration of the Builder.
    • buildShared

      public UpdatableThetaSketch buildShared(MemorySegment dstSeg)
      Returns a concurrent shared UpdatableThetaSketch with the current configuration of the Builder and the given destination MemorySegment. If the destination MemorySegment is null, this defaults to an on-heap concurrent shared UpdatableThetaSketch.

      The parameters unique to the shared concurrent sketch are:

      • Number of Pool Threads (default is 3)
      • Maximum Concurrency Error

      Key parameters that are in common with other Theta sketches:

      • Nominal Entries or Log Nominal Entries (for the shared concurrent sketch)
      • Destination MemorySegment (if not null, returned sketch is Direct. Default is null.)
      Parameters:
      dstSeg - the given MemorySegment for Direct, otherwise null.
      Returns:
      a concurrent UpdatableThetaSketch with the current configuration of the Builder and the given destination MemorySegment.
    • buildSharedFromSketch

      public UpdatableThetaSketch buildSharedFromSketch(UpdatableThetaSketch sketch, MemorySegment dstSeg)
      Returns a direct (potentially off-heap) concurrent shared UpdatableThetaSketch with the current configuration of the Builder, the data from the given sketch, and the given destination MemorySegment. If the destination MemorySegment is null, this defaults to an on-heap concurrent shared UpdatableThetaSketch.

      The parameters unique to the shared concurrent sketch are:

      • Number of Pool Threads (default is 3)
      • Maximum Concurrency Error

      Key parameters that are in common with other Theta sketches:

      • Nominal Entries or Log Nominal Entries (for the shared concurrent sketch)
      • Destination MemorySegment (if not null, returned sketch is Direct. Default is null.)
      Parameters:
      sketch - a given UpdatableThetaSketch from which the data is used to initialize the returned shared sketch.
      dstSeg - the given MemorySegment for Direct, otherwise null.
      Returns:
      a concurrent UpdatableThetaSketch with the current configuration of the Builder and the given destination MemorySegment.
    • buildLocal

      public UpdatableThetaSketch buildLocal(UpdatableThetaSketch shared)
      Returns a local, on-heap, concurrent UpdatableThetaSketch to be used as a per-thread local buffer along with the given concurrent shared UpdatableThetaSketch and the current configuration of this Builder.

      The parameters unique to the local concurrent sketch are:

      • Local Nominal Entries or Local Log Nominal Entries
      • Propagate Ordered Compact flag
      Parameters:
      shared - the concurrent shared sketch to be accessed via the concurrent local sketch.
      Returns:
      an UpdatableThetaSketch to be used as a per-thread local buffer.
    • toString

      public String toString()
      Overrides:
      toString in class Object