Class UpdateSketchBuilder

java.lang.Object
org.apache.datasketches.theta.UpdateSketchBuilder

public class UpdateSketchBuilder extends Object
For building a new UpdateSketch.
Author:
Lee Rhodes
  • Constructor Details

    • UpdateSketchBuilder

      public UpdateSketchBuilder()
      Constructor for building a new UpdateSketch. The default configuration is
      • Nominal Entries: 4096
      • Seed: 9001L
      • Input Sampling Probability: 1.0
      • Family: Family.QUICKSELECT
      • Resize Factor: The default for sketches on the Java heap is ResizeFactor.X8. For direct sketches, which are targeted for native memory off the Java heap, this value will be fixed at either ResizeFactor.X1 or ResizeFactor.X2.
      • MemoryRequestServer (Direct only): DefaultMemoryRequestServer.
      Parameters unique to the concurrent sketches only:
      • Number of local Nominal Entries: 4
      • Concurrent NumPoolThreads: 3
      • Concurrent PropagateOrderedCompact: true
      • Concurrent MaxConcurrencyError: 0
  • Method Details

    • setNominalEntries

      public UpdateSketchBuilder setNominalEntries(int nomEntries)
      Sets the Nominal Entries for this sketch. This value is also used for building a shared concurrent sketch. The minimum value is 16 (2^4) and the maximum value is 67,108,864 (2^26). Be aware that sketches as large as this maximum value may not have been thoroughly tested or characterized for performance.
      Parameters:
      nomEntries - Nominal Entries This will become the ceiling power of 2 if the given value is not.
      Returns:
      this UpdateSketchBuilder
    • setLogNominalEntries

      public UpdateSketchBuilder setLogNominalEntries(int lgNomEntries)
      Alternative method of setting the Nominal Entries for this sketch from the log_base2 value. This value is also used for building a shared concurrent sketch. The minimum value is 4 and the maximum value is 26. Be aware that sketches as large as this maximum value may not have been thoroughly characterized for performance.
      Parameters:
      lgNomEntries - the Log Nominal Entries. Also for the concurrent shared sketch
      Returns:
      this UpdateSketchBuilder
    • getLgNominalEntries

      public int getLgNominalEntries()
      Returns Log-base 2 Nominal Entries
      Returns:
      Log-base 2 Nominal Entries
    • setLocalNominalEntries

      public UpdateSketchBuilder setLocalNominalEntries(int nomEntries)
      Sets the Nominal Entries for the concurrent local sketch. The minimum value is 16 and the maximum value is 67,108,864, which is 2^26. Be aware that sketches as large as this maximum value have not been thoroughly tested or characterized for performance.
      Parameters:
      nomEntries - Nominal Entries This will become the ceiling power of 2 if it is not.
      Returns:
      this UpdateSketchBuilder
    • setLocalLogNominalEntries

      public UpdateSketchBuilder setLocalLogNominalEntries(int lgNomEntries)
      Alternative method of setting the Nominal Entries for a local concurrent sketch from the log_base2 value. The minimum value is 4 and the maximum value is 26. Be aware that sketches as large as this maximum value have not been thoroughly tested or characterized for performance.
      Parameters:
      lgNomEntries - the Log Nominal Entries for a concurrent local sketch
      Returns:
      this UpdateSketchBuilder
    • getLocalLgNominalEntries

      public int getLocalLgNominalEntries()
      Returns Log-base 2 Nominal Entries for the concurrent local sketch
      Returns:
      Log-base 2 Nominal Entries for the concurrent local sketch
    • setSeed

      public UpdateSketchBuilder setSeed(long seed)
      Sets the long seed value that is required by the hashing function.
      Parameters:
      seed - See seed
      Returns:
      this UpdateSketchBuilder
    • getSeed

      public long getSeed()
      Returns the seed
      Returns:
      the seed
    • setP

      public UpdateSketchBuilder setP(float p)
      Sets the upfront uniform sampling probability, p
      Parameters:
      p - See Sampling Probability, p
      Returns:
      this UpdateSketchBuilder
    • getP

      public float getP()
      Returns the pre-sampling probability p
      Returns:
      the pre-sampling probability p
    • setResizeFactor

      public UpdateSketchBuilder setResizeFactor(ResizeFactor rf)
      Sets the cache Resize Factor.
      Parameters:
      rf - See Resize Factor
      Returns:
      this UpdateSketchBuilder
    • getResizeFactor

      public ResizeFactor getResizeFactor()
      Returns the Resize Factor
      Returns:
      the Resize Factor
    • setFamily

      public UpdateSketchBuilder setFamily(Family family)
      Set the Family.
      Parameters:
      family - the family for this builder
      Returns:
      this UpdateSketchBuilder
    • getFamily

      public Family getFamily()
      Returns the Family
      Returns:
      the Family
    • setMemoryRequestServer

      public UpdateSketchBuilder setMemoryRequestServer(org.apache.datasketches.memory.MemoryRequestServer memReqSvr)
      Set the MemoryRequestServer
      Parameters:
      memReqSvr - the given MemoryRequestServer
      Returns:
      this UpdateSketchBuilder
    • getMemoryRequestServer

      public org.apache.datasketches.memory.MemoryRequestServer getMemoryRequestServer()
      Returns the MemoryRequestServer
      Returns:
      the MemoryRequestServer
    • setNumPoolThreads

      public void setNumPoolThreads(int numPoolThreads)
      Sets the number of pool threads used for background propagation in the concurrent sketches.
      Parameters:
      numPoolThreads - the given number of pool threads
    • getNumPoolThreads

      public int getNumPoolThreads()
      Gets the number of background pool threads used for propagation in the concurrent sketches.
      Returns:
      the number of background pool threads
    • setPropagateOrderedCompact

      public UpdateSketchBuilder setPropagateOrderedCompact(boolean prop)
      Sets the Propagate Ordered Compact flag to the given value. Used with concurrent sketches.
      Parameters:
      prop - the given value
      Returns:
      this UpdateSketchBuilder
    • getPropagateOrderedCompact

      public boolean getPropagateOrderedCompact()
      Gets the Propagate Ordered Compact flag used with concurrent sketches.
      Returns:
      the Propagate Ordered Compact flag
    • setMaxConcurrencyError

      public void setMaxConcurrencyError(double maxConcurrencyError)
      Sets the Maximum Concurrency Error.
      Parameters:
      maxConcurrencyError - the given Maximum Concurrency Error.
    • getMaxConcurrencyError

      public double getMaxConcurrencyError()
      Gets the Maximum Concurrency Error
      Returns:
      the Maximum Concurrency Error
    • setMaxNumLocalThreads

      public void setMaxNumLocalThreads(int maxNumLocalThreads)
      Sets the Maximum Number of Local Threads. This is used to set the size of the local concurrent buffers.
      Parameters:
      maxNumLocalThreads - the given Maximum Number of Local Threads
    • getMaxNumLocalThreads

      public int getMaxNumLocalThreads()
      Gets the Maximum Number of Local Threads.
      Returns:
      the Maximum Number of Local Threads.
    • build

      public UpdateSketch build()
      Returns an UpdateSketch with the current configuration of this Builder.
      Returns:
      an UpdateSketch
    • build

      public UpdateSketch build(org.apache.datasketches.memory.WritableMemory dstMem)
      Returns an UpdateSketch with the current configuration of this Builder with the specified backing destination Memory store. Note: this cannot be used with the Alpha Family of sketches.
      Parameters:
      dstMem - The destination Memory.
      Returns:
      an UpdateSketch
    • buildShared

      public UpdateSketch buildShared()
      Returns an on-heap concurrent shared UpdateSketch with the current configuration of the Builder.

      The parameters unique to the shared concurrent sketch are:

      • Number of Pool Threads (default is 3)
      • Maximum Concurrency Error

      Key parameters that are in common with other Theta sketches:

      • Nominal Entries or Log Nominal Entries (for the shared concurrent sketch)
      Returns:
      an on-heap concurrent UpdateSketch with the current configuration of the Builder.
    • buildShared

      public UpdateSketch buildShared(org.apache.datasketches.memory.WritableMemory dstMem)
      Returns a direct (potentially off-heap) concurrent shared UpdateSketch with the current configuration of the Builder and the given destination WritableMemory. If the destination WritableMemory is null, this defaults to an on-heap concurrent shared UpdateSketch.

      The parameters unique to the shared concurrent sketch are:

      • Number of Pool Threads (default is 3)
      • Maximum Concurrency Error

      Key parameters that are in common with other Theta sketches:

      • Nominal Entries or Log Nominal Entries (for the shared concurrent sketch)
      • Destination Writable Memory (if not null, returned sketch is Direct. Default is null.)
      Parameters:
      dstMem - the given WritableMemory for Direct, otherwise null.
      Returns:
      a concurrent UpdateSketch with the current configuration of the Builder and the given destination WritableMemory.
    • buildSharedFromSketch

      public UpdateSketch buildSharedFromSketch(UpdateSketch sketch, org.apache.datasketches.memory.WritableMemory dstMem)
      Returns a direct (potentially off-heap) concurrent shared UpdateSketch with the current configuration of the Builder, the data from the given sketch, and the given destination WritableMemory. If the destination WritableMemory is null, this defaults to an on-heap concurrent shared UpdateSketch.

      The parameters unique to the shared concurrent sketch are:

      • Number of Pool Threads (default is 3)
      • Maximum Concurrency Error

      Key parameters that are in common with other Theta sketches:

      • Nominal Entries or Log Nominal Entries (for the shared concurrent sketch)
      • Destination Writable Memory (if not null, returned sketch is Direct. Default is null.)
      Parameters:
      sketch - a given UpdateSketch from which the data is used to initialize the returned shared sketch.
      dstMem - the given WritableMemory for Direct, otherwise null.
      Returns:
      a concurrent UpdateSketch with the current configuration of the Builder and the given destination WritableMemory.
    • buildLocal

      public UpdateSketch buildLocal(UpdateSketch shared)
      Returns a local, on-heap, concurrent UpdateSketch to be used as a per-thread local buffer along with the given concurrent shared UpdateSketch and the current configuration of this Builder.

      The parameters unique to the local concurrent sketch are:

      • Local Nominal Entries or Local Log Nominal Entries
      • Propagate Ordered Compact flag
      Parameters:
      shared - the concurrent shared sketch to be accessed via the concurrent local sketch.
      Returns:
      an UpdateSketch to be used as a per-thread local buffer.
    • toString

      public String toString()
      Overrides:
      toString in class Object