Class BinomialBoundsN

java.lang.Object
org.apache.datasketches.thetacommon.BinomialBoundsN

public final class BinomialBoundsN extends Object
This class enables the estimation of error bounds given a sample set size, the sampling probability theta, the number of standard deviations and a simple noDataSeen flag. This can be used to estimate error bounds for fixed threshold sampling as well as the error bounds calculations for sketches.
Author:
Kevin Lang
  • Method Summary

    Modifier and Type
    Method
    Description
    static double
    getLowerBound(long numSamples, double theta, int numSDev, boolean noDataSeen)
    Returns the approximate lower bound value
    static double
    getUpperBound(long numSamples, double theta, int numSDev, boolean noDataSeen)
    Returns the approximate upper bound value

    Methods inherited from class java.lang.Object

    equals, getClass, hashCode, notify, notifyAll, toString, wait, wait, wait
  • Method Details

    • getLowerBound

      public static double getLowerBound(long numSamples, double theta, int numSDev, boolean noDataSeen)
      Returns the approximate lower bound value
      Parameters:
      numSamples - the number of samples in the sample set
      theta - the sampling probability
      numSDev - the number of "standard deviations" from the mean for the tail bounds. This must be an integer value of 1, 2 or 3.
      noDataSeen - this is normally false. However, in the case where you have zero samples and a theta < 1.0, this flag enables the distinction between a virgin case when no actual data has been seen and the case where the estimate may be zero but an upper error bound may still exist.
      Returns:
      the approximate lower bound value
    • getUpperBound

      public static double getUpperBound(long numSamples, double theta, int numSDev, boolean noDataSeen)
      Returns the approximate upper bound value
      Parameters:
      numSamples - the number of samples in the sample set
      theta - the sampling probability
      numSDev - the number of "standard deviations" from the mean for the tail bounds. This must be an integer value of 1, 2 or 3.
      noDataSeen - this is normally false. However, in the case where you have zero samples and a theta < 1.0, this flag enables the distinction between a virgin case when no actual data has been seen and the case where the estimate may be zero but an upper error bound may still exist.
      Returns:
      the approximate upper bound value