distinctCount
static int distinctCount(long numVals,
int[] freqCounts,
long nRows,
long sampleSize)
Peter J. Haas, Jeffrey F. Naughton, S. Seshadri, and Lynne Stokes. Sampling-Based Estimation of the Number of
Distinct Values of an Attribute. VLDB'95, Section 3.2.
- Parameters:
numVals
- The number of unique values in the sample
freqCounts
- The inverse histogram of frequencies. counts extracted
nRows
- The original number of rows in the entire input
sampleSize
- The number of rows in the sample
- Returns:
- an estimation of number of distinct values.