Interface ComEstFactory
-
public interface ComEstFactory
-
-
Field Summary
Fields Modifier and Type Field Description static org.apache.commons.logging.Log
LOG
-
Method Summary
Static Methods Modifier and Type Method Description static AComEst
createEstimator(MatrixBlock data, CompressionSettings cs, int k)
Create an estimator for the input data with the given settings and parallelization degree.static AComEst
createEstimator(MatrixBlock data, CompressionSettings cs, int sampleSize, int k)
Create an estimator for the input data with the given settings and parallelization degree.static int
getSampleSize(double samplePower, int nRows, int nCols, double sparsity, int minSampleSize, int maxSampleSize)
This function returns the sample size to use.
-
-
-
Method Detail
-
createEstimator
static AComEst createEstimator(MatrixBlock data, CompressionSettings cs, int k)
Create an estimator for the input data with the given settings and parallelization degree.- Parameters:
data
- The matrix to extract compression information from.cs
- The settings for the compressionk
- The parallelization degree- Returns:
- A new CompressionSizeEstimator used to extract information of column groups
-
createEstimator
static AComEst createEstimator(MatrixBlock data, CompressionSettings cs, int sampleSize, int k)
Create an estimator for the input data with the given settings and parallelization degree.- Parameters:
data
- The matrix to extract compression information from.cs
- The settings for the compressionsampleSize
- The number of rows to extract from the input data to extract information from.k
- The parallelization degree- Returns:
- A new CompressionSizeEstimator used to extract information of column groups
-
getSampleSize
static int getSampleSize(double samplePower, int nRows, int nCols, double sparsity, int minSampleSize, int maxSampleSize)
This function returns the sample size to use. The sampling is bound by the maximum sampling and the minimum sampling. The sampling is calculated based on the a power of the number of rows and a sampling fraction- Parameters:
samplePower
- The sample powernRows
- The number of rowsnCols
- The number of columnssparsity
- The sparsity of the inputminSampleSize
- The minimum sample sizemaxSampleSize
- The maximum sample size- Returns:
- The sample size to use.
-
-