Class EstimatorSample


  • public class EstimatorSample
    extends SparsityEstimator
    This estimator implements an approach based on row/column sampling Yongyang Yu, MingJie Tang, Walid G. Aref, Qutaibah M. Malluhi, Mostafa M. Abbas, Mourad Ouzzani: In-Memory Distributed Matrix Computation Processing and Optimization. ICDE 2017: 1047-1058 The basic idea is to draw random samples of aligned columns SA and rows SB, and compute the output nnz as max(nnz(SA_i)*nnz(SB_i)). However, this estimator is biased toward underestimation as the maximum is unlikely sampled and collisions are not accounted for. Accordingly, we also support an extended estimator that relies on similar ideas for element-wise addition as the other estimators.
    • Constructor Detail

      • EstimatorSample

        public EstimatorSample()
      • EstimatorSample

        public EstimatorSample​(double sampleFrac)
      • EstimatorSample

        public EstimatorSample​(double sampleFrac,
                               boolean extended)