Class ColGroupUncompressed

  • All Implemented Interfaces:
    Serializable

    public class ColGroupUncompressed
    extends AColGroup
    Column group type for columns that are stored as dense arrays of doubles. Uses a MatrixBlock internally to store the column contents.
    See Also:
    Serialized Form
    • Method Detail

      • create

        public static AColGroup create​(MatrixBlock mb,
                                       IColIndex colIndexes)
        Create an Uncompressed Matrix Block, where the columns are offset by col indexes. It is assumed that the size of the colIndexes and number of columns in mb is matching.
        Parameters:
        mb - The MB / data to contain in the uncompressed column
        colIndexes - The column indexes for the group
        Returns:
        An Uncompressed Column group
      • create

        public static AColGroup create​(IColIndex colIndexes,
                                       MatrixBlock rawBlock,
                                       boolean transposed)
        Main constructor for Uncompressed ColGroup.
        Parameters:
        colIndexes - Indices (relative to the current block) of the columns that this column group represents.
        rawBlock - The uncompressed block; uncompressed data must be present at the time that the constructor is called
        transposed - Says if the input matrix raw block have been transposed.
        Returns:
        AColGroup.
      • getColGroupType

        public org.apache.sysds.runtime.compress.colgroup.AColGroup.ColGroupType getColGroupType()
      • getData

        public MatrixBlock getData()
        Access for superclass
        Returns:
        direct pointer to the internal representation of the columns
      • estimateInMemorySize

        public long estimateInMemorySize()
        Description copied from class: AColGroup
        Get the upper bound estimate of in memory allocation for the column group.
        Overrides:
        estimateInMemorySize in class AColGroup
        Returns:
        an upper bound on the number of bytes used to store this ColGroup in memory.
      • decompressToDenseBlock

        public void decompressToDenseBlock​(DenseBlock db,
                                           int rl,
                                           int ru,
                                           int offR,
                                           int offC)
        Description copied from class: AColGroup
        Decompress into the DenseBlock. (no NNZ handling)
        Specified by:
        decompressToDenseBlock in class AColGroup
        Parameters:
        db - Target DenseBlock
        rl - Row to start decompression from
        ru - Row to end decompression at
        offR - Row offset into the target to decompress
        offC - Column offset into the target to decompress
      • decompressToSparseBlock

        public void decompressToSparseBlock​(SparseBlock ret,
                                            int rl,
                                            int ru,
                                            int offR,
                                            int offC)
        Description copied from class: AColGroup
        Decompress into the SparseBlock. (no NNZ handling) Note this method is allowing to calls to append since it is assumed that the sparse column indexes are sorted afterwards
        Specified by:
        decompressToSparseBlock in class AColGroup
        Parameters:
        ret - Target SparseBlock
        rl - Row to start decompression from
        ru - Row to end decompression at
        offR - Row offset into the target to decompress
        offC - Column offset into the target to decompress
      • getIdx

        public double getIdx​(int r,
                             int colIdx)
        Description copied from class: AColGroup
        Get the value at a colGroup specific row/column index position.
        Specified by:
        getIdx in class AColGroup
        Parameters:
        r - row
        colIdx - column index in the _colIndexes.
        Returns:
        value at the row/column index position
      • leftMultByMatrixNoPreAgg

        public void leftMultByMatrixNoPreAgg​(MatrixBlock matrix,
                                             MatrixBlock result,
                                             int rl,
                                             int ru,
                                             int cl,
                                             int cu)
        Description copied from class: AColGroup
        Left multiply with this column group.
        Specified by:
        leftMultByMatrixNoPreAgg in class AColGroup
        Parameters:
        matrix - The matrix to multiply with on the left
        result - The result to output the values into, always dense for the purpose of the column groups parallelizing
        rl - The row to begin the multiplication from on the lhs matrix
        ru - The row to end the multiplication at on the lhs matrix
        cl - The column to begin the multiplication from on the lhs matrix
        cu - The column to end the multiplication at on the lhs matrix
      • scalarOperation

        public AColGroup scalarOperation​(ScalarOperator op)
        Description copied from class: AColGroup
        Perform the specified scalar operation directly on the compressed column group, without decompressing individual cells if possible.
        Specified by:
        scalarOperation in class AColGroup
        Parameters:
        op - operation to perform
        Returns:
        version of this column group with the operation applied
      • unaryOperation

        public AColGroup unaryOperation​(UnaryOperator op)
        Description copied from class: AColGroup
        Perform unary operation on the column group and return a new column group
        Specified by:
        unaryOperation in class AColGroup
        Parameters:
        op - The operation to perform
        Returns:
        The new column group
      • binaryRowOpLeft

        public AColGroup binaryRowOpLeft​(BinaryOperator op,
                                         double[] v,
                                         boolean isRowSafe)
        Description copied from class: AColGroup
        Perform a binary row operation.
        Specified by:
        binaryRowOpLeft in class AColGroup
        Parameters:
        op - The operation to execute
        v - The vector of values to apply the values contained should be at least the length of the highest value in the column index
        isRowSafe - True if the binary op is applied to an entire zero row and all results are zero
        Returns:
        A updated column group with the new values.
      • binaryRowOpRight

        public AColGroup binaryRowOpRight​(BinaryOperator op,
                                          double[] v,
                                          boolean isRowSafe)
        Description copied from class: AColGroup
        Perform a binary row operation.
        Specified by:
        binaryRowOpRight in class AColGroup
        Parameters:
        op - The operation to execute
        v - The vector of values to apply the values contained should be at least the length of the highest value in the column index
        isRowSafe - True if the binary op is applied to an entire zero row and all results are zero
        Returns:
        A updated column group with the new values.
      • unaryAggregateOperations

        public void unaryAggregateOperations​(AggregateUnaryOperator op,
                                             double[] result,
                                             int nRows,
                                             int rl,
                                             int ru)
        Description copied from class: AColGroup
        Unary Aggregate operator, since aggregate operators require new object output, the output becomes an uncompressed matrix. The range of rl to ru only applies to row aggregates. (ReduceCol)
        Specified by:
        unaryAggregateOperations in class AColGroup
        Parameters:
        op - The operator used
        result - The output matrix block
        nRows - The total number of rows in the Column Group
        rl - The starting row to do aggregation from
        ru - The last row to do aggregation to (not included)
      • getExactSizeOnDisk

        public long getExactSizeOnDisk()
        Description copied from class: AColGroup
        Returns the exact serialized size of column group. This can be used for example for buffer preallocation.
        Overrides:
        getExactSizeOnDisk in class AColGroup
        Returns:
        exact serialized size for column group
      • getMin

        public double getMin()
        Description copied from class: AColGroup
        Short hand method for getting minimum value contained in this column group.
        Specified by:
        getMin in class AColGroup
        Returns:
        The minimum value contained in this ColumnGroup
      • getMax

        public double getMax()
        Description copied from class: AColGroup
        Short hand method for getting maximum value contained in this column group.
        Specified by:
        getMax in class AColGroup
        Returns:
        The maximum value contained in this ColumnGroup
      • getSum

        public double getSum​(int nRows)
        Description copied from class: AColGroup
        Short hand method for getting the sum of this column group
        Specified by:
        getSum in class AColGroup
        Parameters:
        nRows - The number of rows in the column group
        Returns:
        The sum of this column group
      • tsmm

        public final void tsmm​(MatrixBlock ret,
                               int nRows)
        Description copied from class: AColGroup
        Do a transposed self matrix multiplication on the left side t(x) %*% x. but only with this column group. This gives better performance since there is no need to iterate through all the rows of the matrix, but the execution can be limited to its number of distinct values. Note it only calculate the upper triangle
        Specified by:
        tsmm in class AColGroup
        Parameters:
        ret - The return matrix block [numColumns x numColumns]
        nRows - The number of rows in the column group
      • containsValue

        public boolean containsValue​(double pattern)
        Description copied from class: AColGroup
        Detect if the column group contains a specific value.
        Specified by:
        containsValue in class AColGroup
        Parameters:
        pattern - The value to look for.
        Returns:
        boolean saying true if the value is contained.
      • getNumberNonZeros

        public long getNumberNonZeros​(int nRows)
        Description copied from class: AColGroup
        Get the number of nonZeros contained in this column group.
        Specified by:
        getNumberNonZeros in class AColGroup
        Parameters:
        nRows - The number of rows in the column group, this is used for groups that does not contain information about how many rows they have.
        Returns:
        The nnz.
      • leftMultByAColGroup

        public void leftMultByAColGroup​(AColGroup lhs,
                                        MatrixBlock result,
                                        int nRows)
        Description copied from class: AColGroup
        Left side matrix multiplication with a column group that is transposed.
        Specified by:
        leftMultByAColGroup in class AColGroup
        Parameters:
        lhs - The left hand side Column group to multiply with, the left hand side should be considered transposed. Also it should be guaranteed that this column group is not empty.
        result - The result matrix to insert the result of the multiplication into
        nRows - Number of rows in the lhs colGroup
      • tsmmAColGroup

        public void tsmmAColGroup​(AColGroup lhs,
                                  MatrixBlock result)
        Description copied from class: AColGroup
        Matrix multiply with this other column group, but: 1. Only output upper triangle values. 2. Multiply both ways with "this" being on the left and on the right. It should be guaranteed that the input is not the same as the caller of the method. The second step is achievable by treating the initial multiplied matrix, and adding its values to the correct locations in the output.
        Specified by:
        tsmmAColGroup in class AColGroup
        Parameters:
        lhs - The other Column group to multiply with
        result - The result matrix to put the results into
      • rightMultByMatrix

        public AColGroup rightMultByMatrix​(MatrixBlock right,
                                           IColIndex allCols)
        Description copied from class: AColGroup
        Right matrix multiplication with this column group. This method can return null, meaning that the output overlapping group would have been empty.
        Specified by:
        rightMultByMatrix in class AColGroup
        Parameters:
        right - The MatrixBlock on the right of this matrix multiplication
        allCols - A pre-materialized list of all col indexes, that can be shared across all column groups if use full, can be set to null.
        Returns:
        The new Column Group or null that is the result of the matrix multiplication.
      • getNumValues

        public int getNumValues()
        Description copied from class: AColGroup
        Obtain number of distinct tuples in contained sets of values associated with this column group. If the column group is uncompressed the number or rows is returned.
        Specified by:
        getNumValues in class AColGroup
        Returns:
        the number of distinct sets of values associated with the bitmaps in this column group
      • replace

        public AColGroup replace​(double pattern,
                                 double replace)
        Description copied from class: AColGroup
        Make a copy of the column group values, and replace all values that match pattern with replacement value.
        Specified by:
        replace in class AColGroup
        Parameters:
        pattern - The value to look for
        replace - The value to replace the other value with
        Returns:
        A new Column Group, reusing the index structure but with new values.
      • computeColSums

        public void computeColSums​(double[] c,
                                   int nRows)
        Description copied from class: AColGroup
        Compute the column sum
        Specified by:
        computeColSums in class AColGroup
        Parameters:
        c - The array to add the column sum to.
        nRows - The number of rows in the column group.
      • centralMoment

        public CM_COV_Object centralMoment​(CMOperator op,
                                           int nRows)
        Description copied from class: AColGroup
        Central Moment instruction executed on a column group.
        Specified by:
        centralMoment in class AColGroup
        Parameters:
        op - The Operator to use.
        nRows - The number of rows contained in the ColumnGroup.
        Returns:
        A Central Moment object.
      • rexpandCols

        public AColGroup rexpandCols​(int max,
                                     boolean ignore,
                                     boolean cast,
                                     int nRows)
        Description copied from class: AColGroup
        Expand the column group to multiple columns. (one hot encode the column group)
        Specified by:
        rexpandCols in class AColGroup
        Parameters:
        max - The number of columns to expand to and cutoff values at.
        ignore - If zero and negative values should be ignored.
        cast - If the double values contained should be cast to whole numbers.
        nRows - The number of rows in the column group.
        Returns:
        A new column group containing max number of columns.
      • getCost

        public double getCost​(ComputationCostEstimator e,
                              int nRows)
        Description copied from class: AColGroup
        Get the computation cost associated with this column group.
        Specified by:
        getCost in class AColGroup
        Parameters:
        e - The computation cost estimator
        nRows - the number of rows in the column group
        Returns:
        The cost of this column group
      • isEmpty

        public boolean isEmpty()
        Description copied from class: AColGroup
        Get if the group is only containing zero
        Specified by:
        isEmpty in class AColGroup
        Returns:
        true if empty
      • sliceRows

        public AColGroup sliceRows​(int rl,
                                   int ru)
        Description copied from class: AColGroup
        Slice range of rows out of the column group and return a new column group only containing the row segment. Note that this slice should maintain pointers back to the original dictionaries and only modify index structures.
        Specified by:
        sliceRows in class AColGroup
        Parameters:
        rl - The row to start at
        ru - The row to end at (not included)
        Returns:
        A new column group containing the specified row range.
      • append

        public AColGroup append​(AColGroup g)
        Description copied from class: AColGroup
        Append the other column group to this column group. This method tries to combine them to return a new column group containing both. In some cases it is possible in reasonable time, in others it is not. The result is first this column group followed by the other column group in higher row values. If it is not possible or very inefficient null is returned.
        Specified by:
        append in class AColGroup
        Parameters:
        g - The other column group
        Returns:
        A combined column group or null
      • appendNInternal

        public AColGroup appendNInternal​(AColGroup[] g,
                                         int blen,
                                         int rlen)
      • getCompressionScheme

        public ICLAScheme getCompressionScheme()
        Description copied from class: AColGroup
        Get the compression scheme for this column group to enable compression of other data.
        Specified by:
        getCompressionScheme in class AColGroup
        Returns:
        The compression scheme of this column group
      • recompress

        public AColGroup recompress()
        Description copied from class: AColGroup
        Recompress this column group into a new column group.
        Specified by:
        recompress in class AColGroup
        Returns:
        A new or the same column group depending on optimization goal.
      • getCompressionInfo

        public CompressedSizeInfoColGroup getCompressionInfo​(int nRow)
        Description copied from class: AColGroup
        Get the compression info for this column group.
        Specified by:
        getCompressionInfo in class AColGroup
        Parameters:
        nRow - The number of rows in this column group.
        Returns:
        The compression info for this group.
      • copyAndSet

        public AColGroup copyAndSet​(IColIndex colIndexes)
        Description copied from class: AColGroup
        Copy the content of the column group with pointers to the previous content but with new column given Note this method does not verify if the colIndexes specified are valid and correct dimensions for the underlying column groups.
        Specified by:
        copyAndSet in class AColGroup
        Parameters:
        colIndexes - the new indexes to use in the copy
        Returns:
        a new object with pointers to underlying data.