Class MatrixBlock

    • Field Detail

      • ULTRA_SPARSITY_TURN_POINT

        public static final double ULTRA_SPARSITY_TURN_POINT
        See Also:
        Constant Field Values
      • ULTRA_SPARSITY_TURN_POINT2

        public static final double ULTRA_SPARSITY_TURN_POINT2
        See Also:
        Constant Field Values
      • DEFAULT_INPLACE_SPARSEBLOCK

        public static final SparseBlock.Type DEFAULT_INPLACE_SPARSEBLOCK
      • MAX_SHALLOW_SERIALIZE_OVERHEAD

        public static final double MAX_SHALLOW_SERIALIZE_OVERHEAD
        See Also:
        Constant Field Values
      • CONVERT_MCSR_TO_CSR_ON_DEEP_SERIALIZE

        public static final boolean CONVERT_MCSR_TO_CSR_ON_DEEP_SERIALIZE
        See Also:
        Constant Field Values
    • Constructor Detail

      • MatrixBlock

        public MatrixBlock()
      • MatrixBlock

        public MatrixBlock​(int rl,
                           int cl,
                           boolean sp)
      • MatrixBlock

        public MatrixBlock​(int rl,
                           int cl,
                           long estnnz)
      • MatrixBlock

        public MatrixBlock​(int rl,
                           int cl,
                           boolean sp,
                           long estnnz)
      • MatrixBlock

        public MatrixBlock​(int rl,
                           int cl,
                           boolean sp,
                           long estnnz,
                           boolean dedup)
      • MatrixBlock

        public MatrixBlock​(MatrixBlock that)
      • MatrixBlock

        public MatrixBlock​(MatrixBlock that,
                           boolean sp)
      • MatrixBlock

        public MatrixBlock​(double val)
      • MatrixBlock

        public MatrixBlock​(int rl,
                           int cl,
                           double val)
      • MatrixBlock

        public MatrixBlock​(int rl,
                           int cl,
                           long nnz,
                           SparseBlock sblock)
        Constructs a sparse MatrixBlock with a given instance of a SparseBlock
        Parameters:
        rl - number of rows
        cl - number of columns
        nnz - number of non zeroes
        sblock - sparse block
      • MatrixBlock

        public MatrixBlock​(int rl,
                           int cl,
                           DenseBlock dBlock)
      • MatrixBlock

        public MatrixBlock​(int rl,
                           int cl,
                           double[] vals)
    • Method Detail

      • reset

        public final void reset​(int rl,
                                int cl)
        Specified by:
        reset in class MatrixValue
      • reset

        public final void reset​(int rl,
                                int cl,
                                long estnnz)
      • reset

        public final void reset​(int rl,
                                int cl,
                                boolean sp)
        Specified by:
        reset in class MatrixValue
      • reset

        public final void reset​(int rl,
                                int cl,
                                boolean sp,
                                long estnnz)
        Specified by:
        reset in class MatrixValue
      • reset

        public final void reset​(int rl,
                                int cl,
                                double val)
        Specified by:
        reset in class MatrixValue
      • reset

        public void reset​(int rl,
                          int cl,
                          boolean sp,
                          long estnnz,
                          double val)
        Internal canonical reset of dense and sparse matrix blocks.
        Parameters:
        rl - number of rows
        cl - number of columns
        sp - sparse representation
        estnnz - estimated number of non-zeros
        val - initialization value
      • reset

        public void reset​(int rl,
                          int cl,
                          boolean sp,
                          long estnnz,
                          double val,
                          boolean dedup)
      • init

        public void init​(double[][] arr,
                         int r,
                         int c)
        NOTE: This method is designed only for dense representation.
        Parameters:
        arr - 2d double array matrix
        r - number of rows
        c - number of columns
      • init

        public void init​(double[] arr,
                         int r,
                         int c)
        NOTE: This method is designed only for dense representation.
        Parameters:
        arr - double array matrix
        r - number of rows
        c - number of columns
      • isAllocated

        public boolean isAllocated()
      • allocateDenseBlock

        public final MatrixBlock allocateDenseBlock()
      • allocateBlock

        public final MatrixBlock allocateBlock()
      • allocateDenseBlock

        public boolean allocateDenseBlock​(boolean clearNNZ)
      • allocateDenseBlock

        public boolean allocateDenseBlock​(boolean clearNNZ,
                                          boolean containsDuplicates)
      • allocateSparseRowsBlock

        public final boolean allocateSparseRowsBlock()
      • allocateSparseRowsBlock

        public boolean allocateSparseRowsBlock​(boolean clearNNZ)
      • allocateAndResetSparseBlock

        public void allocateAndResetSparseBlock​(boolean clearNNZ,
                                                SparseBlock.Type stype)
      • allocateDenseBlockUnsafe

        public final void allocateDenseBlockUnsafe​(int rl,
                                                   int cl)
        This should be called only in the read and write functions for CP This function should be called before calling any setValueDenseUnsafe()
        Parameters:
        rl - number of rows
        cl - number of columns
      • cleanupBlock

        public final void cleanupBlock​(boolean dense,
                                       boolean sparse)
        Allows to cleanup all previously allocated sparserows or denseblocks. This is for example required in reading a matrix with many empty blocks via distributed cache into in-memory list of blocks - not cleaning blocks from non-empty blocks would significantly increase the total memory consumption.
        Parameters:
        dense - if true, set dense block to null
        sparse - if true, set sparse block to null
      • setNumRows

        public final void setNumRows​(int r)
        NOTE: setNumRows() and setNumColumns() are used only in ternaryInstruction (for contingency tables) and pmm for meta corrections.
        Parameters:
        r - number of rows
      • setNumColumns

        public final void setNumColumns​(int c)
      • setNonZeros

        public final long setNonZeros​(long nnz)
      • setAllNonZeros

        public final long setAllNonZeros()
      • getSparsity

        public final double getSparsity()
      • getLength

        public final long getLength()
        Get the total number of cells in this MatrixBlock This variable can be misused (intentionally) for parallelization.
        Returns:
        The number of cells in this matrix.
      • isEmptyBlock

        public final boolean isEmptyBlock()
      • isEmptyBlock

        public boolean isEmptyBlock​(boolean safe)
        Get if this MatrixBlock is an empty block. The call can potentially tricker a recomputation of non zeros if the non-zero count is unknown.
        Parameters:
        safe - True if we want to ensure the count non zeros if the nnz is unknown.
        Returns:
        If the block is empty.
      • getDenseBlock

        public DenseBlock getDenseBlock()
      • setDenseBlock

        public void setDenseBlock​(DenseBlock dblock)
      • getDenseBlockValues

        public double[] getDenseBlockValues()
      • setSparseBlock

        public void setSparseBlock​(SparseBlock sblock)
      • getSparseBlockIterator

        public Iterator<IJV> getSparseBlockIterator()
      • getSparseBlockIterator

        public Iterator<IJV> getSparseBlockIterator​(int rl,
                                                    int ru)
      • get

        public double get​(int r,
                          int c)
        Specified by:
        get in class MatrixValue
      • set

        public void set​(int r,
                        int c,
                        double v)
        Specified by:
        set in class MatrixValue
      • setRow

        public void setRow​(int r,
                           double[] values)
      • containsValue

        public boolean containsValue​(double pattern)
      • containsValue

        public boolean containsValue​(double pattern,
                                     int k)
      • appendValue

        public void appendValue​(int r,
                                int c,
                                double v)

        Append value is only used when values are appended at the end of each row for the sparse representation

        This can only be called, when the caller knows the access pattern of the block
        Parameters:
        r - row
        c - column
        v - value
      • appendValuePlain

        public void appendValuePlain​(int r,
                                     int c,
                                     double v)
      • appendRow

        public void appendRow​(int r,
                              SparseRow row)
      • appendRow

        public void appendRow​(int r,
                              SparseRow row,
                              boolean deep)
      • appendToSparse

        public void appendToSparse​(MatrixBlock that,
                                   int rowoffset,
                                   int coloffset)
      • appendRowToSparse

        public void appendRowToSparse​(SparseBlock dest,
                                      MatrixBlock src,
                                      int i,
                                      int rowoffset,
                                      int coloffset,
                                      boolean deep)
      • sortSparseRows

        public void sortSparseRows()
        Sorts all existing sparse rows by column indexes.
      • sortSparseRows

        public void sortSparseRows​(int rl,
                                   int ru)
        Sorts all existing sparse rows in range [rl,ru) by column indexes.
        Parameters:
        rl - row lower bound, inclusive
        ru - row upper bound, exclusive
      • minNonZero

        public double minNonZero()
        Utility function for computing the min non-zero value.
        Returns:
        minimum non-zero value
      • prod

        public double prod()
        Wrapper method for reduceall-product of a matrix.
        Returns:
        the product sum of the matrix content
      • mean

        public double mean()
        Wrapper method for reduceall-mean of a matrix.
        Returns:
        the mean value of all values in the matrix
      • mean

        public double mean​(int k)
      • min

        public double min()
        Wrapper method for reduceall-min of a matrix.
        Returns:
        the minimum value of all values in the matrix
      • min

        public double min​(int k)
      • colMin

        public final MatrixBlock colMin()
        Wrapper method for reduceall-colMin of a matrix.
        Returns:
        A new MatrixBlock containing the column mins of this matrix
      • colMax

        public final MatrixBlock colMax()
        Wrapper method for reduceall-colMin of a matrix.
        Returns:
        A new MatrixBlock containing the column mins of this matrix
      • max

        public double max()
        Wrapper method for reduceall-max of a matrix.
        Returns:
        the maximum value of all values in the matrix
      • max

        public MatrixBlock max​(int k)
        Wrapper method for reduceall-max of a matrix.
        Parameters:
        k - the parallelization degree
        Returns:
        the maximum value of all values in the matrix
      • sum

        public double sum()
        Wrapper method for reduceall-sum of a matrix.
        Returns:
        Sum of the values in the matrix.
      • sum

        public MatrixBlock sum​(int k)
        Wrapper method for reduceall-sum of a matrix parallel
        Parameters:
        k - parallelization degree
        Returns:
        Sum of the values in the matrix.
      • colSum

        public MatrixBlock colSum()
        Wrapper method for single threaded reduceall-colSum of a matrix.
        Returns:
        A new MatrixBlock containing the column sums of this matrix.
      • rowSum

        public final MatrixBlock rowSum()
        Wrapper method for single threaded reduceall-rowSum of a matrix.
        Returns:
        A new MatrixBlock containing the row sums of this matrix.
      • rowSum

        public final MatrixBlock rowSum​(int k)
        Wrapper method for multi threaded reduceall-rowSum of a matrix.
        Parameters:
        k - the number of threads allowed to be used.
        Returns:
        A new MatrixBlock containing the row sums of this matrix.
      • sumSq

        public double sumSq()
        Wrapper method for reduceall-sumSq of a matrix.
        Returns:
        Sum of the squared values in the matrix.
      • isInSparseFormat

        public boolean isInSparseFormat()
        Returns the current representation (true for sparse).
        Specified by:
        isInSparseFormat in class MatrixValue
        Returns:
        true if sparse
      • isUltraSparse

        public boolean isUltraSparse()
      • isUltraSparse

        public boolean isUltraSparse​(boolean checkNnz)
      • isSparsePermutationMatrix

        public boolean isSparsePermutationMatrix()
      • evalSparseFormatInMemory

        public boolean evalSparseFormatInMemory()
        Evaluates if this matrix block should be in sparse format in memory. Note that this call does not change the representation - for this please call examSparsity.
        Returns:
        true if matrix block should be in sparse format in memory
      • evalSparseFormatInMemory

        public boolean evalSparseFormatInMemory​(boolean allowCSR)
      • evalSparseFormatOnDisk

        public boolean evalSparseFormatOnDisk()
        Evaluates if this matrix block should be in sparse format on disk. This applies to any serialized matrix representation, i.e., when writing to in-memory buffer pool pages or writing to local fs or hdfs.
        Returns:
        true if matrix block should be in sparse format on disk
      • examSparsity

        public final void examSparsity()
        Evaluates if this matrix block should be in sparse format in memory. Depending on the current representation, the state of the matrix block is changed to the right representation if necessary. Note that this consumes for the time of execution memory for both representations. Allowing CSR format is default for this operation.
      • examSparsity

        public final void examSparsity​(int k)
        Evaluates if this matrix block should be in sparse format in memory. Depending on the current representation, the state of the matrix block is changed to the right representation if necessary. Note that this consumes for the time of execution memory for both representations. Allowing CSR format is default for this operation.
        Parameters:
        k - parallelization degree
      • examSparsity

        public final void examSparsity​(boolean allowCSR)
        Evaluates if this matrix block should be in sparse format in memory. Depending on the current representation, the state of the matrix block is changed to the right representation if necessary. Note that this consumes for the time of execution memory for both representations.
        Parameters:
        allowCSR - allow CSR format on dense to sparse conversion
      • examSparsity

        public void examSparsity​(boolean allowCSR,
                                 int k)
        Evaluates if this matrix block should be in sparse format in memory. Depending on the current representation, the state of the matrix block is changed to the right representation if necessary. Note that this consumes for the time of execution memory for both representations.
        Parameters:
        allowCSR - allow CSR format on dense to sparse conversion
        k - parallelization degree
      • evalSparseFormatInMemory

        public static boolean evalSparseFormatInMemory​(DataCharacteristics dc)
      • evalSparseFormatInMemory

        public static boolean evalSparseFormatInMemory​(long nrows,
                                                       long ncols,
                                                       long nnz)
        Evaluates if a matrix block with the given characteristics should be in sparse format in memory.
        Parameters:
        nrows - number of rows
        ncols - number of columns
        nnz - number of non-zeros
        Returns:
        true if matrix block shold be in sparse format in memory
      • evalSparseFormatInMemory

        public static boolean evalSparseFormatInMemory​(long nrows,
                                                       long ncols,
                                                       long nnz,
                                                       boolean allowCSR)
      • evalSparseFormatOnDisk

        public static boolean evalSparseFormatOnDisk​(long nrows,
                                                     long ncols,
                                                     long nnz)
        Evaluates if a matrix block with the given characteristics should be in sparse format on disk (or in any other serialized representation).
        Parameters:
        nrows - number of rows
        ncols - number of columns
        nnz - number of non-zeros
        Returns:
        true if matrix block shold be in sparse format on disk
      • denseToSparse

        public final void denseToSparse()
      • denseToSparse

        public final void denseToSparse​(boolean allowCSR)
      • denseToSparse

        public void denseToSparse​(boolean allowCSR,
                                  int k)
      • sparseToDense

        public final void sparseToDense()
      • sparseToDense

        public void sparseToDense​(int k)
      • recomputeNonZeros

        public long recomputeNonZeros()
        Recomputes and materializes the number of non-zero values of the entire matrix block.
        Returns:
        number of non-zeros
      • recomputeNonZeros

        public long recomputeNonZeros​(int k)
        Recompute the number of nonZero values in parallel
        Parameters:
        k - the paralelization degree
        Returns:
        the number of non zeros
      • recomputeNonZeros

        public long recomputeNonZeros​(int rl,
                                      int ru)
        Recomputes the number of non-zero values of a specified range of the matrix block. NOTE: This call does not materialize the compute result in any form.
        Parameters:
        rl - row lower index, 0-based, inclusive
        ru - row upper index, 0-based, inclusive
        Returns:
        the number of non-zero values
      • recomputeNonZeros

        public long recomputeNonZeros​(int rl,
                                      int ru,
                                      int cl,
                                      int cu)
        Recomputes the number of non-zero values of a specified range of the matrix block. NOTE: This call does not materialize the compute result in any form.
        Parameters:
        rl - row lower index, 0-based, inclusive
        ru - row upper index, 0-based, inclusive
        cl - column lower index, 0-based, inclusive
        cu - column upper index, 0-based, inclusive
        Returns:
        the number of non-zero values
      • checkNonZeros

        public void checkNonZeros()
        Basic debugging primitive to check correctness of nnz. This method is not intended for production use.
      • checkSparseRows

        public void checkSparseRows()
      • checkSparseRows

        public void checkSparseRows​(int rl,
                                    int ru)
        Basic debugging primitive to check sparse block column ordering. This method is not intended for production use.
        Parameters:
        rl - row lower bound (inclusive)
        ru - row upper bound (exclusive)
      • copy

        public void copy​(MatrixValue thatValue)
        Description copied from class: MatrixValue
        Copy that MatrixValue into this MatrixValue. If the MatrixValue is a MatrixBlock evaluate the sparsity of the original matrix, and copy into either a sparse or a dense matrix.
        Specified by:
        copy in class MatrixValue
        Parameters:
        thatValue - object to copy the values into.
      • copy

        public void copy​(MatrixValue thatValue,
                         boolean sp)
        Description copied from class: MatrixValue
        Copy that MatrixValue into this MatrixValue. But select sparse destination block depending on boolean parameter.
        Specified by:
        copy in class MatrixValue
        Parameters:
        thatValue - object to copy the values into.
        sp - boolean specifying if output should be forced sparse or dense. (only applicable if the 'that' is a MatrixBlock)
      • putInto

        public void putInto​(MatrixBlock target,
                            int rowOffset,
                            int colOffset,
                            boolean sparseCopyShallow)
        Method for copying this matrix into a target matrix. Note that this method does not maintain number of non zero values. The method should output into the allocated block type of the target, therefore before any calls an appropriate block must be allocated. CSR sparse format is not supported. If allocating into a sparse matrix MCSR block the rows have to be sorted afterwards with a call to target.sortSparseRows()
        Parameters:
        target - Target MatrixBlock, that can be allocated dense or sparse
        rowOffset - The Row offset to allocate into.
        colOffset - The column offset to allocate into.
        sparseCopyShallow - If the output is sparse, and shallow copy of rows is allowed from this block
      • copy

        public void copy​(int rl,
                         int ru,
                         int cl,
                         int cu,
                         MatrixBlock src,
                         boolean awareDestNZ)
        In-place copy of matrix src into the index range of the existing current matrix. Note that removal of existing nnz in the index range and nnz maintenance is only done if 'awareDestNZ=true',
        Parameters:
        rl - row lower index, 0-based
        ru - row upper index, 0-based, inclusive
        cl - column lower index, 0-based
        cu - column upper index, 0-based, inclusive
        src - matrix block
        awareDestNZ - true, forces (1) to remove existing non-zeros in the index range of the destination if not present in src and (2) to internally maintain nnz false, assume empty index range in destination and do not maintain nnz (the invoker is responsible to recompute nnz after all copies are done)
      • merge

        public MatrixBlock merge​(MatrixBlock that,
                                 boolean appendOnly)
        Description copied from interface: CacheBlock
        Merge disjoint: merges all non-zero values of the given input into the current block. Note that this method does NOT check for overlapping entries; it's the callers responsibility of ensuring disjoint blocks. The appendOnly parameter is only relevant for sparse target blocks; if true, we only append values and do not sort sparse rows for each call; this is useful whenever we merge iterators of matrix blocks into one target block.
        Specified by:
        merge in interface CacheBlock<MatrixBlock>
        Parameters:
        that - cache block
        appendOnly - Indicate if the merger can be append only on sparse rows.
        Returns:
        the merged group, in most implementations 'this' is modified.
      • readFields

        public void readFields​(DataInput in)
                        throws IOException
        Specified by:
        readFields in interface org.apache.hadoop.io.Writable
        Throws:
        IOException
      • readExternal

        public void readExternal​(ObjectInput is)
                          throws IOException
        Redirects the default java serialization via externalizable to our default hadoop writable serialization for efficient broadcast/rdd deserialization.
        Specified by:
        readExternal in interface Externalizable
        Parameters:
        is - object input
        Throws:
        IOException - if IOException occurs
      • writeExternal

        public void writeExternal​(ObjectOutput os)
                           throws IOException
        Redirects the default java serialization via externalizable to our default hadoop writable serialization for efficient broadcast/rdd serialization.
        Specified by:
        writeExternal in interface Externalizable
        Parameters:
        os - object output
        Throws:
        IOException - if IOException occurs
      • getExactSizeOnDisk

        public long getExactSizeOnDisk()
        NOTE: The used estimates must be kept consistent with the respective write functions.
        Returns:
        exact size on disk
      • getHeaderSize

        public static long getHeaderSize()
      • estimateSizeInMemory

        public long estimateSizeInMemory()
      • estimateSizeInMemory

        public static long estimateSizeInMemory​(long nrows,
                                                long ncols,
                                                double sparsity)
      • estimateSizeInMemory

        public static long estimateSizeInMemory​(long nrows,
                                                long ncols,
                                                long nnz)
      • estimateSizeDenseInMemory

        public long estimateSizeDenseInMemory()
      • estimateSizeDenseInMemory

        public static long estimateSizeDenseInMemory​(long nrows,
                                                     long ncols)
      • estimateSizeSparseInMemory

        public long estimateSizeSparseInMemory()
      • estimateSizeSparseInMemory

        public static long estimateSizeSparseInMemory​(long nrows,
                                                      long ncols,
                                                      double sparsity)
      • estimateSizeSparseInMemory

        public static long estimateSizeSparseInMemory​(long nrows,
                                                      long ncols,
                                                      double sparsity,
                                                      boolean allowCSR)
      • estimateSizeSparseInMemory

        public long estimateSizeSparseInMemory​(SparseBlock.Type stype)
      • estimateSizeSparseInMemory

        public static long estimateSizeSparseInMemory​(long nrows,
                                                      long ncols,
                                                      double sparsity,
                                                      SparseBlock.Type stype)
      • estimateSizeOnDisk

        public long estimateSizeOnDisk()
      • estimateSizeOnDisk

        public static long estimateSizeOnDisk​(long nrows,
                                              long ncols,
                                              long nnz)
      • getInMemorySize

        public long getInMemorySize()
        Description copied from interface: CacheBlock
        Get the in-memory size in bytes of the cache block.
        Specified by:
        getInMemorySize in interface CacheBlock<MatrixBlock>
        Returns:
        in-memory size in bytes of cache block
      • getExactSerializedSize

        public long getExactSerializedSize()
        Description copied from interface: CacheBlock
        Get the exact serialized size in bytes of the cache block.
        Specified by:
        getExactSerializedSize in interface CacheBlock<MatrixBlock>
        Returns:
        exact serialized size in bytes of cache block
      • isShallowSerialize

        public boolean isShallowSerialize()
        Description copied from interface: CacheBlock
        Indicates if the cache block is subject to shallow serialized, which is generally true if in-memory size and serialized size are almost identical allowing to avoid unnecessary deep serialize.
        Specified by:
        isShallowSerialize in interface CacheBlock<MatrixBlock>
        Returns:
        true if shallow serialized
      • isShallowSerialize

        public boolean isShallowSerialize​(boolean inclConvert)
        Description copied from interface: CacheBlock
        Indicates if the cache block is subject to shallow serialized, which is generally true if in-memory size and serialized size are almost identical allowing to avoid unnecessary deep serialize.
        Specified by:
        isShallowSerialize in interface CacheBlock<MatrixBlock>
        Parameters:
        inclConvert - if true report blocks as shallow serialize that are currently not amenable but can be brought into an amenable form via toShallowSerializeBlock.
        Returns:
        true if shallow serialized
      • toShallowSerializeBlock

        public void toShallowSerializeBlock()
        Description copied from interface: CacheBlock
        Converts a cache block that is not shallow serializable into a form that is shallow serializable. This methods has no affect if the given cache block is not amenable.
        Specified by:
        toShallowSerializeBlock in interface CacheBlock<MatrixBlock>
      • ternaryOperationCheck

        public static void ternaryOperationCheck​(boolean s1,
                                                 boolean s2,
                                                 boolean s3,
                                                 int m,
                                                 int r1,
                                                 int r2,
                                                 int r3,
                                                 int n,
                                                 int c1,
                                                 int c2,
                                                 int c3)
      • append

        public final MatrixBlock append​(MatrixBlock that)
        Append that matrix to this matrix, while allocating a new matrix. Default is cbind making the matrix "wider"
        Parameters:
        that - the other matrix to append
        Returns:
        A new MatrixBlock object with the appended result
      • append

        public final MatrixBlock append​(MatrixBlock that,
                                        boolean cbind)
        Append that matrix to this matrix, while allocating a new matrix. cbind true makes the matrix "wider" while cbind false make it "taller"
        Parameters:
        that - the other matrix to append
        cbind - if binding on columns or rows
        Returns:
        a new MatrixBlock object with the appended result
      • append

        public final MatrixBlock append​(MatrixBlock that,
                                        MatrixBlock ret)
        Append that matrix to this matrix. Default is cbind making the matrix "wider"
        Parameters:
        that - the other matrix to append
        ret - the output matrix to modify, (is also returned)
        Returns:
        the ret MatrixBlock object with the appended result
      • append

        public final MatrixBlock append​(MatrixBlock that,
                                        MatrixBlock ret,
                                        boolean cbind)
        Append that matrix to this matrix. cbind true makes the matrix "wider" while cbind false make it "taller"
        Parameters:
        that - the other matrix to append
        ret - the output matrix to modify, (is also returned)
        cbind - if binding on columns or rows
        Returns:
        the ret MatrixBlock object with the appended result
      • append

        public MatrixBlock append​(MatrixBlock[] that,
                                  MatrixBlock result,
                                  boolean cbind)
        Append that list of matrixes to this matrix. cbind true makes the matrix "wider" while cbind false make it "taller"
        Parameters:
        that - a list of matrices to append in order
        result - the output matrix to modify, (is also returned)
        cbind - if binding on columns or rows
        Returns:
        the ret MatrixBlock object with the appended result
      • checkDimensionsForAppend

        public void checkDimensionsForAppend​(MatrixBlock[] in,
                                             boolean cbind)
      • leftIndexingOperations

        public MatrixBlock leftIndexingOperations​(ScalarObject scalar,
                                                  int rl,
                                                  int cl,
                                                  MatrixBlock ret,
                                                  MatrixObject.UpdateType update)
        Explicitly allow left indexing for scalars. Note: This operation is now 0-based. * Operations to be performed: 1) result=this; 2) result[row,column] = scalar.getDoubleValue();
        Parameters:
        scalar - scalar object
        rl - row lower
        cl - column lower
        ret - ?
        update - ?
        Returns:
        matrix block
      • slice

        public final MatrixBlock slice​(IndexRange ixrange,
                                       MatrixBlock ret)
        Description copied from interface: CacheBlock
        Slice a sub block out of the current block and write into the given output block. This method returns the passed instance if not null.
        Specified by:
        slice in interface CacheBlock<MatrixBlock>
        Parameters:
        ixrange - index range inclusive
        ret - outputBlock
        Returns:
        sub-block of cache block
      • slice

        public final MatrixBlock slice​(int rl,
                                       int ru)
        Description copied from interface: CacheBlock
        Slice a sub block out of the current block and write into the given output block. This method returns the passed instance if not null.
        Specified by:
        slice in interface CacheBlock<MatrixBlock>
        Parameters:
        rl - row lower
        ru - row upper inclusive
        Returns:
        sub-block of cache block
      • slice

        public final MatrixBlock slice​(int rl,
                                       int ru,
                                       boolean deep)
        Description copied from interface: CacheBlock
        Slice a sub block out of the current block and write into the given output block. This method returns the passed instance if not null.
        Specified by:
        slice in interface CacheBlock<MatrixBlock>
        Parameters:
        rl - row lower
        ru - row upper inclusive
        deep - enforce deep-copy
        Returns:
        sub-block of cache block
      • slice

        public final MatrixBlock slice​(int rl,
                                       int ru,
                                       int cl,
                                       int cu)
        Description copied from interface: CacheBlock
        Slice a sub block out of the current block and write into the given output block. This method returns the passed instance if not null.
        Specified by:
        slice in interface CacheBlock<MatrixBlock>
        Parameters:
        rl - row lower
        ru - row upper inclusive
        cl - column lower
        cu - column upper inclusive
        Returns:
        sub-block of cache block
      • slice

        public final MatrixBlock slice​(int rl,
                                       int ru,
                                       int cl,
                                       int cu,
                                       MatrixBlock ret)
        Description copied from interface: CacheBlock
        Slice a sub block out of the current block and write into the given output block. This method returns the passed instance if not null.
        Specified by:
        slice in interface CacheBlock<MatrixBlock>
        Parameters:
        rl - row lower
        ru - row upper inclusive
        cl - column lower
        cu - column upper inclusive
        ret - cache block
        Returns:
        sub-block of cache block
      • slice

        public final MatrixBlock slice​(int rl,
                                       int ru,
                                       int cl,
                                       int cu,
                                       boolean deep)
        Description copied from interface: CacheBlock
        Slice a sub block out of the current block and write into the given output block. This method returns the passed instance if not null.
        Specified by:
        slice in interface CacheBlock<MatrixBlock>
        Parameters:
        rl - row lower
        ru - row upper inclusive
        cl - column lower
        cu - column upper inclusive
        deep - enforce deep-copy
        Returns:
        sub-block of cache block
      • slice

        public MatrixBlock slice​(int rl,
                                 int ru,
                                 int cl,
                                 int cu,
                                 boolean deep,
                                 MatrixBlock ret)
        Description copied from interface: CacheBlock
        Slice a sub block out of the current block and write into the given output block. This method returns the passed instance if not null.
        Specified by:
        slice in interface CacheBlock<MatrixBlock>
        Parameters:
        rl - row lower
        ru - row upper inclusive
        cl - column lower
        cu - column upper inclusive
        deep - enforce deep-copy
        ret - cache block
        Returns:
        sub-block of cache block
      • slice

        public void slice​(ArrayList<IndexedMatrixValue> outlist,
                          IndexRange range,
                          int rowCut,
                          int colCut,
                          int blen,
                          int boundaryRlen,
                          int boundaryClen)
        Description copied from class: MatrixValue
        Slice out up to 4 matrixBlocks that are separated by the row and col Cuts. This is used in the context of spark execution to distributed sliced out matrix blocks of correct block size.
        Specified by:
        slice in class MatrixValue
        Parameters:
        outlist - The output matrix blocks that is extracted from the matrix
        range - An index range containing overlapping information.
        rowCut - The row to cut and split the matrix.
        colCut - The column to cut ans split the matrix.
        blen - The Block size of the output matrices.
        boundaryRlen - The row length of the edge case matrix block, used for the final blocks that does not have enough rows to construct a full block.
        boundaryClen - The col length of the edge case matrix block, used for the final blocks that does not have enough cols to construct a full block.
      • sortOperations

        public final MatrixBlock sortOperations()
      • interQuartileMean

        public double interQuartileMean()
      • computeIQMCorrection

        public static double computeIQMCorrection​(double sum,
                                                  double sum_wt,
                                                  double q25Part,
                                                  double q25Val,
                                                  double q75Part,
                                                  double q75Val)
      • median

        public double median()
      • pickValue

        public final double pickValue​(double quantile)
      • pickValue

        public double pickValue​(double quantile,
                                boolean average)
      • sumWeightForQuantile

        public double sumWeightForQuantile()
        In a given two column matrix, the second column denotes weights. This function computes the total weight
        Returns:
        sum weight for quantile
      • groupedAggOperations

        public final MatrixBlock groupedAggOperations​(MatrixValue tgt,
                                                      MatrixValue wghts,
                                                      MatrixValue ret,
                                                      int ngroups,
                                                      Operator op)
        Invocation from CP instructions. The aggregate is computed on the groups object against target and weights. Notes: * The computed number of groups is reused for multiple invocations with different target. * This implementation supports that the target is passed as column or row vector, in case of row vectors we also use sparse-safe implementations for sparse safe aggregation operators.
        Parameters:
        tgt - ?
        wghts - ?
        ret - ?
        ngroups - ?
        op - operator
        Returns:
        matrix block
      • removeEmptyOperations

        public final MatrixBlock removeEmptyOperations​(MatrixBlock ret,
                                                       boolean rows,
                                                       boolean emptyReturn)
      • rexpandOperations

        public MatrixBlock rexpandOperations​(MatrixBlock ret,
                                             double max,
                                             boolean rows,
                                             boolean cast,
                                             boolean ignore,
                                             int k)
      • extractTriangular

        public MatrixBlock extractTriangular​(MatrixBlock ret,
                                             boolean lower,
                                             boolean diag,
                                             boolean values)
      • ctableOperations

        public void ctableOperations​(Operator op,
                                     double scalarThat,
                                     MatrixValue that2Val,
                                     CTableMap resultMap,
                                     MatrixBlock resultBlock)
        D = ctable(A,v2,W) this <- A; scalarThat <- v2; that2 <- W; result <- D (i1,j1,v1) from input1 (this) (v2) from sclar_input2 (scalarThat) (i3,j3,w) from input3 (that2)
        Specified by:
        ctableOperations in class MatrixValue
      • ctableOperations

        public void ctableOperations​(Operator op,
                                     double scalarThat,
                                     double scalarThat2,
                                     CTableMap resultMap,
                                     MatrixBlock resultBlock)
        D = ctable(A,v2,w) this <- A; scalar_that <- v2; scalar_that2 <- w; result <- D (i1,j1,v1) from input1 (this) (v2) from sclar_input2 (scalarThat) (w) from scalar_input3 (scalarThat2)
        Specified by:
        ctableOperations in class MatrixValue
      • ctableOperations

        public void ctableOperations​(Operator op,
                                     MatrixIndexes ix1,
                                     double scalarThat,
                                     boolean left,
                                     int blen,
                                     CTableMap resultMap,
                                     MatrixBlock resultBlock)
        Specific ctable case of ctable(seq(...),X), where X is the only matrix input. The 'left' input parameter specifies if the seq appeared on the left, otherwise it appeared on the right.
        Specified by:
        ctableOperations in class MatrixValue
      • ctableOperations

        public void ctableOperations​(Operator op,
                                     MatrixValue thatVal,
                                     double scalarThat2,
                                     boolean ignoreZeros,
                                     CTableMap resultMap,
                                     MatrixBlock resultBlock)
        D = ctable(A,B,w) this <- A; that <- B; scalar_that2 <- w; result <- D (i1,j1,v1) from input1 (this) (i1,j1,v2) from input2 (that) (w) from scalar_input3 (scalarThat2) NOTE: This method supports both vectors and matrices. In case of matrices and ignoreZeros=true we can also use a sparse-safe implementation
        Specified by:
        ctableOperations in class MatrixValue
      • ctableSeqOperations

        public MatrixBlock ctableSeqOperations​(MatrixValue thatMatrix,
                                               double thatScalar,
                                               MatrixBlock ret,
                                               boolean updateClen)
        Parameters:
        thatMatrix - matrix value
        thatScalar - scalar double
        ret - result matrix block
        updateClen - when this matrix already has the desired number of columns updateClen can be set to false
        Returns:
        result matrix block
      • ctableSeqOperations

        public final MatrixBlock ctableSeqOperations​(MatrixValue thatMatrix,
                                                     double thatScalar,
                                                     MatrixBlock resultBlock)
        D = ctable(seq,A,w) this <- seq; thatMatrix <- A; thatScalar <- w; result <- D (i1,j1,v1) from input1 (this) (i1,j1,v2) from input2 (that) (w) from scalar_input3 (scalarThat2)
        Parameters:
        thatMatrix - matrix value
        thatScalar - scalar double
        resultBlock - result matrix block
        Returns:
        resultBlock
      • ctableOperations

        public final void ctableOperations​(Operator op,
                                           MatrixValue thatVal,
                                           MatrixValue that2Val,
                                           CTableMap resultMap)
        D = ctable(A,B,W) this <- A; that <- B; that2 <- W; result <- D (i1,j1,v1) from input1 (this) (i1,j1,v2) from input2 (that) (i1,j1,w) from input3 (that2)
        Parameters:
        op - operator
        thatVal - matrix value 1
        that2Val - matrix value 2
        resultMap - table map
      • randOperations

        public static MatrixBlock randOperations​(int rows,
                                                 int cols,
                                                 double sparsity)
      • randOperations

        public static MatrixBlock randOperations​(int rows,
                                                 int cols,
                                                 double sparsity,
                                                 double min,
                                                 double max,
                                                 String pdf,
                                                 long seed)
        Function to generate the random matrix with specified dimensions (block sizes are not specified).
        Parameters:
        rows - number of rows
        cols - number of columns
        sparsity - sparsity as a percentage
        min - minimum value
        max - maximum value
        pdf - pdf
        seed - random seed
        Returns:
        matrix block
      • randOperations

        public static MatrixBlock randOperations​(int rows,
                                                 int cols,
                                                 double sparsity,
                                                 double min,
                                                 double max,
                                                 String pdf,
                                                 long seed,
                                                 int k)
        Function to generate the random matrix with specified dimensions (block sizes are not specified).
        Parameters:
        rows - number of rows
        cols - number of columns
        sparsity - sparsity as a percentage
        min - minimum value
        max - maximum value
        pdf - pdf
        seed - random seed
        k - The number of threads in the operation
        Returns:
        matrix block
      • randOperations

        public static MatrixBlock randOperations​(RandomMatrixGenerator rgen,
                                                 long seed)
        Function to generate the random matrix with specified dimensions and block dimensions.
        Parameters:
        rgen - random matrix generator
        seed - seed value
        Returns:
        matrix block
      • randOperations

        public static MatrixBlock randOperations​(RandomMatrixGenerator rgen,
                                                 long seed,
                                                 int k)
        Function to generate the random matrix with specified dimensions and block dimensions.
        Parameters:
        rgen - random matrix generator
        seed - seed value
        k - The number of threads to use in the operation
        Returns:
        matrix block
      • randOperationsInPlace

        public MatrixBlock randOperationsInPlace​(RandomMatrixGenerator rgen,
                                                 org.apache.commons.math3.random.Well1024a bigrand,
                                                 long bSeed)
        Function to generate a matrix of random numbers. This is invoked both from CP as well as from MR. In case of CP, it generates an entire matrix block-by-block. A bigrand is passed so that block-level seeds are generated internally. In case of MR, it generates a single block for given block-level seed bSeed. When pdf="uniform", cell values are drawn from uniform distribution in range [min,max]. When pdf="normal", cell values are drawn from standard normal distribution N(0,1). The range of generated values will always be (-Inf,+Inf).
        Parameters:
        rgen - random matrix generator
        bigrand - ?
        bSeed - seed value
        Returns:
        matrix block
      • randOperationsInPlace

        public MatrixBlock randOperationsInPlace​(RandomMatrixGenerator rgen,
                                                 org.apache.commons.math3.random.Well1024a bigrand,
                                                 long bSeed,
                                                 int k)
        Function to generate a matrix of random numbers. This is invoked both from CP as well as from MR. In case of CP, it generates an entire matrix block-by-block. A bigrand is passed so that block-level seeds are generated internally. In case of MR, it generates a single block for given block-level seed bSeed. When pdf="uniform", cell values are drawn from uniform distribution in range [min,max]. When pdf="normal", cell values are drawn from standard normal distribution N(0,1). The range of generated values will always be (-Inf,+Inf).
        Parameters:
        rgen - random matrix generator
        bigrand - ?
        bSeed - seed value
        k - ?
        Returns:
        matrix block
      • seqOperations

        public static MatrixBlock seqOperations​(double from,
                                                double to,
                                                double incr)
        Method to generate a sequence according to the given parameters. The generated sequence is always in dense format. Both end points specified from and to must be included in the generated sequence i.e., [from,to] both inclusive. Note that, to is included only if (to-from) is perfectly divisible by incr. For example, seq(0,1,0.5) generates (0.0 0.5 1.0) whereas seq(0,1,0.6) generates (0.0 0.6) but not (0.0 0.6 1.0)
        Parameters:
        from - ?
        to - ?
        incr - ?
        Returns:
        matrix block
      • seqOperationsInPlace

        public MatrixBlock seqOperationsInPlace​(double from,
                                                double to,
                                                double incr)
      • sampleOperations

        public static MatrixBlock sampleOperations​(long range,
                                                   int size,
                                                   boolean replace,
                                                   long seed)
      • isThreadSafe

        public boolean isThreadSafe()
        Indicates if concurrent modifications of disjoint rows are thread-safe.
        Returns:
        true if thread-safe
      • isThreadSafe

        public static boolean isThreadSafe​(boolean sparse)
        Indicates if concurrent modifications of disjoint rows are thread-safe.
        Parameters:
        sparse - true if sparse
        Returns:
        true if ?
      • equals

        public final boolean equals​(Object arg0)
        Overrides:
        equals in class Object
      • equals

        public final boolean equals​(MatrixBlock arg0)
        Analyze if the matrixBlocks are equivalent, the comparsion supports if the differnet sides are differently allocated such as sparse and dense.

        The implementations adhere to the properties of equals of:

        • Reflective
        • Symmetric
        • Transitive
        • Consistent
        Parameters:
        arg0 - MatrixBlock to compare
        Returns:
        If the matrices are equivalent
      • hashCode

        public final int hashCode()
        Overrides:
        hashCode in class Object
      • getDouble

        public double getDouble​(int r,
                                int c)
        Description copied from interface: CacheBlock
        Returns the double value at the passed row and column. If the value is missing 0 is returned.
        Specified by:
        getDouble in interface CacheBlock<MatrixBlock>
        Parameters:
        r - row of the value
        c - column of the value
        Returns:
        double value at the passed row and column
      • getDoubleNaN

        public double getDoubleNaN​(int r,
                                   int c)
        Description copied from interface: CacheBlock
        Returns the double value at the passed row and column. If the value is missing NaN is returned.
        Specified by:
        getDoubleNaN in interface CacheBlock<MatrixBlock>
        Parameters:
        r - row of the value
        c - column of the value
        Returns:
        double value at the passed row and column
      • getString

        public String getString​(int r,
                                int c)
        Description copied from interface: CacheBlock
        Returns the string of the value at the passed row and column. If the value is missing or NaN, null is returned.
        Specified by:
        getString in interface CacheBlock<MatrixBlock>
        Parameters:
        r - row of the value
        c - column of the value
        Returns:
        string of the value at the passed row and column