Class SparseBlockMCSC

  • All Implemented Interfaces:
    Serializable, Block

    public class SparseBlockMCSC
    extends SparseBlock
    SparseBlock implementation that realizes a 'modified compressed sparse column' representation, where each compressed column is stored as a separate SparseRow object which provides flexibility for unsorted column appends without the need for global reshifting of values/indexes but it incurs additional memory overhead per column for object/array headers per column which also slows down memory-bound operations due to higher memory bandwidth requirements. TODO implement row interface of sparse blocks (can be slow but must be correct; additionally, we can expose the column API for efficient use in specific operations)
    See Also:
    Serialized Form
    • Method Summary

      All Methods Static Methods Instance Methods Concrete Methods 
      Modifier and Type Method Description
      boolean add​(int r, int c, double v)
      Add a value to a matrix cell (r,c).
      void allocate​(int r)
      Allocate the underlying data structure holding non-zero values of row r if necessary.
      void allocate​(int r, int nnz)
      Allocate the underlying data structure holding non-zero values of row r if necessary, w/ given size.
      void allocate​(int r, int ennz, int maxnnz)
      Allocate the underlying data structure holding non-zero values of row r w/ the specified estimated nnz and max nnz.
      void allocateCol​(int c)  
      void allocateCol​(int c, int nnz)  
      void allocateCol​(int c, int ennz, int maxnnz)  
      void append​(int r, int c, double v)
      Append a value to the end of the physical representation.
      boolean checkValidity​(int rlen, int clen, long nnz, boolean strict)
      Validate the correctness of the internal data structures of the different sparse block implementations.
      void compact​(int r)
      Re-allocate physical row if physical size exceeds logical size plus resize factor.
      void compactCol​(int c)  
      void deleteIndexRange​(int r, int cl, int cu)
      Deletes all non-zero values of the given column range [cl,cu) in row r.
      void deleteIndexRangeCol​(int c, int rl, int ru)  
      static long estimateSizeInMemory​(long nrows, long ncols, double sparsity)
      Get the estimated in-memory size of the sparse block in MCSC with the given dimensions w/o accounting for overallocation.
      SparseRow get​(int r)
      Get values of row r in the format of a sparse row.
      double get​(int r, int c)
      Get value of matrix cell (r,c).
      SparseRow getCol​(int c)  
      SparseRow[] getCols()
      Helper function for MCSC
      long getExactSizeInMemory()
      Computes the exact size in memory of the materialized block
      Iterator<Integer> getNonEmptyColumnsIterator​(int cl, int cu)  
      Iterator<Integer> getNonEmptyRowsIterator​(int rl, int ru)  
      SparseRow[] getRows()
      Helper function for MCSC
      int[] indexes​(int r)
      Get the sorted array of column indexes of all non-zero entries in row r.
      int[] indexesCol​(int c)  
      boolean isAllocated​(int r)
      Indicates if the underlying data structure for a given row is already allocated.
      boolean isAllocatedCol​(int c)  
      boolean isContiguous()
      Indicates if the underlying data structures returned by values and indexes are contiguous arrays, which can be exploited for more efficient operations.
      boolean isEmpty​(int r)
      Get information if row r is empty, i.e., does not contain non-zero values.
      boolean isEmptyCol​(int c)  
      boolean isThreadSafe()
      Indicates if the underlying implementation allows thread-safe row updates if concurrent threads update disjoint rows.
      int numCols()  
      int numRows()
      Get the number of rows in the sparse block.
      int pos​(int r)
      Get the starting position of row r in the indexes/values arrays returned by indexes(r) and values(r).
      int posFIndexGT​(int r, int c)
      Get position of first column index greater than column c in row r.
      int posFIndexGTCol​(int r, int c)  
      int posFIndexGTE​(int r, int c)
      Get position of first column index greater than or equal column c in row r.
      int posFIndexGTECol​(int r, int c)  
      int posFIndexLTE​(int r, int c)
      Get position of first column index lower than or equal column c in row r.
      int posFIndexLTECol​(int r, int c)  
      void reset()
      Clears the sparse block by deleting non-zero values.
      void reset​(int ennz, int maxnnz)
      Clears the sparse block by deleting non-zero values.
      void reset​(int r, int ennz, int maxnnz)
      Clears row r of the sparse block by deleting non-zero values.
      void resetCol​(int c, int ennz, int maxnnz)  
      boolean set​(int r, int c, double v)
      Set the value of a matrix cell (r,c).
      void set​(int r, SparseRow row, boolean deep)
      Set the values of row r to the given sparse row.
      void setCol​(int c, SparseRow col, boolean deep)  
      void setIndexRange​(int r, int cl, int cu, double[] v, int[] vix, int vpos, int vlen)
      Sets a sparse array of non-zeros values and indexes into the column range [cl,cu) in row r.
      void setIndexRange​(int r, int cl, int cu, double[] v, int vix, int vlen)
      Sets a dense array of non-zeros values into the column range [cl,cu) in row r.
      void setIndexRangeCol​(int c, int rl, int ru, double[] v, int[] vix, int vpos, int vlen)  
      void setIndexRangeCol​(int c, int rl, int ru, double[] v, int vix, int vlen)  
      long size()
      Get the number of non-zero values in the sparse block.
      int size​(int r)
      Get the number of non-zero values in row r.
      long size​(int rl, int ru)
      Get the number of non-zeros values in the row range of [rl, ru).
      long size​(int rl, int ru, int cl, int cu)
      Get the number of non-zeros values in the row and column range of [rl/cl, ru/cu);
      int sizeCol​(int c)  
      long sizeCol​(int cl, int cu)  
      void sort()
      Sort all non-zero value/index pairs of the sparse block by row and column index.
      void sort​(int r)
      Sort all non-zero value/index pairs of row r column index.
      void sortCol​(int c)  
      String toString()  
      double[] values​(int r)
      Get the array of all non-zero entries in row r, sorted by their column indexes.
      double[] valuesCol​(int c)  
    • Constructor Detail

      • SparseBlockMCSC

        public SparseBlockMCSC​(SparseBlock sblock,
                               int clen)
      • SparseBlockMCSC

        public SparseBlockMCSC​(SparseBlock sblock)
      • SparseBlockMCSC

        public SparseBlockMCSC​(SparseRow[] cols,
                               boolean deep,
                               int rlen)
      • SparseBlockMCSC

        public SparseBlockMCSC​(int clen)
      • SparseBlockMCSC

        public SparseBlockMCSC​(int rlen,
                               int clen)
    • Method Detail

      • estimateSizeInMemory

        public static long estimateSizeInMemory​(long nrows,
                                                long ncols,
                                                double sparsity)
        Get the estimated in-memory size of the sparse block in MCSC with the given dimensions w/o accounting for overallocation.
        Parameters:
        nrows - number of rows
        ncols - number of columns
        sparsity - sparsity ratio
        Returns:
        memory estimate
      • getExactSizeInMemory

        public long getExactSizeInMemory()
        Description copied from class: SparseBlock
        Computes the exact size in memory of the materialized block
        Specified by:
        getExactSizeInMemory in class SparseBlock
        Returns:
        the exact size in memory
      • allocate

        public void allocate​(int r)
        Description copied from class: SparseBlock
        Allocate the underlying data structure holding non-zero values of row r if necessary.
        Specified by:
        allocate in class SparseBlock
        Parameters:
        r - row index
      • allocateCol

        public void allocateCol​(int c)
      • allocate

        public void allocate​(int r,
                             int nnz)
        Description copied from class: SparseBlock
        Allocate the underlying data structure holding non-zero values of row r if necessary, w/ given size.
        Specified by:
        allocate in class SparseBlock
        Parameters:
        r - row index
        nnz - number of non-zeros
      • allocateCol

        public void allocateCol​(int c,
                                int nnz)
      • allocate

        public void allocate​(int r,
                             int ennz,
                             int maxnnz)
        Description copied from class: SparseBlock
        Allocate the underlying data structure holding non-zero values of row r w/ the specified estimated nnz and max nnz.
        Specified by:
        allocate in class SparseBlock
        Parameters:
        r - row index
        ennz - estimated non-zeros
        maxnnz - max non-zeros
      • allocateCol

        public void allocateCol​(int c,
                                int ennz,
                                int maxnnz)
      • compact

        public void compact​(int r)
        Description copied from class: SparseBlock
        Re-allocate physical row if physical size exceeds logical size plus resize factor.
        Specified by:
        compact in class SparseBlock
        Parameters:
        r - row index
      • compactCol

        public void compactCol​(int c)
      • numRows

        public int numRows()
        Description copied from class: SparseBlock
        Get the number of rows in the sparse block.
        Specified by:
        numRows in class SparseBlock
        Returns:
        number of rows
      • numCols

        public int numCols()
      • isThreadSafe

        public boolean isThreadSafe()
        Description copied from class: SparseBlock
        Indicates if the underlying implementation allows thread-safe row updates if concurrent threads update disjoint rows.
        Specified by:
        isThreadSafe in class SparseBlock
        Returns:
        true if thread-safe row updates
      • isContiguous

        public boolean isContiguous()
        Description copied from class: SparseBlock
        Indicates if the underlying data structures returned by values and indexes are contiguous arrays, which can be exploited for more efficient operations.
        Specified by:
        isContiguous in class SparseBlock
        Returns:
        true if underlying data structures are contiguous arrays
      • isAllocated

        public boolean isAllocated​(int r)
        Description copied from class: SparseBlock
        Indicates if the underlying data structure for a given row is already allocated.
        Specified by:
        isAllocated in class SparseBlock
        Parameters:
        r - row index
        Returns:
        true if already allocated
      • isAllocatedCol

        public boolean isAllocatedCol​(int c)
      • reset

        public void reset()
        Description copied from class: SparseBlock
        Clears the sparse block by deleting non-zero values. After this call all size() calls are guaranteed to return 0.
        Specified by:
        reset in class SparseBlock
      • reset

        public void reset​(int ennz,
                          int maxnnz)
        Description copied from class: SparseBlock
        Clears the sparse block by deleting non-zero values. After this call all size() calls are guaranteed to return 0.
        Specified by:
        reset in class SparseBlock
        Parameters:
        ennz - estimated non-zeros
        maxnnz - max non-zeros
      • reset

        public void reset​(int r,
                          int ennz,
                          int maxnnz)
        Description copied from class: SparseBlock
        Clears row r of the sparse block by deleting non-zero values. After this call size(r) is guaranteed to return 0.
        Specified by:
        reset in class SparseBlock
        Parameters:
        r - row index
        ennz - estimated non-zeros
        maxnnz - max non-zeros
      • resetCol

        public void resetCol​(int c,
                             int ennz,
                             int maxnnz)
      • size

        public long size()
        Description copied from class: SparseBlock
        Get the number of non-zero values in the sparse block.
        Specified by:
        size in class SparseBlock
        Returns:
        number of non-zero values in sparse block
      • size

        public int size​(int r)
        Description copied from class: SparseBlock
        Get the number of non-zero values in row r.
        Specified by:
        size in class SparseBlock
        Parameters:
        r - row index starting at 0
        Returns:
        number of non-zero values in row r
      • sizeCol

        public int sizeCol​(int c)
      • size

        public long size​(int rl,
                         int ru)
        Description copied from class: SparseBlock
        Get the number of non-zeros values in the row range of [rl, ru).
        Specified by:
        size in class SparseBlock
        Parameters:
        rl - row lower index
        ru - row upper index
        Returns:
        number of non-zero values in the row range
      • sizeCol

        public long sizeCol​(int cl,
                            int cu)
      • size

        public long size​(int rl,
                         int ru,
                         int cl,
                         int cu)
        Description copied from class: SparseBlock
        Get the number of non-zeros values in the row and column range of [rl/cl, ru/cu);
        Specified by:
        size in class SparseBlock
        Parameters:
        rl - row lower index
        ru - row upper index
        cl - column lower index
        cu - column upper index
        Returns:
        number of non-zero values in the row and column range
      • isEmpty

        public boolean isEmpty​(int r)
        Description copied from class: SparseBlock
        Get information if row r is empty, i.e., does not contain non-zero values. Equivalent to size(r)==0. Users should do this check if it is unknown if the underlying row data structure is allocated.
        Specified by:
        isEmpty in class SparseBlock
        Parameters:
        r - row index starting at 0
        Returns:
        true if row does not contain non-zero values
      • isEmptyCol

        public boolean isEmptyCol​(int c)
      • checkValidity

        public boolean checkValidity​(int rlen,
                                     int clen,
                                     long nnz,
                                     boolean strict)
        Description copied from class: SparseBlock
        Validate the correctness of the internal data structures of the different sparse block implementations.
        Specified by:
        checkValidity in class SparseBlock
        Parameters:
        rlen - number of rows
        clen - number of columns
        nnz - number of non zeros
        strict - enforce optional properties
        Returns:
        true if the sparse block is valid wrt the corresponding format such as COO, CSR, MCSR.
      • indexes

        public int[] indexes​(int r)
        Description copied from class: SparseBlock
        Get the sorted array of column indexes of all non-zero entries in row r. Note that - for flexibility of the implementing format - the returned array may be larger, where the range for row r is given by [pos(r),pos(r)+size(r)).
        Specified by:
        indexes in class SparseBlock
        Parameters:
        r - row index starting at 0
        Returns:
        sorted array of column indexes
      • indexesCol

        public int[] indexesCol​(int c)
      • values

        public double[] values​(int r)
        Description copied from class: SparseBlock
        Get the array of all non-zero entries in row r, sorted by their column indexes. Note that - for flexibility of the implementing format - the returned array may be larger, where the range for row r is given by [pos(r),pos(r)+size(r)).
        Specified by:
        values in class SparseBlock
        Parameters:
        r - row index starting at 0
        Returns:
        array of all non-zero entries in row r sorted by column indexes
      • valuesCol

        public double[] valuesCol​(int c)
      • pos

        public int pos​(int r)
        Description copied from class: SparseBlock
        Get the starting position of row r in the indexes/values arrays returned by indexes(r) and values(r).
        Specified by:
        pos in class SparseBlock
        Parameters:
        r - row index starting at 0
        Returns:
        starting position of row r
      • set

        public boolean set​(int r,
                           int c,
                           double v)
        Description copied from class: SparseBlock
        Set the value of a matrix cell (r,c). This might update an existing non-zero value, insert a new non-zero value, or delete a non-zero value.
        Specified by:
        set in class SparseBlock
        Parameters:
        r - row index starting at 0
        c - column index starting at 0
        v - zero or non-zero value
        Returns:
        true, if number of non-zeros changed
      • set

        public void set​(int r,
                        SparseRow row,
                        boolean deep)
        Description copied from class: SparseBlock
        Set the values of row r to the given sparse row. This might update existing non-zero values, insert a new row, or delete a row. NOTE: This method exists for incremental runtime integration and might be deleted in the future.
        Specified by:
        set in class SparseBlock
        Parameters:
        r - row index starting at 0
        row - sparse row
        deep - indicator to create deep copy of sparse row
      • setCol

        public void setCol​(int c,
                           SparseRow col,
                           boolean deep)
      • add

        public boolean add​(int r,
                           int c,
                           double v)
        Description copied from class: SparseBlock
        Add a value to a matrix cell (r,c). This might update an existing non-zero value, or insert a new non-zero value.
        Specified by:
        add in class SparseBlock
        Parameters:
        r - row index starting at 0
        c - column index starting at 0
        v - zero or non-zero value
        Returns:
        true, if number of non-zeros changed
      • append

        public void append​(int r,
                           int c,
                           double v)
        Description copied from class: SparseBlock
        Append a value to the end of the physical representation. This should only be used for operations with sequential write pattern or if followed by a sort() operation. Note that this operation does not perform any matrix cell updates.
        Specified by:
        append in class SparseBlock
        Parameters:
        r - row index starting at 0
        c - column index starting at 0
        v - zero or non-zero value
      • setIndexRange

        public void setIndexRange​(int r,
                                  int cl,
                                  int cu,
                                  double[] v,
                                  int vix,
                                  int vlen)
        Description copied from class: SparseBlock
        Sets a dense array of non-zeros values into the column range [cl,cu) in row r. The passed value array may be larger and the relevant range is given by [vix,vix+len).
        Specified by:
        setIndexRange in class SparseBlock
        Parameters:
        r - row index starting at 0
        cl - lower column index starting at 0
        cu - upper column index starting at 0
        v - value array
        vix - start index in value array
        vlen - number of relevant values
      • setIndexRangeCol

        public void setIndexRangeCol​(int c,
                                     int rl,
                                     int ru,
                                     double[] v,
                                     int vix,
                                     int vlen)
      • setIndexRange

        public void setIndexRange​(int r,
                                  int cl,
                                  int cu,
                                  double[] v,
                                  int[] vix,
                                  int vpos,
                                  int vlen)
        Description copied from class: SparseBlock
        Sets a sparse array of non-zeros values and indexes into the column range [cl,cu) in row r. The passed value array may be larger.
        Specified by:
        setIndexRange in class SparseBlock
        Parameters:
        r - row index starting at 0
        cl - lower column index starting at 0
        cu - upper column index starting at 0
        v - value array
        vix - column index array
        vpos - start index in value and index arrays
        vlen - number of relevant values
      • setIndexRangeCol

        public void setIndexRangeCol​(int c,
                                     int rl,
                                     int ru,
                                     double[] v,
                                     int[] vix,
                                     int vpos,
                                     int vlen)
      • deleteIndexRange

        public void deleteIndexRange​(int r,
                                     int cl,
                                     int cu)
        Description copied from class: SparseBlock
        Deletes all non-zero values of the given column range [cl,cu) in row r.
        Specified by:
        deleteIndexRange in class SparseBlock
        Parameters:
        r - row index starting at 0
        cl - lower column index starting at 0
        cu - upper column index starting at 0
      • deleteIndexRangeCol

        public void deleteIndexRangeCol​(int c,
                                        int rl,
                                        int ru)
      • sort

        public void sort()
        Description copied from class: SparseBlock
        Sort all non-zero value/index pairs of the sparse block by row and column index.
        Specified by:
        sort in class SparseBlock
      • sort

        public void sort​(int r)
        Description copied from class: SparseBlock
        Sort all non-zero value/index pairs of row r column index.
        Specified by:
        sort in class SparseBlock
        Parameters:
        r - row index starting at 0
      • sortCol

        public void sortCol​(int c)
      • get

        public double get​(int r,
                          int c)
        Description copied from class: SparseBlock
        Get value of matrix cell (r,c). In case of non existing values this call returns 0.
        Specified by:
        get in interface Block
        Specified by:
        get in class SparseBlock
        Parameters:
        r - row index starting at 0
        c - column index starting at 0
        Returns:
        value of cell at position (r,c)
      • get

        public SparseRow get​(int r)
        Description copied from class: SparseBlock
        Get values of row r in the format of a sparse row.
        Specified by:
        get in class SparseBlock
        Parameters:
        r - row index starting at 0
        Returns:
        values of row r as a sparse row
      • getCol

        public SparseRow getCol​(int c)
      • posFIndexLTE

        public int posFIndexLTE​(int r,
                                int c)
        Description copied from class: SparseBlock
        Get position of first column index lower than or equal column c in row r. The position is relative to the indexes/values arrays returned by indexes(r) and values(r). If no such value exists, this call returns -1.
        Specified by:
        posFIndexLTE in class SparseBlock
        Parameters:
        r - row index starting at 0
        c - column index starting at 0
        Returns:
        position of the first column index lower than or equal to column c in row r
      • posFIndexLTECol

        public int posFIndexLTECol​(int r,
                                   int c)
      • posFIndexGTE

        public int posFIndexGTE​(int r,
                                int c)
        Description copied from class: SparseBlock
        Get position of first column index greater than or equal column c in row r. The position is relative to the indexes/values arrays returned by indexes(r) and values(r). If no such value exists, this call returns -1. Note if CSR the pos(r) is subtracted from the result.
        Specified by:
        posFIndexGTE in class SparseBlock
        Parameters:
        r - row index starting at 0
        c - column index starting at 0
        Returns:
        position of the first column index greater than or equal to column c in row r
      • posFIndexGTECol

        public int posFIndexGTECol​(int r,
                                   int c)
      • posFIndexGT

        public int posFIndexGT​(int r,
                               int c)
        Description copied from class: SparseBlock
        Get position of first column index greater than column c in row r. The position is relative to the indexes/values arrays returned by indexes(r) and values(r). If no such value exists, this call returns -1.
        Specified by:
        posFIndexGT in class SparseBlock
        Parameters:
        r - row index starting at 0
        c - column index starting at 0
        Returns:
        position of the first column index greater than column c in row r
      • posFIndexGTCol

        public int posFIndexGTCol​(int r,
                                  int c)
      • getCols

        public SparseRow[] getCols()
        Helper function for MCSC
        Returns:
        the underlying array of columns SparseRow
      • getRows

        public SparseRow[] getRows()
        Helper function for MCSC
        Returns:
        the corresponding array of rows SparseRow
      • getNonEmptyColumnsIterator

        public Iterator<Integer> getNonEmptyColumnsIterator​(int cl,
                                                            int cu)