Class ReaderTextCellParallel


  • public class ReaderTextCellParallel
    extends ReaderTextCell
    Parallel version of ReaderTextCell.java. To summarize, we create read tasks per split and use a fixed-size thread pool, to executed these tasks. If the target matrix is dense, the inserts are done lock-free. If the matrix is sparse, we use a buffer to collect unordered input cells, lock the the target sparse matrix once, and append all buffered values. Note MatrixMarket: 1) For matrix market files each read task probes for comments until it finds data because for very small tasks or large comments, any split might encounter % or %%. Hence, the parallel reader does not do the validity check for. 2) In extreme scenarios, the last comment might be in one split, and the following meta data in the subsequent split. This would create incorrect results or errors. However, this scenario is extremely unlikely (num threads > num lines if 1 comment line) and hence ignored similar to our parallel MR setting (but there we have a 128MB guarantee). 3) However, we use MIN_FILESIZE_MM (8KB) to give guarantees for the common case of small headers in order the issue described in (2).
    • Constructor Detail

      • ReaderTextCellParallel

        public ReaderTextCellParallel​(Types.FileFormat fmt)