Class LibMatrixReorg
- java.lang.Object
-
- org.apache.sysds.runtime.matrix.data.LibMatrixReorg
-
public class LibMatrixReorg extends Object
MB: Library for selected matrix reorg operations including special cases and all combinations of dense and sparse representations. Current list of supported operations: - reshape, - r' (transpose), - rdiag (diagV2M/diagM2V), - rsort (sorting data/indexes) - rmempty (remove empty) - rexpand (outer/table-seq expansion)
-
-
Field Summary
Fields Modifier and Type Field Description static long
PAR_NUMCELL_THRESHOLD
static int
PAR_NUMCELL_THRESHOLD_SORT
static boolean
SHALLOW_COPY_REORG
static boolean
SPARSE_OUTPUTS_IN_CSR
-
Method Summary
All Methods Static Methods Concrete Methods Modifier and Type Method Description static void
checkRexpand(MatrixBlock in, boolean ignore)
Quick check if the input is valid for rexpand, this check does not guarantee that the input is valid for rexpandstatic Future<int[]>
countNNZColumns(MatrixBlock in, int k, ExecutorService pool)
static List<Future<int[]>>
countNNZColumnsFuture(MatrixBlock in, int k, ExecutorService pool)
static int[]
countNnzPerColumn(MatrixBlock in)
static int[]
countNnzPerColumn(MatrixBlock in, int rl, int ru)
static MatrixBlock
diag(MatrixBlock in, MatrixBlock out)
static boolean
isSupportedReorgOperator(ReorgOperator op)
static int[]
mergeNnzCounts(int[] cnt, int[] cnt2)
static MatrixBlock
reorg(MatrixBlock in, MatrixBlock out, ReorgOperator op)
static MatrixBlock
reorgInPlace(MatrixBlock in, ReorgOperator op)
static List<IndexedMatrixValue>
reshape(IndexedMatrixValue in, DataCharacteristics mcIn, DataCharacteristics mcOut, boolean rowwise, boolean outputEmptyBlocks)
MR/SPARK reshape interface - for reshape we cannot view blocks independently, and hence, there are different CP and MR interfaces.static MatrixBlock
reshape(MatrixBlock in, int rows, int cols, boolean rowwise)
CP reshape operation (single input, single output matrix) NOTE: In contrast to R, the rowwise parameter specifies both the read and write order, with row-wise being the default, while R uses always a column-wise read, rowwise specifying the write order and column-wise being the default.static MatrixBlock
reshape(MatrixBlock in, MatrixBlock out, int rows, int cols, boolean rowwise)
CP reshape operation (single input, single output matrix) NOTE: In contrast to R, the rowwise parameter specifies both the read and write order, with row-wise being the default, while R uses always a column-wise read, rowwise specifying the write order and column-wise being the default.static MatrixBlock
reshape(MatrixBlock in, MatrixBlock out, int rows, int cols, boolean rowwise, int k)
CP reshape operation (single input, single output matrix) NOTE: In contrast to R, the rowwise parameter specifies both the read and write order, with row-wise being the default, while R uses always a column-wise read, rowwise specifying the write order and column-wise being the default.static void
rev(IndexedMatrixValue in, long rlen, int blen, ArrayList<IndexedMatrixValue> out)
static MatrixBlock
rev(MatrixBlock in, MatrixBlock out)
static void
rexpand(IndexedMatrixValue data, double max, boolean rows, boolean cast, boolean ignore, long blen, ArrayList<IndexedMatrixValue> outList)
MR/Spark rexpand operation (single input, multiple outputs incl empty blocks)static MatrixBlock
rexpand(MatrixBlock in, MatrixBlock ret, double max, boolean rows, boolean cast, boolean ignore, int k)
CP rexpand operation (single input, single output), the classic example of this operation is one hot encoding of a column to multiple columns.static MatrixBlock
rexpand(MatrixBlock in, MatrixBlock ret, int max, boolean rows, boolean cast, boolean ignore, int k)
CP rexpand operation (single input, single output), the classic example of this operation is one hot encoding of a column to multiple columns.static void
rmempty(IndexedMatrixValue data, IndexedMatrixValue offset, boolean rmRows, long len, long blen, ArrayList<IndexedMatrixValue> outList)
MR rmempty interface - for rmempty we cannot view blocks independently, and hence, there are different CP and MR interfaces.static MatrixBlock
rmempty(MatrixBlock in, MatrixBlock ret, boolean rows, boolean emptyReturn, MatrixBlock select)
CP rmempty operation (single input, single output matrix)static MatrixBlock
sort(MatrixBlock in, MatrixBlock out, int[] by, boolean desc, boolean ixret)
static MatrixBlock
sort(MatrixBlock in, MatrixBlock out, int[] by, boolean desc, boolean ixret, int k)
static MatrixBlock
transpose(MatrixBlock in)
static MatrixBlock
transpose(MatrixBlock in, int k)
static MatrixBlock
transpose(MatrixBlock in, int k, boolean allowCSR)
static MatrixBlock
transpose(MatrixBlock in, MatrixBlock out)
static MatrixBlock
transpose(MatrixBlock in, MatrixBlock out, int k)
static MatrixBlock
transpose(MatrixBlock in, MatrixBlock out, int k, boolean allowCSR)
static MatrixBlock
transposeInPlace(MatrixBlock in, int k)
-
-
-
Field Detail
-
PAR_NUMCELL_THRESHOLD
public static long PAR_NUMCELL_THRESHOLD
-
PAR_NUMCELL_THRESHOLD_SORT
public static final int PAR_NUMCELL_THRESHOLD_SORT
- See Also:
- Constant Field Values
-
SHALLOW_COPY_REORG
public static final boolean SHALLOW_COPY_REORG
- See Also:
- Constant Field Values
-
SPARSE_OUTPUTS_IN_CSR
public static final boolean SPARSE_OUTPUTS_IN_CSR
- See Also:
- Constant Field Values
-
-
Method Detail
-
isSupportedReorgOperator
public static boolean isSupportedReorgOperator(ReorgOperator op)
-
reorg
public static MatrixBlock reorg(MatrixBlock in, MatrixBlock out, ReorgOperator op)
-
reorgInPlace
public static MatrixBlock reorgInPlace(MatrixBlock in, ReorgOperator op)
-
transpose
public static MatrixBlock transpose(MatrixBlock in)
-
transpose
public static MatrixBlock transpose(MatrixBlock in, MatrixBlock out)
-
transpose
public static MatrixBlock transpose(MatrixBlock in, int k)
-
transpose
public static MatrixBlock transpose(MatrixBlock in, int k, boolean allowCSR)
-
transpose
public static MatrixBlock transpose(MatrixBlock in, MatrixBlock out, int k)
-
transpose
public static MatrixBlock transpose(MatrixBlock in, MatrixBlock out, int k, boolean allowCSR)
-
countNNZColumns
public static Future<int[]> countNNZColumns(MatrixBlock in, int k, ExecutorService pool) throws InterruptedException, ExecutionException
-
countNNZColumnsFuture
public static List<Future<int[]>> countNNZColumnsFuture(MatrixBlock in, int k, ExecutorService pool) throws InterruptedException
- Throws:
InterruptedException
-
transposeInPlace
public static MatrixBlock transposeInPlace(MatrixBlock in, int k)
-
rev
public static MatrixBlock rev(MatrixBlock in, MatrixBlock out)
-
rev
public static void rev(IndexedMatrixValue in, long rlen, int blen, ArrayList<IndexedMatrixValue> out)
-
diag
public static MatrixBlock diag(MatrixBlock in, MatrixBlock out)
-
sort
public static MatrixBlock sort(MatrixBlock in, MatrixBlock out, int[] by, boolean desc, boolean ixret)
-
sort
public static MatrixBlock sort(MatrixBlock in, MatrixBlock out, int[] by, boolean desc, boolean ixret, int k)
- Parameters:
in
- Input matrix to sortout
- Output matrix where the sorted input is inserted toby
- The Ordering parameterdesc
- A boolean, specifying if it should be descending order.ixret
- A boolean, specifying if the return should be the sorted indexes.k
- Number of parallel threads- Returns:
- The sorted out matrix.
-
reshape
public static MatrixBlock reshape(MatrixBlock in, int rows, int cols, boolean rowwise)
CP reshape operation (single input, single output matrix) NOTE: In contrast to R, the rowwise parameter specifies both the read and write order, with row-wise being the default, while R uses always a column-wise read, rowwise specifying the write order and column-wise being the default.- Parameters:
in
- input matrixrows
- number of rowscols
- number of columnsrowwise
- if true, reshape by row- Returns:
- output matrix
-
reshape
public static MatrixBlock reshape(MatrixBlock in, MatrixBlock out, int rows, int cols, boolean rowwise)
CP reshape operation (single input, single output matrix) NOTE: In contrast to R, the rowwise parameter specifies both the read and write order, with row-wise being the default, while R uses always a column-wise read, rowwise specifying the write order and column-wise being the default.- Parameters:
in
- input matrixout
- output matrixrows
- number of rowscols
- number of columnsrowwise
- if true, reshape by row- Returns:
- output matrix
-
reshape
public static MatrixBlock reshape(MatrixBlock in, MatrixBlock out, int rows, int cols, boolean rowwise, int k)
CP reshape operation (single input, single output matrix) NOTE: In contrast to R, the rowwise parameter specifies both the read and write order, with row-wise being the default, while R uses always a column-wise read, rowwise specifying the write order and column-wise being the default.- Parameters:
in
- input matrixout
- output matrixrows
- number of rowscols
- number of columnsrowwise
- if true, reshape by rowk
- The parallelization degree- Returns:
- output matrix
-
reshape
public static List<IndexedMatrixValue> reshape(IndexedMatrixValue in, DataCharacteristics mcIn, DataCharacteristics mcOut, boolean rowwise, boolean outputEmptyBlocks)
MR/SPARK reshape interface - for reshape we cannot view blocks independently, and hence, there are different CP and MR interfaces.- Parameters:
in
- indexed matrix valuemcIn
- input matrix characteristicsmcOut
- output matrix characteristicsrowwise
- if true, reshape by rowoutputEmptyBlocks
- output blocks with nnz=0- Returns:
- list of indexed matrix values
-
rmempty
public static MatrixBlock rmempty(MatrixBlock in, MatrixBlock ret, boolean rows, boolean emptyReturn, MatrixBlock select)
CP rmempty operation (single input, single output matrix)- Parameters:
in
- input matrixret
- output matrixrows
- ?emptyReturn
- return row/column of zeros for empty inputselect
- ?- Returns:
- matrix block
-
rmempty
public static void rmempty(IndexedMatrixValue data, IndexedMatrixValue offset, boolean rmRows, long len, long blen, ArrayList<IndexedMatrixValue> outList)
MR rmempty interface - for rmempty we cannot view blocks independently, and hence, there are different CP and MR interfaces.- Parameters:
data
- ?offset
- ?rmRows
- ?len
- ?blen
- block lengthoutList
- list of indexed matrix values
-
rexpand
public static MatrixBlock rexpand(MatrixBlock in, MatrixBlock ret, double max, boolean rows, boolean cast, boolean ignore, int k)
CP rexpand operation (single input, single output), the classic example of this operation is one hot encoding of a column to multiple columns.- Parameters:
in
- Input matrixret
- Output matrixmax
- Number of rows/cols of the outputrows
- If the expansion is in rows directioncast
- If the values contained should be cast to double (rounded up and down)ignore
- Ignore if the input contain values below zero that technically is incorrect input.k
- Degree of parallelism- Returns:
- Output matrix rexpanded
-
rexpand
public static MatrixBlock rexpand(MatrixBlock in, MatrixBlock ret, int max, boolean rows, boolean cast, boolean ignore, int k)
CP rexpand operation (single input, single output), the classic example of this operation is one hot encoding of a column to multiple columns.- Parameters:
in
- Input matrixret
- Output matrixmax
- Number of rows/cols of the outputrows
- If the expansion is in rows directioncast
- If the values contained should be cast to double (rounded up and down)ignore
- Ignore if the input contain values below zero that technically is incorrect input.k
- Degree of parallelism- Returns:
- Output matrix rexpanded
-
checkRexpand
public static void checkRexpand(MatrixBlock in, boolean ignore)
Quick check if the input is valid for rexpand, this check does not guarantee that the input is valid for rexpand- Parameters:
in
- Input matrix blockignore
- If zero valued cells should be ignored
-
rexpand
public static void rexpand(IndexedMatrixValue data, double max, boolean rows, boolean cast, boolean ignore, long blen, ArrayList<IndexedMatrixValue> outList)
MR/Spark rexpand operation (single input, multiple outputs incl empty blocks)- Parameters:
data
- Input indexed matrix blockmax
- Total nrows/cols of the outputrows
- If the expansion is in rows directioncast
- If the values contained should be cast to double (rounded up and down)ignore
- Ignore if the input contain values below zero that technically is incorrect input.blen
- The block size to slice the output up intooutList
- The output indexedMatrixValues (a list to add all the output blocks to / modify)
-
countNnzPerColumn
public static int[] countNnzPerColumn(MatrixBlock in)
-
countNnzPerColumn
public static int[] countNnzPerColumn(MatrixBlock in, int rl, int ru)
-
mergeNnzCounts
public static int[] mergeNnzCounts(int[] cnt, int[] cnt2)
-
-