Matrix

A Matrix is represented either by an OperationNode, or the derived class Matrix. Matrices are the most fundamental objects SystemDS operates on.

Although it is possible to generate matrices with the function calls or object construction specified below, the recommended way is to use the methods defined on SystemDSContext.

Create general OperationNode

Parameters:

sds_context – The SystemDS context for performing the operations
operation – The name of the DML function to execute
unnamed_input_nodes – inputs identified by their position, not name
named_input_nodes – inputs with their respective parameter name
output_type – type of the output in DML (double, matrix etc.)
is_python_local_data – if the data is local in python e.g. Numpy arrays
number_of_outputs – If set to other value than 1 then it is expected that this operation node returns multiple values. If set remember to set the output_types value as well.
output_types – The types of output in a multi output scenario. Default is None, and means every multi output is a matrix.

abs() → Matrix

Calculate absolute.

Returns:: Matrix representing operation

acos() → Matrix

Calculate arccos.

Returns:: Matrix representing operation

asin() → Matrix

Calculate arcsin.

Returns:: Matrix representing operation

atan() → Matrix

Calculate arctan.

Returns:: Matrix representing operation

cbind(other) → Matrix: Column-wise matrix concatenation, by concatenating the second matrix as additional columns to the first matrix. :param: The other matrix to bind to the right hand side. :return: The OperationNode containing the concatenated matrices/frames.

cholesky(safe: bool = False) → Matrix

Computes the Cholesky decomposition of a symmetric, positive definite matrix

Parameters:: safe – default value is False, if flag is True additional checks to ensure that the matrix is symmetric positive definite are applied, if False, checks will be skipped
Returns:: the OperationNode representing this operation

code_line(var_name: str, unnamed_input_vars: Sequence[str], named_input_vars: Dict[str, str]) → str

Generates the DML code line equal to the intended action of this node.

Parameters:

var_name – Name of DML-variable this nodes result should be saved in
unnamed_input_vars – all strings representing the unnamed parameters
named_input_vars – all strings representing the named parameters (name value pairs)

Returns:

the DML code line that is equal to this operation

compute(verbose: bool = False, lineage: bool = False) → array

Get result of this operation. Builds the dml script and executes it in SystemDS, before this method is called all operations are only building the DAG without actually executing (lazy evaluation).

Parameters:

verbose – Can be activated to print additional information such as created DML-Script
lineage – Can be activated to print lineage trace till this node

Returns:

the output as an python builtin data type or numpy array

cos() → Matrix

Calculate cos.

Returns:: Matrix representing operation

cosh() → Matrix

Calculate cos.

Returns:: Matrix representing operation

eigen() → Matrix

Computes Eigen decomposition of input matrix A. The Eigen decomposition consists of two matrices V and w such that A = V %*% diag(w) %*% t(V). The columns of V are the eigenvectors of the original matrix A. And, the eigen values are given by w. It is important to note that this function can operate only on small-to-medium sized input matrix that can fit in the main memory. For larger matrices, an out-of-memory exception is raised.

This function returns two matrices w and V, where w is (m x 1) and V is of size (m x m).

Returns:: The MultiReturn node containing the two Matrices w and V

max(axis: int | None = None) → OperationNode

Calculate max of matrix.

Parameters:: axis – can be 0 or 1 to do either row or column aggregation
Returns:: Matrix representing operation

mean(axis: int | None = None) → OperationNode

Calculate mean of matrix.

Parameters:: axis – can be 0 or 1 to do either row or column means
Returns:: Matrix representing operation

min(axis: int | None = None) → OperationNode

Calculate max of matrix.

Parameters:: axis – can be 0 or 1 to do either row or column aggregation
Returns:: Matrix representing operation

order(by: int = 1, decreasing: bool = False, index_return: bool = False) → Matrix

Sort by a column of the matrix X in increasing/decreasing order and returns either the index or data

Parameters:

by – sort matrix by this column number
decreasing – If true the matrix will be sorted in decreasing order
index_return – If true, the index numbers will be returned

Returns:

the OperationNode representing this operation

pass_python_data_to_prepared_script(sds, var_name: str, prepared_script: JavaObject) → None

Passes data from python to the prepared script object.

Parameters:

jvm – the java virtual machine object
var_name – the variable name the data should get in java
prepared_script – the prepared script

rbind(other) → Matrix: Row-wise matrix concatenation, by concatenating the second matrix as additional rows to the first matrix. :param: The other matrix to bind to the right hand side :return: The OperationNode containing the concatenated matrices/frames.

replace(pattern: DAGNode | str | int | float | bool, replacement: DAGNode | str | int | float | bool) → Matrix: Replace all values with replacement value

rev() → Matrix

Reverses the rows

Returns:: the OperationNode representing this operation

round() → Matrix

round all values to nearest natural number

Returns:: The Matrix representing the result of this operation

sin() → Matrix

Calculate sin.

Returns:: Matrix representing operation

sinh() → Matrix

Calculate sin.

Returns:: Matrix representing operation

sum(axis: int | None = None) → OperationNode

Calculate sum of matrix.

Parameters:: axis – can be 0 or 1 to do either row or column sums
Returns:: Matrix representing operation

svd() → Matrix

Singular Value Decomposition of a matrix A (of size m x m), which decomposes into three matrices U, V, and S as A = U %% S %% t(V), where U is an m x m unitary matrix (i.e., orthogonal), V is an n x n unitary matrix (also orthogonal), and S is an m x n matrix with non-negative real numbers on the diagonal.

matrices U <(m x m)>, S <(m x n)>, and V <(n x n)>

Returns:: The MultiReturn node containing the three Matrices U,S, and V

t() → Matrix

Transposes the input

Returns:: the OperationNode representing this operation

tan() → Matrix

Calculate tan.

Returns:: Matrix representing operation

tanh() → Matrix

Calculate tan.

Returns:: Matrix representing operation

to_one_hot(num_classes: int) → Matrix

OneHot encode the matrix.

It is assumed that there is only one column to encode, and all values are whole numbers > 0

Parameters:: num_classes – The number of classes to encode into. max value contained in the matrix must be <= num_classes
Returns:: The OperationNode containing the oneHotEncoded values

to_string(**kwargs: Dict[str, DAGNode | str | int | float | bool]) → Scalar: Converts the input to a string representation. :return: Scalar containing the string.

var(axis: int | None = None) → OperationNode

Calculate variance of matrix.

Parameters:: axis – can be 0 or 1 to do either row or column vars
Returns:: Matrix representing operation