Matrix

A Matrix is represented either by an OperationNode, or the derived class Matrix. Matrices are the most fundamental objects SystemDS operates on.

Although it is possible to generate matrices with the function calls or object construction specified below, the recommended way is to use the methods defined on SystemDSContext.

class systemds.operator.Matrix(sds_context, operation: str, unnamed_input_nodes: str | Iterable[DAGNode | str | int | float | bool] = None, named_input_nodes: Dict[str, DAGNode | str | int | float | bool] = None, local_data: array = None, brackets: bool = False)
__init__(sds_context, operation: str, unnamed_input_nodes: str | Iterable[DAGNode | str | int | float | bool] = None, named_input_nodes: Dict[str, DAGNode | str | int | float | bool] = None, local_data: array = None, brackets: bool = False) Matrix

Create general OperationNode

Parameters:
  • sds_context – The SystemDS context for performing the operations

  • operation – The name of the DML function to execute

  • unnamed_input_nodes – inputs identified by their position, not name

  • named_input_nodes – inputs with their respective parameter name

  • is_python_local_data – if the data is local in python e.g. Numpy arrays that this operation node returns multiple values. If set remember to set the output_types value as well.

abs() Matrix

Calculate absolute.

Returns:

Matrix representing operation

acos() Matrix

Calculate arccos.

Returns:

Matrix representing operation

argmax(axis: int | None = None) OperationNode

Return the index of the maximum if axis is None or a column vector for row-wise / column-wise maxima computation.

Parameters:

axis – can be 0 or 1 to do either row or column sums

Returns:

Matrix representing operation for row / columns or ‘Scalar’ representing operation for complete

argmin(axis: int | None = None) OperationNode

Return the index of the minimum if axis is None or a column vector for row-wise / column-wise minima computation.

Parameters:

axis – can be 0 or 1 to do either row or column sums

Returns:

Matrix representing operation for row / columns or ‘Scalar’ representing operation for complete

asin() Matrix

Calculate arcsin.

Returns:

Matrix representing operation

atan() Matrix

Calculate arctan.

Returns:

Matrix representing operation

cbind(other) Matrix

Column-wise matrix concatenation, by concatenating the second matrix as additional columns to the first matrix. :param: The other matrix to bind to the right hand side. :return: The OperationNode containing the concatenated matrices/frames.

ceil() Matrix

Return the ceiling of the input, element-wise.

Returns:

Matrix representing operation

cholesky(safe: bool = False) Matrix

Computes the Cholesky decomposition of a symmetric, positive definite matrix

Parameters:

safe – default value is False, if flag is True additional checks to ensure that the matrix is symmetric positive definite are applied, if False, checks will be skipped

Returns:

the OperationNode representing this operation

code_line(var_name: str, unnamed_input_vars: Sequence[str], named_input_vars: Dict[str, str]) str

Generates the DML code line equal to the intended action of this node.

Parameters:
  • var_name – Name of DML-variable this nodes result should be saved in

  • unnamed_input_vars – all strings representing the unnamed parameters

  • named_input_vars – all strings representing the named parameters (name value pairs)

Returns:

the DML code line that is equal to this operation

compute(verbose: bool = False, lineage: bool = False) array

Get result of this operation. Builds the dml script and executes it in SystemDS, before this method is called all operations are only building the DAG without actually executing (lazy evaluation).

Parameters:
  • verbose – Can be activated to print additional information such as created DML-Script

  • lineage – Can be activated to print lineage trace till this node

Returns:

the output as an python builtin data type or numpy array

cos() Matrix

Calculate cos.

Returns:

Matrix representing operation

cosh() Matrix

Calculate cos.

Returns:

Matrix representing operation

countDistinct(axis: int | None = None) OperationNode

Calculate the number of distinct values of matrix.

Parameters:

axis – can be 0 or 1 to do either row or column aggregation

Returns:

OperationNode representing operation

countDistinctApprox(axis: int | None = None) OperationNode

Calculate the approximate number of distinct values of matrix. :param axis: can be 0 or 1 to do either row or column aggregation :return: OperationNode representing operation

cummax() Matrix

Column prefix-max. (For row-prefix max, use X.t().cummax().t())

Returns:

The Matrix representing the result of this operation

cummin() Matrix

Column prefix-min. (For row-prefix min, use X.t().cummin().t())

Returns:

The Matrix representing the result of this operation

cumprod() Matrix

Column prefix-product. (For row-prefix prod, use X.t().cumprod().t())

Returns:

The Matrix representing the result of this operation

cumsum() Matrix

Column prefix-sum. (For row-prefix sum, use X.t().cumsum().t())

Returns:

The Matrix representing the result of this operation

cumsumprod() Matrix

Column prefix-sumprod of an 2-column matrix: Y = X.comsumprod(), where Y[i,1] = X[i,1] + X[i,2]*Y[i-1,1] for i in [1,2, .., nrow(X)] The aggregator is initialized with 0 (Y[0,1] = 0)

Returns:

The Matrix representing the result of this operation

diag() Matrix

Create diagonal matrix from (n x 1) matrix, or take diagonal from square matrix

Returns:

the OperationNode representing this operation

eigen() Matrix

Computes Eigen decomposition of input matrix A. The Eigen decomposition consists of two matrices V and w such that A = V %*% diag(w) %*% t(V). The columns of V are the eigenvectors of the original matrix A. And, the eigen values are given by w. It is important to note that this function can operate only on small-to-medium sized input matrix that can fit in the main memory. For larger matrices, an out-of-memory exception is raised.

This function returns two matrices w and V, where w is (m x 1) and V is of size (m x m).

Returns:

The MultiReturn node containing the two Matrices w and V

exp() Matrix

Calculate exponential.

Returns:

Matrix representing operation

fft() MultiReturn

Performs the Fast Fourier Transform (FFT) on the matrix. :return: A MultiReturn object representing the real and imaginary parts of the FFT output.

floor() Matrix

Return the floor of the input, element-wise.

Returns:

Matrix representing operation

ifft(imag_input: Matrix | None = None) MultiReturn

Performs the Inverse Fast Fourier Transform (IFFT) on a complex matrix.

Parameters:

imag_input – The imaginary part of the input matrix (optional).

Returns:

A MultiReturn object representing the real and imaginary parts of the IFFT output.

inv() Matrix

Computes the inverse of a squared matrix.

Returns:

The Matrix representing the result of this operation

isInf() Matrix

Computes a boolean indicator matrix of the same shape as the input, indicating where Inf (positive or negative infinity) values are located. :return: the OperationNode representing this operation

isNA() Matrix

Computes a boolean indicator matrix of the same shape as the input, indicating where NA (not available) values are located. Currently, NA is only capturing NaN values.

Returns:

the OperationNode representing this operation

isNaN() Matrix

Computes a boolean indicator matrix of the same shape as the input, indicating where NaN (not a number) values are located.

Returns:

the OperationNode representing this operation

log() Matrix

Calculate logarithm.

Returns:

Matrix representing operation

lu() MultiReturn

Computes Pivoted LU decomposition a square matrix A. The LU decomposition consists of three matrices P, L, and U such that P %*% A = L %*% U, where P is a permutation matrix that is used to rearrange the rows in A before the decomposition can be computed. L is a lower-triangular matrix whereas U is an upper-triangular matrix.

Returns:

The MultiReturn node containing the three Matrices p, l and u

max(axis: int | None = None) OperationNode

Calculate max of matrix.

Parameters:

axis – can be 0 or 1 to do either row or column aggregation

Returns:

Matrix representing operation

mean(axis: int | None = None) OperationNode

Calculate mean of matrix.

Parameters:

axis – can be 0 or 1 to do either row or column means

Returns:

Matrix representing operation

median(weights: Matrix | None = None) Scalar

Calculate median of a column matrix.

Returns:

Scalar representing operation

min(axis: int | None = None) OperationNode

Calculate max of matrix.

Parameters:

axis – can be 0 or 1 to do either row or column aggregation

Returns:

Matrix representing operation

order(by: int = 1, decreasing: bool = False, index_return: bool = False) Matrix

Sort by a column of the matrix X in increasing/decreasing order and returns either the index or data

Parameters:
  • by – sort matrix by this column number

  • decreasing – If true the matrix will be sorted in decreasing order

  • index_return – If true, the index numbers will be returned

Returns:

the OperationNode representing this operation

pass_python_data_to_prepared_script(sds, var_name: str, prepared_script: JavaObject) None

Passes data from python to the prepared script object.

Parameters:
  • jvm – the java virtual machine object

  • var_name – the variable name the data should get in java

  • prepared_script – the prepared script

prod(axis: int | None = None) OperationNode

Calculate product of cells in matrix.

Parameters:

axis – can be 0 or 1 to do either row or column sums

Returns:

Matrix representing operation

qr() MultiReturn

Computes QR decomposition of a matrix A using Householder reflectors. The QR decomposition of A consists of two matrices Q and R such that A = Q%*%R where Q is an orthogonal matrix (i.e., Q%*%t(Q) = t(Q)%*%Q = I, identity matrix) and R is an upper triangular matrix. For efficiency purposes, this function returns the matrix of Householder reflector vectors H instead of Q (which is a large m x m potentially dense matrix). The Q matrix can be explicitly computed from H, if needed. In most applications of QR, one is interested in calculating Q %*% B or t(Q) %*% B – and, both can be computed directly using H instead of explicitly constructing the large Q matrix.

Returns:

The MultiReturn node containing the two Matrices h and r

quantile(p, weights: Matrix | None = None) OperationNode

Returns a column matrix with list of all quantiles requested in P.

Parameters:
  • p – float for a single quantile or column matrix of requested quantiles

  • weights – (optional) weights matrix of the same shape as self

Returns:

Matrix or ‘Scalar’ representing operation

rbind(other) Matrix

Row-wise matrix concatenation, by concatenating the second matrix as additional rows to the first matrix. :param: The other matrix to bind to the right hand side :return: The OperationNode containing the concatenated matrices/frames.

replace(pattern: DAGNode | str | int | float | bool, replacement: DAGNode | str | int | float | bool) Matrix

Replace all values with replacement value

reshape(rows, cols=1)

Gives a new shape to a matrix without changing its data.

Parameters:
  • rows – number of rows

  • cols – number of columns, defaults to 1

Returns:

Matrix representing operation

rev() Matrix

Reverses the rows

Returns:

the OperationNode representing this operation

roll(shift: int) Matrix

Reverses the rows

Returns:

the OperationNode representing this operation

round() Matrix

round all values to nearest natural number

Returns:

The Matrix representing the result of this operation

sd() Scalar

Calculate standard deviation of matrix.

Returns:

Scalar representing operation

sign() Matrix

Returns a matrix representing the signs of the input matrix elements, where 1 represents positive, 0 represents zero, and -1 represents negative.

Returns:

Matrix representing operation

sin() Matrix

Calculate sin.

Returns:

Matrix representing operation

sinh() Matrix

Calculate sin.

Returns:

Matrix representing operation

sqrt() Matrix

Calculate square root.

Returns:

Matrix representing operation

sum(axis: int | None = None) OperationNode

Calculate sum of matrix.

Parameters:

axis – can be 0 or 1 to do either row or column sums

Returns:

Matrix representing operation

svd() Matrix

Singular Value Decomposition of a matrix A (of size m x m), which decomposes into three matrices U, V, and S as A = U %% S %% t(V), where U is an m x m unitary matrix (i.e., orthogonal), V is an n x n unitary matrix (also orthogonal), and S is an m x n matrix with non-negative real numbers on the diagonal.

matrices U <(m x m)>, S <(m x n)>, and V <(n x n)>

Returns:

The MultiReturn node containing the three Matrices U,S, and V

t() Matrix

Transposes the input

Returns:

the OperationNode representing this operation

tan() Matrix

Calculate tan.

Returns:

Matrix representing operation

tanh() Matrix

Calculate tan.

Returns:

Matrix representing operation

to_one_hot(num_classes: int) Matrix

OneHot encode the matrix.

It is assumed that there is only one column to encode, and all values are whole numbers > 0

Parameters:

num_classes – The number of classes to encode into. max value contained in the matrix must be <= num_classes

Returns:

The OperationNode containing the oneHotEncoded values

to_string(**kwargs: Dict[str, DAGNode | str | int | float | bool]) Scalar

Converts the input to a string representation. :return: Scalar containing the string.

trace() Scalar

Calculate trace.

Returns:

Scalar representing operation

tril(include_diagonal=True, return_values=True) Matrix

Selects the lower triangular part of a matrix, configurable to include the diagonal and return values or ones

Parameters:
  • include_diagonal – boolean, default True

  • return_values – boolean, default True, if set to False returns ones

Returns:

Matrix

triu(include_diagonal=True, return_values=True) Matrix

Selects the upper triangular part of a matrix, configurable to include the diagonal and return values or ones

Parameters:
  • include_diagonal – boolean, default True

  • return_values – boolean, default True, if set to False returns ones

Returns:

Matrix

unique(axis: int | None = None) Matrix

Returns the unique values for the complete matrix, for each row or for each column.

Parameters:

axis – can be 0 or 1 to do either row or column uniques

Returns:

Matrix representing operation

var(axis: int | None = None) OperationNode

Calculate variance of matrix.

Parameters:

axis – can be 0 or 1 to do either row or column vars

Returns:

OperationNode representing operation