Class Matrix


  • public class Matrix
    extends Object
    Matrix encapsulates a SystemDS matrix. It allows for easy conversion to various other formats, such as RDDs, JavaRDDs, DataFrames, and double[][]s. After script execution, it offers a convenient format for obtaining SystemDS matrix data in Scala tuples.
    • Constructor Summary

      Constructors 
      Constructor Description
      Matrix​(org.apache.spark.api.java.JavaPairRDD<MatrixIndexes,​MatrixBlock> binaryBlocks, MatrixMetadata matrixMetadata)
      Create a Matrix, specifying the SystemDS binary-block matrix and its metadata.
      Matrix​(org.apache.spark.sql.Dataset<org.apache.spark.sql.Row> dataFrame)
      Convert a Spark DataFrame to a SystemDS binary-block representation.
      Matrix​(org.apache.spark.sql.Dataset<org.apache.spark.sql.Row> dataFrame, long numRows, long numCols)
      Convert a Spark DataFrame to a SystemDS binary-block representation, specifying the number of rows and columns.
      Matrix​(org.apache.spark.sql.Dataset<org.apache.spark.sql.Row> dataFrame, MatrixMetadata matrixMetadata)
      Convert a Spark DataFrame to a SystemDS binary-block representation.
      Matrix​(MatrixObject matrixObject, SparkExecutionContext sparkExecutionContext)  
    • Constructor Detail

      • Matrix

        public Matrix​(org.apache.spark.sql.Dataset<org.apache.spark.sql.Row> dataFrame,
                      MatrixMetadata matrixMetadata)
        Convert a Spark DataFrame to a SystemDS binary-block representation.
        Parameters:
        dataFrame - the Spark DataFrame
        matrixMetadata - matrix metadata, such as number of rows and columns
      • Matrix

        public Matrix​(org.apache.spark.sql.Dataset<org.apache.spark.sql.Row> dataFrame,
                      long numRows,
                      long numCols)
        Convert a Spark DataFrame to a SystemDS binary-block representation, specifying the number of rows and columns.
        Parameters:
        dataFrame - the Spark DataFrame
        numRows - the number of rows
        numCols - the number of columns
      • Matrix

        public Matrix​(org.apache.spark.api.java.JavaPairRDD<MatrixIndexes,​MatrixBlock> binaryBlocks,
                      MatrixMetadata matrixMetadata)
        Create a Matrix, specifying the SystemDS binary-block matrix and its metadata.
        Parameters:
        binaryBlocks - the JavaPairRDD<MatrixIndexes, MatrixBlock> matrix
        matrixMetadata - matrix metadata, such as number of rows and columns
      • Matrix

        public Matrix​(org.apache.spark.sql.Dataset<org.apache.spark.sql.Row> dataFrame)
        Convert a Spark DataFrame to a SystemDS binary-block representation.
        Parameters:
        dataFrame - the Spark DataFrame
    • Method Detail

      • toMatrixObject

        public MatrixObject toMatrixObject()
        Obtain the matrix as a SystemDS MatrixObject.
        Returns:
        the matrix as a SystemDS MatrixObject
      • to2DDoubleArray

        public double[][] to2DDoubleArray()
        Obtain the matrix as a two-dimensional double array
        Returns:
        the matrix as a two-dimensional double array
      • toJavaRDDStringIJV

        public org.apache.spark.api.java.JavaRDD<String> toJavaRDDStringIJV()
        Obtain the matrix as a JavaRDD<String> in IJV format
        Returns:
        the matrix as a JavaRDD<String> in IJV format
      • toJavaRDDStringCSV

        public org.apache.spark.api.java.JavaRDD<String> toJavaRDDStringCSV()
        Obtain the matrix as a JavaRDD<String> in CSV format
        Returns:
        the matrix as a JavaRDD<String> in CSV format
      • toRDDStringCSV

        public org.apache.spark.rdd.RDD<String> toRDDStringCSV()
        Obtain the matrix as a RDD<String> in CSV format
        Returns:
        the matrix as a RDD<String> in CSV format
      • toRDDStringIJV

        public org.apache.spark.rdd.RDD<String> toRDDStringIJV()
        Obtain the matrix as a RDD<String> in IJV format
        Returns:
        the matrix as a RDD<String> in IJV format
      • toDF

        public org.apache.spark.sql.Dataset<org.apache.spark.sql.Row> toDF()
        Obtain the matrix as a DataFrame of doubles with an ID column
        Returns:
        the matrix as a DataFrame of doubles with an ID column
      • toDFDoubleWithIDColumn

        public org.apache.spark.sql.Dataset<org.apache.spark.sql.Row> toDFDoubleWithIDColumn()
        Obtain the matrix as a DataFrame of doubles with an ID column
        Returns:
        the matrix as a DataFrame of doubles with an ID column
      • toDFDoubleNoIDColumn

        public org.apache.spark.sql.Dataset<org.apache.spark.sql.Row> toDFDoubleNoIDColumn()
        Obtain the matrix as a DataFrame of doubles with no ID column
        Returns:
        the matrix as a DataFrame of doubles with no ID column
      • toDFVectorWithIDColumn

        public org.apache.spark.sql.Dataset<org.apache.spark.sql.Row> toDFVectorWithIDColumn()
        Obtain the matrix as a DataFrame of vectors with an ID column
        Returns:
        the matrix as a DataFrame of vectors with an ID column
      • toDFVectorNoIDColumn

        public org.apache.spark.sql.Dataset<org.apache.spark.sql.Row> toDFVectorNoIDColumn()
        Obtain the matrix as a DataFrame of vectors with no ID column
        Returns:
        the matrix as a DataFrame of vectors with no ID column
      • toBinaryBlocks

        public org.apache.spark.api.java.JavaPairRDD<MatrixIndexes,​MatrixBlock> toBinaryBlocks()
        Obtain the matrix as a JavaPairRDD<MatrixIndexes, MatrixBlock>
        Returns:
        the matrix as a JavaPairRDD<MatrixIndexes, MatrixBlock>
      • toMatrixBlock

        public MatrixBlock toMatrixBlock()
        Obtain the matrix as a MatrixBlock
        Returns:
        the matrix as a MatrixBlock
      • getMatrixMetadata

        public MatrixMetadata getMatrixMetadata()
        Obtain the matrix metadata
        Returns:
        the matrix metadata
      • toString

        public String toString()
        If MatrixObject is available, output MatrixObject.toString(). If MatrixObject is not available but MatrixMetadata is available, output MatrixMetadata.toString(). Otherwise output Object.toString().
        Overrides:
        toString in class Object
      • hasBinaryBlocks

        public boolean hasBinaryBlocks()
        Whether or not this matrix contains data as binary blocks
        Returns:
        true if data as binary blocks are present, false otherwise.
      • hasMatrixObject

        public boolean hasMatrixObject()
        Whether or not this matrix contains data as a MatrixObject
        Returns:
        true if data as binary blocks are present, false otherwise.