Package org.apache.sysds.api.mlcontext
Class MLContextConversionUtil
- java.lang.Object
-
- org.apache.sysds.api.mlcontext.MLContextConversionUtil
-
public class MLContextConversionUtil extends Object
Utility class containing methods to perform data conversions.
-
-
Constructor Summary
Constructors Constructor Description MLContextConversionUtil()
-
Method Summary
All Methods Static Methods Concrete Methods Modifier and Type Method Description static FrameObject
binaryBlocksToFrameObject(org.apache.spark.api.java.JavaPairRDD<Long,FrameBlock> binaryBlocks)
Convert aJavaPairRDD<Long, FrameBlock>
to aFrameObject
.static FrameObject
binaryBlocksToFrameObject(org.apache.spark.api.java.JavaPairRDD<Long,FrameBlock> binaryBlocks, FrameMetadata frameMetadata)
Convert aJavaPairRDD<Long, FrameBlock>
to aFrameObject
.static MatrixBlock
binaryBlocksToMatrixBlock(org.apache.spark.api.java.JavaPairRDD<MatrixIndexes,MatrixBlock> binaryBlocks, MatrixMetadata matrixMetadata)
Convert aJavaPairRDD<MatrixIndexes, MatrixBlock>
to aMatrixBlock
static MatrixObject
binaryBlocksToMatrixObject(org.apache.spark.api.java.JavaPairRDD<MatrixIndexes,MatrixBlock> binaryBlocks)
Convert aJavaPairRDD<MatrixIndexes, MatrixBlock>
to aMatrixObject
.static MatrixObject
binaryBlocksToMatrixObject(org.apache.spark.api.java.JavaPairRDD<MatrixIndexes,MatrixBlock> binaryBlocks, MatrixMetadata matrixMetadata)
Convert aJavaPairRDD<MatrixIndexes, MatrixBlock>
to aMatrixObject
.static org.apache.spark.api.java.JavaPairRDD<Long,FrameBlock>
dataFrameToFrameBinaryBlocks(org.apache.spark.sql.Dataset<org.apache.spark.sql.Row> dataFrame, FrameMetadata frameMetadata)
Convert aDataFrame
to aJavaPairRDD<Long, FrameBlock>
binary-block frame.static FrameObject
dataFrameToFrameObject(org.apache.spark.sql.Dataset<org.apache.spark.sql.Row> dataFrame)
Convert aDataFrame
to aFrameObject
.static FrameObject
dataFrameToFrameObject(org.apache.spark.sql.Dataset<org.apache.spark.sql.Row> dataFrame, FrameMetadata frameMetadata)
Convert aDataFrame
to aFrameObject
.static org.apache.spark.api.java.JavaPairRDD<MatrixIndexes,MatrixBlock>
dataFrameToMatrixBinaryBlocks(org.apache.spark.sql.Dataset<org.apache.spark.sql.Row> dataFrame)
Convert aDataFrame
to aJavaPairRDD<MatrixIndexes, MatrixBlock>
binary-block matrix.static org.apache.spark.api.java.JavaPairRDD<MatrixIndexes,MatrixBlock>
dataFrameToMatrixBinaryBlocks(org.apache.spark.sql.Dataset<org.apache.spark.sql.Row> dataFrame, MatrixMetadata matrixMetadata)
Convert aDataFrame
to aJavaPairRDD<MatrixIndexes, MatrixBlock>
binary-block matrix.static MatrixObject
dataFrameToMatrixObject(org.apache.spark.sql.Dataset<org.apache.spark.sql.Row> dataFrame)
Convert aDataFrame
to aMatrixObject
.static MatrixObject
dataFrameToMatrixObject(org.apache.spark.sql.Dataset<org.apache.spark.sql.Row> dataFrame, MatrixMetadata matrixMetadata)
Convert aDataFrame
to aMatrixObject
.static void
determineFrameFormatIfNeeded(org.apache.spark.sql.Dataset<org.apache.spark.sql.Row> dataFrame, FrameMetadata frameMetadata)
If the FrameFormat of the DataFrame has not been explicitly specified, attempt to determine the proper FrameFormat.static void
determineMatrixFormatIfNeeded(org.apache.spark.sql.Dataset<org.apache.spark.sql.Row> dataFrame, MatrixMetadata matrixMetadata)
If the MatrixFormat of the DataFrame has not been explicitly specified, attempt to determine the proper MatrixFormat.static MatrixObject
doubleMatrixToMatrixObject(String variableName, double[][] doubleMatrix)
Convert a two-dimensional double array to aMatrixObject
.static MatrixObject
doubleMatrixToMatrixObject(String variableName, double[][] doubleMatrix, MatrixMetadata matrixMetadata)
Convert a two-dimensional double array to aMatrixObject
.static FrameObject
frameBlockToFrameObject(String variableName, FrameBlock frameBlock, FrameMetadata frameMetadata)
Convert aFrameBlock
to aFrameObject
.static String[][]
frameObjectTo2DStringArray(FrameObject frameObject)
Convert aFrameObject
to a two-dimensional string array.static org.apache.spark.api.java.JavaPairRDD<Long,FrameBlock>
frameObjectToBinaryBlocks(FrameObject frameObject, SparkExecutionContext sparkExecutionContext)
Convert aFrameObject
to aJavaPairRDD<Long, FrameBlock>
.static org.apache.spark.sql.Dataset<org.apache.spark.sql.Row>
frameObjectToDataFrame(FrameObject frameObject, SparkExecutionContext sparkExecutionContext)
Convert aFrameObject
to aDataFrame
.static org.apache.spark.api.java.JavaRDD<String>
frameObjectToJavaRDDStringCSV(FrameObject frameObject, String delimiter)
Convert aFrameObject
to aJavaRDD<String>
in CSV format.static org.apache.spark.api.java.JavaRDD<String>
frameObjectToJavaRDDStringIJV(FrameObject frameObject)
Convert aFrameObject
to aJavaRDD<String>
in IJV format.static List<String>
frameObjectToListStringCSV(FrameObject frameObject, String delimiter)
Convert aFrameObject
to aList<String>
in CSV format.static List<String>
frameObjectToListStringIJV(FrameObject frameObject)
Convert aFrameObject
to aList<String>
in IJV format.static org.apache.spark.rdd.RDD<String>
frameObjectToRDDStringCSV(FrameObject frameObject, String delimiter)
Convert aFrameObject
to aRDD<String>
in CSV format.static org.apache.spark.rdd.RDD<String>
frameObjectToRDDStringIJV(FrameObject frameObject)
Convert aFrameObject
to aRDD<String>
in IJV format.static boolean
isDataFrameWithIDColumn(FrameMetadata frameMetadata)
Return whether or not the DataFrame has an ID column.static boolean
isDataFrameWithIDColumn(MatrixMetadata matrixMetadata)
Return whether or not the DataFrame has an ID column.static boolean
isVectorBasedDataFrame(MatrixMetadata matrixMetadata)
Return whether or not the DataFrame is vector-based.static FrameObject
javaRDDStringCSVToFrameObject(org.apache.spark.api.java.JavaRDD<String> javaRDD)
Convert aJavaRDD<String>
in CSV format to aFrameObject
static FrameObject
javaRDDStringCSVToFrameObject(org.apache.spark.api.java.JavaRDD<String> javaRDD, FrameMetadata frameMetadata)
Convert aJavaRDD<String>
in CSV format to aFrameObject
static MatrixObject
javaRDDStringCSVToMatrixObject(org.apache.spark.api.java.JavaRDD<String> javaRDD)
Convert aJavaRDD<String>
in CSV format to aMatrixObject
static MatrixObject
javaRDDStringCSVToMatrixObject(org.apache.spark.api.java.JavaRDD<String> javaRDD, MatrixMetadata matrixMetadata)
Convert aJavaRDD<String>
in CSV format to aMatrixObject
static FrameObject
javaRDDStringIJVToFrameObject(org.apache.spark.api.java.JavaRDD<String> javaRDD, FrameMetadata frameMetadata)
Convert aJavaRDD<String>
in IJV format to aFrameObject
.static MatrixObject
javaRDDStringIJVToMatrixObject(org.apache.spark.api.java.JavaRDD<String> javaRDD, MatrixMetadata matrixMetadata)
Convert aJavaRDD<String>
in IJV format to aMatrixObject
.static org.apache.spark.api.java.JavaSparkContext
jsc()
Obtain JavaSparkContext from MLContextProxy.static MatrixObject
matrixBlockToMatrixObject(String variableName, MatrixBlock matrixBlock, MatrixMetadata matrixMetadata)
Convert aMatrixBlock
to aMatrixObject
.static double[][]
matrixObjectTo2DDoubleArray(MatrixObject matrixObject)
Convert aMatrixObject
to a two-dimensional double array.static org.apache.spark.api.java.JavaPairRDD<MatrixIndexes,MatrixBlock>
matrixObjectToBinaryBlocks(MatrixObject matrixObject, SparkExecutionContext sparkExecutionContext)
Convert aMatrixObject
to aJavaPairRDD<MatrixIndexes, MatrixBlock>
.static org.apache.spark.sql.Dataset<org.apache.spark.sql.Row>
matrixObjectToDataFrame(MatrixObject matrixObject, SparkExecutionContext sparkExecutionContext, boolean isVectorDF)
Convert aMatrixObject
to aDataFrame
.static org.apache.spark.api.java.JavaRDD<String>
matrixObjectToJavaRDDStringCSV(MatrixObject matrixObject)
Convert aMatrixObject
to aJavaRDD<String>
in CSV format.static org.apache.spark.api.java.JavaRDD<String>
matrixObjectToJavaRDDStringIJV(MatrixObject matrixObject)
Convert aMatrixObject
to aJavaRDD<String>
in IJV format.static List<String>
matrixObjectToListStringCSV(MatrixObject matrixObject)
Convert aMatrixObject
to aList<String>
in CSV format.static List<String>
matrixObjectToListStringIJV(MatrixObject matrixObject)
Convert aMatrixObject
to aList<String>
in IJV format.static org.apache.spark.rdd.RDD<String>
matrixObjectToRDDStringCSV(MatrixObject matrixObject)
Convert aMatrixObject
to aRDD<String>
in CSV format.static org.apache.spark.rdd.RDD<String>
matrixObjectToRDDStringIJV(MatrixObject matrixObject)
Convert aMatrixObject
to aRDD<String>
in IJV format.static FrameObject
rddStringCSVToFrameObject(org.apache.spark.rdd.RDD<String> rdd)
Convert aRDD<String>
in CSV format to aFrameObject
static FrameObject
rddStringCSVToFrameObject(org.apache.spark.rdd.RDD<String> rdd, FrameMetadata frameMetadata)
Convert aRDD<String>
in CSV format to aFrameObject
static MatrixObject
rddStringCSVToMatrixObject(org.apache.spark.rdd.RDD<String> rdd)
Convert aRDD<String>
in CSV format to aMatrixObject
static MatrixObject
rddStringCSVToMatrixObject(org.apache.spark.rdd.RDD<String> rdd, MatrixMetadata matrixMetadata)
Convert aRDD<String>
in CSV format to aMatrixObject
static FrameObject
rddStringIJVToFrameObject(org.apache.spark.rdd.RDD<String> rdd, FrameMetadata frameMetadata)
Convert aRDD<String>
in IJV format to aFrameObject
.static MatrixObject
rddStringIJVToMatrixObject(org.apache.spark.rdd.RDD<String> rdd, MatrixMetadata matrixMetadata)
Convert aRDD<String>
in IJV format to aMatrixObject
.static org.apache.spark.SparkContext
sc()
Obtain SparkContext from MLContextProxy.static org.apache.spark.sql.SparkSession
spark()
Obtain SparkSession from MLContextProxy.static MatrixObject
urlToMatrixObject(URL url, MatrixMetadata matrixMetadata)
Convert a matrix at a URL to aMatrixObject
.
-
-
-
Method Detail
-
doubleMatrixToMatrixObject
public static MatrixObject doubleMatrixToMatrixObject(String variableName, double[][] doubleMatrix)
Convert a two-dimensional double array to aMatrixObject
.- Parameters:
variableName
- name of the variable associated with the matrixdoubleMatrix
- matrix of double values- Returns:
- the two-dimensional double matrix converted to a
MatrixObject
-
doubleMatrixToMatrixObject
public static MatrixObject doubleMatrixToMatrixObject(String variableName, double[][] doubleMatrix, MatrixMetadata matrixMetadata)
Convert a two-dimensional double array to aMatrixObject
.- Parameters:
variableName
- name of the variable associated with the matrixdoubleMatrix
- matrix of double valuesmatrixMetadata
- the matrix metadata- Returns:
- the two-dimensional double matrix converted to a
MatrixObject
-
urlToMatrixObject
public static MatrixObject urlToMatrixObject(URL url, MatrixMetadata matrixMetadata)
Convert a matrix at a URL to aMatrixObject
.- Parameters:
url
- the URL to a matrix (in CSV or IJV format)matrixMetadata
- the matrix metadata- Returns:
- the matrix at a URL converted to a
MatrixObject
-
matrixBlockToMatrixObject
public static MatrixObject matrixBlockToMatrixObject(String variableName, MatrixBlock matrixBlock, MatrixMetadata matrixMetadata)
Convert aMatrixBlock
to aMatrixObject
.- Parameters:
variableName
- name of the variable associated with the matrixmatrixBlock
- matrix as a MatrixBlockmatrixMetadata
- the matrix metadata- Returns:
- the
MatrixBlock
converted to aMatrixObject
-
frameBlockToFrameObject
public static FrameObject frameBlockToFrameObject(String variableName, FrameBlock frameBlock, FrameMetadata frameMetadata)
Convert aFrameBlock
to aFrameObject
.- Parameters:
variableName
- name of the variable associated with the frameframeBlock
- frame as a FrameBlockframeMetadata
- the frame metadata- Returns:
- the
FrameBlock
converted to aFrameObject
-
binaryBlocksToMatrixObject
public static MatrixObject binaryBlocksToMatrixObject(org.apache.spark.api.java.JavaPairRDD<MatrixIndexes,MatrixBlock> binaryBlocks)
Convert aJavaPairRDD<MatrixIndexes, MatrixBlock>
to aMatrixObject
.- Parameters:
binaryBlocks
-JavaPairRDD<MatrixIndexes, MatrixBlock>
representation of a binary-block matrix- Returns:
- the
JavaPairRDD<MatrixIndexes, MatrixBlock>
matrix converted to aMatrixObject
-
binaryBlocksToMatrixObject
public static MatrixObject binaryBlocksToMatrixObject(org.apache.spark.api.java.JavaPairRDD<MatrixIndexes,MatrixBlock> binaryBlocks, MatrixMetadata matrixMetadata)
Convert aJavaPairRDD<MatrixIndexes, MatrixBlock>
to aMatrixObject
.- Parameters:
binaryBlocks
-JavaPairRDD<MatrixIndexes, MatrixBlock>
representation of a binary-block matrixmatrixMetadata
- the matrix metadata- Returns:
- the
JavaPairRDD<MatrixIndexes, MatrixBlock>
matrix converted to aMatrixObject
-
binaryBlocksToMatrixBlock
public static MatrixBlock binaryBlocksToMatrixBlock(org.apache.spark.api.java.JavaPairRDD<MatrixIndexes,MatrixBlock> binaryBlocks, MatrixMetadata matrixMetadata)
Convert aJavaPairRDD<MatrixIndexes, MatrixBlock>
to aMatrixBlock
- Parameters:
binaryBlocks
-JavaPairRDD<MatrixIndexes, MatrixBlock>
representation of a binary-block matrixmatrixMetadata
- the matrix metadata- Returns:
- the
JavaPairRDD<MatrixIndexes, MatrixBlock>
matrix converted to aMatrixBlock
-
binaryBlocksToFrameObject
public static FrameObject binaryBlocksToFrameObject(org.apache.spark.api.java.JavaPairRDD<Long,FrameBlock> binaryBlocks)
Convert aJavaPairRDD<Long, FrameBlock>
to aFrameObject
.- Parameters:
binaryBlocks
-JavaPairRDD<Long, FrameBlock>
representation of a binary-block frame- Returns:
- the
JavaPairRDD<Long, FrameBlock>
frame converted to aFrameObject
-
binaryBlocksToFrameObject
public static FrameObject binaryBlocksToFrameObject(org.apache.spark.api.java.JavaPairRDD<Long,FrameBlock> binaryBlocks, FrameMetadata frameMetadata)
Convert aJavaPairRDD<Long, FrameBlock>
to aFrameObject
.- Parameters:
binaryBlocks
-JavaPairRDD<Long, FrameBlock>
representation of a binary-block frameframeMetadata
- the frame metadata- Returns:
- the
JavaPairRDD<Long, FrameBlock>
frame converted to aFrameObject
-
dataFrameToMatrixObject
public static MatrixObject dataFrameToMatrixObject(org.apache.spark.sql.Dataset<org.apache.spark.sql.Row> dataFrame)
Convert aDataFrame
to aMatrixObject
.- Parameters:
dataFrame
- the SparkDataFrame
- Returns:
- the
DataFrame
matrix converted to a converted to aMatrixObject
-
dataFrameToMatrixObject
public static MatrixObject dataFrameToMatrixObject(org.apache.spark.sql.Dataset<org.apache.spark.sql.Row> dataFrame, MatrixMetadata matrixMetadata)
Convert aDataFrame
to aMatrixObject
.- Parameters:
dataFrame
- the SparkDataFrame
matrixMetadata
- the matrix metadata- Returns:
- the
DataFrame
matrix converted to a converted to aMatrixObject
-
dataFrameToFrameObject
public static FrameObject dataFrameToFrameObject(org.apache.spark.sql.Dataset<org.apache.spark.sql.Row> dataFrame)
Convert aDataFrame
to aFrameObject
.- Parameters:
dataFrame
- the SparkDataFrame
- Returns:
- the
DataFrame
matrix converted to a converted to aFrameObject
-
dataFrameToFrameObject
public static FrameObject dataFrameToFrameObject(org.apache.spark.sql.Dataset<org.apache.spark.sql.Row> dataFrame, FrameMetadata frameMetadata)
Convert aDataFrame
to aFrameObject
.- Parameters:
dataFrame
- the SparkDataFrame
frameMetadata
- the frame metadata- Returns:
- the
DataFrame
frame converted to a converted to aFrameObject
-
dataFrameToMatrixBinaryBlocks
public static org.apache.spark.api.java.JavaPairRDD<MatrixIndexes,MatrixBlock> dataFrameToMatrixBinaryBlocks(org.apache.spark.sql.Dataset<org.apache.spark.sql.Row> dataFrame)
Convert aDataFrame
to aJavaPairRDD<MatrixIndexes, MatrixBlock>
binary-block matrix.- Parameters:
dataFrame
- the SparkDataFrame
- Returns:
- the
DataFrame
matrix converted to aJavaPairRDD<MatrixIndexes, MatrixBlock>
binary-block matrix
-
dataFrameToMatrixBinaryBlocks
public static org.apache.spark.api.java.JavaPairRDD<MatrixIndexes,MatrixBlock> dataFrameToMatrixBinaryBlocks(org.apache.spark.sql.Dataset<org.apache.spark.sql.Row> dataFrame, MatrixMetadata matrixMetadata)
Convert aDataFrame
to aJavaPairRDD<MatrixIndexes, MatrixBlock>
binary-block matrix.- Parameters:
dataFrame
- the SparkDataFrame
matrixMetadata
- the matrix metadata- Returns:
- the
DataFrame
matrix converted to aJavaPairRDD<MatrixIndexes, MatrixBlock>
binary-block matrix
-
dataFrameToFrameBinaryBlocks
public static org.apache.spark.api.java.JavaPairRDD<Long,FrameBlock> dataFrameToFrameBinaryBlocks(org.apache.spark.sql.Dataset<org.apache.spark.sql.Row> dataFrame, FrameMetadata frameMetadata)
Convert aDataFrame
to aJavaPairRDD<Long, FrameBlock>
binary-block frame.- Parameters:
dataFrame
- the SparkDataFrame
frameMetadata
- the frame metadata- Returns:
- the
DataFrame
matrix converted to aJavaPairRDD<Long, FrameBlock>
binary-block frame
-
determineMatrixFormatIfNeeded
public static void determineMatrixFormatIfNeeded(org.apache.spark.sql.Dataset<org.apache.spark.sql.Row> dataFrame, MatrixMetadata matrixMetadata)
If the MatrixFormat of the DataFrame has not been explicitly specified, attempt to determine the proper MatrixFormat.- Parameters:
dataFrame
- the SparkDataFrame
matrixMetadata
- the matrix metadata, if available
-
determineFrameFormatIfNeeded
public static void determineFrameFormatIfNeeded(org.apache.spark.sql.Dataset<org.apache.spark.sql.Row> dataFrame, FrameMetadata frameMetadata)
If the FrameFormat of the DataFrame has not been explicitly specified, attempt to determine the proper FrameFormat.- Parameters:
dataFrame
- the SparkDataFrame
frameMetadata
- the frame metadata, if available
-
isDataFrameWithIDColumn
public static boolean isDataFrameWithIDColumn(MatrixMetadata matrixMetadata)
Return whether or not the DataFrame has an ID column.- Parameters:
matrixMetadata
- the matrix metadata- Returns:
true
if the DataFrame has an ID column,false
otherwise.
-
isDataFrameWithIDColumn
public static boolean isDataFrameWithIDColumn(FrameMetadata frameMetadata)
Return whether or not the DataFrame has an ID column.- Parameters:
frameMetadata
- the frame metadata- Returns:
true
if the DataFrame has an ID column,false
otherwise.
-
isVectorBasedDataFrame
public static boolean isVectorBasedDataFrame(MatrixMetadata matrixMetadata)
Return whether or not the DataFrame is vector-based.- Parameters:
matrixMetadata
- the matrix metadata- Returns:
true
if the DataFrame is vector-based,false
otherwise.
-
javaRDDStringCSVToMatrixObject
public static MatrixObject javaRDDStringCSVToMatrixObject(org.apache.spark.api.java.JavaRDD<String> javaRDD)
Convert aJavaRDD<String>
in CSV format to aMatrixObject
- Parameters:
javaRDD
- the Java RDD of strings- Returns:
- the
JavaRDD<String>
converted to aMatrixObject
-
javaRDDStringCSVToMatrixObject
public static MatrixObject javaRDDStringCSVToMatrixObject(org.apache.spark.api.java.JavaRDD<String> javaRDD, MatrixMetadata matrixMetadata)
Convert aJavaRDD<String>
in CSV format to aMatrixObject
- Parameters:
javaRDD
- the Java RDD of stringsmatrixMetadata
- matrix metadata- Returns:
- the
JavaRDD<String>
converted to aMatrixObject
-
javaRDDStringCSVToFrameObject
public static FrameObject javaRDDStringCSVToFrameObject(org.apache.spark.api.java.JavaRDD<String> javaRDD)
Convert aJavaRDD<String>
in CSV format to aFrameObject
- Parameters:
javaRDD
- the Java RDD of strings- Returns:
- the
JavaRDD<String>
converted to aFrameObject
-
javaRDDStringCSVToFrameObject
public static FrameObject javaRDDStringCSVToFrameObject(org.apache.spark.api.java.JavaRDD<String> javaRDD, FrameMetadata frameMetadata)
Convert aJavaRDD<String>
in CSV format to aFrameObject
- Parameters:
javaRDD
- the Java RDD of stringsframeMetadata
- frame metadata- Returns:
- the
JavaRDD<String>
converted to aFrameObject
-
javaRDDStringIJVToMatrixObject
public static MatrixObject javaRDDStringIJVToMatrixObject(org.apache.spark.api.java.JavaRDD<String> javaRDD, MatrixMetadata matrixMetadata)
Convert aJavaRDD<String>
in IJV format to aMatrixObject
. Note that metadata is required for IJV format.- Parameters:
javaRDD
- the Java RDD of stringsmatrixMetadata
- matrix metadata- Returns:
- the
JavaRDD<String>
converted to aMatrixObject
-
javaRDDStringIJVToFrameObject
public static FrameObject javaRDDStringIJVToFrameObject(org.apache.spark.api.java.JavaRDD<String> javaRDD, FrameMetadata frameMetadata)
Convert aJavaRDD<String>
in IJV format to aFrameObject
. Note that metadata is required for IJV format.- Parameters:
javaRDD
- the Java RDD of stringsframeMetadata
- frame metadata- Returns:
- the
JavaRDD<String>
converted to aFrameObject
-
rddStringCSVToMatrixObject
public static MatrixObject rddStringCSVToMatrixObject(org.apache.spark.rdd.RDD<String> rdd)
Convert aRDD<String>
in CSV format to aMatrixObject
- Parameters:
rdd
- the RDD of strings- Returns:
- the
RDD<String>
converted to aMatrixObject
-
rddStringCSVToMatrixObject
public static MatrixObject rddStringCSVToMatrixObject(org.apache.spark.rdd.RDD<String> rdd, MatrixMetadata matrixMetadata)
Convert aRDD<String>
in CSV format to aMatrixObject
- Parameters:
rdd
- the RDD of stringsmatrixMetadata
- matrix metadata- Returns:
- the
RDD<String>
converted to aMatrixObject
-
rddStringCSVToFrameObject
public static FrameObject rddStringCSVToFrameObject(org.apache.spark.rdd.RDD<String> rdd)
Convert aRDD<String>
in CSV format to aFrameObject
- Parameters:
rdd
- the RDD of strings- Returns:
- the
RDD<String>
converted to aFrameObject
-
rddStringCSVToFrameObject
public static FrameObject rddStringCSVToFrameObject(org.apache.spark.rdd.RDD<String> rdd, FrameMetadata frameMetadata)
Convert aRDD<String>
in CSV format to aFrameObject
- Parameters:
rdd
- the RDD of stringsframeMetadata
- frame metadata- Returns:
- the
RDD<String>
converted to aFrameObject
-
rddStringIJVToMatrixObject
public static MatrixObject rddStringIJVToMatrixObject(org.apache.spark.rdd.RDD<String> rdd, MatrixMetadata matrixMetadata)
Convert aRDD<String>
in IJV format to aMatrixObject
. Note that metadata is required for IJV format.- Parameters:
rdd
- the RDD of stringsmatrixMetadata
- matrix metadata- Returns:
- the
RDD<String>
converted to aMatrixObject
-
rddStringIJVToFrameObject
public static FrameObject rddStringIJVToFrameObject(org.apache.spark.rdd.RDD<String> rdd, FrameMetadata frameMetadata)
Convert aRDD<String>
in IJV format to aFrameObject
. Note that metadata is required for IJV format.- Parameters:
rdd
- the RDD of stringsframeMetadata
- frame metadata- Returns:
- the
RDD<String>
converted to aFrameObject
-
matrixObjectToJavaRDDStringCSV
public static org.apache.spark.api.java.JavaRDD<String> matrixObjectToJavaRDDStringCSV(MatrixObject matrixObject)
Convert aMatrixObject
to aJavaRDD<String>
in CSV format.- Parameters:
matrixObject
- theMatrixObject
- Returns:
- the
MatrixObject
converted to aJavaRDD<String>
-
frameObjectToJavaRDDStringCSV
public static org.apache.spark.api.java.JavaRDD<String> frameObjectToJavaRDDStringCSV(FrameObject frameObject, String delimiter)
Convert aFrameObject
to aJavaRDD<String>
in CSV format.- Parameters:
frameObject
- theFrameObject
delimiter
- the delimiter- Returns:
- the
FrameObject
converted to aJavaRDD<String>
-
matrixObjectToJavaRDDStringIJV
public static org.apache.spark.api.java.JavaRDD<String> matrixObjectToJavaRDDStringIJV(MatrixObject matrixObject)
Convert aMatrixObject
to aJavaRDD<String>
in IJV format.- Parameters:
matrixObject
- theMatrixObject
- Returns:
- the
MatrixObject
converted to aJavaRDD<String>
-
frameObjectToJavaRDDStringIJV
public static org.apache.spark.api.java.JavaRDD<String> frameObjectToJavaRDDStringIJV(FrameObject frameObject)
Convert aFrameObject
to aJavaRDD<String>
in IJV format.- Parameters:
frameObject
- theFrameObject
- Returns:
- the
FrameObject
converted to aJavaRDD<String>
-
matrixObjectToRDDStringIJV
public static org.apache.spark.rdd.RDD<String> matrixObjectToRDDStringIJV(MatrixObject matrixObject)
Convert aMatrixObject
to aRDD<String>
in IJV format.- Parameters:
matrixObject
- theMatrixObject
- Returns:
- the
MatrixObject
converted to aRDD<String>
-
frameObjectToRDDStringIJV
public static org.apache.spark.rdd.RDD<String> frameObjectToRDDStringIJV(FrameObject frameObject)
Convert aFrameObject
to aRDD<String>
in IJV format.- Parameters:
frameObject
- theFrameObject
- Returns:
- the
FrameObject
converted to aRDD<String>
-
matrixObjectToRDDStringCSV
public static org.apache.spark.rdd.RDD<String> matrixObjectToRDDStringCSV(MatrixObject matrixObject)
Convert aMatrixObject
to aRDD<String>
in CSV format.- Parameters:
matrixObject
- theMatrixObject
- Returns:
- the
MatrixObject
converted to aRDD<String>
-
frameObjectToRDDStringCSV
public static org.apache.spark.rdd.RDD<String> frameObjectToRDDStringCSV(FrameObject frameObject, String delimiter)
Convert aFrameObject
to aRDD<String>
in CSV format.- Parameters:
frameObject
- theFrameObject
delimiter
- the delimiter- Returns:
- the
FrameObject
converted to aRDD<String>
-
matrixObjectToListStringCSV
public static List<String> matrixObjectToListStringCSV(MatrixObject matrixObject)
Convert aMatrixObject
to aList<String>
in CSV format.- Parameters:
matrixObject
- theMatrixObject
- Returns:
- the
MatrixObject
converted to aList<String>
-
frameObjectToListStringCSV
public static List<String> frameObjectToListStringCSV(FrameObject frameObject, String delimiter)
Convert aFrameObject
to aList<String>
in CSV format.- Parameters:
frameObject
- theFrameObject
delimiter
- the delimiter- Returns:
- the
FrameObject
converted to aList<String>
-
matrixObjectToListStringIJV
public static List<String> matrixObjectToListStringIJV(MatrixObject matrixObject)
Convert aMatrixObject
to aList<String>
in IJV format.- Parameters:
matrixObject
- theMatrixObject
- Returns:
- the
MatrixObject
converted to aList<String>
-
frameObjectToListStringIJV
public static List<String> frameObjectToListStringIJV(FrameObject frameObject)
Convert aFrameObject
to aList<String>
in IJV format.- Parameters:
frameObject
- theFrameObject
- Returns:
- the
FrameObject
converted to aList<String>
-
matrixObjectTo2DDoubleArray
public static double[][] matrixObjectTo2DDoubleArray(MatrixObject matrixObject)
Convert aMatrixObject
to a two-dimensional double array.- Parameters:
matrixObject
- theMatrixObject
- Returns:
- the
MatrixObject
converted to adouble[][]
-
matrixObjectToDataFrame
public static org.apache.spark.sql.Dataset<org.apache.spark.sql.Row> matrixObjectToDataFrame(MatrixObject matrixObject, SparkExecutionContext sparkExecutionContext, boolean isVectorDF)
Convert aMatrixObject
to aDataFrame
.- Parameters:
matrixObject
- theMatrixObject
sparkExecutionContext
- the Spark execution contextisVectorDF
- is the DataFrame a vector DataFrame?- Returns:
- the
MatrixObject
converted to aDataFrame
-
frameObjectToDataFrame
public static org.apache.spark.sql.Dataset<org.apache.spark.sql.Row> frameObjectToDataFrame(FrameObject frameObject, SparkExecutionContext sparkExecutionContext)
Convert aFrameObject
to aDataFrame
.- Parameters:
frameObject
- theFrameObject
sparkExecutionContext
- the Spark execution context- Returns:
- the
FrameObject
converted to aDataFrame
-
matrixObjectToBinaryBlocks
public static org.apache.spark.api.java.JavaPairRDD<MatrixIndexes,MatrixBlock> matrixObjectToBinaryBlocks(MatrixObject matrixObject, SparkExecutionContext sparkExecutionContext)
Convert aMatrixObject
to aJavaPairRDD<MatrixIndexes, MatrixBlock>
.- Parameters:
matrixObject
- theMatrixObject
sparkExecutionContext
- the Spark execution context- Returns:
- the
MatrixObject
converted to aJavaPairRDD<MatrixIndexes, MatrixBlock>
-
frameObjectToBinaryBlocks
public static org.apache.spark.api.java.JavaPairRDD<Long,FrameBlock> frameObjectToBinaryBlocks(FrameObject frameObject, SparkExecutionContext sparkExecutionContext)
Convert aFrameObject
to aJavaPairRDD<Long, FrameBlock>
.- Parameters:
frameObject
- theFrameObject
sparkExecutionContext
- the Spark execution context- Returns:
- the
FrameObject
converted to aJavaPairRDD<Long, FrameBlock>
-
frameObjectTo2DStringArray
public static String[][] frameObjectTo2DStringArray(FrameObject frameObject)
Convert aFrameObject
to a two-dimensional string array.- Parameters:
frameObject
- theFrameObject
- Returns:
- the
FrameObject
converted to aString[][]
-
jsc
public static org.apache.spark.api.java.JavaSparkContext jsc()
Obtain JavaSparkContext from MLContextProxy.- Returns:
- the Java Spark Context
-
sc
public static org.apache.spark.SparkContext sc()
Obtain SparkContext from MLContextProxy.- Returns:
- the Spark Context
-
spark
public static org.apache.spark.sql.SparkSession spark()
Obtain SparkSession from MLContextProxy.- Returns:
- the Spark Session
-
-