Class Dictionary
- java.lang.Object
-
- org.apache.sysds.runtime.compress.colgroup.dictionary.ADictionary
-
- org.apache.sysds.runtime.compress.colgroup.dictionary.Dictionary
-
- All Implemented Interfaces:
Serializable
,IDictionary
- Direct Known Subclasses:
DeltaDictionary
public class Dictionary extends ADictionary
This dictionary class aims to encapsulate the storage and operations over unique floating point values of a column group. The primary reason for its introduction was to provide an entry point for specialization such as shared dictionaries, which require additional information.- See Also:
- Serialized Form
-
-
Nested Class Summary
-
Nested classes/interfaces inherited from interface org.apache.sysds.runtime.compress.colgroup.dictionary.IDictionary
IDictionary.DictType
-
-
Field Summary
-
Fields inherited from interface org.apache.sysds.runtime.compress.colgroup.dictionary.IDictionary
LOG
-
-
Method Summary
All Methods Static Methods Instance Methods Concrete Methods Modifier and Type Method Description void
addToEntry(double[] v, int fr, int to, int nCol)
Adds the dictionary entry from this dictionary to the d dictionaryvoid
addToEntry(double[] v, int fr, int to, int nCol, int rep)
Adds the dictionary entry from this dictionary to the v dictionary rep times.void
addToEntryVectorized(double[] v, int f1, int f2, int f3, int f4, int f5, int f6, int f7, int f8, int t1, int t2, int t3, int t4, int t5, int t6, int t7, int t8, int nCol)
Vectorized add to entry, this call helps with a bit of locality for the cache.double
aggregate(double init, Builtin fn)
Aggregate all the contained values, useful in value only computations where the operation is iterating through all values contained in the dictionary.void
aggregateCols(double[] c, Builtin fn, IColIndex colIndexes)
Aggregates the columns into the target double array provided.void
aggregateColsWithReference(double[] c, Builtin fn, IColIndex colIndexes, double[] reference, boolean def)
Aggregates the columns into the target double array provided.double[]
aggregateRows(Builtin fn, int nCol)
Aggregate all entries in the rows.double[]
aggregateRowsWithDefault(Builtin fn, double[] defaultTuple)
Aggregate all entries in the rows of the dictionary with a extra cell in the end that contains the aggregate of the given defaultTuple.double[]
aggregateRowsWithReference(Builtin fn, double[] reference)
Aggregate all entries in the rows with an offset value reference added.double
aggregateWithReference(double init, Builtin fn, double[] reference, boolean def)
Aggregate all the contained values, with a reference offset.Dictionary
applyScalarOp(ScalarOperator op)
Allocate a new dictionary and applies the scalar operation on each cell of to then return the new dictionary.IDictionary
applyScalarOpAndAppend(ScalarOperator op, double v0, int nCol)
Allocate a new dictionary with one extra row and applies the scalar operation on each cell of to then return the new dictionary.Dictionary
applyScalarOpWithReference(ScalarOperator op, double[] reference, double[] newReference)
Allocate a new dictionary and apply the scalar operation on each cell to then return a new dictionary.Dictionary
applyUnaryOp(UnaryOperator op)
Allocate a new dictionary and apply the unary operator on each cell.IDictionary
applyUnaryOpAndAppend(UnaryOperator op, double v0, int nCol)
Allocate a new dictionary with one extra row and apply the unary operator on each cell.IDictionary
applyUnaryOpWithReference(UnaryOperator op, double[] reference, double[] newReference)
Allocate a new dictionary and apply the scalar operation on each cell to then return a new dictionary.Dictionary
binOpLeft(BinaryOperator op, double[] v, IColIndex colIndexes)
Apply binary row operation on the left sideIDictionary
binOpLeftAndAppend(BinaryOperator op, double[] v, IColIndex colIndexes)
Apply binary row operation on the left side with one extra row evaluating with zeros.Dictionary
binOpLeftWithReference(BinaryOperator op, double[] v, IColIndex colIndexes, double[] reference, double[] newReference)
Apply the binary operator such that each value is offset by the reference before application.Dictionary
binOpRight(BinaryOperator op, double[] v)
Apply binary row operation on the right side as with no columns to extract from v.Dictionary
binOpRight(BinaryOperator op, double[] v, IColIndex colIndexes)
Apply binary row operation on the right side.IDictionary
binOpRightAndAppend(BinaryOperator op, double[] v, IColIndex colIndexes)
Apply binary row operation on the right side with one extra row evaluating with zeros.Dictionary
binOpRightWithReference(BinaryOperator op, double[] v, IColIndex colIndexes, double[] reference, double[] newReference)
Apply the binary operator such that each value is offset by the reference before application.IDictionary
cbind(IDictionary that, int nCol)
Cbind this dictionary with that dictionaryCM_COV_Object
centralMoment(CM_COV_Object ret, ValueFunction fn, int[] counts, int nRows)
Central moment function to calculate the central moment of this column group.CM_COV_Object
centralMomentWithDefault(CM_COV_Object ret, ValueFunction fn, int[] counts, double def, int nRows)
Central moment function to calculate the central moment of this column group with a default offset on all missing tuples.CM_COV_Object
centralMomentWithReference(CM_COV_Object ret, ValueFunction fn, int[] counts, double reference, int nRows)
Central moment function to calculate the central moment of this column group with a reference offset on each tuple.Dictionary
clone()
Returns a deep clone of the dictionary.void
colProduct(double[] res, int[] counts, IColIndex colIndexes)
Calculate the column product of the dictionary weighted by counts.void
colProductWithReference(double[] res, int[] counts, IColIndex colIndexes, double[] reference)
Calculate the column product of the dictionary weighted by counts.void
colSum(double[] c, int[] counts, IColIndex colIndexes)
Get the column sum of the values contained in the dictionaryvoid
colSumSq(double[] c, int[] counts, IColIndex colIndexes)
Get the column sum of the values contained in the dictionaryvoid
colSumSqWithReference(double[] c, int[] counts, IColIndex colIndexes, double[] reference)
Get the column sum of the values contained in the dictionary with an offset reference value added to each cell.boolean
containsValue(double pattern)
Detect if the dictionary contains a specific value.boolean
containsValueWithReference(double pattern, double[] reference)
Detect if the dictionary contains a specific value with reference offset.static Dictionary
create(double[] values)
static Dictionary
createNoCheck(double[] values)
boolean
equals(IDictionary o)
Indicate if the other dictionary is equal to this.IDictionary.DictType
getDictType()
Get the dictionary type this dictionary is.long
getExactSizeOnDisk()
Calculate the space consumption if the dictionary is stored on disk.long
getInMemorySize()
Returns the memory usage of the dictionary.MatrixBlockDictionary
getMBDict(int nCol)
Get this dictionary as a MatrixBlock dictionary.long
getNumberNonZeros(int[] counts, int nCol)
Calculate the number of non zeros in the dictionary.long
getNumberNonZerosWithReference(int[] counts, double[] reference, int nRows)
Calculate the number of non zeros in the dictionary.int
getNumberOfValues(int nCol)
Get the number of distinct tuples given that the column group has n columnsdouble
getSparsity()
Get the sparsity of the dictionary.String
getString(int colIndexes)
Get a string representation of the dictionary, that considers the layout of the data.double
getValue(int i)
Get Specific value contained in the dictionary at index.double
getValue(int r, int c, int nCol)
Get Specific value contain in dictionary at index.double[]
getValues()
Get all the values contained in the dictionary as a linearized double array.void
MMDict(IDictionary right, IColIndex rowsLeft, IColIndex colsRight, MatrixBlock result)
Matrix multiplication of dictionaries Note the left is this, and it is transposedvoid
MMDictDense(double[] left, IColIndex rowsLeft, IColIndex colsRight, MatrixBlock result)
Matrix multiplication of dictionaries left side dense and transposed right side is this.void
MMDictScaling(IDictionary right, IColIndex rowsLeft, IColIndex colsRight, MatrixBlock result, int[] scaling)
Matrix multiplication of dictionaries Note the left is this, and it is transposedvoid
MMDictScalingDense(double[] left, IColIndex rowsLeft, IColIndex colsRight, MatrixBlock result, int[] scaling)
Matrix multiplication of dictionaries left side dense and transposed right side is this.void
MMDictScalingSparse(SparseBlock left, IColIndex rowsLeft, IColIndex colsRight, MatrixBlock result, int[] scaling)
Matrix multiplication of dictionaries left side sparse and transposed right side is this.void
MMDictSparse(SparseBlock left, IColIndex rowsLeft, IColIndex colsRight, MatrixBlock result)
Matrix multiplication of dictionaries left side sparse and transposed right side is this.void
multiplyScalar(double v, double[] ret, int off, int dictIdx, IColIndex cols)
Multiply the v value with the dictionary entry at dictIdx and add it to the ret matrix at the columns specified in the int array.Dictionary
preaggValuesFromDense(int numVals, IColIndex colIndexes, IColIndex aggregateColumns, double[] b, int cut)
Pre Aggregate values for Right Matrix Multiplication.void
product(double[] ret, int[] counts, int nCol)
Calculate the product of the dictionary weighted by counts.double[]
productAllRowsToDouble(int nCol)
Method to product all rows to a column vector.double[]
productAllRowsToDoubleWithDefault(double[] defaultTuple)
Method to product all rows to a column vector with a default value added in the end.double[]
productAllRowsToDoubleWithReference(double[] reference)
Method to product all rows to a column vector with a reference values added to all cells, and a reference product in the endvoid
productWithDefault(double[] ret, int[] counts, double[] def, int defCount)
Calculate the product of the dictionary weighted by counts with a default value added .void
productWithReference(double[] ret, int[] counts, double[] reference, int refCount)
Calculate the product of the dictionary weighted by counts and offset by referencestatic Dictionary
read(DataInput in)
IDictionary
reorder(int[] reorder)
Reorder the elements in the dictionary based on the reorder specification given.IDictionary
replace(double pattern, double replace, int nCol)
Make a copy of the values, and replace all values that match pattern with replacement value.IDictionary
replaceWithReference(double pattern, double replace, double[] reference)
Make a copy of the values, and replace all values that match pattern with replacement value.IDictionary
rexpandCols(int max, boolean ignore, boolean cast, int nCol)
Rexpand the dictionary (one hot encode)IDictionary
rexpandColsWithReference(int max, boolean ignore, boolean cast, int reference)
Rexpand the dictionary (one hot encode)IDictionary
scaleTuples(int[] scaling, int nCol)
Scale all tuples contained in the dictionary by the scaling factor given in the int list.IDictionary
sliceOutColumnRange(int idxStart, int idxEnd, int previousNumberOfColumns)
Modify the dictionary by removing columns not within the index range.IDictionary
subtractTuple(double[] tuple)
Allocate a new dictionary where the tuple given is subtracted from all tuples in the previous dictionary.double
sum(int[] counts, int nCol)
Get the sum of the values contained in the dictionarydouble[]
sumAllRowsToDouble(int nrColumns)
Method used as a pre-aggregate of each tuple in the dictionary, to single double values.double[]
sumAllRowsToDoubleSq(int nrColumns)
Method used as a pre-aggregate of each tuple in the dictionary, to single double values.double[]
sumAllRowsToDoubleSqWithDefault(double[] defaultTuple)
Method used as a pre-aggregate of each tuple in the dictionary, to single double values.double[]
sumAllRowsToDoubleSqWithReference(double[] reference)
Method used as a pre-aggregate of each tuple in the dictionary, to single double values.double[]
sumAllRowsToDoubleWithDefault(double[] defaultTuple)
Do exactly the same as the sumAllRowsToDouble but also sum the array given to a extra index in the end of the array.double[]
sumAllRowsToDoubleWithReference(double[] reference)
Method used as a pre-aggregate of each tuple in the dictionary, to single double values with a reference.double
sumRowWithReference(int k, int nrColumns, double[] reference)
double
sumSq(int[] counts, int nCol)
Get the square sum of the values contained in the dictionarydouble
sumSqWithReference(int[] counts, double[] reference)
Get the square sum of the values contained in the dictionary with a reference offset on each value.String
toString()
void
TSMMToUpperTriangle(IDictionary right, IColIndex rowsLeft, IColIndex colsRight, MatrixBlock result)
Matrix multiplication but allocate output in upper triangle and twice if on diagonal, note this is leftvoid
TSMMToUpperTriangleDense(double[] left, IColIndex rowsLeft, IColIndex colsRight, MatrixBlock result)
Matrix multiplication but allocate output in upper triangle and twice if on diagonal, note this is rightvoid
TSMMToUpperTriangleDenseScaling(double[] left, IColIndex rowsLeft, IColIndex colsRight, int[] scale, MatrixBlock result)
Matrix multiplication but allocate output in upper triangle and twice if on diagonal, note this is rightvoid
TSMMToUpperTriangleScaling(IDictionary right, IColIndex rowsLeft, IColIndex colsRight, int[] scale, MatrixBlock result)
Matrix multiplication but allocate output in upper triangle and twice if on diagonal, note this is leftvoid
TSMMToUpperTriangleSparse(SparseBlock left, IColIndex rowsLeft, IColIndex colsRight, MatrixBlock result)
Matrix multiplication but allocate output in upper triangle and twice if on diagonal, note this is rightvoid
TSMMToUpperTriangleSparseScaling(SparseBlock left, IColIndex rowsLeft, IColIndex colsRight, int[] scale, MatrixBlock result)
Matrix multiplication but allocate output in upper triangle and twice if on diagonal, note this is rightvoid
TSMMWithScaling(int[] counts, IColIndex rows, IColIndex cols, MatrixBlock ret)
Transpose self matrix multiplication with a scaling factor on each pair of values.void
write(DataOutput out)
Write the dictionary to a DataOutput.-
Methods inherited from class org.apache.sysds.runtime.compress.colgroup.dictionary.ADictionary
centralMoment, centralMomentWithDefault, centralMomentWithReference, correctNan, doubleToString, equals, equals
-
-
-
-
Method Detail
-
create
public static Dictionary create(double[] values)
-
createNoCheck
public static Dictionary createNoCheck(double[] values)
-
getValues
public double[] getValues()
Description copied from interface:IDictionary
Get all the values contained in the dictionary as a linearized double array.- Returns:
- linearized double array
-
getValue
public double getValue(int i)
Description copied from interface:IDictionary
Get Specific value contained in the dictionary at index.- Parameters:
i
- The index to extract the value from- Returns:
- The value contained at the index
-
getValue
public final double getValue(int r, int c, int nCol)
Description copied from interface:IDictionary
Get Specific value contain in dictionary at index.- Parameters:
r
- Row targetc
- Col targetnCol
- nCol in dictionary- Returns:
- value
-
getInMemorySize
public long getInMemorySize()
Description copied from interface:IDictionary
Returns the memory usage of the dictionary.- Returns:
- a long value in number of bytes for the dictionary.
-
aggregate
public double aggregate(double init, Builtin fn)
Description copied from interface:IDictionary
Aggregate all the contained values, useful in value only computations where the operation is iterating through all values contained in the dictionary.- Parameters:
init
- The initial Value, in cases such as Max value, this could be -infinityfn
- The Function to apply to values- Returns:
- The aggregated value as a double.
-
aggregateWithReference
public double aggregateWithReference(double init, Builtin fn, double[] reference, boolean def)
Description copied from interface:IDictionary
Aggregate all the contained values, with a reference offset.- Parameters:
init
- The initial value, in cases such as Max value this could be -infinity.fn
- The function to apply to the valuesreference
- The reference offset to each value in the dictionarydef
- If the reference should be treated as an instance of only as reference- Returns:
- The aggregated value as a double.
-
aggregateRows
public double[] aggregateRows(Builtin fn, int nCol)
Description copied from interface:IDictionary
Aggregate all entries in the rows.- Parameters:
fn
- The aggregate functionnCol
- The number of columns contained in the dictionary.- Returns:
- Aggregates for this dictionary tuples.
-
aggregateRowsWithDefault
public double[] aggregateRowsWithDefault(Builtin fn, double[] defaultTuple)
Description copied from interface:IDictionary
Aggregate all entries in the rows of the dictionary with a extra cell in the end that contains the aggregate of the given defaultTuple.- Parameters:
fn
- The aggregate functiondefaultTuple
- The default tuple to aggregate in last cell- Returns:
- Aggregates for this dictionary tuples.
-
aggregateRowsWithReference
public double[] aggregateRowsWithReference(Builtin fn, double[] reference)
Description copied from interface:IDictionary
Aggregate all entries in the rows with an offset value reference added.- Parameters:
fn
- The aggregate functionreference
- The reference offset to each value in the dictionary- Returns:
- Aggregates for this dictionary tuples.
-
applyScalarOp
public Dictionary applyScalarOp(ScalarOperator op)
Description copied from interface:IDictionary
Allocate a new dictionary and applies the scalar operation on each cell of to then return the new dictionary.- Parameters:
op
- The operator.- Returns:
- The new dictionary to return.
-
applyScalarOpAndAppend
public IDictionary applyScalarOpAndAppend(ScalarOperator op, double v0, int nCol)
Description copied from interface:IDictionary
Allocate a new dictionary with one extra row and applies the scalar operation on each cell of to then return the new dictionary.- Parameters:
op
- The operatorv0
- The new value to put into each cell in the new rownCol
- The number of columns in the dictionary- Returns:
- The new dictionary to return.
-
applyUnaryOp
public Dictionary applyUnaryOp(UnaryOperator op)
Description copied from interface:IDictionary
Allocate a new dictionary and apply the unary operator on each cell.- Parameters:
op
- The operator.- Returns:
- The new dictionary to return.
-
applyUnaryOpAndAppend
public IDictionary applyUnaryOpAndAppend(UnaryOperator op, double v0, int nCol)
Description copied from interface:IDictionary
Allocate a new dictionary with one extra row and apply the unary operator on each cell.- Parameters:
op
- The operator.v0
- The new value to put into each cell in the new rownCol
- The number of columns in the dictionary- Returns:
- The new dictionary to return.
-
applyScalarOpWithReference
public Dictionary applyScalarOpWithReference(ScalarOperator op, double[] reference, double[] newReference)
Description copied from interface:IDictionary
Allocate a new dictionary and apply the scalar operation on each cell to then return a new dictionary. outValues[j] = op(this.values[j] + reference[i]) - newReference[i]- Parameters:
op
- The operator to apply to each cell.reference
- The reference value to add before the operator.newReference
- The reference value to subtract after the operator.- Returns:
- A New Dictionary.
-
applyUnaryOpWithReference
public IDictionary applyUnaryOpWithReference(UnaryOperator op, double[] reference, double[] newReference)
Description copied from interface:IDictionary
Allocate a new dictionary and apply the scalar operation on each cell to then return a new dictionary. outValues[j] = op(this.values[j] + reference[i]) - newReference[i]- Parameters:
op
- The unary operator to apply to each cell.reference
- The reference value to add before the operator.newReference
- The reference value to subtract after the operator.- Returns:
- A New Dictionary.
-
binOpRight
public Dictionary binOpRight(BinaryOperator op, double[] v, IColIndex colIndexes)
Description copied from interface:IDictionary
Apply binary row operation on the right side.- Parameters:
op
- The operation to this dictionaryv
- The values to use on the right hand side.colIndexes
- The column indexes to consider inside v.- Returns:
- A new dictionary containing the updated values.
-
binOpRight
public Dictionary binOpRight(BinaryOperator op, double[] v)
Description copied from interface:IDictionary
Apply binary row operation on the right side as with no columns to extract from v.- Parameters:
op
- The operation to this dictionaryv
- The values to apply on the dictionary (same number of cols as the dictionary)- Returns:
- A new dictionary containing the updated values.
-
binOpRightAndAppend
public IDictionary binOpRightAndAppend(BinaryOperator op, double[] v, IColIndex colIndexes)
Description copied from interface:IDictionary
Apply binary row operation on the right side with one extra row evaluating with zeros.- Parameters:
op
- The operation to this dictionaryv
- The values to use on the right hand side.colIndexes
- The column indexes to consider inside v.- Returns:
- A new dictionary containing the updated values.
-
binOpRightWithReference
public Dictionary binOpRightWithReference(BinaryOperator op, double[] v, IColIndex colIndexes, double[] reference, double[] newReference)
Description copied from interface:IDictionary
Apply the binary operator such that each value is offset by the reference before application. Then put the result into the new dictionary, but offset it by the new reference. outValues[j] = op(this.values[j] + reference[i], v[colIndexes[i]]) - newReference[i]- Parameters:
op
- The operation to apply on the dictionary values.v
- The values to use on the right side of the operator.colIndexes
- The column indexes to use.reference
- The reference value to add before operator.newReference
- The reference value to subtract after operator.- Returns:
- A new dictionary.
-
binOpLeft
public final Dictionary binOpLeft(BinaryOperator op, double[] v, IColIndex colIndexes)
Description copied from interface:IDictionary
Apply binary row operation on the left side- Parameters:
op
- The operation to this dictionaryv
- The values to use on the left hand side.colIndexes
- The column indexes to consider inside v.- Returns:
- A new dictionary containing the updated values.
-
binOpLeftAndAppend
public IDictionary binOpLeftAndAppend(BinaryOperator op, double[] v, IColIndex colIndexes)
Description copied from interface:IDictionary
Apply binary row operation on the left side with one extra row evaluating with zeros.- Parameters:
op
- The operation to this dictionaryv
- The values to use on the left hand side.colIndexes
- The column indexes to consider inside v.- Returns:
- A new dictionary containing the updated values.
-
binOpLeftWithReference
public Dictionary binOpLeftWithReference(BinaryOperator op, double[] v, IColIndex colIndexes, double[] reference, double[] newReference)
Description copied from interface:IDictionary
Apply the binary operator such that each value is offset by the reference before application. Then put the result into the new dictionary, but offset it by the new reference. outValues[j] = op(v[colIndexes[i]], this.values[j] + reference[i]) - newReference[i]- Parameters:
op
- The operation to apply on the dictionary values.v
- The values to use on the left side of the operator.colIndexes
- The column indexes to use.reference
- The reference value to add before operator.newReference
- The reference value to subtract after operator.- Returns:
- A new dictionary.
-
clone
public Dictionary clone()
Description copied from interface:IDictionary
Returns a deep clone of the dictionary.- Specified by:
clone
in interfaceIDictionary
- Specified by:
clone
in classADictionary
- Returns:
- A deep clone
-
read
public static Dictionary read(DataInput in) throws IOException
- Throws:
IOException
-
write
public void write(DataOutput out) throws IOException
Description copied from interface:IDictionary
Write the dictionary to a DataOutput.- Parameters:
out
- the output sink to write the dictionary to.- Throws:
IOException
- if the sink fails.
-
getExactSizeOnDisk
public long getExactSizeOnDisk()
Description copied from interface:IDictionary
Calculate the space consumption if the dictionary is stored on disk.- Returns:
- the long count of bytes to store the dictionary.
-
getNumberOfValues
public int getNumberOfValues(int nCol)
Description copied from interface:IDictionary
Get the number of distinct tuples given that the column group has n columns- Parameters:
nCol
- The number of Columns in the ColumnGroup.- Returns:
- the number of value tuples contained in the dictionary.
-
sumAllRowsToDouble
public double[] sumAllRowsToDouble(int nrColumns)
Description copied from interface:IDictionary
Method used as a pre-aggregate of each tuple in the dictionary, to single double values. Note if the number of columns is one the actual dictionaries values are simply returned.- Parameters:
nrColumns
- The number of columns in the ColGroup to know how to get the values from the dictionary.- Returns:
- a double array containing the row sums from this dictionary.
-
sumAllRowsToDoubleWithDefault
public double[] sumAllRowsToDoubleWithDefault(double[] defaultTuple)
Description copied from interface:IDictionary
Do exactly the same as the sumAllRowsToDouble but also sum the array given to a extra index in the end of the array.- Parameters:
defaultTuple
- The default row to sum in the end index returned.- Returns:
- a double array containing the row sums from this dictionary.
-
sumAllRowsToDoubleWithReference
public double[] sumAllRowsToDoubleWithReference(double[] reference)
Description copied from interface:IDictionary
Method used as a pre-aggregate of each tuple in the dictionary, to single double values with a reference.- Parameters:
reference
- The reference values to add to each cell.- Returns:
- a double array containing the row sums from this dictionary.
-
sumAllRowsToDoubleSq
public double[] sumAllRowsToDoubleSq(int nrColumns)
Description copied from interface:IDictionary
Method used as a pre-aggregate of each tuple in the dictionary, to single double values. Note if the number of columns is one the actual dictionaries values are simply returned.- Parameters:
nrColumns
- The number of columns in the ColGroup to know how to get the values from the dictionary.- Returns:
- a double array containing the row sums from this dictionary.
-
sumAllRowsToDoubleSqWithDefault
public double[] sumAllRowsToDoubleSqWithDefault(double[] defaultTuple)
Description copied from interface:IDictionary
Method used as a pre-aggregate of each tuple in the dictionary, to single double values. But adds another cell to the return with an extra value that is the sum of the given defaultTuple.- Parameters:
defaultTuple
- The default row to sum in the end index returned.- Returns:
- a double array containing the row sums from this dictionary.
-
productAllRowsToDouble
public double[] productAllRowsToDouble(int nCol)
Description copied from interface:IDictionary
Method to product all rows to a column vector.- Parameters:
nCol
- The number of columns in the ColGroup to know how to get the values from the dictionary.- Returns:
- A row product
-
productAllRowsToDoubleWithDefault
public double[] productAllRowsToDoubleWithDefault(double[] defaultTuple)
Description copied from interface:IDictionary
Method to product all rows to a column vector with a default value added in the end.- Parameters:
defaultTuple
- The default row that aggregate to last cell- Returns:
- A row product
-
productAllRowsToDoubleWithReference
public double[] productAllRowsToDoubleWithReference(double[] reference)
Description copied from interface:IDictionary
Method to product all rows to a column vector with a reference values added to all cells, and a reference product in the end- Parameters:
reference
- The reference row- Returns:
- A row product
-
sumAllRowsToDoubleSqWithReference
public double[] sumAllRowsToDoubleSqWithReference(double[] reference)
Description copied from interface:IDictionary
Method used as a pre-aggregate of each tuple in the dictionary, to single double values.- Parameters:
reference
- The reference values to add to each cell.- Returns:
- a double array containing the row sums from this dictionary.
-
sumRowWithReference
public double sumRowWithReference(int k, int nrColumns, double[] reference)
-
colSum
public void colSum(double[] c, int[] counts, IColIndex colIndexes)
Description copied from interface:IDictionary
Get the column sum of the values contained in the dictionary- Parameters:
c
- The output array allocated to contain all column groups output.counts
- The counts of the individual tuples.colIndexes
- The columns indexes of the parent column group, this indicate where to put the column sum into the c output.
-
colSumSq
public void colSumSq(double[] c, int[] counts, IColIndex colIndexes)
Description copied from interface:IDictionary
Get the column sum of the values contained in the dictionary- Parameters:
c
- The output array allocated to contain all column groups output.counts
- The counts of the individual tuples.colIndexes
- The columns indexes of the parent column group, this indicate where to put the column sum into the c output.
-
colProduct
public void colProduct(double[] res, int[] counts, IColIndex colIndexes)
Description copied from interface:IDictionary
Calculate the column product of the dictionary weighted by counts.- Parameters:
res
- The result vector to put the result intocounts
- The weighted count of individual tuplescolIndexes
- The column indexes.
-
colProductWithReference
public void colProductWithReference(double[] res, int[] counts, IColIndex colIndexes, double[] reference)
Description copied from interface:IDictionary
Calculate the column product of the dictionary weighted by counts.- Parameters:
res
- The result vector to put the result intocounts
- The weighted count of individual tuplescolIndexes
- The column indexes.reference
- The reference value.
-
colSumSqWithReference
public void colSumSqWithReference(double[] c, int[] counts, IColIndex colIndexes, double[] reference)
Description copied from interface:IDictionary
Get the column sum of the values contained in the dictionary with an offset reference value added to each cell.- Parameters:
c
- The output array allocated to contain all column groups output.counts
- The counts of the individual tuples.colIndexes
- The columns indexes of the parent column group, this indicate where to put the column sum into the c output.reference
- The reference values to add to each cell.
-
sum
public double sum(int[] counts, int nCol)
Description copied from interface:IDictionary
Get the sum of the values contained in the dictionary- Parameters:
counts
- The counts of the individual tuplesnCol
- The number of columns contained- Returns:
- The sum scaled by the counts provided.
-
sumSq
public double sumSq(int[] counts, int nCol)
Description copied from interface:IDictionary
Get the square sum of the values contained in the dictionary- Parameters:
counts
- The counts of the individual tuplesnCol
- The number of columns contained- Returns:
- The square sum scaled by the counts provided.
-
sumSqWithReference
public double sumSqWithReference(int[] counts, double[] reference)
Description copied from interface:IDictionary
Get the square sum of the values contained in the dictionary with a reference offset on each value.- Parameters:
counts
- The counts of the individual tuplesreference
- The reference value- Returns:
- The square sum scaled by the counts and reference.
-
getString
public String getString(int colIndexes)
Description copied from interface:IDictionary
Get a string representation of the dictionary, that considers the layout of the data.- Parameters:
colIndexes
- The number of columns in the dictionary.- Returns:
- A string that is nicer to print.
-
sliceOutColumnRange
public IDictionary sliceOutColumnRange(int idxStart, int idxEnd, int previousNumberOfColumns)
Description copied from interface:IDictionary
Modify the dictionary by removing columns not within the index range.- Parameters:
idxStart
- The column index to start at.idxEnd
- The column index to end at (not inclusive)previousNumberOfColumns
- The number of columns contained in the dictionary.- Returns:
- A dictionary containing the sliced out columns values only.
-
containsValue
public boolean containsValue(double pattern)
Description copied from interface:IDictionary
Detect if the dictionary contains a specific value.- Parameters:
pattern
- The value to search for- Returns:
- true if the value is contained else false.
-
containsValueWithReference
public boolean containsValueWithReference(double pattern, double[] reference)
Description copied from interface:IDictionary
Detect if the dictionary contains a specific value with reference offset.- Parameters:
pattern
- The pattern/ value to search forreference
- The reference double array.- Returns:
- true if the value is contained else false.
-
getNumberNonZeros
public long getNumberNonZeros(int[] counts, int nCol)
Description copied from interface:IDictionary
Calculate the number of non zeros in the dictionary. The number of non zeros should be scaled with the counts given. This gives the exact number of non zero values in the parent column group.- Parameters:
counts
- The counts of each dictionary entrynCol
- The number of columns in this dictionary- Returns:
- The nonZero count
-
getNumberNonZerosWithReference
public long getNumberNonZerosWithReference(int[] counts, double[] reference, int nRows)
Description copied from interface:IDictionary
Calculate the number of non zeros in the dictionary. Each value in the dictionary should be added to the reference value. The number of non zeros should be scaled with the given counts.- Parameters:
counts
- The Counts of each dict entry.reference
- The reference vector.nRows
- The number of rows in the input.- Returns:
- The NonZero Count.
-
addToEntry
public final void addToEntry(double[] v, int fr, int to, int nCol)
Description copied from interface:IDictionary
Adds the dictionary entry from this dictionary to the d dictionary- Parameters:
v
- The target dictionary (dense double array)fr
- The from index is the tuple index to copy from.to
- The to index is the row index to copy into.nCol
- The number of columns in both cases
-
addToEntry
public final void addToEntry(double[] v, int fr, int to, int nCol, int rep)
Description copied from interface:IDictionary
Adds the dictionary entry from this dictionary to the v dictionary rep times.- Parameters:
v
- The target dictionary (dense double array)fr
- The from index is the tuple index to copy from.to
- The to index is the row index to copy into.nCol
- The number of columns in both casesrep
- The number of repetitions to apply (simply multiply do not loop)
-
addToEntryVectorized
public void addToEntryVectorized(double[] v, int f1, int f2, int f3, int f4, int f5, int f6, int f7, int f8, int t1, int t2, int t3, int t4, int t5, int t6, int t7, int t8, int nCol)
Description copied from interface:IDictionary
Vectorized add to entry, this call helps with a bit of locality for the cache.- Parameters:
v
- The target dictionary (dense double array)f1
- From index 1f2
- From index 2f3
- From index 3f4
- From index 4f5
- From index 5f6
- From index 6f7
- From index 7f8
- From index 8t1
- To index 1t2
- To index 2t3
- To index 3t4
- To index 4t5
- To index 5t6
- To index 6t7
- To index 7t8
- To index 8nCol
- Number of columns in the dictionary
-
getDictType
public IDictionary.DictType getDictType()
Description copied from interface:IDictionary
Get the dictionary type this dictionary is.- Returns:
- The Dictionary type this is.
-
subtractTuple
public IDictionary subtractTuple(double[] tuple)
Description copied from interface:IDictionary
Allocate a new dictionary where the tuple given is subtracted from all tuples in the previous dictionary.- Parameters:
tuple
- a double list representing a tuple, it is given that the tuple with is the same as this dictionaries.- Returns:
- a new instance of dictionary with the tuple subtracted.
-
getMBDict
public MatrixBlockDictionary getMBDict(int nCol)
Description copied from interface:IDictionary
Get this dictionary as a MatrixBlock dictionary. This allows us to use optimized kernels coded elsewhere in the system, such as matrix multiplication. Return null if the matrix is empty.- Parameters:
nCol
- The number of columns contained in this column group.- Returns:
- A Dictionary containing a MatrixBlock.
-
aggregateCols
public void aggregateCols(double[] c, Builtin fn, IColIndex colIndexes)
Description copied from interface:IDictionary
Aggregates the columns into the target double array provided.- Parameters:
c
- The target double array, this contains the full number of columns, therefore the colIndexes for this specific dictionary is needed.fn
- The function to apply to individual columnscolIndexes
- The mapping to the target columns from the individual columns
-
aggregateColsWithReference
public void aggregateColsWithReference(double[] c, Builtin fn, IColIndex colIndexes, double[] reference, boolean def)
Description copied from interface:IDictionary
Aggregates the columns into the target double array provided.- Parameters:
c
- The target double array, this contains the full number of columns, therefore the colIndexes for this specific dictionary is needed.fn
- The function to apply to individual columnscolIndexes
- The mapping to the target columns from the individual columnsreference
- The reference offset values to add to each cell.def
- If the reference should be treated as a tuple as well
-
scaleTuples
public IDictionary scaleTuples(int[] scaling, int nCol)
Description copied from interface:IDictionary
Scale all tuples contained in the dictionary by the scaling factor given in the int list.- Parameters:
scaling
- The amount to multiply the given tuples withnCol
- The number of columns contained in this column group.- Returns:
- A New dictionary (since we don't want to modify the underlying dictionary)
-
preaggValuesFromDense
public Dictionary preaggValuesFromDense(int numVals, IColIndex colIndexes, IColIndex aggregateColumns, double[] b, int cut)
Description copied from interface:IDictionary
Pre Aggregate values for Right Matrix Multiplication.- Parameters:
numVals
- The number of values contained in this dictionarycolIndexes
- The column indexes that is associated with the parent column groupaggregateColumns
- The column to aggregate, this is preprocessed, to find remove consideration for empty columnsb
- The values in the right hand side matrixcut
- The number of columns in b.- Returns:
- A new dictionary with the pre aggregated values.
-
replace
public IDictionary replace(double pattern, double replace, int nCol)
Description copied from interface:IDictionary
Make a copy of the values, and replace all values that match pattern with replacement value. If needed add a new column index.- Parameters:
pattern
- The value to look forreplace
- The value to replace the other value withnCol
- The number of columns contained in the dictionary.- Returns:
- A new Column Group, reusing the index structure but with new values.
-
replaceWithReference
public IDictionary replaceWithReference(double pattern, double replace, double[] reference)
Description copied from interface:IDictionary
Make a copy of the values, and replace all values that match pattern with replacement value. If needed add a new column index. With reference such that each value in the dict is considered offset by the values contained in the reference.- Parameters:
pattern
- The value to look forreplace
- The value to replace the other value withreference
- The reference tuple to add to all entries when replacing- Returns:
- A new Column Group, reusing the index structure but with new values.
-
product
public void product(double[] ret, int[] counts, int nCol)
Description copied from interface:IDictionary
Calculate the product of the dictionary weighted by counts.- Parameters:
ret
- The result dense double array (containing one value)counts
- The count of individual tuplesnCol
- Number of columns in the dictionary.
-
productWithDefault
public void productWithDefault(double[] ret, int[] counts, double[] def, int defCount)
Description copied from interface:IDictionary
Calculate the product of the dictionary weighted by counts with a default value added .- Parameters:
ret
- The result dense double array (containing one value)counts
- The count of individual tuplesdef
- The default tupledefCount
- The count of the default tuple
-
productWithReference
public void productWithReference(double[] ret, int[] counts, double[] reference, int refCount)
Description copied from interface:IDictionary
Calculate the product of the dictionary weighted by counts and offset by reference- Parameters:
ret
- The result dense double array (containing one value)counts
- The counts of each entry in the dictionaryreference
- The reference value.refCount
- The number of occurrences of the ref value.
-
centralMoment
public CM_COV_Object centralMoment(CM_COV_Object ret, ValueFunction fn, int[] counts, int nRows)
Description copied from interface:IDictionary
Central moment function to calculate the central moment of this column group. MUST be on a single column dictionary.- Parameters:
ret
- The Central Moment object to be modified and returnedfn
- The value function to applycounts
- The weight of individual tuplesnRows
- The number of rows in total of the column group- Returns:
- The central moment Object
-
centralMomentWithDefault
public CM_COV_Object centralMomentWithDefault(CM_COV_Object ret, ValueFunction fn, int[] counts, double def, int nRows)
Description copied from interface:IDictionary
Central moment function to calculate the central moment of this column group with a default offset on all missing tuples. MUST be on a single column dictionary.- Parameters:
ret
- The Central Moment object to be modified and returnedfn
- The value function to applycounts
- The weight of individual tuplesdef
- The default values to offset the tuples withnRows
- The number of rows in total of the column group- Returns:
- The central moment Object
-
centralMomentWithReference
public CM_COV_Object centralMomentWithReference(CM_COV_Object ret, ValueFunction fn, int[] counts, double reference, int nRows)
Description copied from interface:IDictionary
Central moment function to calculate the central moment of this column group with a reference offset on each tuple. MUST be on a single column dictionary.- Parameters:
ret
- The Central Moment object to be modified and returnedfn
- The value function to applycounts
- The weight of individual tuplesreference
- The reference values to offset the tuples withnRows
- The number of rows in total of the column group- Returns:
- The central moment Object
-
rexpandCols
public IDictionary rexpandCols(int max, boolean ignore, boolean cast, int nCol)
Description copied from interface:IDictionary
Rexpand the dictionary (one hot encode)- Parameters:
max
- the tuple width of the outputignore
- If we should ignore zero and negative valuescast
- If we should cast all double values to whole integer valuesnCol
- The number of columns in the dictionary already (should be 1)- Returns:
- A new dictionary
-
rexpandColsWithReference
public IDictionary rexpandColsWithReference(int max, boolean ignore, boolean cast, int reference)
Description copied from interface:IDictionary
Rexpand the dictionary (one hot encode)- Parameters:
max
- the tuple width of the outputignore
- If we should ignore zero and negative valuescast
- If we should cast all double values to whole integer valuesreference
- A reference value to add to all tuples before expanding- Returns:
- A new dictionary
-
getSparsity
public double getSparsity()
Description copied from interface:IDictionary
Get the sparsity of the dictionary.- Returns:
- a sparsity between 0 and 1
-
multiplyScalar
public void multiplyScalar(double v, double[] ret, int off, int dictIdx, IColIndex cols)
Description copied from interface:IDictionary
Multiply the v value with the dictionary entry at dictIdx and add it to the ret matrix at the columns specified in the int array.- Parameters:
v
- Value to multiplyret
- Output dense double array locationoff
- Offset into the ret array that the "row" output starts atdictIdx
- The dictionary entry to multiply.cols
- The columns to multiply into of the output.
-
TSMMWithScaling
public void TSMMWithScaling(int[] counts, IColIndex rows, IColIndex cols, MatrixBlock ret)
Description copied from interface:IDictionary
Transpose self matrix multiplication with a scaling factor on each pair of values.- Parameters:
counts
- The scaling factorrows
- The row indexescols
- The col indexesret
- The output matrix block
-
MMDict
public void MMDict(IDictionary right, IColIndex rowsLeft, IColIndex colsRight, MatrixBlock result)
Description copied from interface:IDictionary
Matrix multiplication of dictionaries Note the left is this, and it is transposed- Parameters:
right
- Right hand side of multiplicationrowsLeft
- Offset rows on the leftcolsRight
- Offset cols on the rightresult
- The output matrix block
-
MMDictScaling
public void MMDictScaling(IDictionary right, IColIndex rowsLeft, IColIndex colsRight, MatrixBlock result, int[] scaling)
Description copied from interface:IDictionary
Matrix multiplication of dictionaries Note the left is this, and it is transposed- Parameters:
right
- Right hand side of multiplicationrowsLeft
- Offset rows on the leftcolsRight
- Offset cols on the rightresult
- The output matrix blockscaling
- The scaling
-
MMDictDense
public void MMDictDense(double[] left, IColIndex rowsLeft, IColIndex colsRight, MatrixBlock result)
Description copied from interface:IDictionary
Matrix multiplication of dictionaries left side dense and transposed right side is this.- Parameters:
left
- Dense left siderowsLeft
- Offset rows on the leftcolsRight
- Offset cols on the rightresult
- The output matrix block
-
MMDictScalingDense
public void MMDictScalingDense(double[] left, IColIndex rowsLeft, IColIndex colsRight, MatrixBlock result, int[] scaling)
Description copied from interface:IDictionary
Matrix multiplication of dictionaries left side dense and transposed right side is this.- Parameters:
left
- Dense left siderowsLeft
- Offset rows on the leftcolsRight
- Offset cols on the rightresult
- The output matrix blockscaling
- The scaling
-
MMDictSparse
public void MMDictSparse(SparseBlock left, IColIndex rowsLeft, IColIndex colsRight, MatrixBlock result)
Description copied from interface:IDictionary
Matrix multiplication of dictionaries left side sparse and transposed right side is this.- Parameters:
left
- Sparse left siderowsLeft
- Offset rows on the leftcolsRight
- Offset cols on the rightresult
- The output matrix block
-
MMDictScalingSparse
public void MMDictScalingSparse(SparseBlock left, IColIndex rowsLeft, IColIndex colsRight, MatrixBlock result, int[] scaling)
Description copied from interface:IDictionary
Matrix multiplication of dictionaries left side sparse and transposed right side is this.- Parameters:
left
- Sparse left siderowsLeft
- Offset rows on the leftcolsRight
- Offset cols on the rightresult
- The output matrix blockscaling
- The scaling
-
TSMMToUpperTriangle
public void TSMMToUpperTriangle(IDictionary right, IColIndex rowsLeft, IColIndex colsRight, MatrixBlock result)
Description copied from interface:IDictionary
Matrix multiplication but allocate output in upper triangle and twice if on diagonal, note this is left- Parameters:
right
- Right siderowsLeft
- Offset rows on the leftcolsRight
- Offset cols on the rightresult
- The output matrix block
-
TSMMToUpperTriangleDense
public void TSMMToUpperTriangleDense(double[] left, IColIndex rowsLeft, IColIndex colsRight, MatrixBlock result)
Description copied from interface:IDictionary
Matrix multiplication but allocate output in upper triangle and twice if on diagonal, note this is right- Parameters:
left
- Dense left siderowsLeft
- Offset rows on the leftcolsRight
- Offset cols on the rightresult
- The output matrix block
-
TSMMToUpperTriangleSparse
public void TSMMToUpperTriangleSparse(SparseBlock left, IColIndex rowsLeft, IColIndex colsRight, MatrixBlock result)
Description copied from interface:IDictionary
Matrix multiplication but allocate output in upper triangle and twice if on diagonal, note this is right- Parameters:
left
- Sparse left siderowsLeft
- Offset rows on the leftcolsRight
- Offset cols on the rightresult
- The output matrix block
-
TSMMToUpperTriangleScaling
public void TSMMToUpperTriangleScaling(IDictionary right, IColIndex rowsLeft, IColIndex colsRight, int[] scale, MatrixBlock result)
Description copied from interface:IDictionary
Matrix multiplication but allocate output in upper triangle and twice if on diagonal, note this is left- Parameters:
right
- Right siderowsLeft
- Offset rows on the leftcolsRight
- Offset cols on the rightscale
- Scale factorresult
- The output matrix block
-
TSMMToUpperTriangleDenseScaling
public void TSMMToUpperTriangleDenseScaling(double[] left, IColIndex rowsLeft, IColIndex colsRight, int[] scale, MatrixBlock result)
Description copied from interface:IDictionary
Matrix multiplication but allocate output in upper triangle and twice if on diagonal, note this is right- Parameters:
left
- Dense left siderowsLeft
- Offset rows on the leftcolsRight
- Offset cols on the rightscale
- Scale factorresult
- The output matrix block
-
TSMMToUpperTriangleSparseScaling
public void TSMMToUpperTriangleSparseScaling(SparseBlock left, IColIndex rowsLeft, IColIndex colsRight, int[] scale, MatrixBlock result)
Description copied from interface:IDictionary
Matrix multiplication but allocate output in upper triangle and twice if on diagonal, note this is right- Parameters:
left
- Sparse left siderowsLeft
- Offset rows on the leftcolsRight
- Offset cols on the rightscale
- Scale factorresult
- The output matrix block
-
equals
public boolean equals(IDictionary o)
Description copied from interface:IDictionary
Indicate if the other dictionary is equal to this.- Parameters:
o
- The other object- Returns:
- If it is equal
-
cbind
public IDictionary cbind(IDictionary that, int nCol)
Description copied from interface:IDictionary
Cbind this dictionary with that dictionary- Parameters:
that
- the right hand side dictionary to cbindnCol
- the right hand side number of columns- Returns:
- The combined dictionary
-
reorder
public IDictionary reorder(int[] reorder)
Description copied from interface:IDictionary
Reorder the elements in the dictionary based on the reorder specification given.- Parameters:
reorder
- The order to move to.- Returns:
- A new Dictionary that is reordered.s
-
-