Class FrameBlock
- java.lang.Object
-
- org.apache.sysds.runtime.frame.data.FrameBlock
-
- All Implemented Interfaces:
Externalizable
,Serializable
,org.apache.hadoop.io.Writable
,CacheBlock<FrameBlock>
public class FrameBlock extends Object implements CacheBlock<FrameBlock>, Externalizable
- See Also:
- Serialized Form
-
-
Nested Class Summary
Nested Classes Modifier and Type Class Description static class
FrameBlock.FrameMapFunction
-
Field Summary
Fields Modifier and Type Field Description static int
BUFFER_SIZE
Buffer size variable: 1M elements, size of default matrix blockstatic boolean
debug
If debugging is enabled for the FrameBlocks in stable state
-
Constructor Summary
Constructors Constructor Description FrameBlock()
FrameBlock(int ncols, Types.ValueType vt)
FrameBlock(Types.ValueType[] schema)
FrameBlock(Types.ValueType[] schema, int rlen)
FrameBlock(Types.ValueType[] schema, String[] names)
FrameBlock(Types.ValueType[] schema, String[][] data)
FrameBlock(Types.ValueType[] schema, String[] names, int rlen)
FrameBlock(Types.ValueType[] schema, String[] names, String[][] data)
allocate a FrameBlock with the given data arrays.FrameBlock(Types.ValueType[] schema, String[] colNames, ColumnMetadata[] meta, Array<?>[] data)
FrameBlock(Types.ValueType[] schema, String constant, int nRow)
FrameBlock constructor with constantFrameBlock(Array<?>[] data)
Create a FrameBlock containing columns of the specified arraysFrameBlock(Array<?>[] data, String[] colnames)
Create a FrameBlock containing columns of the specified arrays and namesFrameBlock(FrameBlock that)
Copy constructor for frame blocks, which uses a shallow copy for the schema (column types and names) but a deep copy for meta data and actual column data.
-
Method Summary
All Methods Static Methods Instance Methods Concrete Methods Modifier and Type Method Description FrameBlock
append(FrameBlock that, boolean cbind)
Appends the given argument FrameBlock 'that' to this FrameBlock by creating a deep copy to prevent side effects.void
appendColumn(boolean[] col)
Append a column of value type BOOLEAN as the last column of the data frame.void
appendColumn(double[] col)
Append a column of value type DOUBLE as the last column of the data frame.void
appendColumn(float[] col)
Append a column of value type float as the last column of the data frame.void
appendColumn(int[] col)
Append a column of value type INT as the last column of the data frame.void
appendColumn(long[] col)
Append a column of value type LONG as the last column of the data frame.void
appendColumn(String[] col)
Append a column of value type STRING as the last column of the data frame.void
appendColumn(Array col)
Add a column of already allocated Array type.void
appendColumns(double[][] cols)
Append a set of column of value type DOUBLE at the end of the frame in order to avoid repeated allocation with appendColumns.void
appendRow(Object[] row)
Append a row to the end of the data frame, where all row fields are boxed objects according to the schema.void
appendRow(String[] row)
Append a row to the end of the data frame, where all row fields are string encoded.FrameBlock
applySchema(FrameBlock schema)
FrameBlock
applySchema(FrameBlock schema, int k)
FrameBlock
binaryOperations(BinaryOperator bop, FrameBlock that, FrameBlock out)
This method performs the value comparison on two frames if the values in both frames are equal, not equal, less than, greater than, less than/greater than and equal to the output frame will store boolean value for each each comparisonvoid
compactEmptyBlock()
Free unnecessarily allocated empty block.static FrameBlock
convertToFrameBlock(MatrixBlock mb, Types.ValueType[] schema, int k)
FrameBlock
copy()
void
copy(int rl, int ru, int cl, int cu, FrameBlock src)
Copy src matrix into the index range of the existing current matrix.void
copy(FrameBlock src)
FrameBlock
copyShallow()
static String
createColName(int i)
static String[]
createColNames(int size)
static String[]
createColNames(int off, int size)
FrameBlock
detectSchema(double sampleFraction, int k)
FrameBlock
detectSchema(int k)
FrameBlock
dropInvalidType(FrameBlock schema)
Drop the cell value which does not confirms to the data type of its columnvoid
ensureAllocatedColumns(int numRows)
Allocate column data structures if necessary, i.e., if schema specified but not all column data structures created yet.void
ensureColumnCompatibility(int newLen)
Checks for matching column sizes in case of existing columns.FrameBlock
frameRowReplication(FrameBlock rowToreplicate)
Object
get(int r, int c)
Gets a boxed object of the value in position (r,c).Array<?>
getColumn(int c)
Object
getColumnData(int c)
ColumnMetadata[]
getColumnMetadata()
ColumnMetadata
getColumnMetadata(int c)
String
getColumnName(int c)
Returns the column name for the requested column.Map<String,Integer>
getColumnNameIDMap()
Creates a mapping from column names to column IDs, i.e., 1-based column indexesString[]
getColumnNames()
Returns the column names of the frame block.String[]
getColumnNames(boolean alloc)
Returns the column names of the frame block.FrameBlock
getColumnNamesAsFrame()
Array<?>[]
getColumns()
Types.ValueType
getColumnType(int c)
static FrameBlock.FrameMapFunction
getCompiledFunction(String lambdaExpr, long margin)
DataCharacteristics
getDataCharacteristics()
double
getDouble(int r, int c)
Returns the double value at the passed row and column.double
getDoubleNaN(int r, int c)
Returns the double value at the passed row and column.long
getExactSerializedSize()
Get the exact serialized size in bytes of the cache block.long
getInMemorySize()
Get the in-memory size in bytes of the cache block.int
getNumColumns()
Get the number of columns of the frame block, that is the number of columns defined in the schema.int
getNumRows()
Get the number of rows of the frame block.Map<Object,Long>
getRecodeMap(int col)
This function will split every Recode map in the column using delimiter Lop.DATATYPE_PREFIX, as Recode map generated earlier in the form of Code+Lop.DATATYPE_PREFIX+Token and store it in a map which contains token and code for every unique tokens.Types.ValueType[]
getSchema()
Returns the schema of the frame block.FrameBlock
getSchemaTypeOf()
String
getString(int r, int c)
Returns the string of the value at the passed row and column.FrameBlock
invalidByLength(MatrixBlock feaLen)
This method validates the frame data against an attribute length constrain if data value in any cell is greater than the specified threshold of that attribute the output frame will store a null on that cell position, thus removing the length-violating values.boolean
isColNameDefault(int i)
boolean
isColNamesDefault()
boolean
isColumnMetadataDefault()
boolean
isColumnMetadataDefault(int c)
boolean
isShallowSerialize()
Indicates if the cache block is subject to shallow serialized, which is generally true if in-memory size and serialized size are almost identical allowing to avoid unnecessary deep serialize.boolean
isShallowSerialize(boolean inclConvert)
Indicates if the cache block is subject to shallow serialized, which is generally true if in-memory size and serialized size are almost identical allowing to avoid unnecessary deep serialize.FrameBlock
leftIndexingOperations(FrameBlock rhsFrame, int rl, int ru, int cl, int cu, FrameBlock ret)
FrameBlock
leftIndexingOperations(FrameBlock rhsFrame, IndexRange ixrange, FrameBlock ret)
FrameBlock
map(String lambdaExpr, long margin)
FrameBlock
map(FrameBlock.FrameMapFunction lambdaExpr, long margin)
FrameBlock
mapDist(FrameBlock.FrameMapFunction lambdaExpr)
void
mapInplace(Function<String,String> fun)
FrameBlock
merge(FrameBlock that)
FrameBlock
merge(FrameBlock that, boolean appendOnly)
Merge disjoint: merges all non-zero values of the given input into the current block.void
readExternal(ObjectInput in)
void
readFields(DataInput in)
void
recomputeColumnCardinality()
FrameBlock
removeEmptyOperations(boolean rows, boolean emptyReturn, MatrixBlock select)
<T> FrameBlock
replaceOperations(String pattern, String replacement)
void
reset()
void
reset(int nrow, boolean clearMeta)
void
set(int r, int c, Object val)
Sets the value in position (r,c), where the input is assumed to be a boxed object consistent with the schema definition.void
set(int r, int c, String val)
Sets the value in position (r,c), to the input string value, and at the individual arrays, convert to correct type.void
setColumn(int c, Array<?> column)
void
setColumnMetadata(int c, ColumnMetadata colmeta)
void
setColumnMetadata(ColumnMetadata[] colmeta)
void
setColumnName(int index, String name)
void
setColumnNames(String[] colnames)
void
setSchema(Types.ValueType[] schema)
Sets the schema of the frame block.FrameBlock
slice(int rl, int ru)
Slice a sub block out of the current block and write into the given output block.FrameBlock
slice(int rl, int ru, boolean deep)
Slice a sub block out of the current block and write into the given output block.FrameBlock
slice(int rl, int ru, int cl, int cu)
Slice a sub block out of the current block and write into the given output block.FrameBlock
slice(int rl, int ru, int cl, int cu, boolean deep)
Slice a sub block out of the current block and write into the given output block.FrameBlock
slice(int rl, int ru, int cl, int cu, boolean deep, FrameBlock ret)
Slice a sub block out of the current block and write into the given output block.FrameBlock
slice(int rl, int ru, int cl, int cu, FrameBlock ret)
Slice a sub block out of the current block and write into the given output block.void
slice(ArrayList<Pair<Long,FrameBlock>> outList, IndexRange range, int rowCut)
FrameBlock
slice(IndexRange ixrange, FrameBlock ret)
Slice a sub block out of the current block and write into the given output block.void
toShallowSerializeBlock()
Converts a cache block that is not shallow serializable into a form that is shallow serializable.String
toString()
FrameBlock
valueSwap(FrameBlock schema)
void
write(DataOutput out)
void
writeExternal(ObjectOutput out)
FrameBlock
zeroOutOperations(FrameBlock result, IndexRange range, boolean complementary, int iRowStartSrc, int iRowStartDest, int blen, int iMaxRowsToCopy)
This function ZERO OUT the data in the slicing window applicable for this block.
-
-
-
Field Detail
-
BUFFER_SIZE
public static final int BUFFER_SIZE
Buffer size variable: 1M elements, size of default matrix block- See Also:
- Constant Field Values
-
debug
public static boolean debug
If debugging is enabled for the FrameBlocks in stable state
-
-
Constructor Detail
-
FrameBlock
public FrameBlock()
-
FrameBlock
public FrameBlock(FrameBlock that)
Copy constructor for frame blocks, which uses a shallow copy for the schema (column types and names) but a deep copy for meta data and actual column data.- Parameters:
that
- frame block
-
FrameBlock
public FrameBlock(int ncols, Types.ValueType vt)
-
FrameBlock
public FrameBlock(Types.ValueType[] schema)
-
FrameBlock
public FrameBlock(Types.ValueType[] schema, int rlen)
-
FrameBlock
public FrameBlock(Types.ValueType[] schema, String[] names)
-
FrameBlock
public FrameBlock(Types.ValueType[] schema, String[] names, int rlen)
-
FrameBlock
public FrameBlock(Types.ValueType[] schema, String[][] data)
-
FrameBlock
public FrameBlock(Types.ValueType[] schema, String constant, int nRow)
FrameBlock constructor with constant- Parameters:
schema
- The schema to allocate (also specifying number of columns)constant
- The constant to allocate in all cellsnRow
- the number of rows
-
FrameBlock
public FrameBlock(Types.ValueType[] schema, String[] names, String[][] data)
allocate a FrameBlock with the given data arrays. The data is in row major, making the first dimension number of rows. second number of columns.- Parameters:
schema
- the schema to allocatenames
- The names of the columndata
- The data.
-
FrameBlock
public FrameBlock(Types.ValueType[] schema, String[] colNames, ColumnMetadata[] meta, Array<?>[] data)
-
FrameBlock
public FrameBlock(Array<?>[] data)
Create a FrameBlock containing columns of the specified arrays- Parameters:
data
- The column data contained
-
-
Method Detail
-
getNumRows
public int getNumRows()
Get the number of rows of the frame block.- Specified by:
getNumRows
in interfaceCacheBlock<FrameBlock>
- Returns:
- number of rows
-
getDouble
public double getDouble(int r, int c)
Description copied from interface:CacheBlock
Returns the double value at the passed row and column. If the value is missing 0 is returned.- Specified by:
getDouble
in interfaceCacheBlock<FrameBlock>
- Parameters:
r
- row of the valuec
- column of the value- Returns:
- double value at the passed row and column
-
getDoubleNaN
public double getDoubleNaN(int r, int c)
Description copied from interface:CacheBlock
Returns the double value at the passed row and column. If the value is missing NaN is returned.- Specified by:
getDoubleNaN
in interfaceCacheBlock<FrameBlock>
- Parameters:
r
- row of the valuec
- column of the value- Returns:
- double value at the passed row and column
-
getString
public String getString(int r, int c)
Description copied from interface:CacheBlock
Returns the string of the value at the passed row and column. If the value is missing or NaN, null is returned.- Specified by:
getString
in interfaceCacheBlock<FrameBlock>
- Parameters:
r
- row of the valuec
- column of the value- Returns:
- string of the value at the passed row and column
-
getNumColumns
public int getNumColumns()
Get the number of columns of the frame block, that is the number of columns defined in the schema.- Specified by:
getNumColumns
in interfaceCacheBlock<FrameBlock>
- Returns:
- number of columns
-
getDataCharacteristics
public DataCharacteristics getDataCharacteristics()
- Specified by:
getDataCharacteristics
in interfaceCacheBlock<FrameBlock>
-
getSchema
public Types.ValueType[] getSchema()
Returns the schema of the frame block.- Returns:
- schema as array of ValueTypes
-
setSchema
public void setSchema(Types.ValueType[] schema)
Sets the schema of the frame block.- Parameters:
schema
- schema as array of ValueTypes
-
getColumnNames
public String[] getColumnNames()
Returns the column names of the frame block. This method allocates default column names if required.- Returns:
- column names
-
getColumnNamesAsFrame
public FrameBlock getColumnNamesAsFrame()
-
getColumnNames
public String[] getColumnNames(boolean alloc)
Returns the column names of the frame block. This method allocates default column names if required.- Parameters:
alloc
- if true, create column names- Returns:
- array of column names
-
getColumnName
public String getColumnName(int c)
Returns the column name for the requested column. This method allocates default column names if required.- Parameters:
c
- column index- Returns:
- column name
-
setColumnNames
public void setColumnNames(String[] colnames)
-
setColumnName
public void setColumnName(int index, String name)
-
getColumnMetadata
public ColumnMetadata[] getColumnMetadata()
-
getColumnMetadata
public ColumnMetadata getColumnMetadata(int c)
-
getColumns
public Array<?>[] getColumns()
-
isColumnMetadataDefault
public boolean isColumnMetadataDefault()
-
isColumnMetadataDefault
public boolean isColumnMetadataDefault(int c)
-
setColumnMetadata
public void setColumnMetadata(ColumnMetadata[] colmeta)
-
setColumnMetadata
public void setColumnMetadata(int c, ColumnMetadata colmeta)
-
getColumnNameIDMap
public Map<String,Integer> getColumnNameIDMap()
Creates a mapping from column names to column IDs, i.e., 1-based column indexes- Returns:
- map of column name keys and id values
-
ensureAllocatedColumns
public void ensureAllocatedColumns(int numRows)
Allocate column data structures if necessary, i.e., if schema specified but not all column data structures created yet.- Parameters:
numRows
- number of rows
-
ensureColumnCompatibility
public void ensureColumnCompatibility(int newLen)
Checks for matching column sizes in case of existing columns. If the check parses the number of rows is reassigned to the given newLen- Parameters:
newLen
- number of rows to compare with existing number of rows
-
createColNames
public static String[] createColNames(int size)
-
createColNames
public static String[] createColNames(int off, int size)
-
createColName
public static String createColName(int i)
-
isColNamesDefault
public boolean isColNamesDefault()
-
isColNameDefault
public boolean isColNameDefault(int i)
-
recomputeColumnCardinality
public void recomputeColumnCardinality()
-
get
public Object get(int r, int c)
Gets a boxed object of the value in position (r,c).- Parameters:
r
- row index, 0-basedc
- column index, 0-based- Returns:
- object of the value at specified position
-
set
public void set(int r, int c, Object val)
Sets the value in position (r,c), where the input is assumed to be a boxed object consistent with the schema definition.- Parameters:
r
- row indexc
- column indexval
- value to set at specified position
-
set
public void set(int r, int c, String val)
Sets the value in position (r,c), to the input string value, and at the individual arrays, convert to correct type.- Parameters:
r
- row indexc
- column indexval
- value to set at specified position
-
reset
public void reset(int nrow, boolean clearMeta)
-
reset
public void reset()
-
appendRow
public void appendRow(Object[] row)
Append a row to the end of the data frame, where all row fields are boxed objects according to the schema. Append row should be avoided if possible.- Parameters:
row
- array of objects
-
appendRow
public void appendRow(String[] row)
Append a row to the end of the data frame, where all row fields are string encoded. Append row should be avoided if possible- Parameters:
row
- array of strings
-
appendColumn
public void appendColumn(String[] col)
Append a column of value type STRING as the last column of the data frame. The given array is wrapped but not copied and hence might be updated in the future.- Parameters:
col
- array of strings
-
appendColumn
public void appendColumn(boolean[] col)
Append a column of value type BOOLEAN as the last column of the data frame. The given array is wrapped but not copied and hence might be updated in the future.- Parameters:
col
- array of booleans
-
appendColumn
public void appendColumn(int[] col)
Append a column of value type INT as the last column of the data frame. The given array is wrapped but not copied and hence might be updated in the future.- Parameters:
col
- array of longs
-
appendColumn
public void appendColumn(long[] col)
Append a column of value type LONG as the last column of the data frame. The given array is wrapped but not copied and hence might be updated in the future.- Parameters:
col
- array of longs
-
appendColumn
public void appendColumn(float[] col)
Append a column of value type float as the last column of the data frame. The given array is wrapped but not copied and hence might be updated in the future.- Parameters:
col
- array of doubles
-
appendColumn
public void appendColumn(double[] col)
Append a column of value type DOUBLE as the last column of the data frame. The given array is wrapped but not copied and hence might be updated in the future.- Parameters:
col
- array of doubles
-
appendColumns
public void appendColumns(double[][] cols)
Append a set of column of value type DOUBLE at the end of the frame in order to avoid repeated allocation with appendColumns. The given array is wrapped but not copied and hence might be updated in the future.- Parameters:
cols
- 2d array of doubles
-
convertToFrameBlock
public static FrameBlock convertToFrameBlock(MatrixBlock mb, Types.ValueType[] schema, int k)
-
appendColumn
public void appendColumn(Array col)
Add a column of already allocated Array type.- Parameters:
col
- column to add.
-
getColumnData
public Object getColumnData(int c)
-
getColumnType
public Types.ValueType getColumnType(int c)
-
getColumn
public Array<?> getColumn(int c)
-
setColumn
public void setColumn(int c, Array<?> column)
-
write
public void write(DataOutput out) throws IOException
- Specified by:
write
in interfaceorg.apache.hadoop.io.Writable
- Throws:
IOException
-
readFields
public void readFields(DataInput in) throws IOException
- Specified by:
readFields
in interfaceorg.apache.hadoop.io.Writable
- Throws:
IOException
-
writeExternal
public void writeExternal(ObjectOutput out) throws IOException
- Specified by:
writeExternal
in interfaceExternalizable
- Throws:
IOException
-
readExternal
public void readExternal(ObjectInput in) throws IOException
- Specified by:
readExternal
in interfaceExternalizable
- Throws:
IOException
-
getInMemorySize
public long getInMemorySize()
Description copied from interface:CacheBlock
Get the in-memory size in bytes of the cache block.- Specified by:
getInMemorySize
in interfaceCacheBlock<FrameBlock>
- Returns:
- in-memory size in bytes of cache block
-
getExactSerializedSize
public long getExactSerializedSize()
Description copied from interface:CacheBlock
Get the exact serialized size in bytes of the cache block.- Specified by:
getExactSerializedSize
in interfaceCacheBlock<FrameBlock>
- Returns:
- exact serialized size in bytes of cache block
-
isShallowSerialize
public boolean isShallowSerialize()
Description copied from interface:CacheBlock
Indicates if the cache block is subject to shallow serialized, which is generally true if in-memory size and serialized size are almost identical allowing to avoid unnecessary deep serialize.- Specified by:
isShallowSerialize
in interfaceCacheBlock<FrameBlock>
- Returns:
- true if shallow serialized
-
isShallowSerialize
public boolean isShallowSerialize(boolean inclConvert)
Description copied from interface:CacheBlock
Indicates if the cache block is subject to shallow serialized, which is generally true if in-memory size and serialized size are almost identical allowing to avoid unnecessary deep serialize.- Specified by:
isShallowSerialize
in interfaceCacheBlock<FrameBlock>
- Parameters:
inclConvert
- if true report blocks as shallow serialize that are currently not amenable but can be brought into an amenable form viatoShallowSerializeBlock
.- Returns:
- true if shallow serialized
-
toShallowSerializeBlock
public void toShallowSerializeBlock()
Description copied from interface:CacheBlock
Converts a cache block that is not shallow serializable into a form that is shallow serializable. This methods has no affect if the given cache block is not amenable.- Specified by:
toShallowSerializeBlock
in interfaceCacheBlock<FrameBlock>
-
compactEmptyBlock
public void compactEmptyBlock()
Description copied from interface:CacheBlock
Free unnecessarily allocated empty block.- Specified by:
compactEmptyBlock
in interfaceCacheBlock<FrameBlock>
-
binaryOperations
public FrameBlock binaryOperations(BinaryOperator bop, FrameBlock that, FrameBlock out)
This method performs the value comparison on two frames if the values in both frames are equal, not equal, less than, greater than, less than/greater than and equal to the output frame will store boolean value for each each comparison- Parameters:
bop
- binary operatorthat
- frame block of rhs of m * n dimensionsout
- output frame block- Returns:
- a boolean frameBlock
-
leftIndexingOperations
public FrameBlock leftIndexingOperations(FrameBlock rhsFrame, IndexRange ixrange, FrameBlock ret)
-
leftIndexingOperations
public FrameBlock leftIndexingOperations(FrameBlock rhsFrame, int rl, int ru, int cl, int cu, FrameBlock ret)
-
slice
public final FrameBlock slice(IndexRange ixrange, FrameBlock ret)
Description copied from interface:CacheBlock
Slice a sub block out of the current block and write into the given output block. This method returns the passed instance if not null.- Specified by:
slice
in interfaceCacheBlock<FrameBlock>
- Parameters:
ixrange
- index range inclusiveret
- outputBlock- Returns:
- sub-block of cache block
-
slice
public final FrameBlock slice(int rl, int ru)
Description copied from interface:CacheBlock
Slice a sub block out of the current block and write into the given output block. This method returns the passed instance if not null.- Specified by:
slice
in interfaceCacheBlock<FrameBlock>
- Parameters:
rl
- row lowerru
- row upper inclusive- Returns:
- sub-block of cache block
-
slice
public final FrameBlock slice(int rl, int ru, boolean deep)
Description copied from interface:CacheBlock
Slice a sub block out of the current block and write into the given output block. This method returns the passed instance if not null.- Specified by:
slice
in interfaceCacheBlock<FrameBlock>
- Parameters:
rl
- row lowerru
- row upper inclusivedeep
- enforce deep-copy- Returns:
- sub-block of cache block
-
slice
public final FrameBlock slice(int rl, int ru, int cl, int cu)
Description copied from interface:CacheBlock
Slice a sub block out of the current block and write into the given output block. This method returns the passed instance if not null.- Specified by:
slice
in interfaceCacheBlock<FrameBlock>
- Parameters:
rl
- row lowerru
- row upper inclusivecl
- column lowercu
- column upper inclusive- Returns:
- sub-block of cache block
-
slice
public final FrameBlock slice(int rl, int ru, int cl, int cu, FrameBlock ret)
Description copied from interface:CacheBlock
Slice a sub block out of the current block and write into the given output block. This method returns the passed instance if not null.- Specified by:
slice
in interfaceCacheBlock<FrameBlock>
- Parameters:
rl
- row lowerru
- row upper inclusivecl
- column lowercu
- column upper inclusiveret
- cache block- Returns:
- sub-block of cache block
-
slice
public final FrameBlock slice(int rl, int ru, int cl, int cu, boolean deep)
Description copied from interface:CacheBlock
Slice a sub block out of the current block and write into the given output block. This method returns the passed instance if not null.- Specified by:
slice
in interfaceCacheBlock<FrameBlock>
- Parameters:
rl
- row lowerru
- row upper inclusivecl
- column lowercu
- column upper inclusivedeep
- enforce deep-copy- Returns:
- sub-block of cache block
-
slice
public FrameBlock slice(int rl, int ru, int cl, int cu, boolean deep, FrameBlock ret)
Description copied from interface:CacheBlock
Slice a sub block out of the current block and write into the given output block. This method returns the passed instance if not null.- Specified by:
slice
in interfaceCacheBlock<FrameBlock>
- Parameters:
rl
- row lowerru
- row upper inclusivecl
- column lowercu
- column upper inclusivedeep
- enforce deep-copyret
- cache block- Returns:
- sub-block of cache block
-
slice
public void slice(ArrayList<Pair<Long,FrameBlock>> outList, IndexRange range, int rowCut)
-
append
public FrameBlock append(FrameBlock that, boolean cbind)
Appends the given argument FrameBlock 'that' to this FrameBlock by creating a deep copy to prevent side effects. For cbind, the frames are appended column-wise (same number of rows), while for rbind the frames are appended row-wise (same number of columns).- Parameters:
that
- frame block to append to current frame blockcbind
- if true, column append- Returns:
- frame block
-
copy
public FrameBlock copy()
-
copy
public void copy(FrameBlock src)
-
copyShallow
public FrameBlock copyShallow()
-
copy
public void copy(int rl, int ru, int cl, int cu, FrameBlock src)
Copy src matrix into the index range of the existing current matrix. This is used to copy smaller blocks into a larger block, for instance in binary reading.- Parameters:
rl
- row startru
- row end inclusivecl
- col startcu
- col end inclusivesrc
- source FrameBlock typically a smaller block.
-
getRecodeMap
public Map<Object,Long> getRecodeMap(int col)
This function will split every Recode map in the column using delimiter Lop.DATATYPE_PREFIX, as Recode map generated earlier in the form of Code+Lop.DATATYPE_PREFIX+Token and store it in a map which contains token and code for every unique tokens.- Parameters:
col
- is the column # from frame data which contains Recode map generated earlier.- Returns:
- map of token and code for every element in the input column of a frame containing Recode map
-
merge
public FrameBlock merge(FrameBlock that, boolean appendOnly)
Description copied from interface:CacheBlock
Merge disjoint: merges all non-zero values of the given input into the current block. Note that this method does NOT check for overlapping entries; it's the callers responsibility of ensuring disjoint blocks. The appendOnly parameter is only relevant for sparse target blocks; if true, we only append values and do not sort sparse rows for each call; this is useful whenever we merge iterators of matrix blocks into one target block.- Specified by:
merge
in interfaceCacheBlock<FrameBlock>
- Parameters:
that
- cache blockappendOnly
- Indicate if the merger can be append only on sparse rows.- Returns:
- the merged group, in most implementations 'this' is modified.
-
merge
public FrameBlock merge(FrameBlock that)
-
zeroOutOperations
public FrameBlock zeroOutOperations(FrameBlock result, IndexRange range, boolean complementary, int iRowStartSrc, int iRowStartDest, int blen, int iMaxRowsToCopy)
This function ZERO OUT the data in the slicing window applicable for this block.- Parameters:
result
- frame blockrange
- index rangecomplementary
- ?iRowStartSrc
- ?iRowStartDest
- ?blen
- ?iMaxRowsToCopy
- ?- Returns:
- frame block
-
getSchemaTypeOf
public FrameBlock getSchemaTypeOf()
-
detectSchema
public final FrameBlock detectSchema(int k)
-
detectSchema
public final FrameBlock detectSchema(double sampleFraction, int k)
-
applySchema
public final FrameBlock applySchema(FrameBlock schema)
-
applySchema
public final FrameBlock applySchema(FrameBlock schema, int k)
-
dropInvalidType
public FrameBlock dropInvalidType(FrameBlock schema)
Drop the cell value which does not confirms to the data type of its column- Parameters:
schema
- of the frame- Returns:
- original frame where invalid values are replaced with null
-
invalidByLength
public FrameBlock invalidByLength(MatrixBlock feaLen)
This method validates the frame data against an attribute length constrain if data value in any cell is greater than the specified threshold of that attribute the output frame will store a null on that cell position, thus removing the length-violating values.- Parameters:
feaLen
- vector of valid lengths- Returns:
- FrameBlock with invalid values converted into missing values (null)
-
map
public FrameBlock map(String lambdaExpr, long margin)
-
frameRowReplication
public FrameBlock frameRowReplication(FrameBlock rowToreplicate)
-
valueSwap
public FrameBlock valueSwap(FrameBlock schema)
-
map
public FrameBlock map(FrameBlock.FrameMapFunction lambdaExpr, long margin)
-
mapDist
public FrameBlock mapDist(FrameBlock.FrameMapFunction lambdaExpr)
-
getCompiledFunction
public static FrameBlock.FrameMapFunction getCompiledFunction(String lambdaExpr, long margin)
-
replaceOperations
public <T> FrameBlock replaceOperations(String pattern, String replacement)
-
removeEmptyOperations
public FrameBlock removeEmptyOperations(boolean rows, boolean emptyReturn, MatrixBlock select)
-
-