Class ColumnEncoder

    • Field Detail

      • APPLY_ROW_BLOCKS_PER_COLUMN

        public static int APPLY_ROW_BLOCKS_PER_COLUMN
      • BUILD_ROW_BLOCKS_PER_COLUMN

        public static int BUILD_ROW_BLOCKS_PER_COLUMN
    • Method Detail

      • initEmbeddings

        public void initEmbeddings​(MatrixBlock embeddings)
      • apply

        public MatrixBlock apply​(CacheBlock<?> in,
                                 MatrixBlock out,
                                 int outputCol)
        Apply Functions are only used in Single Threaded or Multi-Threaded Dense context. That's why there is no regard for MT sparse!
        Specified by:
        apply in interface Encoder
        Parameters:
        in - Input Block
        out - Output Matrix
        outputCol - The output column for the given column
        Returns:
        same as out
      • isApplicable

        public boolean isApplicable()
        Indicates if this encoder is applicable, i.e, if there is a column to encode.
        Returns:
        true if a colID is set
      • isApplicable

        public boolean isApplicable​(int colID)
        Indicates if this encoder is applicable for the given column ID, i.e., if it is subject to this transformation.
        Parameters:
        colID - column ID
        Returns:
        true if encoder is applicable for given column
      • prepareBuildPartial

        public void prepareBuildPartial()
        Allocates internal data structures for partial build.
        Specified by:
        prepareBuildPartial in interface Encoder
      • getDomainSize

        public int getDomainSize()
      • buildPartial

        public void buildPartial​(FrameBlock in)
        Partial build of internal data structures (e.g., in distributed spark operations).
        Specified by:
        buildPartial in interface Encoder
        Parameters:
        in - input frame block
      • build

        public void build​(CacheBlock<?> in,
                          double[] equiHeightMaxs)
      • mergeAt

        public void mergeAt​(ColumnEncoder other)
        Merges another encoder, of a compatible type, in after a certain position. Resizes as necessary. ColumnEncoders are compatible with themselves and EncoderComposite is compatible with every other ColumnEncoders. MultiColumnEncoders are compatible with every encoder
        Parameters:
        other - the encoder that should be merged in
      • updateIndexRanges

        public void updateIndexRanges​(long[] beginDims,
                                      long[] endDims,
                                      int colOffset)
        Update index-ranges to after encoding. Note that only Dummycoding changes the ranges.
        Specified by:
        updateIndexRanges in interface Encoder
        Parameters:
        beginDims - begin dimensions of range
        endDims - end dimensions of range
        colOffset - is applied to begin and endDims
      • getColMapping

        public MatrixBlock getColMapping​(FrameBlock meta)
        Obtain the column mapping of encoded frames based on the passed meta data frame.
        Parameters:
        meta - meta data frame block
        Returns:
        matrix with column mapping (one row per attribute)
      • writeExternal

        public void writeExternal​(ObjectOutput os)
                           throws IOException
        Redirects the default java serialization via externalizable to our default hadoop writable serialization for efficient broadcast/rdd serialization.
        Specified by:
        writeExternal in interface Externalizable
        Parameters:
        os - object output
        Throws:
        IOException - if IOException occurs
      • readExternal

        public void readExternal​(ObjectInput in)
                          throws IOException
        Redirects the default java serialization via externalizable to our default hadoop writable serialization for efficient broadcast/rdd deserialization.
        Specified by:
        readExternal in interface Externalizable
        Parameters:
        in - object input
        Throws:
        IOException - if IOException occur
      • getColID

        public int getColID()
      • setColID

        public void setColID​(int colID)
      • shiftCol

        public void shiftCol​(int columnOffset)
      • setEstMetaSize

        public void setEstMetaSize​(long estSize)
      • getEstMetaSize

        public long getEstMetaSize()
      • setEstNumDistincts

        public void setEstNumDistincts​(int numDistincts)
      • getEstNumDistincts

        public int getEstNumDistincts()
      • getSparseRowsWZeros

        public Set<Integer> getSparseRowsWZeros()