Class LegacyEncoder

    • Method Detail

      • getColList

        public int[] getColList()
      • setColList

        public void setColList​(int[] colList)
      • initColList

        public int initColList​(org.apache.wink.json4j.JSONArray attrs)
      • initColList

        public int initColList​(int[] colList)
      • isApplicable

        public boolean isApplicable()
        Indicates if this encoder is applicable, i.e, if there is at least one column to encode.
        Returns:
        true if at least one column to encode
      • isApplicable

        public int isApplicable​(int colID)
        Indicates if this encoder is applicable for the given column ID, i.e., if it is subject to this transformation.
        Parameters:
        colID - column ID
        Returns:
        true if encoder is applicable for given column
      • encode

        public abstract MatrixBlock encode​(FrameBlock in,
                                           MatrixBlock out)
        Block encode: build and apply (transform encode).
        Parameters:
        in - input frame block
        out - output matrix block
        Returns:
        output matrix block
      • build

        public abstract void build​(FrameBlock in)
        Build the transform meta data for the given block input. This call modifies and keeps meta data as encoder state.
        Parameters:
        in - input frame block
      • prepareBuildPartial

        public void prepareBuildPartial()
        Allocates internal data structures for partial build.
      • buildPartial

        public void buildPartial​(FrameBlock in)
        Partial build of internal data structures (e.g., in distributed spark operations).
        Parameters:
        in - input frame block
      • apply

        public abstract MatrixBlock apply​(FrameBlock in,
                                          MatrixBlock out)
        Encode input data blockwise according to existing transform meta data (transform apply).
        Parameters:
        in - input frame block
        out - output matrix block
        Returns:
        output matrix block
      • subRangeEncoder

        public LegacyEncoder subRangeEncoder​(IndexRange ixRange)
        Returns a new Encoder that only handles a sub range of columns.
        Parameters:
        ixRange - the range (1-based, begin inclusive, end exclusive)
        Returns:
        an encoder of the same type, just for the sub-range
      • mergeAt

        public void mergeAt​(LegacyEncoder other,
                            int row,
                            int col)
        Merges another encoder, of a compatible type, in after a certain position. Resizes as necessary. Encoders are compatible with themselves and EncoderComposite is compatible with every other Encoder.
        Parameters:
        other - the encoder that should be merged in
        row - the row where it should be placed (1-based)
        col - the col where it should be placed (1-based)
      • updateIndexRanges

        public void updateIndexRanges​(long[] beginDims,
                                      long[] endDims)
        Update index-ranges to after encoding. Note that only Dummycoding changes the ranges.
        Parameters:
        beginDims - begin dimensions of range
        endDims - end dimensions of range
      • getMetaData

        public abstract FrameBlock getMetaData​(FrameBlock out)
        Construct a frame block out of the transform meta data.
        Parameters:
        out - output frame block
        Returns:
        output frame block?
      • initMetaData

        public abstract void initMetaData​(FrameBlock meta)
        Sets up the required meta data for a subsequent call to apply.
        Parameters:
        meta - frame block
      • getColMapping

        public MatrixBlock getColMapping​(FrameBlock meta,
                                         MatrixBlock out)
        Obtain the column mapping of encoded frames based on the passed meta data frame.
        Parameters:
        meta - meta data frame block
        out - output matrix
        Returns:
        matrix with column mapping (one row per attribute)
      • writeExternal

        public void writeExternal​(ObjectOutput os)
                           throws IOException
        Redirects the default java serialization via externalizable to our default hadoop writable serialization for efficient broadcast/rdd serialization.
        Specified by:
        writeExternal in interface Externalizable
        Parameters:
        os - object output
        Throws:
        IOException - if IOException occurs
      • readExternal

        public void readExternal​(ObjectInput in)
                          throws IOException
        Redirects the default java serialization via externalizable to our default hadoop writable serialization for efficient broadcast/rdd deserialization.
        Specified by:
        readExternal in interface Externalizable
        Parameters:
        in - object input
        Throws:
        IOException - if IOException occur
      • shiftCols

        public void shiftCols​(int offset)