Class MultiReturnParameterizedBuiltinSPInstruction.TransformEncodeGroupFunction

  • All Implemented Interfaces:
    Serializable, org.apache.spark.api.java.function.FlatMapFunction<scala.Tuple2<Integer,​Iterable<Object>>,​String>
    Enclosing class:
    MultiReturnParameterizedBuiltinSPInstruction

    public static class MultiReturnParameterizedBuiltinSPInstruction.TransformEncodeGroupFunction
    extends Object
    implements org.apache.spark.api.java.function.FlatMapFunction<scala.Tuple2<Integer,​Iterable<Object>>,​String>
    This function assigns codes to globally distinct values of recoded columns and writes the resulting column map in textcell (IJV) format to the output. (part of distributed recode map construction, used for recoding, binning and dummy coding). We operate directly over schema-specific objects to avoid unnecessary string conversion, as well as reduce memory overhead and shuffle.
    See Also:
    Serialized Form
    • Constructor Detail

      • TransformEncodeGroupFunction

        public TransformEncodeGroupFunction​(MultiColumnEncoder encoder,
                                            org.apache.sysds.runtime.instructions.spark.MultiReturnParameterizedBuiltinSPInstruction.MaxLongAccumulator accMax)