Class Recompiler


  • public class Recompiler
    extends Object
    Dynamic recompilation of hop dags to runtime instructions, which includes the following substeps: (1) deep copy hop dag, (2) refresh matrix characteristics, (3) apply dynamic rewrites, (4) refresh memory estimates, (5) construct lops (incl operator selection), and (6) generate runtime program (incl piggybacking).
    • Constructor Detail

      • Recompiler

        public Recompiler()
    • Method Detail

      • reinitRecompiler

        public static void reinitRecompiler()
        Re-initializes the recompiler according to the current optimizer flags.
      • recompileProgramBlockHierarchy2Forced

        public static void recompileProgramBlockHierarchy2Forced​(ArrayList<ProgramBlock> pbs,
                                                                 long tid,
                                                                 Set<String> fnStack,
                                                                 Types.ExecType et)
        Method to recompile program block hierarchy to forced execution time. This affects also referenced functions and chains of functions. Use et==null in order to release the forced exec type.
        Parameters:
        pbs - list of program blocks
        tid - thread id
        fnStack - function stack
        et - execution type
      • recompileProgramBlockInstructions

        public static void recompileProgramBlockInstructions​(ProgramBlock pb)
                                                      throws IOException
        This method does NO full program block recompile (no stats update, no rewrites, no recursion) but only regenerates lops and instructions. The primary use case is recompilation after are hop configuration changes which allows to preserve statistics (e.g., propagated worst case stats from other program blocks) and better performance for recompiling individual program blocks.
        Parameters:
        pb - program block
        Throws:
        IOException - if IOException occurs
      • requiresRecompilation

        public static boolean requiresRecompilation​(ArrayList<Hop> hops)
      • requiresRecompilation

        public static boolean requiresRecompilation​(Hop hop)
      • deepCopyHopsDag

        public static ArrayList<Hop> deepCopyHopsDag​(List<Hop> hops)
        Deep copy of hops dags for parallel recompilation.
        Parameters:
        hops - list of high-level operators
        Returns:
        list of high-level operators
      • deepCopyHopsDag

        public static Hop deepCopyHopsDag​(Hop hops)
        Deep copy of hops dags for parallel recompilation.
        Parameters:
        hops - high-level operator
        Returns:
        high-level operator
      • updateFunctionNames

        public static void updateFunctionNames​(List<Hop> hops,
                                               long pid)
      • rUpdateFunctionNames

        public static void rUpdateFunctionNames​(Hop hop,
                                                long pid)
      • removeUpdatedScalars

        public static void removeUpdatedScalars​(LocalVariableMap callVars,
                                                StatementBlock sb)
        Remove any scalar variables from the variable map if the variable is updated in this block.
        Parameters:
        callVars - Map of variables eligible for propagation.
        sb - DML statement block.
      • extractDAGOutputStatistics

        public static void extractDAGOutputStatistics​(List<Hop> hops,
                                                      LocalVariableMap vars)
      • extractDAGOutputStatistics

        public static void extractDAGOutputStatistics​(List<Hop> hops,
                                                      LocalVariableMap vars,
                                                      boolean overwrite)
      • extractDAGOutputStatistics

        public static void extractDAGOutputStatistics​(Hop hop,
                                                      LocalVariableMap vars,
                                                      boolean overwrite)
      • rClearLops

        public static void rClearLops​(Hop hop)
        Clearing lops for a given hops includes to (1) remove the reference to constructed lops and (2) clear the exec type (for consistency). The latter is important for advanced optimizers like parfor; otherwise subtle side-effects of program recompilation and hop-lop rewrites possible (e.g., see indexingop hop-lop rewrite in combination parfor rewrite set exec type that eventuelly might lead to unnecessary remote_parfor jobs).
        Parameters:
        hop - high-level operator
      • rUpdateStatistics

        public static void rUpdateStatistics​(Hop hop,
                                             LocalVariableMap vars)
      • rReplaceLiterals

        public static void rReplaceLiterals​(Hop hop,
                                            ExecutionContext ec,
                                            boolean scalarsOnly)
        public interface to package local literal replacement
        Parameters:
        hop - high-level operator
        ec - Execution context
        scalarsOnly - if true, replace only scalar variables but no matrix operations; if false, apply full literal replacement
      • rReplaceLiterals

        public static void rReplaceLiterals​(Hop hop,
                                            LocalVariableMap vars,
                                            boolean scalarsOnly)
      • rGetMaxParallelism

        public static int rGetMaxParallelism​(List<Hop> hops)
      • rGetMaxParallelism

        public static int rGetMaxParallelism​(Hop hop)
      • rSetMaxParallelism

        public static void rSetMaxParallelism​(List<Hop> hops,
                                              int k)
      • rSetMaxParallelism

        public static void rSetMaxParallelism​(Hop hop,
                                              int k)
      • checkCPReblock

        public static boolean checkCPReblock​(ExecutionContext ec,
                                             String varin)
        CP Reblock check for spark instructions; in contrast to MR, we can not rely on the input file sizes because inputs might be passed via rdds.
        Parameters:
        ec - execution context
        varin - variable
        Returns:
        true if CP reblock?