Class InfrastructureAnalyzer

  • public class InfrastructureAnalyzer
    extends Object
    Central place for analyzing and obtaining static infrastructure properties such as memory and number of logical processors.
    • Constructor Detail

      • InfrastructureAnalyzer

        public InfrastructureAnalyzer()
    • Method Detail

      • getLocalParallelism

        public static int getLocalParallelism()
        Gets the number of logical processors of the current node, including hyper-threading if enabled.
        number of local processors of the current node
      • getRemoteParallelNodes

        public static int getRemoteParallelNodes()
        Gets the number of cluster nodes (number of tasktrackers). If multiple tasktracker are started per node, each tasktracker is viewed as individual node.
        number of cluster nodes
      • getRemoteParallelMapTasks

        public static int getRemoteParallelMapTasks()
        Gets the number of remote parallel map slots.
        number of remote parallel map tasks
      • setRemoteParallelMapTasks

        public static void setRemoteParallelMapTasks​(int pmap)
      • getRemoteParallelReduceTasks

        public static int getRemoteParallelReduceTasks()
        Gets the total number of remote parallel reduce slots.
        number of remote parallel reduce tasks
      • setRemoteParallelReduceTasks

        public static void setRemoteParallelReduceTasks​(int preduce)
      • getLocalMaxMemory

        public static long getLocalMaxMemory()
        Gets the maximum memory [in bytes] of the current JVM.
        maximum memory of the current JVM
      • setLocalMaxMemory

        public static void setLocalMaxMemory​(long localMem)
      • getLocalMaxMemoryFraction

        public static double getLocalMaxMemoryFraction()
      • isLocalMode

        public static boolean isLocalMode()
      • isLocalMode

        public static boolean isLocalMode​(org.apache.hadoop.mapred.JobConf job)
      • getCkMaxCP

        public static int getCkMaxCP()
        Gets the maximum local parallelism constraint.
        maximum local parallelism constraint
      • getCkMaxMR

        public static int getCkMaxMR()
        Gets the maximum remote parallelism constraint
        maximum remote parallelism constraint
      • getCmMax

        public static long getCmMax()
        Gets the maximum memory constraint [in bytes].
        maximum memory constraint
      • getBlockSize

        public static long getBlockSize​(org.apache.hadoop.fs.FileSystem fs)
      • getHDFSBlockSize

        public static long getHDFSBlockSize()
        Gets the HDFS blocksize of the used cluster in bytes.
        HDFS block size
      • extractMaxMemoryOpt

        public static long extractMaxMemoryOpt​(String javaOpts)
      • setMaxMemoryOpt

        public static void setMaxMemoryOpt​(org.apache.hadoop.mapred.JobConf job,
                                           String key,
                                           long bytes)
      • getClusterUtilization

        public static double getClusterUtilization​(boolean mapOnly)
                                            throws IOException
        Gets the fraction of running map/reduce tasks to existing map/reduce task slots. NOTE: on YARN the number of slots is a spurious indicator because containers are purely scheduled based on memory.
        mapOnly - if true, only look at map tasks
        cluster utilization (current / capacity)
        IOException - if IOException occurs