Class InfrastructureAnalyzer


  • public class InfrastructureAnalyzer
    extends Object
    Central place for analyzing and obtaining static infrastructure properties such as memory and number of logical processors.
    • Constructor Detail

      • InfrastructureAnalyzer

        public InfrastructureAnalyzer()
    • Method Detail

      • getLocalParallelism

        public static int getLocalParallelism()
        Gets the number of logical processors of the current node, including hyper-threading if enabled.
        Returns:
        number of local processors of the current node
      • getRemoteParallelNodes

        public static int getRemoteParallelNodes()
        Gets the number of cluster nodes (number of tasktrackers). If multiple tasktracker are started per node, each tasktracker is viewed as individual node.
        Returns:
        number of cluster nodes
      • getRemoteParallelMapTasks

        public static int getRemoteParallelMapTasks()
        Gets the number of remote parallel map slots.
        Returns:
        number of remote parallel map tasks
      • setRemoteParallelMapTasks

        public static void setRemoteParallelMapTasks​(int pmap)
      • getRemoteParallelReduceTasks

        public static int getRemoteParallelReduceTasks()
        Gets the total number of remote parallel reduce slots.
        Returns:
        number of remote parallel reduce tasks
      • setRemoteParallelReduceTasks

        public static void setRemoteParallelReduceTasks​(int preduce)
      • getLocalMaxMemory

        public static long getLocalMaxMemory()
        Gets the maximum memory [in bytes] of the current JVM.
        Returns:
        maximum memory of the current JVM
      • setLocalMaxMemory

        public static void setLocalMaxMemory​(long localMem)
      • getLocalMaxMemoryFraction

        public static double getLocalMaxMemoryFraction()
      • isLocalMode

        public static boolean isLocalMode()
      • isLocalMode

        public static boolean isLocalMode​(org.apache.hadoop.mapred.JobConf job)
      • getCkMaxCP

        public static int getCkMaxCP()
        Gets the maximum local parallelism constraint.
        Returns:
        maximum local parallelism constraint
      • getCkMaxMR

        public static int getCkMaxMR()
        Gets the maximum remote parallelism constraint
        Returns:
        maximum remote parallelism constraint
      • getCmMax

        public static long getCmMax()
        Gets the maximum memory constraint [in bytes].
        Returns:
        maximum memory constraint
      • getBlockSize

        public static long getBlockSize​(org.apache.hadoop.fs.FileSystem fs)
      • getHDFSBlockSize

        public static long getHDFSBlockSize()
        Gets the HDFS blocksize of the used cluster in bytes.
        Returns:
        HDFS block size
      • extractMaxMemoryOpt

        public static long extractMaxMemoryOpt​(String javaOpts)
      • setMaxMemoryOpt

        public static void setMaxMemoryOpt​(org.apache.hadoop.mapred.JobConf job,
                                           String key,
                                           long bytes)
      • getClusterUtilization

        public static double getClusterUtilization​(boolean mapOnly)
                                            throws IOException
        Gets the fraction of running map/reduce tasks to existing map/reduce task slots. NOTE: on YARN the number of slots is a spurious indicator because containers are purely scheduled based on memory.
        Parameters:
        mapOnly - if true, only look at map tasks
        Returns:
        cluster utilization (current / capacity)
        Throws:
        IOException - if IOException occurs