Class PathOutputCommitter

java.lang.Object
org.apache.hadoop.mapreduce.OutputCommitter
org.apache.hadoop.mapreduce.lib.output.PathOutputCommitter
Direct Known Subclasses:
org.apache.hadoop.fs.s3a.commit.AbstractS3ACommitter, BindingPathOutputCommitter, FileOutputCommitter, ManifestCommitter

@Public @Evolving public abstract class PathOutputCommitter extends OutputCommitter
A committer which somehow commits data written to a working directory to the final directory during the commit process. The reference implementation of this is the FileOutputCommitter. There are two constructors, both of which do nothing but long and validate their arguments.
  • Constructor Details

    • PathOutputCommitter

      protected PathOutputCommitter(Path outputPath, TaskAttemptContext context) throws IOException
      Constructor for a task attempt. Subclasses should provide a public constructor with this signature.
      Parameters:
      outputPath - output path: may be null
      context - task context
      Throws:
      IOException - IO problem
    • PathOutputCommitter

      protected PathOutputCommitter(Path outputPath, JobContext context) throws IOException
      Constructor for a job attempt. Subclasses should provide a public constructor with this signature.
      Parameters:
      outputPath - output path: may be null
      context - task context
      Throws:
      IOException - IO problem
  • Method Details

    • getOutputPath

      public abstract Path getOutputPath()
      Get the final directory where work will be placed once the job is committed. This may be null, in which case, there is no output path to write data to.
      Returns:
      the path where final output of the job should be placed.
    • hasOutputPath

      public boolean hasOutputPath()
      Predicate: is there an output path?
      Returns:
      true if we have an output path set, else false.
    • getWorkPath

      public abstract Path getWorkPath() throws IOException
      Get the directory that the task should write results into. Warning: there's no guarantee that this work path is on the same FS as the final output, or that it's visible across machines. May be null.
      Returns:
      the work directory
      Throws:
      IOException - IO problem
    • toString

      public String toString()
      Overrides:
      toString in class Object