Class CompositeInputFormat<K extends WritableComparable>

java.lang.Object
org.apache.hadoop.mapred.join.CompositeInputFormat<K>
All Implemented Interfaces:
InputFormat<K,TupleWritable>, ComposableInputFormat<K,TupleWritable>

@Public @Stable public class CompositeInputFormat<K extends WritableComparable> extends Object implements ComposableInputFormat<K,TupleWritable>
An InputFormat capable of performing joins over a set of data sources sorted and partitioned the same way. A user may define new join types by setting the property mapred.join.define.<ident> to a classname. In the expression mapred.join.expr, the identifier will be assumed to be a ComposableRecordReader. mapred.join.keycomparator can be a classname used to compare keys in the join.
See Also:
  • Constructor Details

    • CompositeInputFormat

      public CompositeInputFormat()
  • Method Details

    • setFormat

      public void setFormat(JobConf job) throws IOException
      Interpret a given string as a composite expression. func ::= <ident>([<func>,]*<func>) func ::= tbl(<class>,"<path>") class ::= @see java.lang.Class#forName(java.lang.String) path ::= @see org.apache.hadoop.fs.Path#Path(java.lang.String) Reads expression from the mapred.join.expr property and user-supplied join types from mapred.join.define.<ident> types. Paths supplied to tbl are given as input paths to the InputFormat class listed.
      Throws:
      IOException
      See Also:
    • addDefaults

      protected void addDefaults()
      Adds the default set of identifiers to the parser.
    • getSplits

      public InputSplit[] getSplits(JobConf job, int numSplits) throws IOException
      Build a CompositeInputSplit from the child InputFormats by assigning the ith split from each child to the ith composite split.
      Specified by:
      getSplits in interface InputFormat<K extends WritableComparable,TupleWritable>
      Parameters:
      job - job configuration.
      numSplits - the desired number of splits, a hint.
      Returns:
      an array of InputSplits for the job.
      Throws:
      IOException
    • getRecordReader

      public ComposableRecordReader<K,TupleWritable> getRecordReader(InputSplit split, JobConf job, Reporter reporter) throws IOException
      Construct a CompositeRecordReader for the children of this InputFormat as defined in the init expression. The outermost join need only be composable, not necessarily a composite. Mandating TupleWritable isn't strictly correct.
      Specified by:
      getRecordReader in interface ComposableInputFormat<K extends WritableComparable,TupleWritable>
      Specified by:
      getRecordReader in interface InputFormat<K extends WritableComparable,TupleWritable>
      Parameters:
      split - the InputSplit
      job - the job that this split belongs to
      Returns:
      a RecordReader
      Throws:
      IOException
    • compose

      public static String compose(Class<? extends InputFormat> inf, String path)
      Convenience method for constructing composite formats. Given InputFormat class (inf), path (p) return: tbl(<inf>, <p>)
    • compose

      public static String compose(String op, Class<? extends InputFormat> inf, String... path)
      Convenience method for constructing composite formats. Given operation (op), Object class (inf), set of paths (p) return: <op>(tbl(<inf>,<p1>),tbl(<inf>,<p2>),...,tbl(<inf>,<pn>))
    • compose

      public static String compose(String op, Class<? extends InputFormat> inf, Path... path)
      Convenience method for constructing composite formats. Given operation (op), Object class (inf), set of paths (p) return: <op>(tbl(<inf>,<p1>),tbl(<inf>,<p2>),...,tbl(<inf>,<pn>))