Class InputStreamInputFormat

  • All Implemented Interfaces:
    org.apache.hadoop.mapred.InputFormat<org.apache.hadoop.io.LongWritable,​org.apache.hadoop.io.Text>

    public class InputStreamInputFormat
    extends Object
    implements org.apache.hadoop.mapred.InputFormat<org.apache.hadoop.io.LongWritable,​org.apache.hadoop.io.Text>
    Custom input format and record reader to redirect common implementation of csv read over record readers (which are required for the parallel readers) to an input stream.
    • Constructor Detail

      • InputStreamInputFormat

        public InputStreamInputFormat​(InputStream is)
    • Method Detail

      • getSplits

        public org.apache.hadoop.mapred.InputSplit[] getSplits​(org.apache.hadoop.mapred.JobConf job,
                                                               int numSplits)
                                                        throws IOException
        Specified by:
        getSplits in interface org.apache.hadoop.mapred.InputFormat<org.apache.hadoop.io.LongWritable,​org.apache.hadoop.io.Text>
        Throws:
        IOException
      • getRecordReader

        public org.apache.hadoop.mapred.RecordReader<org.apache.hadoop.io.LongWritable,​org.apache.hadoop.io.Text> getRecordReader​(org.apache.hadoop.mapred.InputSplit split,
                                                                                                                                        org.apache.hadoop.mapred.JobConf job,
                                                                                                                                        org.apache.hadoop.mapred.Reporter reporter)
                                                                                                                                 throws IOException
        Specified by:
        getRecordReader in interface org.apache.hadoop.mapred.InputFormat<org.apache.hadoop.io.LongWritable,​org.apache.hadoop.io.Text>
        Throws:
        IOException