Class FixedLengthInputFormat

All Implemented Interfaces:
InputFormat<LongWritable,BytesWritable>, JobConfigurable

@Public @Stable public class FixedLengthInputFormat extends FileInputFormat<LongWritable,BytesWritable> implements JobConfigurable
FixedLengthInputFormat is an input format used to read input files which contain fixed length records. The content of a record need not be text. It can be arbitrary binary data. Users must configure the record length property by calling: FixedLengthInputFormat.setRecordLength(conf, recordLength);

or conf.setInt(FixedLengthInputFormat.FIXED_RECORD_LENGTH, recordLength);

See Also:
  • FixedLengthRecordReader
  • Field Details

  • Constructor Details

    • FixedLengthInputFormat

      public FixedLengthInputFormat()
  • Method Details

    • setRecordLength

      public static void setRecordLength(Configuration conf, int recordLength)
      Set the length of each record
      Parameters:
      conf - configuration
      recordLength - the length of a record
    • getRecordLength

      public static int getRecordLength(Configuration conf)
      Get record length value
      Parameters:
      conf - configuration
      Returns:
      the record length, zero means none was set
    • configure

      public void configure(JobConf conf)
      Description copied from interface: JobConfigurable
      Initializes a new instance from a JobConf.
      Specified by:
      configure in interface JobConfigurable
      Parameters:
      conf - the configuration
    • getRecordReader

      public RecordReader<LongWritable,BytesWritable> getRecordReader(InputSplit genericSplit, JobConf job, Reporter reporter) throws IOException
      Description copied from interface: InputFormat
      Get the RecordReader for the given InputSplit.

      It is the responsibility of the RecordReader to respect record boundaries while processing the logical split to present a record-oriented view to the individual task.

      Specified by:
      getRecordReader in interface InputFormat<LongWritable,BytesWritable>
      Specified by:
      getRecordReader in class FileInputFormat<LongWritable,BytesWritable>
      Parameters:
      genericSplit - the InputSplit
      job - the job that this split belongs to
      Returns:
      a RecordReader
      Throws:
      IOException
    • isSplitable

      protected boolean isSplitable(FileSystem fs, Path file)
      Description copied from class: FileInputFormat
      Is the given filename splittable? Usually, true, but if the file is stream compressed, it will not be. The default implementation in FileInputFormat always returns true. Implementations that may deal with non-splittable files must override this method. FileInputFormat implementations can override this and return false to ensure that individual input files are never split-up so that Mappers process entire files.
      Overrides:
      isSplitable in class FileInputFormat<LongWritable,BytesWritable>
      Parameters:
      fs - the file system that the file is on
      file - the file name to check
      Returns:
      is this file splitable?