Package org.apache.hadoop.mapred
Class FixedLengthInputFormat
java.lang.Object
org.apache.hadoop.mapred.FileInputFormat<LongWritable,BytesWritable>
org.apache.hadoop.mapred.FixedLengthInputFormat
- All Implemented Interfaces:
InputFormat<LongWritable,,BytesWritable> JobConfigurable
@Public
@Stable
public class FixedLengthInputFormat
extends FileInputFormat<LongWritable,BytesWritable>
implements JobConfigurable
FixedLengthInputFormat is an input format used to read input files
which contain fixed length records. The content of a record need not be
text. It can be arbitrary binary data. Users must configure the record
length property by calling:
FixedLengthInputFormat.setRecordLength(conf, recordLength);
or conf.setInt(FixedLengthInputFormat.FIXED_RECORD_LENGTH, recordLength);
or conf.setInt(FixedLengthInputFormat.FIXED_RECORD_LENGTH, recordLength);
- See Also:
-
FixedLengthRecordReader
-
Nested Class Summary
Nested classes/interfaces inherited from class org.apache.hadoop.mapred.FileInputFormat
FileInputFormat.Counter -
Field Summary
FieldsFields inherited from class org.apache.hadoop.mapred.FileInputFormat
INPUT_DIR_NONRECURSIVE_IGNORE_SUBDIRS, INPUT_DIR_RECURSIVE, LOG, NUM_INPUT_FILES -
Constructor Summary
Constructors -
Method Summary
Modifier and TypeMethodDescriptionvoidInitializes a new instance from aJobConf.static intgetRecordLength(Configuration conf) Get record length valuegetRecordReader(InputSplit genericSplit, JobConf job, Reporter reporter) Get theRecordReaderfor the givenInputSplit.protected booleanisSplitable(FileSystem fs, Path file) Is the given filename splittable?static voidsetRecordLength(Configuration conf, int recordLength) Set the length of each recordMethods inherited from class org.apache.hadoop.mapred.FileInputFormat
addInputPath, addInputPathRecursively, addInputPaths, computeSplitSize, getBlockIndex, getInputPathFilter, getInputPaths, getSplitHosts, getSplits, listStatus, makeSplit, makeSplit, setInputPathFilter, setInputPaths, setInputPaths, setMinSplitSize
-
Field Details
-
FIXED_RECORD_LENGTH
- See Also:
-
-
Constructor Details
-
FixedLengthInputFormat
public FixedLengthInputFormat()
-
-
Method Details
-
setRecordLength
Set the length of each record- Parameters:
conf- configurationrecordLength- the length of a record
-
getRecordLength
Get record length value- Parameters:
conf- configuration- Returns:
- the record length, zero means none was set
-
configure
Description copied from interface:JobConfigurableInitializes a new instance from aJobConf.- Specified by:
configurein interfaceJobConfigurable- Parameters:
conf- the configuration
-
getRecordReader
public RecordReader<LongWritable,BytesWritable> getRecordReader(InputSplit genericSplit, JobConf job, Reporter reporter) throws IOException Description copied from interface:InputFormatGet theRecordReaderfor the givenInputSplit.It is the responsibility of the
RecordReaderto respect record boundaries while processing the logical split to present a record-oriented view to the individual task.- Specified by:
getRecordReaderin interfaceInputFormat<LongWritable,BytesWritable> - Specified by:
getRecordReaderin classFileInputFormat<LongWritable,BytesWritable> - Parameters:
genericSplit- theInputSplitjob- the job that this split belongs to- Returns:
- a
RecordReader - Throws:
IOException
-
isSplitable
Description copied from class:FileInputFormatIs the given filename splittable? Usually, true, but if the file is stream compressed, it will not be. The default implementation inFileInputFormatalways returns true. Implementations that may deal with non-splittable files must override this method.FileInputFormatimplementations can override this and returnfalseto ensure that individual input files are never split-up so thatMappers process entire files.- Overrides:
isSplitablein classFileInputFormat<LongWritable,BytesWritable> - Parameters:
fs- the file system that the file is onfile- the file name to check- Returns:
- is this file splitable?
-