Package org.apache.hadoop.mapred
Class MultiFileInputFormat<K,V>
java.lang.Object
org.apache.hadoop.mapred.FileInputFormat<K,V>
org.apache.hadoop.mapred.MultiFileInputFormat<K,V>
- All Implemented Interfaces:
InputFormat<K,V>
An abstract
Subclasses implement
InputFormat that returns MultiFileSplit's
in getSplits(JobConf, int) method. Splits are constructed from
the files under the input paths. Each split returned contains nearly
equal content length. Subclasses implement
getRecordReader(InputSplit, JobConf, Reporter)
to construct RecordReader's for MultiFileSplit's.- See Also:
-
Nested Class Summary
Nested classes/interfaces inherited from class org.apache.hadoop.mapred.FileInputFormat
FileInputFormat.Counter -
Field Summary
Fields inherited from class org.apache.hadoop.mapred.FileInputFormat
INPUT_DIR_NONRECURSIVE_IGNORE_SUBDIRS, INPUT_DIR_RECURSIVE, LOG, NUM_INPUT_FILES -
Constructor Summary
Constructors -
Method Summary
Modifier and TypeMethodDescriptionabstract RecordReader<K,V> getRecordReader(InputSplit split, JobConf job, Reporter reporter) Get theRecordReaderfor the givenInputSplit.Splits files returned byFileInputFormat.listStatus(JobConf)when they're too big.Methods inherited from class org.apache.hadoop.mapred.FileInputFormat
addInputPath, addInputPathRecursively, addInputPaths, computeSplitSize, getBlockIndex, getInputPathFilter, getInputPaths, getSplitHosts, isSplitable, listStatus, makeSplit, makeSplit, setInputPathFilter, setInputPaths, setInputPaths, setMinSplitSize
-
Constructor Details
-
MultiFileInputFormat
public MultiFileInputFormat()
-
-
Method Details
-
getSplits
Description copied from class:FileInputFormatSplits files returned byFileInputFormat.listStatus(JobConf)when they're too big.- Specified by:
getSplitsin interfaceInputFormat<K,V> - Overrides:
getSplitsin classFileInputFormat<K,V> - Parameters:
job- job configuration.numSplits- the desired number of splits, a hint.- Returns:
- an array of
InputSplits for the job. - Throws:
IOException
-
getRecordReader
public abstract RecordReader<K,V> getRecordReader(InputSplit split, JobConf job, Reporter reporter) throws IOException Description copied from interface:InputFormatGet theRecordReaderfor the givenInputSplit.It is the responsibility of the
RecordReaderto respect record boundaries while processing the logical split to present a record-oriented view to the individual task.- Specified by:
getRecordReaderin interfaceInputFormat<K,V> - Specified by:
getRecordReaderin classFileInputFormat<K,V> - Parameters:
split- theInputSplitjob- the job that this split belongs to- Returns:
- a
RecordReader - Throws:
IOException
-