Package org.apache.hadoop.mapred
Class TextInputFormat
java.lang.Object
org.apache.hadoop.mapred.FileInputFormat<LongWritable,Text>
org.apache.hadoop.mapred.TextInputFormat
- All Implemented Interfaces:
InputFormat<LongWritable,,Text> JobConfigurable
@Public
@Stable
public class TextInputFormat
extends FileInputFormat<LongWritable,Text>
implements JobConfigurable
An
InputFormat for plain text files. Files are broken into lines.
Either linefeed or carriage-return are used to signal end of line. Keys are
the position in the file, and values are the line of text..-
Nested Class Summary
Nested classes/interfaces inherited from class org.apache.hadoop.mapred.FileInputFormat
FileInputFormat.Counter -
Field Summary
Fields inherited from class org.apache.hadoop.mapred.FileInputFormat
INPUT_DIR_NONRECURSIVE_IGNORE_SUBDIRS, INPUT_DIR_RECURSIVE, LOG, NUM_INPUT_FILES -
Constructor Summary
Constructors -
Method Summary
Modifier and TypeMethodDescriptionvoidInitializes a new instance from aJobConf.getRecordReader(InputSplit genericSplit, JobConf job, Reporter reporter) Get theRecordReaderfor the givenInputSplit.protected booleanisSplitable(FileSystem fs, Path file) Is the given filename splittable?Methods inherited from class org.apache.hadoop.mapred.FileInputFormat
addInputPath, addInputPathRecursively, addInputPaths, computeSplitSize, getBlockIndex, getInputPathFilter, getInputPaths, getSplitHosts, getSplits, listStatus, makeSplit, makeSplit, setInputPathFilter, setInputPaths, setInputPaths, setMinSplitSize
-
Constructor Details
-
TextInputFormat
public TextInputFormat()
-
-
Method Details
-
configure
Description copied from interface:JobConfigurableInitializes a new instance from aJobConf.- Specified by:
configurein interfaceJobConfigurable- Parameters:
conf- the configuration
-
isSplitable
Description copied from class:FileInputFormatIs the given filename splittable? Usually, true, but if the file is stream compressed, it will not be. The default implementation inFileInputFormatalways returns true. Implementations that may deal with non-splittable files must override this method.FileInputFormatimplementations can override this and returnfalseto ensure that individual input files are never split-up so thatMappers process entire files.- Overrides:
isSplitablein classFileInputFormat<LongWritable,Text> - Parameters:
fs- the file system that the file is onfile- the file name to check- Returns:
- is this file splitable?
-
getRecordReader
public RecordReader<LongWritable,Text> getRecordReader(InputSplit genericSplit, JobConf job, Reporter reporter) throws IOException Description copied from interface:InputFormatGet theRecordReaderfor the givenInputSplit.It is the responsibility of the
RecordReaderto respect record boundaries while processing the logical split to present a record-oriented view to the individual task.- Specified by:
getRecordReaderin interfaceInputFormat<LongWritable,Text> - Specified by:
getRecordReaderin classFileInputFormat<LongWritable,Text> - Parameters:
genericSplit- theInputSplitjob- the job that this split belongs to- Returns:
- a
RecordReader - Throws:
IOException
-