Class ValueAggregatorBaseDescriptor
java.lang.Object
org.apache.hadoop.mapreduce.lib.aggregate.ValueAggregatorBaseDescriptor
- All Implemented Interfaces:
ValueAggregatorDescriptor
- Direct Known Subclasses:
ValueAggregatorBaseDescriptor
@Public
@Stable
public class ValueAggregatorBaseDescriptor
extends Object
implements ValueAggregatorDescriptor
This class implements the common functionalities of
the subclasses of ValueAggregatorDescriptor class.
-
Field Summary
FieldsModifier and TypeFieldDescriptionstatic final Stringstatic final Stringstatic final Stringstatic final Stringstatic final Stringstatic final Stringstatic final Stringstatic final StringFields inherited from interface org.apache.hadoop.mapreduce.lib.aggregate.ValueAggregatorDescriptor
ONE, TYPE_SEPARATOR -
Constructor Summary
Constructors -
Method Summary
Modifier and TypeMethodDescriptionvoidconfigure(Configuration conf) get the input file name.generateEntry(String type, String id, Text val) generateKeyValPairs(Object key, Object val) Generate 1 or 2 aggregation-id/value pairs for the given key/value pair.static ValueAggregatorgenerateValueAggregator(String type, long uniqCount)
-
Field Details
-
UNIQ_VALUE_COUNT
- See Also:
-
LONG_VALUE_SUM
- See Also:
-
DOUBLE_VALUE_SUM
- See Also:
-
VALUE_HISTOGRAM
- See Also:
-
LONG_VALUE_MAX
- See Also:
-
LONG_VALUE_MIN
- See Also:
-
STRING_VALUE_MAX
- See Also:
-
STRING_VALUE_MIN
- See Also:
-
inputFile
-
-
Constructor Details
-
ValueAggregatorBaseDescriptor
public ValueAggregatorBaseDescriptor()
-
-
Method Details
-
generateEntry
- Parameters:
type- the aggregation typeid- the aggregation idval- the val associated with the id to be aggregated- Returns:
- an Entry whose key is the aggregation id prefixed with the aggregation type.
-
generateValueAggregator
- Parameters:
type- the aggregation typeuniqCount- the limit in the number of unique values to keep, if type is UNIQ_VALUE_COUNT- Returns:
- a value aggregator of the given type.
-
generateKeyValPairs
Generate 1 or 2 aggregation-id/value pairs for the given key/value pair. The first id will be of type LONG_VALUE_SUM, with "record_count" as its aggregation id. If the input is a file split, the second id of the same type will be generated too, with the file name as its aggregation id. This achieves the behavior of counting the total number of records in the input data, and the number of records in each input file.- Specified by:
generateKeyValPairsin interfaceValueAggregatorDescriptor- Parameters:
key- input keyval- input value- Returns:
- a list of aggregation id/value pairs. An aggregation id encodes an aggregation type which is used to guide the way to aggregate the value in the reduce/combiner phrase of an Aggregate based job.
-
configure
get the input file name.- Specified by:
configurein interfaceValueAggregatorDescriptor- Parameters:
conf- a configuration object
-