Class BinaryPartitioner<V>

java.lang.Object
org.apache.hadoop.mapreduce.Partitioner<BinaryComparable,V>
org.apache.hadoop.mapreduce.lib.partition.BinaryPartitioner<V>
All Implemented Interfaces:
Configurable
Direct Known Subclasses:
BinaryPartitioner

@Public @Evolving public class BinaryPartitioner<V> extends Partitioner<BinaryComparable,V> implements Configurable

Partition BinaryComparable keys using a configurable part of the bytes array returned by BinaryComparable.getBytes().

The subarray to be used for the partitioning can be defined by means of the following properties:

  • mapreduce.partition.binarypartitioner.left.offset: left offset in array (0 by default)
  • mapreduce.partition.binarypartitioner.right.offset: right offset in array (-1 by default)
Like in Python, both negative and positive offsets are allowed, but the meaning is slightly different. In case of an array of length 5, for instance, the possible offsets are:

  +---+---+---+---+---+
  | B | B | B | B | B |
  +---+---+---+---+---+
    0   1   2   3   4
   -5  -4  -3  -2  -1
 
The first row of numbers gives the position of the offsets 0...5 in the array; the second row gives the corresponding negative offsets. Contrary to Python, the specified subarray has byte i and j as first and last element, repectively, when i and j are the left and right offset.

For Hadoop programs written in Java, it is advisable to use one of the following static convenience methods for setting the offsets:

  • Field Details

  • Constructor Details

    • BinaryPartitioner

      public BinaryPartitioner()
  • Method Details

    • setOffsets

      public static void setOffsets(Configuration conf, int left, int right)
      Set the subarray to be used for partitioning to bytes[left:(right+1)] in Python syntax.
      Parameters:
      conf - configuration object
      left - left Python-style offset
      right - right Python-style offset
    • setLeftOffset

      public static void setLeftOffset(Configuration conf, int offset)
      Set the subarray to be used for partitioning to bytes[offset:] in Python syntax.
      Parameters:
      conf - configuration object
      offset - left Python-style offset
    • setRightOffset

      public static void setRightOffset(Configuration conf, int offset)
      Set the subarray to be used for partitioning to bytes[:(offset+1)] in Python syntax.
      Parameters:
      conf - configuration object
      offset - right Python-style offset
    • setConf

      public void setConf(Configuration conf)
      Description copied from interface: Configurable
      Set the configuration to be used by this object.
      Specified by:
      setConf in interface Configurable
      Parameters:
      conf - configuration to be used
    • getConf

      public Configuration getConf()
      Description copied from interface: Configurable
      Return the configuration used by this object.
      Specified by:
      getConf in interface Configurable
      Returns:
      Configuration
    • getPartition

      public int getPartition(BinaryComparable key, V value, int numPartitions)
      Use (the specified slice of the array returned by) BinaryComparable.getBytes() to partition.
      Specified by:
      getPartition in class Partitioner<BinaryComparable,V>
      Parameters:
      key - the key to be partioned.
      value - the entry value.
      numPartitions - the total number of partitions.
      Returns:
      the partition number for the key.