Class KeyFieldBasedComparator<K,V>

java.lang.Object
org.apache.hadoop.io.WritableComparator
org.apache.hadoop.mapreduce.lib.partition.KeyFieldBasedComparator<K,V>
All Implemented Interfaces:
Comparator, Configurable, RawComparator
Direct Known Subclasses:
KeyFieldBasedComparator

@Public @Stable public class KeyFieldBasedComparator<K,V> extends WritableComparator implements Configurable
This comparator implementation provides a subset of the features provided by the Unix/GNU Sort. In particular, the supported features are: -n, (Sort numerically) -r, (Reverse the result of comparison) -k pos1[,pos2], where pos is of the form f[.c][opts], where f is the number of the field to use, and c is the number of the first character from the beginning of the field. Fields and character posns are numbered starting with 1; a character position of zero in pos2 indicates the field's last character. If '.c' is omitted from pos1, it defaults to 1 (the beginning of the field); if omitted from pos2, it defaults to 0 (the end of the field). opts are ordering options (any of 'nr' as described above). We assume that the fields in the key are separated by MRJobConfig.MAP_OUTPUT_KEY_FIELD_SEPARATOR.
  • Field Details

    • COMPARATOR_OPTIONS

      public static String COMPARATOR_OPTIONS
  • Constructor Details

    • KeyFieldBasedComparator

      public KeyFieldBasedComparator()
  • Method Details

    • setConf

      public void setConf(Configuration conf)
      Description copied from interface: Configurable
      Set the configuration to be used by this object.
      Specified by:
      setConf in interface Configurable
      Overrides:
      setConf in class WritableComparator
      Parameters:
      conf - configuration to be used
    • getConf

      public Configuration getConf()
      Description copied from interface: Configurable
      Return the configuration used by this object.
      Specified by:
      getConf in interface Configurable
      Overrides:
      getConf in class WritableComparator
      Returns:
      Configuration
    • compare

      public int compare(byte[] b1, int s1, int l1, byte[] b2, int s2, int l2)
      Description copied from class: WritableComparator
      Optimization hook. Override this to make SequenceFile.Sorter's scream.

      The default implementation reads the data into two WritableComparables (using Writable.readFields(DataInput), then calls WritableComparator.compare(WritableComparable,WritableComparable).

      Specified by:
      compare in interface RawComparator<K>
      Overrides:
      compare in class WritableComparator
      Parameters:
      b1 - The first byte array.
      s1 - The position index in b1. The object under comparison's starting index.
      l1 - The length of the object in b1.
      b2 - The second byte array.
      s2 - The position index in b2. The object under comparison's starting index.
      l2 - The length of the object under comparison in b2.
      Returns:
      An integer result of the comparison.
    • setKeyFieldComparatorOptions

      public static void setKeyFieldComparatorOptions(Job job, String keySpec)
      Set the KeyFieldBasedComparator options used to compare keys.
      Parameters:
      keySpec - the key specification of the form -k pos1[,pos2], where, pos is of the form f[.c][opts], where f is the number of the key field to use, and c is the number of the first character from the beginning of the field. Fields and character posns are numbered starting with 1; a character position of zero in pos2 indicates the field's last character. If '.c' is omitted from pos1, it defaults to 1 (the beginning of the field); if omitted from pos2, it defaults to 0 (the end of the field). opts are ordering options. The supported options are: -n, (Sort numerically) -r, (Reverse the result of comparison)
    • getKeyFieldComparatorOption

      public static String getKeyFieldComparatorOption(JobContext job)
      Get the KeyFieldBasedComparator options