Package org.apache.datasketches.frequencies


package org.apache.datasketches.frequencies
This package contains the implementations of the paper https://arxiv.org/abs/1705.07001.
  • Class
    Description
    Specifies one of two types of error regions of the statistical classification Confusion Matrix that can be excluded from a returned sample of Frequent Items.
    This sketch is based on the paper https://arxiv.org/abs/1705.07001 ("A High-Performance Algorithm for Identifying Frequent Items in Data Streams" by Daniel Anderson, Pryce Bevan, Kevin Lang, Edo Liberty, Lee Rhodes, and Justin Thaler) and is useful for tracking approximate frequencies of items of type <T> with optional associated counts (<T> item, long count) that are members of a multiset of such items.
    Row class that defines the return values from a getFrequentItems query.
    This sketch is based on the paper https://arxiv.org/abs/1705.07001 ("A High-Performance Algorithm for Identifying Frequent Items in Data Streams" by Daniel Anderson, Pryce Bevan, Kevin Lang, Edo Liberty, Lee Rhodes, and Justin Thaler) and is useful for tracking approximate frequencies of long items with optional associated counts (long item, long count) that are members of a multiset of such items.
    Row class that defines the return values from a getFrequentItems query.