|
| | req_sketch (uint16_t k, bool hra=true, const Comparator &comparator=Comparator(), const Allocator &allocator=Allocator()) |
| | Constructor. More...
|
| |
| | req_sketch (const req_sketch &other) |
| | Copy constructor. More...
|
| |
| | req_sketch (req_sketch &&other) noexcept |
| | Move constructor. More...
|
| |
| req_sketch & | operator= (const req_sketch &other) |
| | Copy assignment. More...
|
| |
| req_sketch & | operator= (req_sketch &&other) |
| | Move assignment. More...
|
| |
| template<typename TT , typename CC , typename AA > |
| | req_sketch (const req_sketch< TT, CC, AA > &other, const Comparator &comparator=Comparator(), const Allocator &allocator=Allocator()) |
| | Type converting constructor. More...
|
| |
| uint16_t | get_k () const |
| | Returns configured parameter K. More...
|
| |
| bool | is_HRA () const |
| | Returns configured parameter High Rank Accuracy. More...
|
| |
| bool | is_empty () const |
| | Returns true if this sketch is empty. More...
|
| |
| uint64_t | get_n () const |
| | Returns the length of the input stream. More...
|
| |
| uint32_t | get_num_retained () const |
| | Returns the number of retained items in the sketch. More...
|
| |
| bool | is_estimation_mode () const |
| | Returns true if this sketch is in estimation mode. More...
|
| |
| template<typename FwdT > |
| void | update (FwdT &&item) |
| | Updates this sketch with the given data item. More...
|
| |
| template<typename FwdSk > |
| void | merge (FwdSk &&other) |
| | Merges another sketch into this one. More...
|
| |
| const T & | get_min_item () const |
| | Returns the min item of the stream. More...
|
| |
| const T & | get_max_item () const |
| | Returns the max item of the stream. More...
|
| |
| Comparator | get_comparator () const |
| | Returns an instance of the comparator for this sketch. More...
|
| |
| Allocator | get_allocator () const |
| | Returns an instance of the allocator for this sketch. More...
|
| |
| double | get_rank (const T &item, bool inclusive=true) const |
| | Returns an approximation to the normalized rank of the given item from 0 to 1 inclusive. More...
|
| |
| vector_double | get_PMF (const T *split_points, uint32_t size, bool inclusive=true) const |
| | Returns an approximation to the Probability Mass Function (PMF) of the input stream given a set of split points (items). More...
|
| |
| vector_double | get_CDF (const T *split_points, uint32_t size, bool inclusive=true) const |
| | Returns an approximation to the Cumulative Distribution Function (CDF), which is the cumulative analog of the PMF, of the input stream given a set of split points (items). More...
|
| |
| quantile_return_type | get_quantile (double rank, bool inclusive=true) const |
| | Returns an approximate quantile of the given normalized rank. More...
|
| |
| double | get_rank_lower_bound (double rank, uint8_t num_std_dev) const |
| | Returns an approximate lower bound of the given normalized rank. More...
|
| |
| double | get_rank_upper_bound (double rank, uint8_t num_std_dev) const |
| | Returns an approximate upper bound of the given normalized rank. More...
|
| |
| template<typename TT = T, typename SerDe = serde<T>, typename std::enable_if< std::is_arithmetic< TT >::value, int >::type = 0> |
| size_t | get_serialized_size_bytes (const SerDe &sd=SerDe()) const |
| | Computes size needed to serialize the current state of the sketch. More...
|
| |
| template<typename TT = T, typename SerDe = serde<T>, typename std::enable_if<!std::is_arithmetic< TT >::value, int >::type = 0> |
| size_t | get_serialized_size_bytes (const SerDe &sd=SerDe()) const |
| | Computes size needed to serialize the current state of the sketch. More...
|
| |
| template<typename SerDe = serde<T>> |
| void | serialize (std::ostream &os, const SerDe &sd=SerDe()) const |
| | This method serializes the sketch into a given stream in a binary form. More...
|
| |
| template<typename SerDe = serde<T>> |
| vector_bytes | serialize (unsigned header_size_bytes=0, const SerDe &sd=SerDe()) const |
| | This method serializes the sketch as a vector of bytes. More...
|
| |
| string< Allocator > | to_string (bool print_levels=false, bool print_items=false) const |
| | Prints a summary of the sketch. More...
|
| |
| const_iterator | begin () const |
| | Iterator pointing to the first item in the sketch. More...
|
| |
| const_iterator | end () const |
| | Iterator pointing to the past-the-end item in the sketch. More...
|
| |
| quantiles_sorted_view< T, Comparator, Allocator > | get_sorted_view () const |
| | Gets the sorted view of this sketch. More...
|
| |
|
| static double | get_RSE (uint16_t k, double rank, bool hra, uint64_t n) |
| | Returns an a priori estimate of relative standard error (RSE, expressed as a number in [0,1]). More...
|
| |
| template<typename SerDe = serde<T>> |
| static req_sketch | deserialize (std::istream &is, const SerDe &sd=SerDe(), const Comparator &comparator=Comparator(), const Allocator &allocator=Allocator()) |
| | This method deserializes a sketch from a given stream. More...
|
| |
| template<typename SerDe = serde<T>> |
| static req_sketch | deserialize (const void *bytes, size_t size, const SerDe &sd=SerDe(), const Comparator &comparator=Comparator(), const Allocator &allocator=Allocator()) |
| | This method deserializes a sketch from a given array of bytes. More...
|
| |
template<typename T, typename Comparator = std::less<T>, typename Allocator = std::allocator<T>>
class datasketches::req_sketch< T, Comparator, Allocator >
Relative Error Quantiles Sketch.
This is an implementation based on the paper "Relative Error Streaming Quantiles" by Graham Cormode, Zohar Karnin, Edo Liberty, Justin Thaler, Pavel VeselĂ˝, and loosely derived from a Python prototype written by Pavel VeselĂ˝.
Reference: https://arxiv.org/abs/2004.01668
This implementation differs from the algorithm described in the paper in the following:
-
The algorithm requires no upper bound on the stream length. Instead, each relative-compactor counts the number of compaction operations performed so far (via variable state). Initially, the relative-compactor starts with INIT_NUMBER_OF_SECTIONS. Each time the number of compactions (variable state) exceeds 2^{numSections - 1}, we double numSections. Note that after merging the sketch with another one variable state may not correspond to the number of compactions performed at a particular level, however, since the state variable never exceeds the number of compactions, the guarantees of the sketch remain valid.
-
The size of each section (variable k and section_size in the code and parameter k in the paper) is initialized with a number set by the user via variable k. When the number of sections doubles, we decrease section_size by a factor of sqrt(2). This is applied at each level separately. Thus, when we double the number of sections, the nominal compactor size increases by a factor of approx. sqrt(2) (+/- rounding).
-
The merge operation here does not perform "special compactions", which are used in the paper to allow for a tight mathematical analysis of the sketch.
This implementation provides a number of capabilities not discussed in the paper or provided in the Python prototype.
-
The Python prototype only implemented high accuracy for low ranks. This implementation provides the user with the ability to choose either high rank accuracy or low rank accuracy at the time of sketch construction.
-
The Python prototype only implemented a comparison criterion of "INCLUSIVE". This implementation allows the user to use both the "INCLUSIVE" criterion and the "EXCLUSIVE" criterion.
-
This implementation provides extensive debug visibility into the operation of the sketch with two levels of detail output. This is not only useful for debugging, but is a powerful tool to help users understand how the sketch works.