datasketches-cpp
|
DataSketches namespace. More...
Namespaces | |
namespace | cpc_constants |
CPC constants. | |
namespace | ebpps_constants |
EBPPS sketch constants. | |
namespace | kll_constants |
KLL sketch constants. | |
namespace | quantiles_constants |
Constants for Quantiles sketch. | |
namespace | req_constants |
REQ sketch constants. | |
namespace | theta_constants |
Theta constants. | |
namespace | var_opt_constants |
VarOpt sketch constants. | |
Classes | |
class | array_tuple_a_not_b |
array tuple A-not-B More... | |
class | array_tuple_intersection |
array tuple intersection More... | |
class | array_tuple_union |
array tuple union More... | |
class | base_theta_sketch_alloc |
Abstract base class for Theta sketch. More... | |
class | bounds_binomial_proportions |
Confidence intervals for binomial proportions. More... | |
class | bounds_on_ratios_in_sampled_sets |
Bounds on ratios in sampled sets. More... | |
class | bounds_on_ratios_in_theta_sketched_sets |
Bounds on ratios in Theta sketched sets. More... | |
class | compact_array_tuple_sketch |
Compact array tuple sketch. More... | |
class | compact_theta_sketch_alloc |
Compact Theta sketch. More... | |
class | compact_tuple_sketch |
Compact Tuple sketch. More... | |
class | count_min_sketch |
C++ implementation of the CountMin sketch data structure of Cormode and Muthukrishnan. More... | |
class | cpc_sketch_alloc |
High performance C++ implementation of Compressed Probabilistic Counting (CPC) Sketch. More... | |
class | cpc_union_alloc |
High performance C++ implementation of Compressed Probabilistic Counting (CPC) Union. More... | |
struct | default_array_tuple_union_policy |
default array tuple union policy More... | |
class | default_array_tuple_update_policy |
default array tuple update policy More... | |
class | density_sketch |
Density sketch. More... | |
class | ebpps_sketch |
An implementation of an Exact and Bounded Sampling Proportional to Size sketch. More... | |
class | frequent_items_sketch |
Frequent Items sketch. More... | |
class | hll_union_alloc |
This performs union operations for HLL sketches. More... | |
class | HllSketchImpl |
This is a high performance implementation of Phillipe Flajolet's HLL sketch but with significantly improved error behavior. More... | |
class | jaccard_similarity_base |
Base class for Jaccard similarity. More... | |
class | kll_sketch |
Implementation of a very compact quantiles sketch with lazy compaction scheme and nearly optimal accuracy per retained item. More... | |
class | kolmogorov_smirnov |
Kolmogorov-Smirnov test for KLL or Quantiles sketches. More... | |
class | quantiles_sketch |
This is a stochastic streaming sketch that enables near-real time analysis of the approximate distribution from a very large stream in a single pass. More... | |
class | quantiles_sorted_view |
Sorted view for quantiles sketches (REQ, KLL and Quantiles) More... | |
class | req_sketch |
Relative Error Quantiles Sketch. More... | |
struct | serde |
Interface for serializing and deserializing items. More... | |
struct | serde< std::string > |
serde for std::string items. More... | |
struct | serde< T, typename std::enable_if< std::is_arithmetic< T >::value >::type > |
serde for all fixed-size arithmetic types (int and float of different sizes). More... | |
class | theta_a_not_b_alloc |
Theta A-not-B (set difference). More... | |
class | theta_base_builder |
Theta base builder. More... | |
class | theta_intersection_alloc |
Theta intersection. More... | |
class | theta_sketch_alloc |
Base class for the Theta Sketch, a generalization of the Kth Minimum Value (KMV) sketch. More... | |
class | theta_union_alloc |
Theta Union. More... | |
class | tuple_a_not_b |
tuple A-not-B More... | |
class | tuple_base_builder |
Tuple base builder. More... | |
class | tuple_intersection |
Tuple intersection. More... | |
class | tuple_sketch |
Base class for Tuple sketch. More... | |
class | tuple_union |
Tuple Union. More... | |
class | update_array_tuple_sketch |
Update array tuple sketch. More... | |
class | update_theta_sketch_alloc |
Update Theta sketch. More... | |
class | update_tuple_sketch |
Update Tuple sketch. More... | |
class | var_opt_sketch |
This sketch samples data from a stream of items. More... | |
class | var_opt_union |
Provides a unioning operation over var_opt_sketch objects. More... | |
class | wrapped_compact_theta_sketch_alloc |
Wrapped Compact Theta sketch. More... | |
Typedefs | |
using | cpc_sketch = cpc_sketch_alloc< std::allocator< uint8_t > > |
CPC sketch alias with default allocator. | |
using | cpc_union = cpc_union_alloc< std::allocator< uint8_t > > |
CPC union alias with default allocator. | |
using | hll_sketch = hll_sketch_alloc< std::allocator< uint8_t > > |
HLL sketch alias with default allocator. | |
using | hll_union = hll_union_alloc< std::allocator< uint8_t > > |
HLL union alias with default allocator. | |
template<typename Allocator = std::allocator<uint64_t>> | |
using | theta_jaccard_similarity_alloc = jaccard_similarity_base< theta_union_alloc< Allocator >, theta_intersection_alloc< Allocator >, trivial_extract_key > |
Theta Jaccard similarity alias. | |
using | theta_jaccard_similarity = theta_jaccard_similarity_alloc< std::allocator< uint64_t > > |
Theta Jaccard similarity alias with default allocator. | |
using | theta_sketch = theta_sketch_alloc< std::allocator< uint64_t > > |
Theta sketch alias with default allocator. | |
using | update_theta_sketch = update_theta_sketch_alloc< std::allocator< uint64_t > > |
Update Theta sketch alias with default allocator. | |
using | compact_theta_sketch = compact_theta_sketch_alloc< std::allocator< uint64_t > > |
Compact Theta sketch alias with default allocator. | |
using | wrapped_compact_theta_sketch = wrapped_compact_theta_sketch_alloc< std::allocator< uint64_t > > |
Wrapped Compact Theta sketch alias with default allocator. | |
using | default_array_of_doubles_update_policy = default_array_tuple_update_policy< array< double > > |
convenience alias with default allocator, default policy for update_array_of_doubles_sketch | |
using | update_array_of_doubles_sketch = update_array_tuple_sketch< array< double > > |
convenience alias with default allocator, equivalent to ArrayOfDoublesUpdatableSketch in Java | |
using | compact_array_of_doubles_sketch = compact_array_tuple_sketch< array< double > > |
convenience alias with default allocator, equivalent to ArrayOfDoublesCompactSketch in Java | |
using | default_array_of_doubles_union_policy = default_array_tuple_union_policy< array< double > > |
convenience alias, default policy for array_of_doubles_union | |
using | array_of_doubles_union = array_tuple_union< array< double > > |
convenience alias with default allocator, equivalent to ArrayOfDoublesUnion in Java | |
template<typename Policy > | |
using | array_of_doubles_intersection = array_tuple_intersection< array< double >, Policy > |
convenience alias with default allocator, equivalent to ArrayOfDoublesIntersection in Java no default policy since it is not clear in general | |
using | array_of_doubles_a_not_b = array_tuple_a_not_b< array< double > > |
convenience alias with default allocator, equivalent to ArrayOfDoublesAnotB in Java | |
template<typename Summary , typename IntersectionPolicy , typename UnionPolicy = default_tuple_union_policy<Summary>, typename Allocator = std::allocator<Summary>> | |
using | tuple_jaccard_similarity = jaccard_similarity_base< tuple_union< Summary, UnionPolicy, Allocator >, tuple_intersection< Summary, IntersectionPolicy, Allocator >, pair_extract_key< uint64_t, Summary > > |
Tuple Jaccard similarity alias. | |
Enumerations | |
enum | target_hll_type { HLL_4 , HLL_6 , HLL_8 } |
Specifies the target type of HLL sketch to be created. More... | |
enum | frequent_items_error_type { NO_FALSE_POSITIVES , NO_FALSE_NEGATIVES } |
Frequent items error type. More... | |
Functions | |
template<typename A > | |
void | cpc_init () |
Allocation and initialization of global decompression (decoding) tables. | |
DataSketches namespace.
enum target_hll_type |
Specifies the target type of HLL sketch to be created.
It is a target in that the actual allocation of the HLL array is deferred until sufficient number of items have been received by the warm-up phases.
These three target types are isomorphic representations of the same underlying HLL algorithm. Thus, given the same value of lg_config_k and the same input, all three HLL target types will produce identical estimates and have identical error distributions.
The memory (and also the serialization) of the sketch during this early warmup phase starts out very small (8 bytes, when empty) and then grows in increments of 4 bytes as required until the full HLL array is allocated. This transition point occurs at about 10% of K for sketches where lg_config_k is > 8.
HLL_8 This uses an 8-bit byte per HLL bucket. It is generally the fastest in terms of update time, but has the largest storage footprint of about K bytes.
HLL_6 This uses a 6-bit field per HLL bucket. It is the generally the next fastest in terms of update time with a storage footprint of about 3/4 * K bytes.
Enumerator | |
---|---|
HLL_4 | 4 bits per entry (most compact, size may vary) |
HLL_6 | 6 bits per entry (fixed size) |
HLL_8 | 8 bits per entry (fastest, fixed size) |
void cpc_init | ( | ) |
Allocation and initialization of global decompression (decoding) tables.
Call this before anything else if you want to control the initialization time. For instance, to have this happen outside of a transaction context. Otherwise initialization happens on the first use (serialization or deserialization). It is safe to call more than once assuming no race conditions. This is not thread safe! Neither is the rest of the library.