datasketches-cpp
|
This performs union operations for HLL sketches. More...
#include <hll.hpp>
Public Member Functions | |
hll_union_alloc (uint8_t lg_max_k, const A &allocator=A()) | |
Construct an hll_union operator with the given maximum log2 of k. | |
double | get_estimate () const |
Returns the current cardinality estimate. | |
double | get_composite_estimate () const |
This is less accurate than the get_estimate() method and is automatically used when the union has gone through union operations where the more accurate HIP estimator cannot be used. | |
double | get_lower_bound (uint8_t num_std_dev) const |
Returns the approximate lower error bound given the specified number of standard deviations. | |
double | get_upper_bound (uint8_t num_std_dev) const |
Returns the approximate upper error bound given the specified number of standard deviations. | |
uint8_t | get_lg_config_k () const |
Returns union's configured lg_k value. | |
target_hll_type | get_target_type () const |
Returns the union's target HLL mode (from target_hll_type). | |
bool | is_empty () const |
Indicates if the union is currently empty. | |
void | reset () |
Resets the union to an empty state in coupon collection mode. | |
hll_sketch_alloc< A > | get_result (target_hll_type tgt_type=HLL_4) const |
Returns the result of this union operator with the specified target_hll_type. | |
void | update (const hll_sketch_alloc< A > &sketch) |
Update this union operator with the given sketch. | |
void | update (hll_sketch_alloc< A > &&sketch) |
Update this union operator with the given temporary sketch. | |
void | update (const std::string &datum) |
Present the given std::string as a potential unique item. | |
void | update (uint64_t datum) |
Present the given unsigned 64-bit integer as a potential unique item. | |
void | update (uint32_t datum) |
Present the given unsigned 32-bit integer as a potential unique item. | |
void | update (uint16_t datum) |
Present the given unsigned 16-bit integer as a potential unique item. | |
void | update (uint8_t datum) |
Present the given unsigned 8-bit integer as a potential unique item. | |
void | update (int64_t datum) |
Present the given signed 64-bit integer as a potential unique item. | |
void | update (int32_t datum) |
Present the given signed 32-bit integer as a potential unique item. | |
void | update (int16_t datum) |
Present the given signed 16-bit integer as a potential unique item. | |
void | update (int8_t datum) |
Present the given signed 8-bit integer as a potential unique item. | |
void | update (double datum) |
Present the given 64-bit floating point value as a potential unique item. | |
void | update (float datum) |
Present the given 32-bit floating point value as a potential unique item. | |
void | update (const void *data, size_t length_bytes) |
Present the given data array as a potential unique item. | |
Static Public Member Functions | |
static double | get_rel_err (bool upper_bound, bool unioned, uint8_t lg_config_k, uint8_t num_std_dev) |
Gets the current (approximate) Relative Error (RE) asymptotic values given several parameters. | |
This performs union operations for HLL sketches.
This union operator is configured with a lgMaxK instead of the normal lg_config_k.
This union operator does permit the unioning of sketches with different values of lg_config_k. The user should be aware that the resulting accuracy of a sketch returned at the end of the unioning process will be a function of the smallest of lg_max_k and lg_config_k that the union operator has seen.
This union operator also permits unioning of any of the three different target hll_sketch types.
Although the API for this union operator parallels many of the methods of the HllSketch, the behavior of the union operator has some fundamental differences.
First, the user cannot specify the target_hll_type as an input parameter. Instead, it is specified for the sketch returned with get_result.
Second, the internal effective value of log-base-2 of k for the union operation can change dynamically based on the smallest lg_config_k that the union operation has seen.
author Jon Malkin author Lee Rhodes author Kevin Lang
|
explicit |
Construct an hll_union operator with the given maximum log2 of k.
lg_max_k | The maximum size, in log2, of k. The value must be between 7 and 21, inclusive. |
allocator | instance of an Allocator |
double get_estimate | ( | ) | const |
Returns the current cardinality estimate.
double get_composite_estimate | ( | ) | const |
This is less accurate than the get_estimate() method and is automatically used when the union has gone through union operations where the more accurate HIP estimator cannot be used.
This is made public only for error characterization software that exists in separate packages and is not intended for normal use.
double get_lower_bound | ( | uint8_t | num_std_dev | ) | const |
Returns the approximate lower error bound given the specified number of standard deviations.
num_std_dev | Number of standard deviations, an integer from the set {1, 2, 3}. |
double get_upper_bound | ( | uint8_t | num_std_dev | ) | const |
Returns the approximate upper error bound given the specified number of standard deviations.
num_std_dev | Number of standard deviations, an integer from the set {1, 2, 3}. |
uint8_t get_lg_config_k | ( | ) | const |
Returns union's configured lg_k value.
target_hll_type get_target_type | ( | ) | const |
Returns the union's target HLL mode (from target_hll_type).
bool is_empty | ( | ) | const |
Indicates if the union is currently empty.
void reset | ( | ) |
Resets the union to an empty state in coupon collection mode.
Does not re-use existing internal objects.
hll_sketch_alloc< A > get_result | ( | target_hll_type | tgt_type = HLL_4 | ) | const |
Returns the result of this union operator with the specified target_hll_type.
tgt_type | The tgt_hll_type enum value of the desired result (Default: HLL_4) |
void update | ( | const hll_sketch_alloc< A > & | sketch | ) |
Update this union operator with the given sketch.
sketch | The given sketch. |
void update | ( | hll_sketch_alloc< A > && | sketch | ) |
Update this union operator with the given temporary sketch.
sketch | The given sketch. |
void update | ( | const std::string & | datum | ) |
Present the given std::string as a potential unique item.
The string is converted to a byte array using UTF8 encoding. If the string is null or empty no update attempt is made and the method returns.
datum | The given string. |
void update | ( | uint64_t | datum | ) |
Present the given unsigned 64-bit integer as a potential unique item.
datum | The given integer. |
void update | ( | uint32_t | datum | ) |
Present the given unsigned 32-bit integer as a potential unique item.
datum | The given integer. |
void update | ( | uint16_t | datum | ) |
Present the given unsigned 16-bit integer as a potential unique item.
datum | The given integer. |
void update | ( | uint8_t | datum | ) |
Present the given unsigned 8-bit integer as a potential unique item.
datum | The given integer. |
void update | ( | int64_t | datum | ) |
Present the given signed 64-bit integer as a potential unique item.
datum | The given integer. |
void update | ( | int32_t | datum | ) |
Present the given signed 32-bit integer as a potential unique item.
datum | The given integer. |
void update | ( | int16_t | datum | ) |
Present the given signed 16-bit integer as a potential unique item.
datum | The given integer. |
void update | ( | int8_t | datum | ) |
Present the given signed 8-bit integer as a potential unique item.
datum | The given integer. |
void update | ( | double | datum | ) |
Present the given 64-bit floating point value as a potential unique item.
datum | The given double. |
void update | ( | float | datum | ) |
Present the given 32-bit floating point value as a potential unique item.
datum | The given float. |
void update | ( | const void * | data, |
size_t | length_bytes | ||
) |
Present the given data array as a potential unique item.
data | The given array. |
length_bytes | The array length in bytes. |
|
static |
Gets the current (approximate) Relative Error (RE) asymptotic values given several parameters.
This is used primarily for testing.
upper_bound | return the RE for the Upper Bound, otherwise for the Lower Bound. |
unioned | set true if the sketch is the result of a union operation. |
lg_config_k | the configured value for the sketch. |
num_std_dev | the given number of Standard Deviations. This must be an integer between 1 and 3, inclusive. |