datasketches-cpp
Public Member Functions | Static Public Member Functions | List of all members
hll_union_alloc< A > Class Template Reference

This performs union operations for HLL sketches. More...

#include <hll.hpp>

Public Member Functions

 hll_union_alloc (uint8_t lg_max_k, const A &allocator=A())
 Construct an hll_union operator with the given maximum log2 of k. More...
 
double get_estimate () const
 Returns the current cardinality estimate. More...
 
double get_composite_estimate () const
 This is less accurate than the get_estimate() method and is automatically used when the union has gone through union operations where the more accurate HIP estimator cannot be used. More...
 
double get_lower_bound (uint8_t num_std_dev) const
 Returns the approximate lower error bound given the specified number of standard deviations. More...
 
double get_upper_bound (uint8_t num_std_dev) const
 Returns the approximate upper error bound given the specified number of standard deviations. More...
 
uint8_t get_lg_config_k () const
 Returns union's configured lg_k value. More...
 
target_hll_type get_target_type () const
 Returns the union's target HLL mode (from target_hll_type). More...
 
bool is_empty () const
 Indicates if the union is currently empty. More...
 
void reset ()
 Resets the union to an empty state in coupon collection mode. More...
 
hll_sketch_alloc< A > get_result (target_hll_type tgt_type=HLL_4) const
 Returns the result of this union operator with the specified target_hll_type. More...
 
void update (const hll_sketch_alloc< A > &sketch)
 Update this union operator with the given sketch. More...
 
void update (hll_sketch_alloc< A > &&sketch)
 Update this union operator with the given temporary sketch. More...
 
void update (const std::string &datum)
 Present the given std::string as a potential unique item. More...
 
void update (uint64_t datum)
 Present the given unsigned 64-bit integer as a potential unique item. More...
 
void update (uint32_t datum)
 Present the given unsigned 32-bit integer as a potential unique item. More...
 
void update (uint16_t datum)
 Present the given unsigned 16-bit integer as a potential unique item. More...
 
void update (uint8_t datum)
 Present the given unsigned 8-bit integer as a potential unique item. More...
 
void update (int64_t datum)
 Present the given signed 64-bit integer as a potential unique item. More...
 
void update (int32_t datum)
 Present the given signed 32-bit integer as a potential unique item. More...
 
void update (int16_t datum)
 Present the given signed 16-bit integer as a potential unique item. More...
 
void update (int8_t datum)
 Present the given signed 8-bit integer as a potential unique item. More...
 
void update (double datum)
 Present the given 64-bit floating point value as a potential unique item. More...
 
void update (float datum)
 Present the given 32-bit floating point value as a potential unique item. More...
 
void update (const void *data, size_t length_bytes)
 Present the given data array as a potential unique item. More...
 

Static Public Member Functions

static double get_rel_err (bool upper_bound, bool unioned, uint8_t lg_config_k, uint8_t num_std_dev)
 Gets the current (approximate) Relative Error (RE) asymptotic values given several parameters. More...
 

Detailed Description

template<typename A = std::allocator<uint8_t>>
class datasketches::hll_union_alloc< A >

This performs union operations for HLL sketches.

This union operator is configured with a lgMaxK instead of the normal lg_config_k.

This union operator does permit the unioning of sketches with different values of lg_config_k. The user should be aware that the resulting accuracy of a sketch returned at the end of the unioning process will be a function of the smallest of lg_max_k and lg_config_k that the union operator has seen.

This union operator also permits unioning of any of the three different target hll_sketch types.

Although the API for this union operator parallels many of the methods of the HllSketch, the behavior of the union operator has some fundamental differences.

First, the user cannot specify the target_hll_type as an input parameter. Instead, it is specified for the sketch returned with get_result.

Second, the internal effective value of log-base-2 of k for the union operation can change dynamically based on the smallest lg_config_k that the union operation has seen.

author Jon Malkin author Lee Rhodes author Kevin Lang

Constructor & Destructor Documentation

◆ hll_union_alloc()

hll_union_alloc ( uint8_t  lg_max_k,
const A &  allocator = A() 
)
explicit

Construct an hll_union operator with the given maximum log2 of k.

Parameters
lg_max_kThe maximum size, in log2, of k. The value must be between 7 and 21, inclusive.
allocatorinstance of an Allocator

Member Function Documentation

◆ get_estimate()

double get_estimate

Returns the current cardinality estimate.

Returns
the cardinality estimate

◆ get_composite_estimate()

double get_composite_estimate

This is less accurate than the get_estimate() method and is automatically used when the union has gone through union operations where the more accurate HIP estimator cannot be used.

This is made public only for error characterization software that exists in separate packages and is not intended for normal use.

Returns
the composite cardinality estimate

◆ get_lower_bound()

double get_lower_bound ( uint8_t  num_std_dev) const

Returns the approximate lower error bound given the specified number of standard deviations.

Parameters
num_std_devNumber of standard deviations, an integer from the set {1, 2, 3}.
Returns
The approximate lower bound.

◆ get_upper_bound()

double get_upper_bound ( uint8_t  num_std_dev) const

Returns the approximate upper error bound given the specified number of standard deviations.

Parameters
num_std_devNumber of standard deviations, an integer from the set {1, 2, 3}.
Returns
The approximate upper bound.

◆ get_lg_config_k()

uint8_t get_lg_config_k

Returns union's configured lg_k value.

Returns
Configured lg_k value.

◆ get_target_type()

target_hll_type get_target_type

Returns the union's target HLL mode (from target_hll_type).

Returns
The union's target HLL mode.

◆ is_empty()

bool is_empty

Indicates if the union is currently empty.

Returns
True if the union is empty.

◆ reset()

void reset

Resets the union to an empty state in coupon collection mode.

Does not re-use existing internal objects.

◆ get_result()

hll_sketch_alloc< A > get_result ( target_hll_type  tgt_type = HLL_4) const

Returns the result of this union operator with the specified target_hll_type.

Parameters
tgt_typeThe tgt_hll_type enum value of the desired result (Default: HLL_4)
Returns
The result of this union with the specified tgt_hll_type

◆ update() [1/14]

void update ( const hll_sketch_alloc< A > &  sketch)

Update this union operator with the given sketch.

Parameters
sketchThe given sketch.

◆ update() [2/14]

void update ( hll_sketch_alloc< A > &&  sketch)

Update this union operator with the given temporary sketch.

Parameters
sketchThe given sketch.

◆ update() [3/14]

void update ( const std::string &  datum)

Present the given std::string as a potential unique item.

The string is converted to a byte array using UTF8 encoding. If the string is null or empty no update attempt is made and the method returns.

Parameters
datumThe given string.

◆ update() [4/14]

void update ( uint64_t  datum)

Present the given unsigned 64-bit integer as a potential unique item.

Parameters
datumThe given integer.

◆ update() [5/14]

void update ( uint32_t  datum)

Present the given unsigned 32-bit integer as a potential unique item.

Parameters
datumThe given integer.

◆ update() [6/14]

void update ( uint16_t  datum)

Present the given unsigned 16-bit integer as a potential unique item.

Parameters
datumThe given integer.

◆ update() [7/14]

void update ( uint8_t  datum)

Present the given unsigned 8-bit integer as a potential unique item.

Parameters
datumThe given integer.

◆ update() [8/14]

void update ( int64_t  datum)

Present the given signed 64-bit integer as a potential unique item.

Parameters
datumThe given integer.

◆ update() [9/14]

void update ( int32_t  datum)

Present the given signed 32-bit integer as a potential unique item.

Parameters
datumThe given integer.

◆ update() [10/14]

void update ( int16_t  datum)

Present the given signed 16-bit integer as a potential unique item.

Parameters
datumThe given integer.

◆ update() [11/14]

void update ( int8_t  datum)

Present the given signed 8-bit integer as a potential unique item.

Parameters
datumThe given integer.

◆ update() [12/14]

void update ( double  datum)

Present the given 64-bit floating point value as a potential unique item.

Parameters
datumThe given double.

◆ update() [13/14]

void update ( float  datum)

Present the given 32-bit floating point value as a potential unique item.

Parameters
datumThe given float.

◆ update() [14/14]

void update ( const void *  data,
size_t  length_bytes 
)

Present the given data array as a potential unique item.

Parameters
dataThe given array.
length_bytesThe array length in bytes.

◆ get_rel_err()

double get_rel_err ( bool  upper_bound,
bool  unioned,
uint8_t  lg_config_k,
uint8_t  num_std_dev 
)
static

Gets the current (approximate) Relative Error (RE) asymptotic values given several parameters.

This is used primarily for testing.

Parameters
upper_boundreturn the RE for the Upper Bound, otherwise for the Lower Bound.
unionedset true if the sketch is the result of a union operation.
lg_config_kthe configured value for the sketch.
num_std_devthe given number of Standard Deviations. This must be an integer between 1 and 3, inclusive.
Returns
the current (approximate) RelativeError

The documentation for this class was generated from the following files: