Exact and Bounded, Probabilitiy Proportional to Size (EBPPS) Sampling --------------------------------------------------------------------- .. currentmodule:: datasketches An EBPPS sketch produces a randome sample of data from a stream of items, ensuring that the probability of including an item is always exactly equal to the item's size. The size of an item is defined as its weight relative to the total weight of all items seen so far by the sketch. In contrast to VarOpt sampling, this sketch may return fewer than `k` items in order to keep the probability of including an item strictly proportional to its size. This sketch is based on: B. Hentschel, P. J. Haas, Y. Tian "Exact PPS Sampling with Bounded Sample Size", Information Processing Letters, 2023. EBPPS sampling is related to reservoir sampling, but handles unequal item weights. Feeding the sketch items with a uniform weight value will produce a sample equivalent to reservoir sampling. .. note:: Serializing and deserializing this sketch requires the use of a :class:`PyObjectSerDe`. .. autoclass:: ebpps_sketch :members: :undoc-members: :exclude-members: deserialize .. rubric:: Static Methods: .. automethod:: deserialize .. rubric:: Non-static Methods: .. automethod:: __init__