Class TupleAnotB<S extends Summary>

java.lang.Object
org.apache.datasketches.tuple.TupleAnotB<S>
Type Parameters:
S - Type of Summary

public final class TupleAnotB<S extends Summary> extends Object
Computes a set difference, A-AND-NOT-B, of two generic TupleSketches. This class includes both stateful and stateless operations.

The stateful operation is as in the following example:

TupleAnotB anotb = new TupleAnotB();

anotb.setA(TupleSketch skA); //The first argument.
anotb.notB(TupleSketch skB); //The second (subtraction) argument.
anotb.notB(TupleSketch skC); // ...any number of additional subtractions...
anotb.getResult(false); //Get an interim result.
anotb.notB(TupleSketch skD); //Additional subtractions.
anotb.getResult(true);  //Final result and resets the TupleAnotB operator.

The stateless operation is as in the following example:

TupleAnotB anotb = new TupleAnotB();

CompactTupleSketch csk = anotb.aNotB(TupleSketch skA, TupleSketch skB);

Calling the setA operation a second time essentially clears the internal state and loads the new sketch.

The stateless and stateful operations are independent of each other.

Author:
Lee Rhodes
  • Constructor Details

    • TupleAnotB

      public TupleAnotB()
      No argument constructor.
  • Method Details

    • setA

      public void setA(TupleSketch<S> skA)
      This is part of a multistep, stateful TupleAnotB operation and sets the given TupleSketch as the first argument A of A-AND-NOT-B. This overwrites the internal state of this TupleAnotB operator with the contents of the given sketch. This sets the stage for multiple following notB steps.

      An input argument of null will throw an exception.

      Rationale: In mathematics a "null set" is a set with no members, which we call an empty set. That is distinctly different from the java null, which represents a nonexistent object. In most cases it is a programming error due to some object that was not properly initialized. With a null as the first argument, we cannot know what the user's intent is. Since it is very likely that a null is a programming error, we throw a an exception.

      An empty input argument will set the internal state to empty.

      Rationale: An empty set is a mathematically legal concept. Although it makes any subsequent, valid argument for B irrelevant, we must allow this and assume the user knows what they are doing.

      Performing getResult(boolean) just after this step will return a compact form of the given argument.

      Parameters:
      skA - The incoming sketch for the first argument, A.
    • notB

      public void notB(TupleSketch<S> skB)
      This is part of a multistep, stateful TupleAnotB operation and sets the given TupleSketch as the second (or n+1th) argument B of A-AND-NOT-B. Performs an AND NOT operation with the existing internal state of this TupleAnotB operator.

      An input argument of null or empty is ignored.

      Rationale: A null for the second or following arguments is more tolerable because A NOT null is still A even if we don't know exactly what the null represents. It clearly does not have any content that overlaps with A. Also, because this can be part of a multistep operation with multiple notB steps. Other following steps can still produce a valid result.

      Use getResult(boolean) to obtain the result.

      Parameters:
      skB - The incoming Tuple sketch for the second (or following) argument B.
    • notB

      public void notB(ThetaSketch skB)
      This is part of a multistep, stateful TupleAnotB operation and sets the given ThetaSketch as the second (or n+1th) argument B of A-AND-NOT-B. Performs an AND NOT operation with the existing internal state of this TupleAnotB operator. Calls to this method can be intermingled with calls to notB(ThetaSketch).

      An input argument of null or empty is ignored.

      Rationale: A null for the second or following arguments is more tolerable because A NOT null is still A even if we don't know exactly what the null represents. It clearly does not have any content that overlaps with A. Also, because this can be part of a multistep operation with multiple notB steps. Other following steps can still produce a valid result.

      Use getResult(boolean) to obtain the result.

      Parameters:
      skB - The incoming ThetaSketch for the second (or following) argument B.
    • getResult

      public CompactTupleSketch<S> getResult(boolean reset)
      Gets the result of the multistep, stateful operation TupleAnotB that have been executed with calls to setA(TupleSketch) and (notB(TupleSketch) or notB(ThetaSketch)).
      Parameters:
      reset - If true, clears this operator to the empty state after this result is returned. Set this to false if you wish to obtain an intermediate result.
      Returns:
      the result of this operation as an unordered CompactTupleSketch.
    • aNotB

      public static <S extends Summary> CompactTupleSketch<S> aNotB(TupleSketch<S> skA, TupleSketch<S> skB)
      Returns the A-and-not-B set operation on the two given TupleSketches.

      This a stateless operation and has no impact on the internal state of this operator. Thus, this is not an accumulating update and is independent of the setA(TupleSketch), notB(TupleSketch), notB(ThetaSketch), and getResult(boolean) methods.

      If either argument is null an exception is thrown.

      Rationale: In mathematics a "null set" is a set with no members, which we call an empty set. That is distinctly different from the java null, which represents a nonexistent object. In most cases it is a programming error due to some object that was not properly initialized. With a null as the first argument, we cannot know what the user's intent is. With a null as the second argument, we can't ignore it as we must return a result and there is no following possible viable arguments for the second argument. Since it is very likely that a null is a programming error, we throw an exception.

      Type Parameters:
      S - Type of Summary
      Parameters:
      skA - The incoming TupleSketch for the first argument
      skB - The incoming TupleSketch for the second argument
      Returns:
      the result as an unordered CompactTupleSketch
    • aNotB

      public static <S extends Summary> CompactTupleSketch<S> aNotB(TupleSketch<S> skA, ThetaSketch skB)
      Returns the A-and-not-B set operation on a TupleSketch and a ThetaSketch.

      This a stateless operation and has no impact on the internal state of this operator. Thus, this is not an accumulating update and is independent of the setA(TupleSketch), notB(TupleSketch), notB(ThetaSketch), and getResult(boolean) methods.

      If either argument is null an exception is thrown.

      Rationale: In mathematics a "null set" is a set with no members, which we call an empty set. That is distinctly different from the java null, which represents a nonexistent object. In most cases it is a programming error due to some object that was not properly initialized. With a null as the first argument, we cannot know what the user's intent is. With a null as the second argument, we can't ignore it as we must return a result and there is no following possible viable arguments for the second argument. Since it is very likely that a null is a programming error for either argument we throw a an exception.

      Type Parameters:
      S - Type of Summary
      Parameters:
      skA - The incoming TupleSketch for the first argument
      skB - The incoming ThetaSketch for the second argument
      Returns:
      the result as an unordered CompactTupleSketch
    • reset

      public void reset()
      Resets this operation back to the empty state.