datasketches-cpp
|
This is the core C++ component of the Apache DataSketches library. It contains all of the key sketching algorithms that are in the Java component and can be accessed directly from user applications.
This component is also a dependency of other components of the library that create adaptors for target systems, such as PostgreSQL.
Note that we have a parallel core component for [Java]((https://github.com/apache/datasketches-java) and [Python]((https://github.com/apache/datasketches-python) implementations of the same sketch algorithms.
Please visit the main Apache DataSketches website for more information.
If you are interested in making contributions to this site please see our Community page for how to contact us.
This code requires C++11.
This library is header-only. The build process provided is only for building unit tests.
Building the unit tests requires cmake 3.12.0 or higher.
Installing the latest cmake on OSX: brew install cmake
Building and running unit tests using cmake for OSX and Linux:
Building and running unit tests using cmake for Windows from the command line:
To install a local distribution (OSX and Linux), use the following command. The CMAKE_INSTALL_PREFIX variable controls the destination. If not specified, it defaults to installing in /usr (/usr/include, /usr/lib, etc). In the command below, the installation will be in /tmp/install/DataSketches (/tmp/install/DataSketches/include, /tmp/install/DataSketches/lib, etc)
To generate an installable package using cmake's built in cpack packaging tool, use the following command. The type of packaging is controlled by the CPACK_GENERATOR variable (semi-colon separated list). Cmake usually supports packaging types such as RPM, DEB, STGZ, TGZ, TZ, ZIP, etc.
The DataSketches project can be included in other projects' CMakeLists.txt files in one of two ways. If DataSketches has been installed on the host (using an RPM, DEB, "make install" into /usr/local, or some way, then CMake's find_package
command can be used like this:
When used with find_package, DataSketches exports several variables, including
DATASKETCHES_VERSION
: The version number of the datasketches package that was imported.DATASKETCHES_INCLUDE_DIR
: The directory that should be added to access DataSketches include files. Because cmake automatically includes the interface directories for included target libraries when using target_link_library
, under normal circumstances there will be no need to include this directly.DATASKETCHES_LIB
: The name of the DataSketches target to include as a dependency. Projects pulling in DataSketches should reference this with target_link_library
in order to set up all the correct dependencies and include paths.If you don't have DataSketches installed locally, dependent projects can pull it directly from GitHub using CMake's ExternalProject
module. The code would look something like this: