cassandra-easy-stress

cassandra-easy-stress is a workload-centric stress tool for Apache Cassandra, written in Kotlin. Workloads are easy to write and because they are written in code. You have the ultimate flexibility to build whatever you want without having to learn and operate around the restrictions of a configuration DSL. Workloads can be tweaked via command line parameters to make them fit your environment more closely.

One of the goals of cassandra-easy-stress is to provide enough pre-designed workloads out of the box, so it’s unnecessary to code up a workload for most use cases. For instance, it’s very common to have a key value workload, and want to test that. cassandra-easy-stress allows you to customize a pre-configured key-value workload, using simple parameters to modify the workload to fit your needs. Several workloads are included, such as:

Time Series
Key / Value
Materialized Views
Collections (maps)
Counters

The tool is flexible enough to design workloads which leverage multiple (thousands) of tables, hitting them as needed. Statistics are automatically captured by the Dropwizard metrics library.

Quickstart Example

The goal of this project is to be testing common workloads (time series, key value) against a Cassandra cluster in 15 minutes or less.

Installation

Installing a Package

Note: Installing from packages has not been migrated since I (Jon Haddad) began maintaining my own fork. I’ll be updating this section soon.

Building / Using the Stress Tool from Source

This is currently the only way to use the latest version of cassandra-easy-stress. I’m working on getting the packages updated.

First you’ll need to clone and build the repo. You can grab the source here and build via the included gradle script:

$ git clone https://github.com/apache/cassandra-easy-stress.git
$ cd cassandra-easy-stress
$ ./gradlew shadowJar

You can now run the stress tool via the bin/cassandra-easy-stress script. This is not the same script you’ll be running if you’ve installed from a package or the tarball.

You can also create a zip, tar, or deb package by doing the following:

$ ./gradlew distZip
$ ./gradlew distTar
$ ./gradlew deb

Run Your First Stress Workload

Assuming you have either a CCM cluster or are running a single node locally, you can run this quickstart.

Either add the bin directory to your PATH or from within cassandra-easy-stress run the following command to execute 10,000 queries:

$ bin/cassandra-easy-stress run KeyValue -n 10000

You’ll see the output of the keyspaces and tables that are created as well as some statistical information regarding the workload:

Creating cassandra_easy_stress:
CREATE KEYSPACE
 IF NOT EXISTS cassandra_easy_stress
 WITH replication = {'class': 'SimpleStrategy', 'replication_factor':3 }

Executing 10000 operations with consistency level LOCAL_ONE and serial consistency level LOCAL_SERIAL
Connected to Cassandra cluster.
Creating Tables
CREATE TABLE IF NOT EXISTS keyvalue (
key text PRIMARY KEY,
value text
) WITH caching = {'keys': 'ALL', 'rows_per_partition': 'NONE'} AND default_time_to_live = 0
Preparing queries
Initializing metrics
Stepping rate limiter by 500.0 to 5000.0
Connecting to Cassandra cluster ...
Creating generator random
1 threads prepared.
Prepopulating data with 0 records per thread (1)
Prometheus metrics are available at http://localhost:9500/
Starting main runner
[Thread 0]: Running the profile for 10000 iterations...
                 Writes                                  Reads                                  Deletes                       Errors
  Count  Latency (p99)  1min (req/s) |   Count  Latency (p99)  1min (req/s) |   Count  Latency (p99)  1min (req/s) |   Count  1min (errors/s)
    865          51.99             0 |     800          51.71             0 |       0              0             0 |       0                0
   1614          50.96         273.6 |    1550          50.98           257 |       0              0             0 |       0                0
   2345          50.48         273.6 |    2319          49.42           257 |       0              0             0 |       0                0
   3632          50.17        271.55 |    3532          45.52        256.62 |       0              0             0 |       0                0
   5050          37.23        288.89 |    4950          31.06        274.54 |       0              0             0 |       0                0
   5050          37.23        288.89 |    4950          31.06        274.54 |       0              0             0 |       0                0
   5050          37.23        288.89 |    4950          31.06        274.54 |       0              0             0 |       0                0
Stress complete, 1.

If you’ve made it this far, congrats! You’ve run your first workload.

Usage

You’ll probably want to do a bit more than simply run a few thousand queries against a KeyValue table with default settings. The nice part about cassandra-easy-stress is that it not only comes with a variety of workloads that you can run to test your cluster, but that it allows you to change many of the parameters. In the quickstart example we used the -n flag to change the total number of operations cassandra-easy-stress will execute against the database. There are many more options available, this section will cover some of them.

General Help

cassandra-easy-stress will display the help if the cassandra-easy-stress command is run without any arguments or if the --help flag is passed:

$ bin/cassandra-easy-stress
Usage: cassandra-easy-stress [options] [command] [command options]
  Options:
    --help, -h
      Shows this help.
      Default: false
  Commands:
    run      Run a cassandra-easy-stress profile
      Usage: run [options]
        Options:
          --cl
            Consistency level for reads/writes (Defaults to LOCAL_ONE, set
            custom default with CASSANDRA_EASY_STRESS_CONSISTENCY_LEVEL).
            Default: LOCAL_ONE
          --compaction
            Compaction option to use.  Double quotes will auto convert to
            single for convenience.  A shorthand is also available: stcs, lcs,
            twcs.  See the full documentation for all possibilities.
            Default: <empty string>
          --compression
            Compression option to pass to CREATE TABLE. Example:
            {'class':'ZstdCompressor'}
            Default: <empty string>
          --concurrency, -c
            DEPRECATED.  Concurrent queries allowed.  Increase for larger
            clusters.  This flag is deprecated and does nothing.
            Default: 100
          --coordinatoronly, --co
            Coordinator only mode.  This will cause cassandra-easy-stress to
            round robin between nodes without tokens.  Requires using
            -Djoin_ring=false in cassandra-env.sh.  When using this option you
            must only provide a coordinator to --host.
            Default: false
          --cql
            Additional CQL to run after the schema is created.  Use for DDL
            modifications such as creating indexes.
            Default: []
          --csv
            Write metrics to this file in CSV format.
            Default: <empty string>
          --dc
            The data center to which requests should be sent
            Default: datacenter1
          --deleterate, --deletes
            Deletion Rate, 0-1.  Workloads may have their own defaults.
            Default is dependent on workload.
          --drop
            Drop the keyspace before starting.
            Default: false
          --duration, -d
            Duration of the stress test.  Expressed in format 1d 3h 15m
            Default: 0
          --field.
            Override a field's data generator.  Example usage:
            --field.tablename.fieldname='book(100,200)'
            Syntax: --field.key=value
            Default: {}
          --hdr
            Print HDR Histograms using this prefix
            Default: <empty string>
          -h, --help
            Show this help
          --host
            Default: 127.0.0.1
          --id
            Identifier for this run, will be used in partition keys.  Make
            unique for when starting concurrent runners.
            Default: 001
          --iterations, -i, -n
            Number of operations to run.
            Default: 0
          --keycache
            Key cache setting
            Default: ALL
          --keyspace
            Keyspace to use.  Automatically created.  Use replication option
            to adjust options.
            Default: cassandra_easy_stress
          --max-connections
            Sets the number of max connections per host
            Default: 8
          --max-requests
            Sets the max requests per connection
            Default: 32768
          --maxrlat
            Max Read Latency
          --maxwlat

          --no-schema
            Skips schema creation
            Default: false
          --paginate
            Paginate through the entire partition before completing
            Default: false
          --paging
            Override the driver's default page size.
          --parquet
            Capture client latency metrics to a Apache Parquet file at the
            specified path.  If the file is a directory, the file will be
            named rawlog.parquet within that directory
            Default: <empty string>
          --partitiongenerator, --pg
            Method of generating partition keys.  Supports random, normal
            (gaussian), and sequence.
            Default: random
          --partitions, -p
            Max value of integer component of first partition key.
            Default: 1000000
          --password, -P
            Default: cassandra
          --populate
            Pre-population the DB with N rows before starting load test.
            Default: 0
          --port
            Override the cql port. Defaults to 9042.
            Default: 9042
          --prometheusport
            Override the default prometheus port.
            Set the
            default with CASSANDRA_EASY_STRESS_PROM_PORT, or set to 0 to
            disable.
            Default: 9500
          --queue
            Queue Depth.  2x the rate by default.
            Default: 10000
          --rate
            Throughput rate, accepts human numbers
            Default: 5000
          --readrate, --reads, -r
            Read Rate, 0-1.  Workloads may have their own defaults.  Default
            is dependent on workload.
          --replication
            Replication options
            Default: {'class': 'SimpleStrategy', 'replication_factor':3 }
          --rowcache
            Row cache setting
            Default: NONE
          --scl
            Serial consistency level
            Default: LOCAL_SERIAL
          --ssl
            Enable SSL
            Default: false
          --threads, -t
            Threads to run
            Default: 1
          --ttl
            Table level TTL, 0 to disable.
            Default: 0
          --username, -U
            Default: cassandra
          --workload., -w.
            Override workload specific parameters.
            Syntax: --workload.key=value
            Default: {}

    info      Get details of a specific workload.
      Usage: info

    list      List all workloads.
      Usage: list

    fields      null
      Usage: fields

    server      Start MCP server for tool integration
      Usage: server [options]
        Options:
          -p, --port
            Port to listen on
            Default: 9000

Listing All Workloads

$ bin/cassandra-easy-stress list
Available Workloads:

AllowFiltering
BasicTimeSeries
CountersWide
CreateDrop
DSESearch
KeyValue
LWT
Locking
Maps
MaterializedViews
RandomPartitionAccess
RangeScan
SAI
Sets
TxnCounter
UdtTimeSeries

You can run any of these workloads by running cassandra-easy-stress run WORKLOAD.

Getting information about a workload

It’s possible to get (some) information about a workload by using the info command. This area is a bit lacking at the moment. It currently only provides the schema and default read rate.

$ bin/cassandra-easy-stress info KeyValue
CREATE TABLE IF NOT EXISTS keyvalue (
key text PRIMARY KEY,
value text
)
Default read rate: 0.5 (override with -r)

No dynamic workload parameters.

Running a Customized Stress Workload

Whenever possible we try to use human friendly numbers. Typing out -n 1000000000 is error prone and hard to read, -n 1B is much easier.

Table 1. Table Human Friendly Values
Suffix	Implication	Example	Equivalent
k	Thousand	1k	1,000
m	Million	1m	1,000,000
b	Billion	1b	1,000,000,000

Running a stress test for a given duration instead of a number of operations

You might need to run a stress test for a given duration instead of providing a number of operations, especially in case of multithreaded stress runs. This is done by providing the duration in a human readable format with the -d argument. The minimum duration is 1 minute. For example, running a test for 1 hours and 30 minutes will be done as follows:

$ cassandra-easy-stress run KeyValue -d "1h 30m"

To run a test for 1 days, 3 hours and 15 minutes (why not?), run cassandra-easy-stress as follows:

$ cassandra-easy-stress run KeyValue -d "1d 3h 15m"

Partition Keys

A very useful feature is controlling how many partitions are read and written to for a given stress test. Doing a billion operations across a billion partitions is going to have a much different performance profile than writing to one hundred partitions, especially when mixed with different compaction settings. Using -p we can control how many partition keys a stress test will leverage. The keys are randomly chosen at the moment.

Read Rate

It’s possible to specify the read rate of a test as a double. For example, if you want to use 1% reads, you’d specify -r .01. The sum of the read rate and delete rate must be less than or equal to 1.0.

Delete Rate

It’s possible to specify the delete rate of a test as a double. For example, if you want to use 1% deletes, you’d specify --deleterate .01. The sum of the read rate and delete rate must be less than or equal to 1.0.

Compaction

It’s possible to change the compaction strategy used with the --compaction flag. At the moment this changes the compaction strategy of every table in the test. This will be addressed in the future to be more flexible.

The --compaction flag can accept a raw string along these lines:

--compaction "{'class':'LeveledCompactionStrategy'}"

Alternatively, a shortcut format exists as of version 2.0:

--compaction lcs

The following shorthand formats are available:

Table 2. Table Compaction ShortHand
Syntax	Expansion
stcs	`{'class':'SizeTieredCompactionStrategy'}`
stcs,4,32	`{'class':'SizeTieredCompactionStrategy', 'min_threshold':4, 'max_threshold':32}`
lcs	`{'class':'LeveledCompactionStrategy'}`
lcs,160	`{'class':'LeveledCompactionStrategy', 'sstable_size_in_mb':'160'}`
lcs,160,10	`{'class':'LeveledCompactionStrategy', 'sstable_size_in_mb':'160', 'fanout_size':10}`
twcs	`{'class':'TimeWindowCompactionStrategy'}`
twcs,1,days	`{'class':'TimeWindowCompactionStrategy', 'compaction_window_size':'1', 'compaction_window_unit':'DAYS'}`
ucs	`{'class':'UnifiedCompactionStrategy'}`
ucs,L4	`{'class':'UnifiedCompactionStrategy', 'scaling_parameters':'L4'}`
ucs,L4,L10	`{'class':'UnifiedCompactionStrategy', 'scaling_parameters':'L4,L10'}`

Compression

It’s possible to change the compression options used. At the moment this changes the compression options of every table in the test. This will be addressed in the future to be more flexible.

Customizing Fields

To some extent, workloads can be customized by leveraging the --fields flag. For instance, if we look at the KeyValue workload, we have a table called keyvalue which has a value field.

To customize the data we use for this field, we provide a generator at the command line. By default, the value field will use 100-200 characters of random text. What if we’re storing blobs of text instead? Ideally we’d like to tweak this workload to be closer to our production use case. Let’s use random sections from various books:

$ cassandra-easy-stress run KeyValue --field.keyvalue.value='book(20,40)`

Instead of using random strings of garbage, the KeyValue workload will now use 20-40 words extracted from books.

There are other generators available, such as names, gaussian numbers, and cities. Not every generator applies to every type. It’s up to the workload to specify which fields can be used this way.

Logging

cassandra-easy-stress uses the Log4J 2 logging framework.

You can find the default log4j config in https://github.com/apache/cassandra-easy-stress/blob/main/src/main/resources/log4j2.yaml [conf, window="_blank"]. This should be suitable for most use cases.

To use your own logging configuration, simply set the shell variable CASSANDRA_EASY_STRESS_LOG4J to the path of the new logging configuration before running cassandra-easy-stress to point to the config file of your choice.

For more information on how to configure Log4J 2 please see the configuration documentation.

Debian Package

The Debian package installs a basic configuration file to /etc/cassandra-easy-stress/log4j2.yaml.

Exporting Metrics

cassandra-easy-stress automatically runs an HTTP server exporting metrics in Prometheus format on port 9501.

Capturing Client Latencies to Apache Parquet

You can capture detailed client latency metrics to an Apache Parquet file using the --parquet flag followed by a path to a file or directory:

$ bin/cassandra-easy-stress run KeyValue --duration 5m --parquet rawlog.parquet

This writes operation metrics including operation type, success/failure status, start time, and duration to a Parquet file that can be analyzed later with data analysis tools like Pandas, Spark, DuckDB, or visualization tools that support the Parquet format.

If a directory is provided instead of a file, cassandra-easy-stress will automatically create an appropriately named file in that directory.

Parquet Schema Reference

Each row in the Parquet file represents a single operation with the following columns:

Table 3. Table Parquet Schema
Column	Type	Description
`operation`	`binary (UTF8)`	Operation type (e.g. `Insert`, `Select`) — derived from the statement class name
`success`	`boolean`	Whether the operation completed successfully
`failure_reason`	`binary (UTF8)`	Exception class name if the operation failed (empty string on success)
`failure_stacktrace`	`binary (UTF8)`	Full stacktrace if the operation failed (empty string on success)
`request_start_time_ms`	`int64`	Epoch milliseconds when the client intended to send the request
`request_duration_ns`	`int64`	Total elapsed time in nanoseconds from request creation to response (includes queue time)
`service_start_time_ms`	`int64`	Epoch milliseconds when the request was actually sent to the database
`service_duration_ns`	`int64`	Actual database processing time in nanoseconds (excludes queue time)

Note	Every operation is captured (not sampled), so file sizes grow proportionally with the number of operations.

Analyzing Parquet Files with DuckDB

The Parquet files created by cassandra-easy-stress can be easily analyzed using DuckDB, a lightweight analytical database engine. Here are some example queries to get you started:

-- Show summary statistics for operation latencies for every minute
SELECT date_trunc('minute', epoch_ms(request_start_time_ms)) as minute,
       COUNT(*) as count,
       AVG(request_duration_ns / 1000 / 1000) as avg,
       MIN(request_duration_ns / 1000 / 1000) as min,
       MAX(request_duration_ns / 1000 / 1000) as max,
       APPROX_QUANTILE(request_duration_ns / 1000 / 1000, .5) as p50,
       APPROX_QUANTILE(request_duration_ns / 1000 / 1000, .9) as p90,
       APPROX_QUANTILE(request_duration_ns / 1000 / 1000, .99) as p99,
       COUNT(CASE WHEN failure_reason != '' THEN 1 END) AS errors,
       COUNT(CASE WHEN failure_reason = 'ReadTimeoutException' THEN 1 END) AS read_timeouts,
       COUNT(CASE WHEN failure_reason = 'WriteTimeoutException' THEN 1 END) AS write_timeouts,
FROM read_parquet('rawlog.parquet')
GROUP BY minute
ORDER BY minute;

┌─────────────────────┬────────┬─────────────────────┬──────────┬────────────────────┬─────────────────────┬─────────────────────┬────────────────────┬────────┬───────────────┬────────────────┐
│       minute        │ count  │         avg         │   min    │        max         │         p50         │         p90         │        p99         │ errors │ read_timeouts │ write_timeouts │
│      timestamp      │ int64  │       double        │  double  │       double       │       double        │       double        │       double       │ int64  │     int64     │     int64      │
├─────────────────────┼────────┼─────────────────────┼──────────┼────────────────────┼─────────────────────┼─────────────────────┼────────────────────┼────────┼───────────────┼────────────────┤
│ 2025-05-23 22:45:00 │ 141911 │  18.891404855317813 │ 0.088042 │        1305.993875 │ 0.23617621307864622 │  0.5454138286498701 │   755.160273634975 │      0 │             0 │              0 │
│ 2025-05-23 22:46:00 │ 300081 │ 0.26154326542833034 │ 0.091458 │          16.620042 │  0.2198726495794396 │ 0.28866495065114356 │ 1.0759477146627234 │      0 │             0 │              0 │
│ 2025-05-23 22:47:00 │ 300075 │ 0.29655502679997087 │ 0.089208 │          19.247875 │  0.2241928371244093 │  0.3096208807374364 │ 1.8582042492087465 │      0 │             0 │              0 │
│ 2025-05-23 22:48:00 │ 298543 │  0.6524298801211285 │ 0.093666 │ 198.99904199999997 │ 0.22677466153454573 │ 0.33573102120265713 │  9.839418314581492 │      0 │             0 │              0 │
│ 2025-05-23 22:49:00 │ 300053 │ 0.30696147925533085 │ 0.100167 │          64.216666 │  0.2249157848195121 │  0.3072763658664282 │ 1.6342296967730887 │      0 │             0 │              0 │
│ 2025-05-23 22:50:00 │  24765 │  0.4530204902079576 │  0.12675 │          39.548167 │  0.2259263537252715 │ 0.30608390597020046 │ 7.7139616210838575 │      0 │             0 │              0 │
└─────────────────────┴────────┴─────────────────────┴──────────┴────────────────────┴─────────────────────┴─────────────────────┴────────────────────┴────────┴───────────────┴────────────────┘


-- Show error counts
SELECT failure_reason, count(*)
FROM read_parquet('rawlog.parquet')
WHERE failure_reason != ''
GROUP BY failure_reason;

You can also use DuckDB through its various clients including Python, R, Java, and JDBC.

Understanding Request Time vs Service Time

When analyzing latency data from the Parquet files, it’s important to understand the distinction between two key metrics:

Service Time: This is the actual time it takes for the database to process a request and return a response once the request is received by the database. It measures only the execution time of the operation.
Request Time: This is the total time from when the client intended to make the request until receiving the response. It includes the service time plus any queue time or delays that might have occurred before the request was actually sent to the database.

The difference between these metrics is critical for understanding coordinated omission, a common problem in performance testing where the test client doesn’t accurately capture the full latency that would be experienced by real users when the system is under load.

For example, if your database is overloaded and can only process 100 operations per second, but your test is trying to send 200 operations per second:

A naïve benchmark would only measure the service time of the operations that actually got processed, missing the fact that half the operations were delayed.
A properly instrumented benchmark (like cassandra-easy-stress) captures the request time, which includes how long operations had to wait in a queue.

When using the Parquet files for analysis, you can examine both metrics to get a more complete picture of your system’s performance under load:

-- Compare average service time vs request time by operation type
SELECT
    operation,
    AVG(service_duration_ns / 1000 / 1000) as avg_service_time_ms,
    AVG(request_duration_ns / 1000 / 1000) as avg_request_time_ms,
    AVG(request_duration_ns - service_duration_ns) / 1000 / 1000 as avg_queue_time_ms
FROM read_parquet('rawlog.parquet')
GROUP BY operation;

A significant difference between average request time and average service time indicates queuing or scheduling delays in your system, which can be an early warning sign of performance bottlenecks.

Workload Restrictions

The BasicTimeSeries workload only supports Cassandra versions 3.0 and above. This is because range deletes are used by this workload during runtime. Range deletes are only support in Cassandra versions 3.0. An exception will is thrown if this workload is used and a Cassandra version less than 3.0 is detected during runtime.

Developer Docs

Building the documentation

First generate the command examples with the following shell script:

$ manual/generate_examples.sh

There’s a docker service to build the HTML manual:

$ docker-compose up docs

Writing a Custom Workload

cassandra-easy-stress is a work in progress. Writing a stress workload isn’t documented yet as it is still changing.