Kudu is primarily designed for analytic use cases. You are likely to encounter issues if a single row contains multiple kilobytes of data.
The columns which make up the primary key must be listed first in the schema.
Columns that are part of the primary key cannot be renamed. The primary key may not be changed after the table is created. You must drop and recreate a table to select a new primary key or rename key columns.
The primary key of a row may not be modified using the UPDATE
functionality.
To modify a row’s primary key, the row must be deleted and re-inserted with
the modified key. Such a modification is non-atomic.
Columns with DOUBLE
, FLOAT
, or BOOL
types are not allowed as part of a
primary key definition. Additionally, all columns that are part of a primary
key definition must be NOT NULL
.
Type and nullability of existing columns cannot be changed by altering the table.
Dropping a column does not immediately reclaim space. Compaction must run first. There is no way to run compaction manually, but dropping the table will reclaim the space immediately.
Tables must be manually pre-split into tablets using simple or compound primary keys. Automatic splitting is not yet possible. Range partitions may be added or dropped after a table has been created. See Schema Design for more information.
Data in existing tables cannot currently be automatically repartitioned. As a workaround, create a new table with the new partitioning and insert the contents of the old table.
Kudu does not currently include any built-in features for backup and restore. Users are encouraged to use tools such as Spark or Impala to export or import tables as necessary.
Updates, inserts, and deletes via Impala are non-transactional. If a query fails part of the way through, its partial effects will not be rolled back.
No timestamp and decimal type support.
The maximum parallelism of a single query is limited to the number of tablets in a table. For good analytic performance, aim for 10 or more tablets per host or use large tables.
Authentication and authorization features are not implemented.
Data encryption is not built in. Kudu has been reported to run correctly
on systems using local block device encryption (e.g. dmcrypt
).
The following are known bugs and issues with the current release of Kudu. They will be addressed in later releases. Note that this list is not exhaustive, and is meant to communicate only the most important known issues.
If the Kudu master is configured with the -log_force_fsync_all
option, tablet servers
and clients will experience frequent timeouts, and the cluster may become unusable.
If a tablet server has a very large number of tablets, it may take several minutes to start up. It is recommended to limit the number of tablets per server to 100 or fewer. Consider this limitation when pre-splitting your tables. If you notice slow start-up times, you can monitor the number of tablets per server in the web UI.