Kudu can be configured to enforce secure authentication among servers, and
between clients and servers. Authentication prevents untrusted actors from
gaining access to Kudu, and securely identifies the connecting user or services
for authorization checks. Authentication in Kudu is designed to interoperate
with other secure Hadoop components by utilizing Kerberos.
Authentication can be configured on Kudu servers using the
--rpc-authentication
flag, which can be set to required
, optional
, or
disabled
. By default, the flag is set to optional
. When required
, Kudu
will reject connections from clients and servers who lack authentication
credentials. When optional
, Kudu will attempt to use strong authentication.
When disabled
or strong authentication fails for 'optional', by default Kudu
will only allow unauthenticated connections from trusted subnets, which are
private networks (127.0.0.0/8,10.0.0.0/8,172.16.0.0/12,192.168.0.0/16,
169.254.0.0/16) and local subnets of all local network interfaces. Unauthenticated
connections from publicly routable IPs will be rejected.
The trusted subnets can be configured using the --trusted_subnets
flag,
which can be set to IP blocks in CIDR notation separated by comma. Set it to
'0.0.0.0/0' to allow unauthenticated connections from all remote IP addresses.
However, if network access is not otherwise restricted by a firewall,
malicious users may be able to gain unauthorized access. This can be mitigated
if authentication is configured to be required.
|
When the --rpc-authentication flag is set to optional ,
the cluster does not prevent access from unauthenticated users. To secure a
cluster, use --rpc-authentication=required .
|
Kudu uses an internal PKI system to issue X.509 certificates to servers in
the cluster. Connections between peers who have both obtained certificates will
use TLS for authentication, which doesn’t require contacting the Kerberos KDC.
These certificates are only used for internal communication among Kudu
servers, and between Kudu clients and servers. The certificates are never
presented in a public facing protocol.
By using internally-issued certificates, Kudu offers strong authentication which
scales to huge clusters, and allows TLS encryption to be used without requiring
you to manually deploy certificates on every node.
After authenticating to a secure cluster, the Kudu client will automatically
request an authentication token from the Kudu master. An authentication token
encapsulates the identity of the authenticated user and carries the master’s
RSA signature so that its authenticity can be verified.
This token will be used to authenticate subsequent connections. By default,
authentication tokens are only valid for seven days, so that even if a token
were compromised, it could not be used indefinitely. For the most part,
authentication tokens should be completely transparent to users. By using
authentication tokens, Kudu takes advantage of strong authentication without
paying the scalability cost of communicating with a central authority for every
connection.
When used with distributed compute frameworks such as Spark, authentication
tokens can simplify configuration and improve security. For example, the Kudu
Spark connector will automatically retrieve an authentication token during the
planning stage, and distribute the token to tasks. This allows Spark to work
against a secured Kudu cluster where only the planner node has Kerberos
credentials.