AWS provides storage, compute and other services around the world, in regions.
Data in S3 is stored buckets; each bucket is a single region.
There are some “special” regions: China, AWS GovCloud. It is believed that the S3A connector works in these places, at least to the extent that nobody has complained about it not working.
The S3A connector connects to Amazon S3 storage over HTTPS connections, either directly or through an HTTP proxy. HTTP HEAD and GET, PUT, POST and DELETE requests are invoked to perform different read/write operations against the store.
There are multiple ways to connect to an S3 bucket
The S3A connector supports all these; S3 Endpoints are the primary mechanism used -either explicitly declared or automatically determined from the declared region of the bucket.
The S3A connector supports S3 cross region access via AWS SDK which is enabled by default. This allows users to access S3 buckets in a different region than the one defined in the S3 endpoint/region configuration, as long as they are within the same AWS partition. However, S3 cross-region access can be disabled by:
<property> <name>fs.s3a.cross.region.access.enabled</name> <value>false</value> <description>S3 cross region access</description> </property>
Not supported: * AWS Snowball.
As of December 2023, AWS S3 uses Transport Layer Security (TLS) version 1.2 to secure the communications channel; the S3A client is does this through the Apache HttpClient library.
Third-party stores implementing the S3 API are also supported. These often only implement a subset of the S3 API; not all features are available. If TLS authentication is used, then the HTTPS certificates for the private stores MUST be installed on the JVMs on hosts within the Hadoop cluster.
See Working with Third-party S3 Stores after reading this document.
There are three core settings to connect to an S3 store, endpoint, region and whether or not to use path style access.
<property> <name>fs.s3a.endpoint</name> <description>AWS S3 endpoint to connect to. An up-to-date list is provided in the AWS Documentation: regions and endpoints. Without this property, the endpoint/hostname of the S3 Store is inferred from the value of fs.s3a.endpoint.region, fs.s3a.endpoint.fips and more. </description> </property> <property> <name>fs.s3a.endpoint.region</name> <value>REGION</value> <description>AWS Region of the data</description> </property> <property> <name>fs.s3a.path.style.access</name> <value>false</value> <description>Enable S3 path style access by disabling the default virtual hosting behaviour. Needed for AWS PrivateLink, S3 AccessPoints, and, generally, third party stores. Default: false. </description> </property>
Historically the S3A connector has preferred the endpoint as defined by the option fs.s3a.endpoint
. With the move to the AWS V2 SDK, there is more emphasis on the region, set by the fs.s3a.endpoint.region
option.
Normally, declaring the region in fs.s3a.endpoint.region
should be sufficient to set up the network connection to correctly connect to an AWS-hosted S3 store.
fs.s3a.endpoint
and fs.s3a.endpoint.region
are used to set values for S3 endpoint and region respectively.fs.s3a.endpoint.region
is configured with valid AWS region value, S3A will configure the S3 client to use this value. If this is set to a region that does not match your bucket, you will receive a 301 redirect response.fs.s3a.endpoint.region
is not set and fs.s3a.endpoint
is set with valid endpoint value, S3A will attempt to parse the region from the endpoint and configure S3 client to use the region value.fs.s3a.endpoint
and fs.s3a.endpoint.region
are not set, S3A will use us-east-2
as default region and enable cross region access. In this case, S3A does not attempt to override the endpoint while configuring the S3 client.fs.s3a.endpoint
is not set and fs.s3a.endpoint.region
is set to an empty string, S3A will configure S3 client without any region or endpoint override. This will allow fallback to S3 SDK region resolution chain. More details here.fs.s3a.endpoint
is set to central endpoint s3.amazonaws.com
and fs.s3a.endpoint.region
is not set, S3A will use us-east-2
as default region and enable cross region access. In this case, S3A does not attempt to override the endpoint while configuring the S3 client.fs.s3a.endpoint
is set to central endpoint s3.amazonaws.com
and fs.s3a.endpoint.region
is also set to some region, S3A will use that region value and enable cross region access. In this case, S3A does not attempt to override the endpoint while configuring the S3 client.When the cross region access is enabled while configuring the S3 client, even if the region set is incorrect, S3 SDK determines the region. This is done by making the request, and if the SDK receives 301 redirect response, it determines the region at the cost of a HEAD request, and caches it.
Please note that some endpoint and region settings that require cross region access are complex and improving over time. Hence, they may be considered unstable.
If you are working with third party stores, please check third party stores in detail.
See Timeouts.
The S3A connector uses Apache HttpClient to connect to S3 Stores. The client is configured to create a pool of HTTP connections with S3, so that once the initial set of connections have been made they can be re-used for followup operations.
Core aspects of pool settings are: * The pool size is set by fs.s3a.connection.maximum
-if a process asks for more connections than this then threads will be blocked until they are available. * The time blocked before an exception is raised is set in fs.s3a.connection.acquisition.timeout
. * The time an idle connection will be kept in the pool is set by fs.s3a.connection.idle.time
. * The time limit for even a non-idle connection to be kept open is set in fs.s3a.connection.ttl
.
<property> <name>fs.s3a.connection.maximum</name> <value>200</value> <description>Controls the maximum number of simultaneous connections to S3. This must be bigger than the value of fs.s3a.threads.max so as to stop threads being blocked waiting for new HTTPS connections. </description> </property> <property> <name>fs.s3a.connection.acquisition.timeout</name> <value>60s</value> <description> Time to wait for an HTTP connection from the pool. Too low: operations fail on a busy process. When high, it isn't obvious that the connection pool is overloaded, simply that jobs are slow. </description> </property> <property> <name>fs.s3a.connection.request.timeout</name> <value>60s</value> <description> Total time for a single request to take from the HTTP verb to the response from the server. 0 means "no limit" </description> </property> <property> <name>fs.s3a.connection.part.upload.timeout</name> <value>15m</value> <description> Timeout for uploading all of a small object or a single part of a larger one. </description> </property> <property> <name>fs.s3a.connection.ttl</name> <value>5m</value> <description> Expiration time of an Http connection from the connection pool: </description> </property> <property> <name>fs.s3a.connection.idle.time</name> <value>60s</value> <description> Time for an idle HTTP connection to be kept the HTTP connection pool before being closed. Too low: overhead of creating connections. Too high, risk of stale connections and inability to use the adaptive load balancing of the S3 front end. </description> </property> <property> <name>fs.s3a.connection.expect.continue</name> <value>true</value> <description> Should PUT requests await a 100 CONTINUE responses before uploading data? This should normally be left alone unless a third party store which does not support it is encountered, or file upload over long distance networks time out. (see HADOOP-19317 as an example) </description> </property> <property> <name>fs.s3a.connection.ssl.enabled</name> <value>true</value> <description> Enables or disables SSL connections to AWS services. </description> </property> <property> <name>fs.s3a.ssl.channel.mode</name> <value>Default_JSSE</value> <description> TLS implementation and cipher options. Values: OpenSSL, Default, Default_JSSE, Default_JSSE_with_GCM Default_JSSE is not truly the the default JSSE implementation because the GCM cipher is disabled when running on Java 8. However, the name was not changed in order to preserve backwards compatibility. Instead, new mode called Default_JSSE_with_GCM delegates to the default JSSE implementation with no changes to the list of enabled ciphers. OpenSSL requires the wildfly JAR on the classpath and a compatible installation of the openssl binaries. It is often faster than the JVM libraries, but also trickier to use. </description> </property> <property> <name>fs.s3a.socket.send.buffer</name> <value>8192</value> <description> Socket send buffer hint to amazon connector. Represented in bytes. </description> </property> <property> <name>fs.s3a.socket.recv.buffer</name> <value>8192</value> <description> Socket receive buffer hint to amazon connector. Represented in bytes. </description> </property>
Connections to S3A stores can be made through an HTTP or HTTPS proxy.
<property> <name>fs.s3a.proxy.host</name> <description> Hostname of the (optional) proxy server for S3 connections. </description> </property> <property> <name>fs.s3a.proxy.ssl.enabled</name> <value>false</value> <description> Does the proxy use a TLS connection? </description> </property> <property> <name>fs.s3a.proxy.port</name> <description> Proxy server port. If this property is not set but fs.s3a.proxy.host is, port 80 or 443 is assumed (consistent with the value of fs.s3a.connection.ssl.enabled). </description> </property> <property> <name>fs.s3a.proxy.username</name> <description>Username for authenticating with proxy server.</description> </property> <property> <name>fs.s3a.proxy.password</name> <description>Password for authenticating with proxy server.</description> </property> <property> <name>fs.s3a.proxy.domain</name> <description>Domain for authenticating with proxy server.</description> </property> <property> <name>fs.s3a.proxy.workstation</name> <description>Workstation for authenticating with proxy server.</description> </property>
Sometimes the proxy can be source of problems, especially if HTTP connections are kept in the connection pool for some time. Experiment with the values of fs.s3a.connection.ttl
and fs.s3a.connection.request.timeout
if long-lived connections have problems.
S3 Buckets are hosted in different “regions”, the default being “US-East-1”. The S3A client talks to this region by default, issuing HTTP requests to the server s3.amazonaws.com
.
S3A can work with buckets from any region. Each region has its own S3 endpoint, documented by Amazon.
fs.s3a.endpoint.region
, or by explicitly setting fs.s3a.endpoint
and fs.s3a.endpoint.region
.us-east-1
endpoint allows access to the data, but newer storage types, particularly S3 Express are not supported.If the wrong endpoint is used, the request will fail. This may be reported as a 301/redirect error, or as a 400 Bad Request: take these as cues to check the endpoint setting of a bucket.
The up to date list of regions is Available online.
This list can be used to specify the endpoint of individual buckets, for example for buckets in the us-west-2 and EU/Ireland endpoints.
<property> <name>fs.s3a.bucket.us-west-2-dataset.endpoint.region</name> <value>us-west-2</value> </property> <property> <name>fs.s3a.bucket.eu-dataset.endpoint.region</name> <value>eu-west-1</value> </property>
AWS PrivateLink for Amazon S3 allows for a private connection to a bucket to be defined, with network access rules managing how a bucket can be accessed.
vpce-f264a96c-6d27bfa7c85e.s3.us-west-2.vpce.amazonaws.com
vpce
URL.<property> <name>fs.s3a.bucket.example-usw2.endpoint</name> <value>https://bucket.vpce-f264a96c-6d27bfa7c85e.s3.us-west-2.vpce.amazonaws.com/</value> </property> <property> <name>fs.s3a.bucket.example-usw2.path.style.access</name> <value>true</value> </property> <property> <name>fs.s3a.bucket.example-usw2.endpoint.region</name> <value>us-west-2</value> </property>
It is possible to use FIPs-compliant endpoints which support a restricted subset of TLS algorithms.
Amazon provide a specific set of FIPS endpoints to use so callers can be confident that the network communication is compliant with the standard: non-compliant algorithms are unavailable.
The boolean option fs.s3a.endpoint.fips
(default false
) switches the S3A connector to using the FIPS endpoint of a region.
<property> <name>fs.s3a.endpoint.fips</name> <value>true</value> <description>Use the FIPS endpoint</description> </property>
For a single bucket:
<property> <name>fs.s3a.bucket.noaa-isd-pds.endpoint.fips</name> <value>true</value> <description>Use the FIPS endpoint for the NOAA dataset</description> </property>
If fs.s3a.endpoint.fips
is true
, the endpoint option fs.s3a.endpoint
MUST NOT be set to any non-central endpoint value. If fs.s3a.endpoint.fips
is true
, the only optionally allowed value for fs.s3a.endpoint
is central endpoint s3.amazonaws.com
.
S3A error message if s3.eu-west-2.amazonaws.com
endpoint is used with FIPS:
Non central endpoint cannot be set when fs.s3a.endpoint.fips is true : https://s3.eu-west-2.amazonaws.com
S3A validation is used to fail-fast before the SDK returns error.
AWS SDK error message if S3A does not fail-fast:
A custom endpoint cannot be combined with FIPS: https://s3.eu-west-2.amazonaws.com
The SDK calculates the FIPS-specific endpoint without any awareness as to whether FIPs is supported by a region. The first attempt to interact with the service will fail
java.net.UnknownHostException: software.amazon.awssdk.core.exception.SdkClientException: Received an UnknownHostException when attempting to interact with a service. See cause for the exact endpoint that is failing to resolve. If this is happening on an endpoint that previously worked, there may be a network connectivity issue or your DNS cache could be storing endpoints for too long.: example-london-1.s3-fips.eu-west-2.amazonaws.com
For more details on endpoint and region settings, please check S3 endpoint and region settings in detail.
Important OpenSSL and FIPS endpoints
Linux distributions with an FIPS-compliant SSL library may not be compatible with wildfly. Always use with the JDK SSL implementation unless you are confident that the library is compatible, or wish to experiment with the settings outside of production deployments.
<property> <name>fs.s3a.ssl.channel.mode</name> <value>Default_JSSE</value> </property>
S3A supports S3 Access Point usage which improves VPC integration with S3 and simplifies your data’s permission model because different policies can be applied now on the Access Point level. For more information about why to use and how to create them make sure to read the official documentation.
Accessing data through an access point, is done by using its ARN, as opposed to just the bucket name. You can set the Access Point ARN property using the following per bucket configuration property:
<property> <name>fs.s3a.bucket.sample-bucket.accesspoint.arn</name> <value> {ACCESSPOINT_ARN_HERE} </value> <description>Configure S3a traffic to use this AccessPoint</description> </property>
This configures access to the sample-bucket
bucket for S3A, to go through the new Access Point ARN. So, for example s3a://sample-bucket/key
will now use your configured ARN when getting data from S3 instead of your bucket.
the name of the bucket used in the s3a:// URLs is irrelevant; it is not used when connecting with the store
Example
<property> <name>fs.s3a.bucket.example-ap.accesspoint.arn</name> <value>arn:aws:s3:eu-west-2:152813717728:accesspoint/ap-example-london</value> <description>AccessPoint bound to bucket name example-ap</description> </property>
The fs.s3a.accesspoint.required
property can also require all access to S3 to go through Access Points. This has the advantage of increasing security inside a VPN / VPC as you only allow access to known sources of data defined through Access Points. In case there is a need to access a bucket directly (without Access Points) then you can use per bucket overrides to disable this setting on a bucket by bucket basis i.e. fs.s3a.bucket.{YOUR-BUCKET}.accesspoint.required
.
<!-- Require access point only access --> <property> <name>fs.s3a.accesspoint.required</name> <value>true</value> </property> <!-- Disable it on a per-bucket basis if needed --> <property> <name>fs.s3a.bucket.example-bucket.accesspoint.required</name> <value>false</value> </property>
Before using Access Points make sure you’re not impacted by the following: - The endpoint for S3 requests will automatically change to use s3-accesspoint.REGION.amazonaws.{com | com.cn}
depending on the Access Point ARN. While considering endpoints, if you have any custom signers that use the host endpoint property make sure to update them if needed;
The storediag
command within the utility cloudstore JAR is recommended as the way to view and print settings.
If storediag
doesn’t connect to your S3 store, nothing else will.
Based on the experience of people who field support calls, here are some of the main connectivity issues which cause problems.
If more connections are needed than the HTTP connection pool has, then worker threads will block until one is freed.
If the wait exceeds the time set in fs.s3a.connection.acquisition.timeout
, the operation will fail with "Timeout waiting for connection from pool
.
This may be retried, but time has been lost, which results in slower operations. If queries suddenly gets slower as the number of active operations increase, then this is a possible cause.
Fixes:
Increase the value of fs.s3a.connection.maximum
. This is the general fix on query engines such as Apache Spark, and Apache Impala which run many workers threads simultaneously, and do not keep files open past the duration of a single task within a larger query.
It can also surface with applications which deliberately keep files open for extended periods. These should ideally call unbuffer()
on the input streams. This will free up the connection until another read operation is invoked -yet still re-open faster than if open(Path)
were invoked.
Applications may also be “leaking” http connections by failing to close()
them. This is potentially fatal as eventually the connection pool can get exhausted -at which point the program will no longer work.
This can only be fixed in the application code: it is not a bug in the S3A filesystem.
close()
on an input stream when the contents of the file are longer needed.ApiCallTimeout
exceptions, they are not doing so.To aid in identifying the location of these leaks, when a JVM garbage collection releases an unreferenced S3AInputStream
instance, it will log at WARN
level that it has not been closed, listing the file URL, and the thread name + ID of the the thread which creating the file. The the stack trace of the open()
call will be logged at INFO
2024-11-13 12:48:24,537 [Finalizer] WARN resource.leaks (LeakReporter.java:close(114)) - Stream not closed while reading s3a://bucket/test/testFinalizer; thread: JUnit-testFinalizer; id: 11 2024-11-13 12:48:24,537 [Finalizer] INFO resource.leaks (LeakReporter.java:close(120)) - stack java.io.IOException: Stream not closed while reading s3a://bucket/test/testFinalizer; thread: JUnit-testFinalizer; id: 11 at org.apache.hadoop.fs.impl.LeakReporter.<init>(LeakReporter.java:101) at org.apache.hadoop.fs.s3a.S3AInputStream.<init>(S3AInputStream.java:257) at org.apache.hadoop.fs.s3a.S3AFileSystem.executeOpen(S3AFileSystem.java:1891) at org.apache.hadoop.fs.s3a.S3AFileSystem.open(S3AFileSystem.java:1841) at org.apache.hadoop.fs.FileSystem.open(FileSystem.java:997) at org.apache.hadoop.fs.s3a.ITestS3AInputStreamLeakage.testFinalizer(ITestS3AInputStreamLeakage.java:99)
It will also abort()
the HTTP connection, freeing up space in the connection pool. This automated cleanup is not a substitute for applications correctly closing input streams -it only happens during garbage collection, and this may not be rapid enough to prevent an application running out of connections.
It is possible to stop these warning messages from being logged, by restricting the log org.apache.hadoop.fs.resource.leaks
to only log at ERROR
or above. This will also disable error logging for _all other resources whose leaks are detected.
log4j.logger.org.apache.hadoop.fs.s3a.connection.leaks=ERROR
To disable stack traces without the URI/thread information, set the log level to WARN
log4j.logger.org.apache.hadoop.fs.s3a.connection.leaks=WARN
This is better for production deployments: leakages are reported but stack traces only of relevance to the application developers are omitted.
Finally, note that the filesystem and thread context IOStatistic stream_leaks"
is updated; if these statistics are collected then the existence of leakages can be detected.
All hosts in the cluster need to have the configuration secrets; local environment variables are not enough.
If HTTPS/TLS is used for a private store, the relevant certificates MUST be installed everywhere.
For applications such as distcp, the options need to be passed with the job.
If your cluster is configured to use a private store, AWS-hosted buckets are not visible. If you wish to read access in a private store, you need to change the endpoint.
Private S3 stores generally expect path style access.
These usually surface rapidly and with meaningful messages.
Region errors generally surface as * UnknownHostException
* AWSRedirectException
“Received permanent redirect response to region”
Endpoint configuration problems can be more varied, as they are just HTTPS URLs.
When it works, it is fast. But it is fussy as to openSSL implementations, TLS protocols and more. Because it uses the native openssl binaries, operating system updates can trigger regressions.
Disabling it should be the first step to troubleshooting any TLS problems.
If there is a proxy, set it up correctly.