Class FileSystem

java.lang.Object
org.apache.hadoop.conf.Configured
org.apache.hadoop.fs.FileSystem
All Implemented Interfaces:
Closeable, AutoCloseable, Configurable, BulkDeleteSource, org.apache.hadoop.fs.PathCapabilities, org.apache.hadoop.security.token.DelegationTokenIssuer
Direct Known Subclasses:
AdlFileSystem, FilterFileSystem, FTPFileSystem, NativeAzureFileSystem, RawLocalFileSystem, ViewFileSystem

@Public @Stable public abstract class FileSystem extends Configured implements Closeable, org.apache.hadoop.security.token.DelegationTokenIssuer, org.apache.hadoop.fs.PathCapabilities, BulkDeleteSource
An abstract base class for a fairly generic filesystem. It may be implemented as a distributed filesystem, or as a "local" one that reflects the locally-connected disk. The local version exists for small Hadoop instances and for testing.

All user code that may potentially use the Hadoop Distributed File System should be written to use a FileSystem object or its successor, FileContext.

The local implementation is LocalFileSystem and distributed implementation is DistributedFileSystem. There are other implementations for object stores and (outside the Apache Hadoop codebase), third party filesystems.

Notes
  1. The behaviour of the filesystem is specified in the Hadoop documentation. However, the normative specification of the behavior of this class is actually HDFS: if HDFS does not behave the way these Javadocs or the specification in the Hadoop documentations define, assume that the documentation is incorrect.
  2. The term FileSystem refers to an instance of this class.
  3. The acronym "FS" is used as an abbreviation of FileSystem.
  4. The term filesystem refers to the distributed/local filesystem itself, rather than the class used to interact with it.
  5. The term "file" refers to a file in the remote filesystem, rather than instances of java.io.File.
This is a carefully evolving class. New methods may be marked as Unstable or Evolving for their initial release, as a warning that they are new and may change based on the experience of use in applications.

Important note for developers

If you are making changes here to the public API or protected methods, you must review the following subclasses and make sure that they are filtering/passing through new methods as appropriate. FilterFileSystem: methods are passed through. If not, then TestFilterFileSystem.MustNotImplement must be updated with the unsupported interface. Furthermore, if the new API's support is probed for via hasPathCapability(Path, String) then FilterFileSystem.hasPathCapability(Path, String) must return false, always.

ChecksumFileSystem: checksums are created and verified.

TestHarFileSystem will need its MustNotImplement interface updated.

There are some external places your changes will break things. Do co-ordinate changes here.

HBase: HBoss

Hive: HiveShim23

shims/0.23/src/main/java/org/apache/hadoop/hive/shims/Hadoop23Shims.java
  • Field Details

    • FS_DEFAULT_NAME_KEY

      public static final String FS_DEFAULT_NAME_KEY
      See Also:
    • DEFAULT_FS

      public static final String DEFAULT_FS
      See Also:
    • LOG

      @Private public static final org.slf4j.Logger LOG
      This log is widely used in the org.apache.hadoop.fs code and tests, so must be considered something to only be changed with care.
    • SHUTDOWN_HOOK_PRIORITY

      public static final int SHUTDOWN_HOOK_PRIORITY
      Priority of the FileSystem shutdown hook: 10.
      See Also:
    • TRASH_PREFIX

      public static final String TRASH_PREFIX
      Prefix for trash directory: ".Trash".
      See Also:
    • USER_HOME_PREFIX

      public static final String USER_HOME_PREFIX
      See Also:
    • statistics

      protected org.apache.hadoop.fs.FileSystem.Statistics statistics
      The statistics for this file system.
  • Constructor Details

    • FileSystem

      protected FileSystem()
  • Method Details

    • get

      public static FileSystem get(URI uri, Configuration conf, String user) throws IOException, InterruptedException
      Get a FileSystem instance based on the uri, the passed in configuration and the user.
      Parameters:
      uri - of the filesystem
      conf - the configuration to use
      user - to perform the get as
      Returns:
      the filesystem instance
      Throws:
      IOException - failure to load
      InterruptedException - If the UGI.doAs() call was somehow interrupted.
    • get

      public static FileSystem get(Configuration conf) throws IOException
      Returns the configured FileSystem implementation.
      Parameters:
      conf - the configuration to use
      Returns:
      FileSystem.
      Throws:
      IOException - If an I/O error occurred.
    • getDefaultUri

      public static URI getDefaultUri(Configuration conf)
      Get the default FileSystem URI from a configuration.
      Parameters:
      conf - the configuration to use
      Returns:
      the uri of the default filesystem
    • setDefaultUri

      public static void setDefaultUri(Configuration conf, URI uri)
      Set the default FileSystem URI in a configuration.
      Parameters:
      conf - the configuration to alter
      uri - the new default filesystem uri
    • setDefaultUri

      public static void setDefaultUri(Configuration conf, String uri)
      Set the default FileSystem URI in a configuration.
      Parameters:
      conf - the configuration to alter
      uri - the new default filesystem uri
    • initialize

      public void initialize(URI name, Configuration conf) throws IOException
      Initialize a FileSystem. Called after the new FileSystem instance is constructed, and before it is ready for use. FileSystem implementations overriding this method MUST forward it to their superclass, though the order in which it is done, and whether to alter the configuration before the invocation are options of the subclass.
      Parameters:
      name - a URI whose authority section names the host, port, etc. for this FileSystem
      conf - the configuration
      Throws:
      IOException - on any failure to initialize this instance.
      IllegalArgumentException - if the URI is considered invalid.
    • getScheme

      public String getScheme()
      Return the protocol scheme for this FileSystem.

      This implementation throws an UnsupportedOperationException.

      Returns:
      the protocol scheme for this FileSystem.
      Throws:
      UnsupportedOperationException - if the operation is unsupported (default).
    • getUri

      public abstract URI getUri()
      Returns a URI which identifies this FileSystem.
      Returns:
      the URI of this filesystem.
    • getCanonicalUri

      protected URI getCanonicalUri()
      Return a canonicalized form of this FileSystem's URI. The default implementation simply calls canonicalizeUri(URI) on the filesystem's own URI, so subclasses typically only need to implement that method.
      Returns:
      the URI of this filesystem.
      See Also:
    • canonicalizeUri

      protected URI canonicalizeUri(URI uri)
      Canonicalize the given URI. This is implementation-dependent, and may for example consist of canonicalizing the hostname using DNS and adding the default port if not specified. The default implementation simply fills in the default port if not specified and if getDefaultPort() returns a default port.
      Parameters:
      uri - url.
      Returns:
      URI
      See Also:
      • NetUtils.getCanonicalUri(URI, int)
    • getDefaultPort

      protected int getDefaultPort()
      Get the default port for this FileSystem.
      Returns:
      the default port or 0 if there isn't one
    • getFSofPath

      protected static FileSystem getFSofPath(Path absOrFqPath, Configuration conf) throws UnsupportedFileSystemException, IOException
      Throws:
      UnsupportedFileSystemException
      IOException
    • getCanonicalServiceName

      @Public @Evolving public String getCanonicalServiceName()
      Get a canonical service name for this FileSystem. The token cache is the only user of the canonical service name, and uses it to lookup this FileSystem's service tokens. If the file system provides a token of its own then it must have a canonical name, otherwise the canonical name can be null. Default implementation: If the FileSystem has child file systems (such as an embedded file system) then it is assumed that the FS has no tokens of its own and hence returns a null name; otherwise a service name is built using Uri and port.
      Specified by:
      getCanonicalServiceName in interface org.apache.hadoop.security.token.DelegationTokenIssuer
      Returns:
      a service string that uniquely identifies this file system, null if the filesystem does not implement tokens
      See Also:
    • getName

      @Deprecated public String getName()
      Deprecated.
      call getUri() instead.
      Returns:
      uri to string.
    • getNamed

      @Deprecated public static FileSystem getNamed(String name, Configuration conf) throws IOException
      Deprecated.
      Parameters:
      name - name.
      conf - configuration.
      Returns:
      file system.
      Throws:
      IOException - If an I/O error occurred.
    • getLocal

      public static LocalFileSystem getLocal(Configuration conf) throws IOException
      Get the local FileSystem.
      Parameters:
      conf - the configuration to configure the FileSystem with if it is newly instantiated.
      Returns:
      a LocalFileSystem
      Throws:
      IOException - if somehow the local FS cannot be instantiated.
    • get

      public static FileSystem get(URI uri, Configuration conf) throws IOException
      Get a FileSystem for this URI's scheme and authority.
      1. If the configuration has the property "fs.$SCHEME.impl.disable.cache" set to true, a new instance will be created, initialized with the supplied URI and configuration, then returned without being cached.
      2. If the there is a cached FS instance matching the same URI, it will be returned.
      3. Otherwise: a new FS instance will be created, initialized with the configuration and URI, cached and returned to the caller.
      Parameters:
      uri - uri of the filesystem.
      conf - configrution.
      Returns:
      filesystem instance.
      Throws:
      IOException - if the FileSystem cannot be instantiated.
    • newInstance

      public static FileSystem newInstance(URI uri, Configuration conf, String user) throws IOException, InterruptedException
      Returns the FileSystem for this URI's scheme and authority and the given user. Internally invokes newInstance(URI, Configuration)
      Parameters:
      uri - uri of the filesystem.
      conf - the configuration to use
      user - to perform the get as
      Returns:
      filesystem instance
      Throws:
      IOException - if the FileSystem cannot be instantiated.
      InterruptedException - If the UGI.doAs() call was somehow interrupted.
    • newInstance

      public static FileSystem newInstance(URI uri, Configuration config) throws IOException
      Returns the FileSystem for this URI's scheme and authority. The entire URI is passed to the FileSystem instance's initialize method. This always returns a new FileSystem object.
      Parameters:
      uri - FS URI
      config - configuration to use
      Returns:
      the new FS instance
      Throws:
      IOException - FS creation or initialization failure.
    • newInstance

      public static FileSystem newInstance(Configuration conf) throws IOException
      Returns a unique configured FileSystem implementation for the default filesystem of the supplied configuration. This always returns a new FileSystem object.
      Parameters:
      conf - the configuration to use
      Returns:
      the new FS instance
      Throws:
      IOException - FS creation or initialization failure.
    • newInstanceLocal

      public static LocalFileSystem newInstanceLocal(Configuration conf) throws IOException
      Get a unique local FileSystem object.
      Parameters:
      conf - the configuration to configure the FileSystem with
      Returns:
      a new LocalFileSystem object.
      Throws:
      IOException - FS creation or initialization failure.
    • closeAll

      public static void closeAll() throws IOException
      Close all cached FileSystem instances. After this operation, they may not be used in any operations.
      Throws:
      IOException - a problem arose closing one or more filesystem.
    • closeAllForUGI

      public static void closeAllForUGI(UserGroupInformation ugi) throws IOException
      Close all cached FileSystem instances for a given UGI. Be sure those filesystems are not used anymore.
      Parameters:
      ugi - user group info to close
      Throws:
      IOException - a problem arose closing one or more filesystem.
    • makeQualified

      public Path makeQualified(Path path)
      Qualify a path to one which uses this FileSystem and, if relative, made absolute.
      Parameters:
      path - to qualify.
      Returns:
      this path if it contains a scheme and authority and is absolute, or a new path that includes a path and authority and is fully qualified
      Throws:
      IllegalArgumentException - if the path has a schema/URI different from this FileSystem.
      See Also:
      • Path.makeQualified(URI, Path)
    • getDelegationToken

      @Private public Token<?> getDelegationToken(String renewer) throws IOException
      Get a new delegation token for this FileSystem. This is an internal method that should have been declared protected but wasn't historically. Callers should use DelegationTokenIssuer.addDelegationTokens(String, Credentials)
      Specified by:
      getDelegationToken in interface org.apache.hadoop.security.token.DelegationTokenIssuer
      Parameters:
      renewer - the account name that is allowed to renew the token.
      Returns:
      a new delegation token or null if the FS does not support tokens.
      Throws:
      IOException - on any problem obtaining a token
    • getChildFileSystems

      @LimitedPrivate("HDFS") @VisibleForTesting public FileSystem[] getChildFileSystems()
      Get all the immediate child FileSystems embedded in this FileSystem. It does not recurse and get grand children. If a FileSystem has multiple child FileSystems, then it must return a unique list of those FileSystems. Default is to return null to signify no children.
      Returns:
      FileSystems that are direct children of this FileSystem, or null for "no children"
    • getAdditionalTokenIssuers

      @Private public org.apache.hadoop.security.token.DelegationTokenIssuer[] getAdditionalTokenIssuers() throws IOException
      Description copied from interface: org.apache.hadoop.security.token.DelegationTokenIssuer
      Issuers may need tokens from additional services.
      Specified by:
      getAdditionalTokenIssuers in interface org.apache.hadoop.security.token.DelegationTokenIssuer
      Returns:
      delegation token issuer.
      Throws:
      IOException - raised on errors performing I/O.
    • create

      public static FSDataOutputStream create(FileSystem fs, Path file, FsPermission permission) throws IOException
      Create a file with the provided permission. The permission of the file is set to be the provided permission as in setPermission, not permission&~umask The HDFS implementation is implemented using two RPCs. It is understood that it is inefficient, but the implementation is thread-safe. The other option is to change the value of umask in configuration to be 0, but it is not thread-safe.
      Parameters:
      fs - FileSystem
      file - the name of the file to be created
      permission - the permission of the file
      Returns:
      an output stream
      Throws:
      IOException - IO failure
    • mkdirs

      public static boolean mkdirs(FileSystem fs, Path dir, FsPermission permission) throws IOException
      Create a directory with the provided permission. The permission of the directory is set to be the provided permission as in setPermission, not permission&~umask
      Parameters:
      fs - FileSystem handle
      dir - the name of the directory to be created
      permission - the permission of the directory
      Returns:
      true if the directory creation succeeds; false otherwise
      Throws:
      IOException - A problem creating the directories.
      See Also:
    • checkPath

      protected void checkPath(Path path)
      Check that a Path belongs to this FileSystem. The base implementation performs case insensitive equality checks of the URIs' schemes and authorities. Subclasses may implement slightly different checks.
      Parameters:
      path - to check
      Throws:
      IllegalArgumentException - if the path is not considered to be part of this FileSystem.
    • getFileBlockLocations

      public BlockLocation[] getFileBlockLocations(FileStatus file, long start, long len) throws IOException
      Return an array containing hostnames, offset and size of portions of the given file. For nonexistent file or regions, null is returned.
         if f == null :
           result = null
         elif f.getLen() <= start:
           result = []
         else result = [ locations(FS, b) for b in blocks(FS, p, s, s+l)]
       
      This call is most helpful with and distributed filesystem where the hostnames of machines that contain blocks of the given file can be determined. The default implementation returns an array containing one element:
       BlockLocation( { "localhost:9866" },  { "localhost" }, 0, file.getLen())
       
      In HDFS, if file is three-replicated, the returned array contains elements like:
       BlockLocation(offset: 0, length: BLOCK_SIZE,
         hosts: {"host1:9866", "host2:9866, host3:9866"})
       BlockLocation(offset: BLOCK_SIZE, length: BLOCK_SIZE,
         hosts: {"host2:9866", "host3:9866, host4:9866"})
       
      And if a file is erasure-coded, the returned BlockLocation are logical block groups. Suppose we have a RS_3_2 coded file (3 data units and 2 parity units). 1. If the file size is less than one stripe size, say 2 * CELL_SIZE, then there will be one BlockLocation returned, with 0 offset, actual file size and 4 hosts (2 data blocks and 2 parity blocks) hosting the actual blocks. 3. If the file size is less than one group size but greater than one stripe size, then there will be one BlockLocation returned, with 0 offset, actual file size with 5 hosts (3 data blocks and 2 parity blocks) hosting the actual blocks. 4. If the file size is greater than one group size, 3 * BLOCK_SIZE + 123 for example, then the result will be like:
       BlockLocation(offset: 0, length: 3 * BLOCK_SIZE, hosts: {"host1:9866",
         "host2:9866","host3:9866","host4:9866","host5:9866"})
       BlockLocation(offset: 3 * BLOCK_SIZE, length: 123, hosts: {"host1:9866",
         "host4:9866", "host5:9866"})
       
      Parameters:
      file - FilesStatus to get data from
      start - offset into the given file
      len - length for which to get locations for
      Returns:
      block location array.
      Throws:
      IOException - IO failure
    • getFileBlockLocations

      public BlockLocation[] getFileBlockLocations(Path p, long start, long len) throws IOException
      Return an array containing hostnames, offset and size of portions of the given file. For a nonexistent file or regions, null is returned. This call is most helpful with location-aware distributed filesystems, where it returns hostnames of machines that contain the given file. A FileSystem will normally return the equivalent result of passing the FileStatus of the path to getFileBlockLocations(FileStatus, long, long)
      Parameters:
      p - path is used to identify an FS since an FS could have another FS that it could be delegating the call to
      start - offset into the given file
      len - length for which to get locations for
      Returns:
      block location array.
      Throws:
      FileNotFoundException - when the path does not exist
      IOException - IO failure
    • getServerDefaults

      @Deprecated public FsServerDefaults getServerDefaults() throws IOException
      Deprecated.
      Return a set of server default configuration values.
      Returns:
      server default configuration values
      Throws:
      IOException - IO failure
    • getServerDefaults

      public FsServerDefaults getServerDefaults(Path p) throws IOException
      Return a set of server default configuration values.
      Parameters:
      p - path is used to identify an FS since an FS could have another FS that it could be delegating the call to
      Returns:
      server default configuration values
      Throws:
      IOException - IO failure
    • resolvePath

      public Path resolvePath(Path p) throws IOException
      Return the fully-qualified path of path, resolving the path through any symlinks or mount point.
      Parameters:
      p - path to be resolved
      Returns:
      fully qualified path
      Throws:
      FileNotFoundException - if the path is not present
      IOException - for any other error
    • open

      public abstract FSDataInputStream open(Path f, int bufferSize) throws IOException
      Opens an FSDataInputStream at the indicated Path.
      Parameters:
      f - the file name to open
      bufferSize - the size of the buffer to be used.
      Returns:
      input stream.
      Throws:
      IOException - IO failure
    • open

      public FSDataInputStream open(Path f) throws IOException
      Opens an FSDataInputStream at the indicated Path.
      Parameters:
      f - the file to open
      Returns:
      input stream.
      Throws:
      IOException - IO failure
    • open

      public FSDataInputStream open(PathHandle fd) throws IOException
      Open an FSDataInputStream matching the PathHandle instance. The implementation may encode metadata in PathHandle to address the resource directly and verify that the resource referenced satisfies constraints specified at its construciton.
      Parameters:
      fd - PathHandle object returned by the FS authority.
      Returns:
      input stream.
      Throws:
      InvalidPathHandleException - If PathHandle constraints are not satisfied
      IOException - IO failure
      UnsupportedOperationException - If open(PathHandle, int) not overridden by subclass
    • open

      public FSDataInputStream open(PathHandle fd, int bufferSize) throws IOException
      Open an FSDataInputStream matching the PathHandle instance. The implementation may encode metadata in PathHandle to address the resource directly and verify that the resource referenced satisfies constraints specified at its construciton.
      Parameters:
      fd - PathHandle object returned by the FS authority.
      bufferSize - the size of the buffer to use
      Returns:
      input stream.
      Throws:
      InvalidPathHandleException - If PathHandle constraints are not satisfied
      IOException - IO failure
      UnsupportedOperationException - If not overridden by subclass
    • getPathHandle

      public final PathHandle getPathHandle(FileStatus stat, org.apache.hadoop.fs.Options.HandleOpt... opt)
      Create a durable, serializable handle to the referent of the given entity.
      Parameters:
      stat - Referent in the target FileSystem
      opt - If absent, assume Options.HandleOpt.path().
      Returns:
      path handle.
      Throws:
      IllegalArgumentException - If the FileStatus does not belong to this FileSystem
      UnsupportedOperationException - If createPathHandle(org.apache.hadoop.fs.FileStatus, org.apache.hadoop.fs.Options.HandleOpt...) not overridden by subclass.
      UnsupportedOperationException - If this FileSystem cannot enforce the specified constraints.
    • createPathHandle

      protected PathHandle createPathHandle(FileStatus stat, org.apache.hadoop.fs.Options.HandleOpt... opt)
      Hook to implement support for PathHandle operations.
      Parameters:
      stat - Referent in the target FileSystem
      opt - Constraints that determine the validity of the PathHandle reference.
      Returns:
      path handle.
    • create

      public FSDataOutputStream create(Path f) throws IOException
      Create an FSDataOutputStream at the indicated Path. Files are overwritten by default.
      Parameters:
      f - the file to create
      Returns:
      output stream.
      Throws:
      IOException - IO failure
    • create

      public FSDataOutputStream create(Path f, boolean overwrite) throws IOException
      Create an FSDataOutputStream at the indicated Path.
      Parameters:
      f - the file to create
      overwrite - if a file with this name already exists, then if true, the file will be overwritten, and if false an exception will be thrown.
      Returns:
      output stream.
      Throws:
      IOException - IO failure
    • create

      public FSDataOutputStream create(Path f, Progressable progress) throws IOException
      Create an FSDataOutputStream at the indicated Path with write-progress reporting. Files are overwritten by default.
      Parameters:
      f - the file to create
      progress - to report progress
      Returns:
      output stream.
      Throws:
      IOException - IO failure
    • create

      public FSDataOutputStream create(Path f, short replication) throws IOException
      Create an FSDataOutputStream at the indicated Path. Files are overwritten by default.
      Parameters:
      f - the file to create
      replication - the replication factor
      Returns:
      output stream1
      Throws:
      IOException - IO failure
    • create

      public FSDataOutputStream create(Path f, short replication, Progressable progress) throws IOException
      Create an FSDataOutputStream at the indicated Path with write-progress reporting. Files are overwritten by default.
      Parameters:
      f - the file to create
      replication - the replication factor
      progress - to report progress
      Returns:
      output stream.
      Throws:
      IOException - IO failure
    • create

      public FSDataOutputStream create(Path f, boolean overwrite, int bufferSize) throws IOException
      Create an FSDataOutputStream at the indicated Path.
      Parameters:
      f - the file to create
      overwrite - if a path with this name already exists, then if true, the file will be overwritten, and if false an error will be thrown.
      bufferSize - the size of the buffer to be used.
      Returns:
      output stream.
      Throws:
      IOException - IO failure
    • create

      public FSDataOutputStream create(Path f, boolean overwrite, int bufferSize, Progressable progress) throws IOException
      Create an FSDataOutputStream at the indicated Path with write-progress reporting. The frequency of callbacks is implementation-specific; it may be "none".
      Parameters:
      f - the path of the file to open
      overwrite - if a file with this name already exists, then if true, the file will be overwritten, and if false an error will be thrown.
      bufferSize - the size of the buffer to be used.
      progress - to report progress.
      Returns:
      output stream.
      Throws:
      IOException - IO failure
    • create

      public FSDataOutputStream create(Path f, boolean overwrite, int bufferSize, short replication, long blockSize) throws IOException
      Create an FSDataOutputStream at the indicated Path.
      Parameters:
      f - the file name to open
      overwrite - if a file with this name already exists, then if true, the file will be overwritten, and if false an error will be thrown.
      bufferSize - the size of the buffer to be used.
      replication - required block replication for the file.
      blockSize - the size of the buffer to be used.
      Returns:
      output stream.
      Throws:
      IOException - IO failure
    • create

      public FSDataOutputStream create(Path f, boolean overwrite, int bufferSize, short replication, long blockSize, Progressable progress) throws IOException
      Create an FSDataOutputStream at the indicated Path with write-progress reporting.
      Parameters:
      f - the file name to open
      overwrite - if a file with this name already exists, then if true, the file will be overwritten, and if false an error will be thrown.
      bufferSize - the size of the buffer to be used.
      replication - required block replication for the file.
      blockSize - the size of the buffer to be used.
      progress - to report progress.
      Returns:
      output stream.
      Throws:
      IOException - IO failure
    • create

      public abstract FSDataOutputStream create(Path f, FsPermission permission, boolean overwrite, int bufferSize, short replication, long blockSize, Progressable progress) throws IOException
      Create an FSDataOutputStream at the indicated Path with write-progress reporting.
      Parameters:
      f - the file name to open
      permission - file permission
      overwrite - if a file with this name already exists, then if true, the file will be overwritten, and if false an error will be thrown.
      bufferSize - the size of the buffer to be used.
      replication - required block replication for the file.
      blockSize - block size
      progress - the progress reporter
      Returns:
      output stream.
      Throws:
      IOException - IO failure
      See Also:
    • create

      public FSDataOutputStream create(Path f, FsPermission permission, EnumSet<CreateFlag> flags, int bufferSize, short replication, long blockSize, Progressable progress) throws IOException
      Create an FSDataOutputStream at the indicated Path with write-progress reporting.
      Parameters:
      f - the file name to open
      permission - file permission
      flags - CreateFlags to use for this stream.
      bufferSize - the size of the buffer to be used.
      replication - required block replication for the file.
      blockSize - block size
      progress - the progress reporter
      Returns:
      output stream.
      Throws:
      IOException - IO failure
      See Also:
    • create

      public FSDataOutputStream create(Path f, FsPermission permission, EnumSet<CreateFlag> flags, int bufferSize, short replication, long blockSize, Progressable progress, org.apache.hadoop.fs.Options.ChecksumOpt checksumOpt) throws IOException
      Create an FSDataOutputStream at the indicated Path with a custom checksum option.
      Parameters:
      f - the file name to open
      permission - file permission
      flags - CreateFlags to use for this stream.
      bufferSize - the size of the buffer to be used.
      replication - required block replication for the file.
      blockSize - block size
      progress - the progress reporter
      checksumOpt - checksum parameter. If null, the values found in conf will be used.
      Returns:
      output stream.
      Throws:
      IOException - IO failure
      See Also:
    • primitiveCreate

      @Deprecated protected FSDataOutputStream primitiveCreate(Path f, FsPermission absolutePermission, EnumSet<CreateFlag> flag, int bufferSize, short replication, long blockSize, Progressable progress, org.apache.hadoop.fs.Options.ChecksumOpt checksumOpt) throws IOException
      Deprecated.
      This create has been added to support the FileContext that processes the permission with umask before calling this method. This a temporary method added to support the transition from FileSystem to FileContext for user applications.
      Parameters:
      f - path.
      absolutePermission - permission.
      flag - create flag.
      bufferSize - buffer size.
      replication - replication.
      blockSize - block size.
      progress - progress.
      checksumOpt - check sum opt.
      Returns:
      output stream.
      Throws:
      IOException - IO failure
    • primitiveMkdir

      @Deprecated protected boolean primitiveMkdir(Path f, FsPermission absolutePermission) throws IOException
      Deprecated.
      This version of the mkdirs method assumes that the permission is absolute. It has been added to support the FileContext that processes the permission with umask before calling this method. This a temporary method added to support the transition from FileSystem to FileContext for user applications.
      Parameters:
      f - path
      absolutePermission - permissions
      Returns:
      true if the directory was actually created.
      Throws:
      IOException - IO failure
      See Also:
    • primitiveMkdir

      @Deprecated protected void primitiveMkdir(Path f, FsPermission absolutePermission, boolean createParent) throws IOException
      Deprecated.
      This version of the mkdirs method assumes that the permission is absolute. It has been added to support the FileContext that processes the permission with umask before calling this method. This a temporary method added to support the transition from FileSystem to FileContext for user applications.
      Parameters:
      f - the path.
      absolutePermission - permission.
      createParent - create parent.
      Throws:
      IOException - IO failure.
    • createNonRecursive

      public FSDataOutputStream createNonRecursive(Path f, boolean overwrite, int bufferSize, short replication, long blockSize, Progressable progress) throws IOException
      Opens an FSDataOutputStream at the indicated Path with write-progress reporting. Same as create(), except fails if parent directory doesn't already exist.
      Parameters:
      f - the file name to open
      overwrite - if a file with this name already exists, then if true, the file will be overwritten, and if false an error will be thrown.
      bufferSize - the size of the buffer to be used.
      replication - required block replication for the file.
      blockSize - block size
      progress - the progress reporter
      Returns:
      output stream.
      Throws:
      IOException - IO failure
      See Also:
    • createNonRecursive

      public FSDataOutputStream createNonRecursive(Path f, FsPermission permission, boolean overwrite, int bufferSize, short replication, long blockSize, Progressable progress) throws IOException
      Opens an FSDataOutputStream at the indicated Path with write-progress reporting. Same as create(), except fails if parent directory doesn't already exist.
      Parameters:
      f - the file name to open
      permission - file permission
      overwrite - if a file with this name already exists, then if true, the file will be overwritten, and if false an error will be thrown.
      bufferSize - the size of the buffer to be used.
      replication - required block replication for the file.
      blockSize - block size
      progress - the progress reporter
      Returns:
      output stream.
      Throws:
      IOException - IO failure
      See Also:
    • createNonRecursive

      public FSDataOutputStream createNonRecursive(Path f, FsPermission permission, EnumSet<CreateFlag> flags, int bufferSize, short replication, long blockSize, Progressable progress) throws IOException
      Opens an FSDataOutputStream at the indicated Path with write-progress reporting. Same as create(), except fails if parent directory doesn't already exist.
      Parameters:
      f - the file name to open
      permission - file permission
      flags - CreateFlags to use for this stream.
      bufferSize - the size of the buffer to be used.
      replication - required block replication for the file.
      blockSize - block size
      progress - the progress reporter
      Returns:
      output stream.
      Throws:
      IOException - IO failure
      See Also:
    • createNewFile

      public boolean createNewFile(Path f) throws IOException
      Creates the given Path as a brand-new zero-length file. If create fails, or if it already existed, return false. Important: the default implementation is not atomic
      Parameters:
      f - path to use for create
      Returns:
      if create new file success true,not false.
      Throws:
      IOException - IO failure
    • append

      public FSDataOutputStream append(Path f) throws IOException
      Append to an existing file (optional operation). Same as append(f, getConf().getInt(IO_FILE_BUFFER_SIZE_KEY, IO_FILE_BUFFER_SIZE_DEFAULT), null)
      Parameters:
      f - the existing file to be appended.
      Returns:
      output stream.
      Throws:
      IOException - IO failure
      UnsupportedOperationException - if the operation is unsupported (default).
    • append

      public FSDataOutputStream append(Path f, int bufferSize) throws IOException
      Append to an existing file (optional operation). Same as append(f, bufferSize, null).
      Parameters:
      f - the existing file to be appended.
      bufferSize - the size of the buffer to be used.
      Returns:
      output stream.
      Throws:
      IOException - IO failure
      UnsupportedOperationException - if the operation is unsupported (default).
    • append

      public abstract FSDataOutputStream append(Path f, int bufferSize, Progressable progress) throws IOException
      Append to an existing file (optional operation).
      Parameters:
      f - the existing file to be appended.
      bufferSize - the size of the buffer to be used.
      progress - for reporting progress if it is not null.
      Returns:
      output stream.
      Throws:
      IOException - IO failure
      UnsupportedOperationException - if the operation is unsupported (default).
    • append

      public FSDataOutputStream append(Path f, boolean appendToNewBlock) throws IOException
      Append to an existing file (optional operation).
      Parameters:
      f - the existing file to be appended.
      appendToNewBlock - whether to append data to a new block instead of the end of the last partial block
      Returns:
      output stream.
      Throws:
      IOException - IO failure
      UnsupportedOperationException - if the operation is unsupported (default).
    • append

      public FSDataOutputStream append(Path f, int bufferSize, Progressable progress, boolean appendToNewBlock) throws IOException
      Append to an existing file (optional operation). This function is used for being overridden by some FileSystem like DistributedFileSystem
      Parameters:
      f - the existing file to be appended.
      bufferSize - the size of the buffer to be used.
      progress - for reporting progress if it is not null.
      appendToNewBlock - whether to append data to a new block instead of the end of the last partial block
      Returns:
      output stream.
      Throws:
      IOException - IO failure
      UnsupportedOperationException - if the operation is unsupported (default).
    • concat

      public void concat(Path trg, Path[] psrcs) throws IOException
      Concat existing files together.
      Parameters:
      trg - the path to the target destination.
      psrcs - the paths to the sources to use for the concatenation.
      Throws:
      IOException - IO failure
      UnsupportedOperationException - if the operation is unsupported (default).
    • getReplication

      @Deprecated public short getReplication(Path src) throws IOException
      Deprecated.
      Get the replication factor.
      Parameters:
      src - file name
      Returns:
      file replication
      Throws:
      FileNotFoundException - if the path does not resolve.
      IOException - an IO failure
    • setReplication

      public boolean setReplication(Path src, short replication) throws IOException
      Set the replication for an existing file. If a filesystem does not support replication, it will always return true: the check for a file existing may be bypassed. This is the default behavior.
      Parameters:
      src - file name
      replication - new replication
      Returns:
      true if successful, or the feature in unsupported; false if replication is supported but the file does not exist, or is a directory
      Throws:
      IOException - an IO failure.
    • rename

      public abstract boolean rename(Path src, Path dst) throws IOException
      Renames Path src to Path dst.
      Parameters:
      src - path to be renamed
      dst - new path after rename
      Returns:
      true if rename is successful
      Throws:
      IOException - on failure
    • rename

      @Deprecated protected void rename(Path src, Path dst, Options.Rename... options) throws IOException
      Deprecated.
      Renames Path src to Path dst
      • Fails if src is a file and dst is a directory.
      • Fails if src is a directory and dst is a file.
      • Fails if the parent of dst does not exist or is a file.

      If OVERWRITE option is not passed as an argument, rename fails if the dst already exists.

      If OVERWRITE option is passed as an argument, rename overwrites the dst if it is a file or an empty directory. Rename fails if dst is a non-empty directory.

      Note that atomicity of rename is dependent on the file system implementation. Please refer to the file system documentation for details. This default implementation is non atomic.

      This method is deprecated since it is a temporary method added to support the transition from FileSystem to FileContext for user applications.

      Parameters:
      src - path to be renamed
      dst - new path after rename
      options - rename options.
      Throws:
      FileNotFoundException - src path does not exist, or the parent path of dst does not exist.
      FileAlreadyExistsException - dest path exists and is a file
      ParentNotDirectoryException - if the parent path of dest is not a directory
      IOException - on failure
    • truncate

      public boolean truncate(Path f, long newLength) throws IOException
      Truncate the file in the indicated path to the indicated size.
      • Fails if path is a directory.
      • Fails if path does not exist.
      • Fails if path is not closed.
      • Fails if new size is greater than current size.
      Parameters:
      f - The path to the file to be truncated
      newLength - The size the file is to be truncated to
      Returns:
      true if the file has been truncated to the desired newLength and is immediately available to be reused for write operations such as append, or false if a background process of adjusting the length of the last block has been started, and clients should wait for it to complete before proceeding with further file updates.
      Throws:
      IOException - IO failure
      UnsupportedOperationException - if the operation is unsupported (default).
    • delete

      @Deprecated public boolean delete(Path f) throws IOException
      Deprecated.
      Delete a file/directory.
      Parameters:
      f - the path.
      Returns:
      if delete success true, not false.
      Throws:
      IOException - IO failure.
    • delete

      public abstract boolean delete(Path f, boolean recursive) throws IOException
      Delete a file.
      Parameters:
      f - the path to delete.
      recursive - if path is a directory and set to true, the directory is deleted else throws an exception. In case of a file the recursive can be set to either true or false.
      Returns:
      true if delete is successful else false.
      Throws:
      IOException - IO failure
    • deleteOnExit

      public boolean deleteOnExit(Path f) throws IOException
      Mark a path to be deleted when its FileSystem is closed. When the JVM shuts down cleanly, all cached FileSystem objects will be closed automatically. These the marked paths will be deleted as a result. If a FileSystem instance is not cached, i.e. has been created with createFileSystem(URI, Configuration), then the paths will be deleted in when close() is called on that instance. The path must exist in the filesystem at the time of the method call; it does not have to exist at the time of JVM shutdown. Notes
      1. Clean shutdown of the JVM cannot be guaranteed.
      2. The time to shut down a FileSystem will depends on the number of files to delete. For filesystems where the cost of checking for the existence of a file/directory and the actual delete operation (for example: object stores) is high, the time to shutdown the JVM can be significantly extended by over-use of this feature.
      3. Connectivity problems with a remote filesystem may delay shutdown further, and may cause the files to not be deleted.
      Parameters:
      f - the path to delete.
      Returns:
      true if deleteOnExit is successful, otherwise false.
      Throws:
      IOException - IO failure
    • cancelDeleteOnExit

      public boolean cancelDeleteOnExit(Path f)
      Cancel the scheduled deletion of the path when the FileSystem is closed.
      Parameters:
      f - the path to cancel deletion
      Returns:
      true if the path was found in the delete-on-exit list.
    • processDeleteOnExit

      protected void processDeleteOnExit()
      Delete all paths that were marked as delete-on-exit. This recursively deletes all files and directories in the specified paths. The time to process this operation is O(paths), with the actual time dependent on the time for existence and deletion operations to complete, successfully or not.
    • exists

      public boolean exists(Path f) throws IOException
      Check if a path exists. It is highly discouraged to call this method back to back with other getFileStatus(Path) calls, as this will involve multiple redundant RPC calls in HDFS.
      Parameters:
      f - source path
      Returns:
      true if the path exists
      Throws:
      IOException - IO failure
    • isDirectory

      @Deprecated public boolean isDirectory(Path f) throws IOException
      Deprecated.
      True iff the named path is a directory. Note: Avoid using this method. Instead reuse the FileStatus returned by getFileStatus() or listStatus() methods.
      Parameters:
      f - path to check
      Returns:
      if f is directory true, not false.
      Throws:
      IOException - IO failure
    • isFile

      @Deprecated public boolean isFile(Path f) throws IOException
      Deprecated.
      True iff the named path is a regular file. Note: Avoid using this method. Instead reuse the FileStatus returned by getFileStatus(Path) or listStatus() methods.
      Parameters:
      f - path to check
      Returns:
      if f is file true, not false.
      Throws:
      IOException - IO failure
    • getLength

      @Deprecated public long getLength(Path f) throws IOException
      Deprecated.
      Use getFileStatus(Path) instead.
      The number of bytes in a file.
      Parameters:
      f - the path.
      Returns:
      the number of bytes; 0 for a directory
      Throws:
      FileNotFoundException - if the path does not resolve
      IOException - IO failure
    • getContentSummary

      public ContentSummary getContentSummary(Path f) throws IOException
      Return the ContentSummary of a given Path.
      Parameters:
      f - path to use
      Returns:
      content summary.
      Throws:
      FileNotFoundException - if the path does not resolve
      IOException - IO failure
    • getQuotaUsage

      public QuotaUsage getQuotaUsage(Path f) throws IOException
      Return the QuotaUsage of a given Path.
      Parameters:
      f - path to use
      Returns:
      the quota usage
      Throws:
      IOException - IO failure
    • setQuota

      public void setQuota(Path src, long namespaceQuota, long storagespaceQuota) throws IOException
      Set quota for the given Path.
      Parameters:
      src - the target path to set quota for
      namespaceQuota - the namespace quota (i.e., # of files/directories) to set
      storagespaceQuota - the storage space quota to set
      Throws:
      IOException - IO failure
    • setQuotaByStorageType

      public void setQuotaByStorageType(Path src, StorageType type, long quota) throws IOException
      Set per storage type quota for the given Path.
      Parameters:
      src - the target path to set storage type quota for
      type - the storage type to set
      quota - the quota to set for the given storage type
      Throws:
      IOException - IO failure
    • listStatus

      public abstract FileStatus[] listStatus(Path f) throws FileNotFoundException, IOException
      List the statuses of the files/directories in the given path if the path is a directory.

      Does not guarantee to return the List of files/directories status in a sorted order.

      Will not return null. Expect IOException upon access error.

      Parameters:
      f - given path
      Returns:
      the statuses of the files/directories in the given patch
      Throws:
      FileNotFoundException - when the path does not exist
      IOException - see specific implementation
    • listStatusBatch

      @Private protected org.apache.hadoop.fs.FileSystem.DirectoryEntries listStatusBatch(Path f, byte[] token) throws FileNotFoundException, IOException
      Given an opaque iteration token, return the next batch of entries in a directory. This is a private API not meant for use by end users.

      This method should be overridden by FileSystem subclasses that want to use the generic listStatusIterator(Path) implementation.

      Parameters:
      f - Path to list
      token - opaque iteration token returned by previous call, or null if this is the first call.
      Returns:
      directory entries.
      Throws:
      FileNotFoundException - when the path does not exist.
      IOException - If an I/O error occurred.
    • listCorruptFileBlocks

      public org.apache.hadoop.fs.RemoteIterator<Path> listCorruptFileBlocks(Path path) throws IOException
      List corrupted file blocks.
      Parameters:
      path - the path.
      Returns:
      an iterator over the corrupt files under the given path (may contain duplicates if a file has more than one corrupt block)
      Throws:
      UnsupportedOperationException - if the operation is unsupported (default).
      IOException - IO failure
    • listStatus

      public FileStatus[] listStatus(Path f, PathFilter filter) throws FileNotFoundException, IOException
      Filter files/directories in the given path using the user-supplied path filter.

      Does not guarantee to return the List of files/directories status in a sorted order.

      Parameters:
      f - a path name
      filter - the user-supplied path filter
      Returns:
      an array of FileStatus objects for the files under the given path after applying the filter
      Throws:
      FileNotFoundException - when the path does not exist
      IOException - see specific implementation
    • listStatus

      public FileStatus[] listStatus(Path[] files) throws FileNotFoundException, IOException
      Filter files/directories in the given list of paths using default path filter.

      Does not guarantee to return the List of files/directories status in a sorted order.

      Parameters:
      files - a list of paths
      Returns:
      a list of statuses for the files under the given paths after applying the filter default Path filter
      Throws:
      FileNotFoundException - when the path does not exist
      IOException - see specific implementation
    • listStatus

      public FileStatus[] listStatus(Path[] files, PathFilter filter) throws FileNotFoundException, IOException
      Filter files/directories in the given list of paths using user-supplied path filter.

      Does not guarantee to return the List of files/directories status in a sorted order.

      Parameters:
      files - a list of paths
      filter - the user-supplied path filter
      Returns:
      a list of statuses for the files under the given paths after applying the filter
      Throws:
      FileNotFoundException - when the path does not exist
      IOException - see specific implementation
    • globStatus

      public FileStatus[] globStatus(Path pathPattern) throws IOException

      Return all the files that match filePattern and are not checksum files. Results are sorted by their names.

      A filename pattern is composed of regular characters and special pattern matching characters, which are:

      ?
      Matches any single character.
      *
      Matches zero or more characters.
      [abc]
      Matches a single character from character set {a,b,c}.
      [a-b]
      Matches a single character from the character range {a...b}. Note that character a must be lexicographically less than or equal to character b.
      [^a]
      Matches a single character that is not from character set or range {a}. Note that the ^ character must occur immediately to the right of the opening bracket.
      \c
      Removes (escapes) any special meaning of character c.
      {ab,cd}
      Matches a string from the string set {ab, cd}
      {ab,c{de,fh}}
      Matches a string from the string set {ab, cde, cfh}
      Parameters:
      pathPattern - a glob specifying a path pattern
      Returns:
      an array of paths that match the path pattern
      Throws:
      IOException - IO failure
    • globStatus

      public FileStatus[] globStatus(Path pathPattern, PathFilter filter) throws IOException
      Return an array of FileStatus objects whose path names match pathPattern and is accepted by the user-supplied path filter. Results are sorted by their path names.
      Parameters:
      pathPattern - a glob specifying the path pattern
      filter - a user-supplied path filter
      Returns:
      null if pathPattern has no glob and the path does not exist an empty array if pathPattern has a glob and no path matches it else an array of FileStatus objects matching the pattern
      Throws:
      IOException - if any I/O error occurs when fetching file status
    • listLocatedStatus

      public org.apache.hadoop.fs.RemoteIterator<LocatedFileStatus> listLocatedStatus(Path f) throws FileNotFoundException, IOException
      List the statuses of the files/directories in the given path if the path is a directory. Return the file's status and block locations If the path is a file. If a returned status is a file, it contains the file's block locations.
      Parameters:
      f - is the path
      Returns:
      an iterator that traverses statuses of the files/directories in the given path
      Throws:
      FileNotFoundException - If f does not exist
      IOException - If an I/O error occurred
    • listLocatedStatus

      protected org.apache.hadoop.fs.RemoteIterator<LocatedFileStatus> listLocatedStatus(Path f, PathFilter filter) throws FileNotFoundException, IOException
      List a directory. The returned results include its block location if it is a file The results are filtered by the given path filter
      Parameters:
      f - a path
      filter - a path filter
      Returns:
      an iterator that traverses statuses of the files/directories in the given path
      Throws:
      FileNotFoundException - if f does not exist
      IOException - if any I/O error occurred
    • listStatusIterator

      public org.apache.hadoop.fs.RemoteIterator<FileStatus> listStatusIterator(Path p) throws FileNotFoundException, IOException
      Returns a remote iterator so that followup calls are made on demand while consuming the entries. Each FileSystem implementation should override this method and provide a more efficient implementation, if possible. Does not guarantee to return the iterator that traverses statuses of the files in a sorted order.
      Parameters:
      p - target path
      Returns:
      remote iterator
      Throws:
      FileNotFoundException - if p does not exist
      IOException - if any I/O error occurred
    • listFiles

      public org.apache.hadoop.fs.RemoteIterator<LocatedFileStatus> listFiles(Path f, boolean recursive) throws FileNotFoundException, IOException
      List the statuses and block locations of the files in the given path. Does not guarantee to return the iterator that traverses statuses of the files in a sorted order.
       If the path is a directory,
         if recursive is false, returns files in the directory;
         if recursive is true, return files in the subtree rooted at the path.
       If the path is a file, return the file's status and block locations.
       
      Parameters:
      f - is the path
      recursive - if the subdirectories need to be traversed recursively
      Returns:
      an iterator that traverses statuses of the files
      Throws:
      FileNotFoundException - when the path does not exist;
      IOException - see specific implementation
    • getHomeDirectory

      public Path getHomeDirectory()
      Return the current user's home directory in this FileSystem. The default implementation returns "/user/$USER/".
      Returns:
      the path.
    • setWorkingDirectory

      public abstract void setWorkingDirectory(Path new_dir)
      Set the current working directory for the given FileSystem. All relative paths will be resolved relative to it.
      Parameters:
      new_dir - Path of new working directory
    • getWorkingDirectory

      public abstract Path getWorkingDirectory()
      Get the current working directory for the given FileSystem
      Returns:
      the directory pathname
    • getInitialWorkingDirectory

      protected Path getInitialWorkingDirectory()
      Note: with the new FileContext class, getWorkingDirectory() will be removed. The working directory is implemented in FileContext. Some FileSystems like LocalFileSystem have an initial workingDir that we use as the starting workingDir. For other file systems like HDFS there is no built in notion of an initial workingDir.
      Returns:
      if there is built in notion of workingDir then it is returned; else a null is returned.
    • mkdirs

      public boolean mkdirs(Path f) throws IOException
      Call mkdirs(Path, FsPermission) with default permission.
      Parameters:
      f - path
      Returns:
      true if the directory was created
      Throws:
      IOException - IO failure
    • mkdirs

      public abstract boolean mkdirs(Path f, FsPermission permission) throws IOException
      Make the given file and all non-existent parents into directories. Has roughly the semantics of Unix @{code mkdir -p}. Existence of the directory hierarchy is not an error.
      Parameters:
      f - path to create
      permission - to apply to f
      Returns:
      if mkdir success true, not false.
      Throws:
      IOException - IO failure
    • copyFromLocalFile

      public void copyFromLocalFile(Path src, Path dst) throws IOException
      The src file is on the local disk. Add it to filesystem at the given dst name and the source is kept intact afterwards
      Parameters:
      src - path
      dst - path
      Throws:
      IOException - IO failure
    • moveFromLocalFile

      public void moveFromLocalFile(Path[] srcs, Path dst) throws IOException
      The src files is on the local disk. Add it to filesystem at the given dst name, removing the source afterwards.
      Parameters:
      srcs - source paths
      dst - path
      Throws:
      IOException - IO failure
    • moveFromLocalFile

      public void moveFromLocalFile(Path src, Path dst) throws IOException
      The src file is on the local disk. Add it to the filesystem at the given dst name, removing the source afterwards.
      Parameters:
      src - local path
      dst - path
      Throws:
      IOException - IO failure
    • copyFromLocalFile

      public void copyFromLocalFile(boolean delSrc, Path src, Path dst) throws IOException
      The src file is on the local disk. Add it to the filesystem at the given dst name. delSrc indicates if the source should be removed
      Parameters:
      delSrc - whether to delete the src
      src - path
      dst - path
      Throws:
      IOException - IO failure.
    • copyFromLocalFile

      public void copyFromLocalFile(boolean delSrc, boolean overwrite, Path[] srcs, Path dst) throws IOException
      The src files are on the local disk. Add it to the filesystem at the given dst name. delSrc indicates if the source should be removed
      Parameters:
      delSrc - whether to delete the src
      overwrite - whether to overwrite an existing file
      srcs - array of paths which are source
      dst - path
      Throws:
      IOException - IO failure
    • copyFromLocalFile

      public void copyFromLocalFile(boolean delSrc, boolean overwrite, Path src, Path dst) throws IOException
      The src file is on the local disk. Add it to the filesystem at the given dst name. delSrc indicates if the source should be removed
      Parameters:
      delSrc - whether to delete the src
      overwrite - whether to overwrite an existing file
      src - path
      dst - path
      Throws:
      IOException - IO failure
    • copyToLocalFile

      public void copyToLocalFile(Path src, Path dst) throws IOException
      Copy it a file from the remote filesystem to the local one.
      Parameters:
      src - path src file in the remote filesystem
      dst - path local destination
      Throws:
      IOException - IO failure
    • moveToLocalFile

      public void moveToLocalFile(Path src, Path dst) throws IOException
      Copy a file to the local filesystem, then delete it from the remote filesystem (if successfully copied).
      Parameters:
      src - path src file in the remote filesystem
      dst - path local destination
      Throws:
      IOException - IO failure
    • copyToLocalFile

      public void copyToLocalFile(boolean delSrc, Path src, Path dst) throws IOException
      Copy it a file from a remote filesystem to the local one. delSrc indicates if the src will be removed or not.
      Parameters:
      delSrc - whether to delete the src
      src - path src file in the remote filesystem
      dst - path local destination
      Throws:
      IOException - IO failure
    • copyToLocalFile

      public void copyToLocalFile(boolean delSrc, Path src, Path dst, boolean useRawLocalFileSystem) throws IOException
      The src file is under this filesystem, and the dst is on the local disk. Copy it from the remote filesystem to the local dst name. delSrc indicates if the src will be removed or not. useRawLocalFileSystem indicates whether to use RawLocalFileSystem as the local file system or not. RawLocalFileSystem is non checksumming, So, It will not create any crc files at local.
      Parameters:
      delSrc - whether to delete the src
      src - path
      dst - path
      useRawLocalFileSystem - whether to use RawLocalFileSystem as local file system or not.
      Throws:
      IOException - for any IO error
    • startLocalOutput

      public Path startLocalOutput(Path fsOutputFile, Path tmpLocalFile) throws IOException
      Returns a local file that the user can write output to. The caller provides both the eventual target name in this FileSystem and the local working file path. If this FileSystem is local, we write directly into the target. If the FileSystem is not local, we write into the tmp local area.
      Parameters:
      fsOutputFile - path of output file
      tmpLocalFile - path of local tmp file
      Returns:
      the path.
      Throws:
      IOException - IO failure
    • completeLocalOutput

      public void completeLocalOutput(Path fsOutputFile, Path tmpLocalFile) throws IOException
      Called when we're all done writing to the target. A local FS will do nothing, because we've written to exactly the right place. A remote FS will copy the contents of tmpLocalFile to the correct target at fsOutputFile.
      Parameters:
      fsOutputFile - path of output file
      tmpLocalFile - path to local tmp file
      Throws:
      IOException - IO failure
    • close

      public void close() throws IOException
      Close this FileSystem instance. Will release any held locks, delete all files queued for deletion through calls to deleteOnExit(Path), and remove this FS instance from the cache, if cached. After this operation, the outcome of any method call on this FileSystem instance, or any input/output stream created by it is undefined.
      Specified by:
      close in interface AutoCloseable
      Specified by:
      close in interface Closeable
      Throws:
      IOException - IO failure
    • getUsed

      public long getUsed() throws IOException
      Return the total size of all files in the filesystem.
      Returns:
      the number of path used.
      Throws:
      IOException - IO failure
    • getUsed

      public long getUsed(Path path) throws IOException
      Return the total size of all files from a specified path.
      Parameters:
      path - the path.
      Returns:
      the number of path content summary.
      Throws:
      IOException - IO failure
    • getBlockSize

      @Deprecated public long getBlockSize(Path f) throws IOException
      Deprecated.
      Get the block size for a particular file.
      Parameters:
      f - the filename
      Returns:
      the number of bytes in a block
      Throws:
      FileNotFoundException - if the path is not present
      IOException - IO failure
    • getDefaultBlockSize

      @Deprecated public long getDefaultBlockSize()
      Deprecated.
      Return the number of bytes that large input files should be optimally be split into to minimize I/O time.
      Returns:
      default block size.
    • getDefaultBlockSize

      public long getDefaultBlockSize(Path f)
      Return the number of bytes that large input files should be optimally be split into to minimize I/O time. The given path will be used to locate the actual filesystem. The full path does not have to exist.
      Parameters:
      f - path of file
      Returns:
      the default block size for the path's filesystem
    • getDefaultReplication

      @Deprecated public short getDefaultReplication()
      Deprecated.
      Get the default replication.
      Returns:
      the replication; the default value is "1".
    • getDefaultReplication

      public short getDefaultReplication(Path path)
      Get the default replication for a path. The given path will be used to locate the actual FileSystem to query. The full path does not have to exist.
      Parameters:
      path - of the file
      Returns:
      default replication for the path's filesystem
    • getFileStatus

      public abstract FileStatus getFileStatus(Path f) throws IOException
      Return a file status object that represents the path.
      Parameters:
      f - The path we want information from
      Returns:
      a FileStatus object
      Throws:
      FileNotFoundException - when the path does not exist
      IOException - see specific implementation
    • msync

      public void msync() throws IOException, UnsupportedOperationException
      Synchronize client metadata state.

      In some FileSystem implementations such as HDFS metadata synchronization is essential to guarantee consistency of read requests particularly in HA setting.

      Throws:
      IOException - If an I/O error occurred.
      UnsupportedOperationException - if the operation is unsupported.
    • access

      @LimitedPrivate({"HDFS","Hive"}) public void access(Path path, FsAction mode) throws AccessControlException, FileNotFoundException, IOException
      Checks if the user can access a path. The mode specifies which access checks to perform. If the requested permissions are granted, then the method returns normally. If access is denied, then the method throws an AccessControlException.

      The default implementation calls getFileStatus(Path) and checks the returned permissions against the requested permissions. Note that the getFileStatus(Path) call will be subject to authorization checks. Typically, this requires search (execute) permissions on each directory in the path's prefix, but this is implementation-defined. Any file system that provides a richer authorization model (such as ACLs) may override the default implementation so that it checks against that model instead.

      In general, applications should avoid using this method, due to the risk of time-of-check/time-of-use race conditions. The permissions on a file may change immediately after the access call returns. Most applications should prefer running specific file system actions as the desired user represented by a UserGroupInformation.

      Parameters:
      path - Path to check
      mode - type of access to check
      Throws:
      AccessControlException - if access is denied
      FileNotFoundException - if the path does not exist
      IOException - see specific implementation
    • fixRelativePart

      protected Path fixRelativePart(Path p)
      See FileContext.fixRelativePart(org.apache.hadoop.fs.Path).
      Parameters:
      p - the path.
      Returns:
      relative part.
    • createSymlink

      Parameters:
      target - target path.
      link - link.
      createParent - create parent.
      Throws:
      AccessControlException - if access is denied.
      FileAlreadyExistsException - when the path does not exist.
      FileNotFoundException - when the path does not exist.
      ParentNotDirectoryException - if the parent path of dest is not a directory.
      UnsupportedFileSystemException - if there was no known implementation for the scheme.
      IOException - raised on errors performing I/O.
    • getFileLinkStatus

      Parameters:
      f - the path.
      Returns:
      file status
      Throws:
      AccessControlException - if access is denied.
      FileNotFoundException - when the path does not exist.
      IOException - raised on errors performing I/O.
      UnsupportedFileSystemException - if there was no known implementation for the scheme.
    • supportsSymlinks

      public boolean supportsSymlinks()
      Returns:
      if support symlinkls true, not false.
    • getLinkTarget

      public Path getLinkTarget(Path f) throws IOException
      Parameters:
      f - the path.
      Returns:
      the path.
      Throws:
      UnsupportedOperationException - if the operation is unsupported (default outcome).
      IOException - IO failure.
    • resolveLink

      protected Path resolveLink(Path f) throws IOException
      Parameters:
      f - the path.
      Returns:
      the path.
      Throws:
      UnsupportedOperationException - if the operation is unsupported (default outcome).
      IOException - IO failure.
    • getFileChecksum

      public FileChecksum getFileChecksum(Path f) throws IOException
      Get the checksum of a file, if the FS supports checksums.
      Parameters:
      f - The file path
      Returns:
      The file checksum. The default return value is null, which indicates that no checksum algorithm is implemented in the corresponding FileSystem.
      Throws:
      IOException - IO failure
    • getFileChecksum

      public FileChecksum getFileChecksum(Path f, long length) throws IOException
      Get the checksum of a file, from the beginning of the file till the specific length.
      Parameters:
      f - The file path
      length - The length of the file range for checksum calculation
      Returns:
      The file checksum or null if checksums are not supported.
      Throws:
      IOException - IO failure
    • setVerifyChecksum

      public void setVerifyChecksum(boolean verifyChecksum)
      Set the verify checksum flag. This is only applicable if the corresponding filesystem supports checksums. By default doesn't do anything.
      Parameters:
      verifyChecksum - Verify checksum flag
    • setWriteChecksum

      public void setWriteChecksum(boolean writeChecksum)
      Set the write checksum flag. This is only applicable if the corresponding filesystem supports checksums. By default doesn't do anything.
      Parameters:
      writeChecksum - Write checksum flag
    • getStatus

      public FsStatus getStatus() throws IOException
      Returns a status object describing the use and capacity of the filesystem. If the filesystem has multiple partitions, the use and capacity of the root partition is reflected.
      Returns:
      a FsStatus object
      Throws:
      IOException - see specific implementation
    • getStatus

      public FsStatus getStatus(Path p) throws IOException
      Returns a status object describing the use and capacity of the filesystem. If the filesystem has multiple partitions, the use and capacity of the partition pointed to by the specified path is reflected.
      Parameters:
      p - Path for which status should be obtained. null means the default partition.
      Returns:
      a FsStatus object
      Throws:
      IOException - see specific implementation
    • setPermission

      public void setPermission(Path p, FsPermission permission) throws IOException
      Set permission of a path.
      Parameters:
      p - The path
      permission - permission
      Throws:
      IOException - IO failure
    • setOwner

      public void setOwner(Path p, String username, String groupname) throws IOException
      Set owner of a path (i.e. a file or a directory). The parameters username and groupname cannot both be null.
      Parameters:
      p - The path
      username - If it is null, the original username remains unchanged.
      groupname - If it is null, the original groupname remains unchanged.
      Throws:
      IOException - IO failure
    • setTimes

      public void setTimes(Path p, long mtime, long atime) throws IOException
      Set access time of a file.
      Parameters:
      p - The path
      mtime - Set the modification time of this file. The number of milliseconds since Jan 1, 1970. A value of -1 means that this call should not set modification time.
      atime - Set the access time of this file. The number of milliseconds since Jan 1, 1970. A value of -1 means that this call should not set access time.
      Throws:
      IOException - IO failure
    • createSnapshot

      public final Path createSnapshot(Path path) throws IOException
      Create a snapshot with a default name.
      Parameters:
      path - The directory where snapshots will be taken.
      Returns:
      the snapshot path.
      Throws:
      IOException - IO failure
      UnsupportedOperationException - if the operation is unsupported
    • createSnapshot

      public Path createSnapshot(Path path, String snapshotName) throws IOException
      Create a snapshot.
      Parameters:
      path - The directory where snapshots will be taken.
      snapshotName - The name of the snapshot
      Returns:
      the snapshot path.
      Throws:
      IOException - IO failure
      UnsupportedOperationException - if the operation is unsupported
    • renameSnapshot

      public void renameSnapshot(Path path, String snapshotOldName, String snapshotNewName) throws IOException
      Rename a snapshot.
      Parameters:
      path - The directory path where the snapshot was taken
      snapshotOldName - Old name of the snapshot
      snapshotNewName - New name of the snapshot
      Throws:
      IOException - IO failure
      UnsupportedOperationException - if the operation is unsupported (default outcome).
    • deleteSnapshot

      public void deleteSnapshot(Path path, String snapshotName) throws IOException
      Delete a snapshot of a directory.
      Parameters:
      path - The directory that the to-be-deleted snapshot belongs to
      snapshotName - The name of the snapshot
      Throws:
      IOException - IO failure
      UnsupportedOperationException - if the operation is unsupported (default outcome).
    • modifyAclEntries

      public void modifyAclEntries(Path path, List<AclEntry> aclSpec) throws IOException
      Modifies ACL entries of files and directories. This method can add new ACL entries or modify the permissions on existing ACL entries. All existing ACL entries that are not specified in this call are retained without changes. (Modifications are merged into the current ACL.)
      Parameters:
      path - Path to modify
      aclSpec - List<AclEntry> describing modifications
      Throws:
      IOException - if an ACL could not be modified
      UnsupportedOperationException - if the operation is unsupported (default outcome).
    • removeAclEntries

      public void removeAclEntries(Path path, List<AclEntry> aclSpec) throws IOException
      Removes ACL entries from files and directories. Other ACL entries are retained.
      Parameters:
      path - Path to modify
      aclSpec - List describing entries to remove
      Throws:
      IOException - if an ACL could not be modified
      UnsupportedOperationException - if the operation is unsupported (default outcome).
    • removeDefaultAcl

      public void removeDefaultAcl(Path path) throws IOException
      Removes all default ACL entries from files and directories.
      Parameters:
      path - Path to modify
      Throws:
      IOException - if an ACL could not be modified
      UnsupportedOperationException - if the operation is unsupported (default outcome).
    • removeAcl

      public void removeAcl(Path path) throws IOException
      Removes all but the base ACL entries of files and directories. The entries for user, group, and others are retained for compatibility with permission bits.
      Parameters:
      path - Path to modify
      Throws:
      IOException - if an ACL could not be removed
      UnsupportedOperationException - if the operation is unsupported (default outcome).
    • setAcl

      public void setAcl(Path path, List<AclEntry> aclSpec) throws IOException
      Fully replaces ACL of files and directories, discarding all existing entries.
      Parameters:
      path - Path to modify
      aclSpec - List describing modifications, which must include entries for user, group, and others for compatibility with permission bits.
      Throws:
      IOException - if an ACL could not be modified
      UnsupportedOperationException - if the operation is unsupported (default outcome).
    • getAclStatus

      public AclStatus getAclStatus(Path path) throws IOException
      Gets the ACL of a file or directory.
      Parameters:
      path - Path to get
      Returns:
      AclStatus describing the ACL of the file or directory
      Throws:
      IOException - if an ACL could not be read
      UnsupportedOperationException - if the operation is unsupported (default outcome).
    • setXAttr

      public void setXAttr(Path path, String name, byte[] value) throws IOException
      Set an xattr of a file or directory. The name must be prefixed with the namespace followed by ".". For example, "user.attr".

      Refer to the HDFS extended attributes user documentation for details.

      Parameters:
      path - Path to modify
      name - xattr name.
      value - xattr value.
      Throws:
      IOException - IO failure
      UnsupportedOperationException - if the operation is unsupported (default outcome).
    • setXAttr

      public void setXAttr(Path path, String name, byte[] value, EnumSet<XAttrSetFlag> flag) throws IOException
      Set an xattr of a file or directory. The name must be prefixed with the namespace followed by ".". For example, "user.attr".

      Refer to the HDFS extended attributes user documentation for details.

      Parameters:
      path - Path to modify
      name - xattr name.
      value - xattr value.
      flag - xattr set flag
      Throws:
      IOException - IO failure
      UnsupportedOperationException - if the operation is unsupported (default outcome).
    • getXAttr

      public byte[] getXAttr(Path path, String name) throws IOException
      Get an xattr name and value for a file or directory. The name must be prefixed with the namespace followed by ".". For example, "user.attr".

      Refer to the HDFS extended attributes user documentation for details.

      Parameters:
      path - Path to get extended attribute
      name - xattr name.
      Returns:
      byte[] xattr value.
      Throws:
      IOException - IO failure
      UnsupportedOperationException - if the operation is unsupported (default outcome).
    • getXAttrs

      public Map<String,byte[]> getXAttrs(Path path) throws IOException
      Get all of the xattr name/value pairs for a file or directory. Only those xattrs which the logged-in user has permissions to view are returned.

      Refer to the HDFS extended attributes user documentation for details.

      Parameters:
      path - Path to get extended attributes
      Returns:
      Map describing the XAttrs of the file or directory
      Throws:
      IOException - IO failure
      UnsupportedOperationException - if the operation is unsupported (default outcome).
    • getXAttrs

      public Map<String,byte[]> getXAttrs(Path path, List<String> names) throws IOException
      Get all of the xattrs name/value pairs for a file or directory. Only those xattrs which the logged-in user has permissions to view are returned.

      Refer to the HDFS extended attributes user documentation for details.

      Parameters:
      path - Path to get extended attributes
      names - XAttr names.
      Returns:
      Map describing the XAttrs of the file or directory
      Throws:
      IOException - IO failure
      UnsupportedOperationException - if the operation is unsupported (default outcome).
    • listXAttrs

      public List<String> listXAttrs(Path path) throws IOException
      Get all of the xattr names for a file or directory. Only those xattr names which the logged-in user has permissions to view are returned.

      Refer to the HDFS extended attributes user documentation for details.

      Parameters:
      path - Path to get extended attributes
      Returns:
      List<String> of the XAttr names of the file or directory
      Throws:
      IOException - IO failure
      UnsupportedOperationException - if the operation is unsupported (default outcome).
    • removeXAttr

      public void removeXAttr(Path path, String name) throws IOException
      Remove an xattr of a file or directory. The name must be prefixed with the namespace followed by ".". For example, "user.attr".

      Refer to the HDFS extended attributes user documentation for details.

      Parameters:
      path - Path to remove extended attribute
      name - xattr name
      Throws:
      IOException - IO failure
      UnsupportedOperationException - if the operation is unsupported (default outcome).
    • satisfyStoragePolicy

      public void satisfyStoragePolicy(Path path) throws IOException
      Set the source path to satisfy storage policy.
      Parameters:
      path - The source path referring to either a directory or a file.
      Throws:
      IOException - If an I/O error occurred.
    • setStoragePolicy

      public void setStoragePolicy(Path src, String policyName) throws IOException
      Set the storage policy for a given file or directory.
      Parameters:
      src - file or directory path.
      policyName - the name of the target storage policy. The list of supported Storage policies can be retrieved via getAllStoragePolicies().
      Throws:
      IOException - IO failure
      UnsupportedOperationException - if the operation is unsupported (default outcome).
    • unsetStoragePolicy

      public void unsetStoragePolicy(Path src) throws IOException
      Unset the storage policy set for a given file or directory.
      Parameters:
      src - file or directory path.
      Throws:
      IOException - IO failure
      UnsupportedOperationException - if the operation is unsupported (default outcome).
    • getStoragePolicy

      public BlockStoragePolicySpi getStoragePolicy(Path src) throws IOException
      Query the effective storage policy ID for the given file or directory.
      Parameters:
      src - file or directory path.
      Returns:
      storage policy for give file.
      Throws:
      IOException - IO failure
      UnsupportedOperationException - if the operation is unsupported (default outcome).
    • getAllStoragePolicies

      public Collection<? extends BlockStoragePolicySpi> getAllStoragePolicies() throws IOException
      Retrieve all the storage policies supported by this file system.
      Returns:
      all storage policies supported by this filesystem.
      Throws:
      IOException - IO failure
      UnsupportedOperationException - if the operation is unsupported (default outcome).
    • getTrashRoot

      public Path getTrashRoot(Path path)
      Get the root directory of Trash for current user when the path specified is deleted.
      Parameters:
      path - the trash root of the path to be determined.
      Returns:
      the default implementation returns /user/$USER/.Trash
    • getTrashRoots

      public Collection<FileStatus> getTrashRoots(boolean allUsers)
      Get all the trash roots for current user or all users.
      Parameters:
      allUsers - return trash roots for all users if true.
      Returns:
      all the trash root directories. Default FileSystem returns .Trash under users' home directories if /user/$USER/.Trash exists.
    • hasPathCapability

      public boolean hasPathCapability(Path path, String capability) throws IOException
      The base FileSystem implementation generally has no knowledge of the capabilities of actual implementations. Unless it has a way to explicitly determine the capabilities, this method returns false. Probe for a specific capability under the given path. If the function returns true, this instance is explicitly declaring that the capability is available. If the function returns false, it can mean one of:
      • The capability is not known.
      • The capability is known but it is not supported.
      • The capability is known but the filesystem does not know if it is supported under the supplied path.
      The core guarantee which a caller can rely on is: if the predicate returns true, then the specific operation/behavior can be expected to be supported. However a specific call may be rejected for permission reasons, the actual file/directory not being present, or some other failure during the attempted execution of the operation.

      Implementors: PathCapabilitiesSupport can be used to help implement this method.

      Specified by:
      hasPathCapability in interface org.apache.hadoop.fs.PathCapabilities
      Parameters:
      path - path to query the capability of.
      capability - non-null, non-empty string to query the path for support.
      Returns:
      true if the capability is supported under that part of the FS.
      Throws:
      IOException - this should not be raised, except on problems resolving paths or relaying the call.
    • getFileSystemClass

      public static Class<? extends FileSystem> getFileSystemClass(String scheme, Configuration conf) throws IOException
      Get the FileSystem implementation class of a filesystem. This triggers a scan and load of all FileSystem implementations listed as services and discovered via the ServiceLoader
      Parameters:
      scheme - URL scheme of FS
      conf - configuration: can be null, in which case the check for a filesystem binding declaration in the configuration is skipped.
      Returns:
      the filesystem
      Throws:
      UnsupportedFileSystemException - if there was no known implementation for the scheme.
      IOException - if the filesystem could not be loaded
    • getStatistics

      @Deprecated public static Map<String,org.apache.hadoop.fs.FileSystem.Statistics> getStatistics()
      Get the Map of Statistics object indexed by URI Scheme.
      Returns:
      a Map having a key as URI scheme and value as Statistics object
    • getAllStatistics

      @Deprecated public static List<org.apache.hadoop.fs.FileSystem.Statistics> getAllStatistics()
      Return the FileSystem classes that have Statistics.
      Returns:
      statistics lists.
    • getStatistics

      @Deprecated public static org.apache.hadoop.fs.FileSystem.Statistics getStatistics(String scheme, Class<? extends FileSystem> cls)
      Get the statistics for a particular file system.
      Parameters:
      scheme - scheme.
      cls - the class to lookup
      Returns:
      a statistics object
    • clearStatistics

      public static void clearStatistics()
      Reset all statistics for all file systems.
    • printStatistics

      public static void printStatistics() throws IOException
      Print all statistics for all file systems to System.out
      Throws:
      IOException - If an I/O error occurred.
    • areSymlinksEnabled

      @VisibleForTesting public static boolean areSymlinksEnabled()
    • enableSymlinks

      @VisibleForTesting public static void enableSymlinks()
    • getStorageStatistics

      public StorageStatistics getStorageStatistics()
      Get the StorageStatistics for this FileSystem object. These statistics are per-instance. They are not shared with any other FileSystem object.

      This is a default method which is intended to be overridden by subclasses. The default implementation returns an empty storage statistics object.

      Returns:
      The StorageStatistics for this FileSystem instance. Will never be null.
    • getGlobalStorageStatistics

      public static GlobalStorageStatistics getGlobalStorageStatistics()
      Get the global storage statistics.
      Returns:
      global storage statistics.
    • createDataOutputStreamBuilder

      @Unstable protected static FSDataOutputStreamBuilder createDataOutputStreamBuilder(@Nonnull FileSystem fileSystem, @Nonnull Path path)
      Create instance of the standard FSDataOutputStreamBuilder for the given filesystem and path.
      Parameters:
      fileSystem - owner
      path - path to create
      Returns:
      a builder.
    • createFile

      public FSDataOutputStreamBuilder createFile(Path path)
      Create a new FSDataOutputStreamBuilder for the file with path. Files are overwritten by default.
      Parameters:
      path - file path
      Returns:
      a FSDataOutputStreamBuilder object to build the file HADOOP-14384. Temporarily reduce the visibility of method before the builder interface becomes stable.
    • appendFile

      public FSDataOutputStreamBuilder appendFile(Path path)
      Create a Builder to append a file.
      Parameters:
      path - file path.
      Returns:
      a FSDataOutputStreamBuilder to build file append request.
    • openFile

      Open a file for reading through a builder API. Ultimately calls open(Path, int) unless a subclass executes the open command differently. The semantics of this call are therefore the same as that of open(Path, int) with one special point: it is in FSDataInputStreamBuilder.build() in which the open operation takes place -it is there where all preconditions to the operation are checked.
      Parameters:
      path - file path
      Returns:
      a FSDataInputStreamBuilder object to build the input stream
      Throws:
      IOException - if some early checks cause IO failures.
      UnsupportedOperationException - if support is checked early.
    • openFile

      @Unstable public FutureDataInputStreamBuilder openFile(PathHandle pathHandle) throws IOException, UnsupportedOperationException
      Open a file for reading through a builder API. Ultimately calls open(PathHandle, int) unless a subclass executes the open command differently. If PathHandles are unsupported, this may fail in the FSDataInputStreamBuilder.build() command, rather than in this openFile() operation.
      Parameters:
      pathHandle - path handle.
      Returns:
      a FSDataInputStreamBuilder object to build the input stream
      Throws:
      IOException - if some early checks cause IO failures.
      UnsupportedOperationException - if support is checked early.
    • openFileWithOptions

      protected CompletableFuture<FSDataInputStream> openFileWithOptions(Path path, org.apache.hadoop.fs.impl.OpenFileParameters parameters) throws IOException
      Execute the actual open file operation. This is invoked from FSDataInputStreamBuilder.build() and from DelegateToFileSystem and is where the action of opening the file should begin. The base implementation performs a blocking call to open(Path, int) in this call; the actual outcome is in the returned CompletableFuture. This avoids having to create some thread pool, while still setting up the expectation that the get() call is needed to evaluate the result.
      Parameters:
      path - path to the file
      parameters - open file parameters from the builder.
      Returns:
      a future which will evaluate to the opened file.
      Throws:
      IOException - failure to resolve the link.
      IllegalArgumentException - unknown mandatory key
    • openFileWithOptions

      protected CompletableFuture<FSDataInputStream> openFileWithOptions(PathHandle pathHandle, org.apache.hadoop.fs.impl.OpenFileParameters parameters) throws IOException
      Execute the actual open file operation. The base implementation performs a blocking call to open(Path, int) in this call; the actual outcome is in the returned CompletableFuture. This avoids having to create some thread pool, while still setting up the expectation that the get() call is needed to evaluate the result.
      Parameters:
      pathHandle - path to the file
      parameters - open file parameters from the builder.
      Returns:
      a future which will evaluate to the opened file.
      Throws:
      IOException - failure to resolve the link.
      IllegalArgumentException - unknown mandatory key
      UnsupportedOperationException - PathHandles are not supported. This may be deferred until the future is evaluated.
    • createDataInputStreamBuilder

      @LimitedPrivate("Filesystems") @Unstable protected static org.apache.hadoop.fs.FileSystem.FSDataInputStreamBuilder createDataInputStreamBuilder(@Nonnull FileSystem fileSystem, @Nonnull Path path)
      Create instance of the standard FileSystem.FSDataInputStreamBuilder for the given filesystem and path.
      Parameters:
      fileSystem - owner
      path - path to read
      Returns:
      a builder.
    • createDataInputStreamBuilder

      @LimitedPrivate("Filesystems") @Unstable protected static org.apache.hadoop.fs.FileSystem.FSDataInputStreamBuilder createDataInputStreamBuilder(@Nonnull FileSystem fileSystem, @Nonnull PathHandle pathHandle)
      Create instance of the standard FileSystem.FSDataInputStreamBuilder for the given filesystem and path handle.
      Parameters:
      fileSystem - owner
      pathHandle - path handle of file to open.
      Returns:
      a builder.
    • getEnclosingRoot

      @Public @Unstable public Path getEnclosingRoot(Path path) throws IOException
      Return path of the enclosing root for a given path. The enclosing root path is a common ancestor that should be used for temp and staging dirs as well as within encryption zones and other restricted directories. Call makeQualified on the param path to ensure its part of the correct filesystem.
      Parameters:
      path - file path to find the enclosing root path for
      Returns:
      a path to the enclosing root
      Throws:
      IOException - early checks like failure to resolve path cause IO failures
    • createMultipartUploader

      @Unstable public org.apache.hadoop.fs.MultipartUploaderBuilder createMultipartUploader(Path basePath) throws IOException
      Create a multipart uploader.
      Parameters:
      basePath - file path under which all files are uploaded
      Returns:
      a MultipartUploaderBuilder object to build the uploader
      Throws:
      IOException - if some early checks cause IO failures.
      UnsupportedOperationException - if support is checked early.
    • createBulkDelete

      public BulkDelete createBulkDelete(Path path) throws IllegalArgumentException, IOException
      Create a bulk delete operation. The default implementation returns an instance of DefaultBulkDeleteOperation.
      Specified by:
      createBulkDelete in interface BulkDeleteSource
      Parameters:
      path - base path for the operation.
      Returns:
      an instance of the bulk delete.
      Throws:
      IllegalArgumentException - any argument is invalid.
      IOException - if there is an IO problem.