Class AzureBlobFileSystemStore

java.lang.Object
org.apache.hadoop.fs.azurebfs.AzureBlobFileSystemStore
All Implemented Interfaces:
Closeable, AutoCloseable, org.apache.hadoop.fs.azurebfs.services.ListingSupport

@Public @Evolving public class AzureBlobFileSystemStore extends Object implements Closeable, org.apache.hadoop.fs.azurebfs.services.ListingSupport
Provides the bridging logic between Hadoop's abstract filesystem and Azure Storage.
  • Nested Class Summary

    Nested Classes
    Modifier and Type
    Class
    Description
    static final class 
    org.apache.hadoop.fs.azurebfs.AzureBlobFileSystemStore.AzureBlobFileSystemStoreBuilder
    A builder class for AzureBlobFileSystemStore.
    static final class 
    org.apache.hadoop.fs.azurebfs.AzureBlobFileSystemStore.Permissions
    Permissions class contain provided permission and umask in octalNotation.
  • Constructor Summary

    Constructors
    Constructor
    Description
    AzureBlobFileSystemStore(org.apache.hadoop.fs.azurebfs.AzureBlobFileSystemStore.AzureBlobFileSystemStoreBuilder abfsStoreBuilder)
    FileSystem Store for AzureBlobFileSystem for Abfs operations.
  • Method Summary

    Modifier and Type
    Method
    Description
    void
    access(Path path, FsAction mode, org.apache.hadoop.fs.azurebfs.utils.TracingContext tracingContext)
     
    void
    breakLease(Path path, org.apache.hadoop.fs.azurebfs.utils.TracingContext tracingContext)
    Break any current lease on an ABFS file.
    void
     
    void
    createDirectory(Path path, FsPermission permission, FsPermission umask, org.apache.hadoop.fs.azurebfs.utils.TracingContext tracingContext)
    Creates a directory.
    createFile(Path path, org.apache.hadoop.fs.FileSystem.Statistics statistics, boolean overwrite, FsPermission permission, FsPermission umask, org.apache.hadoop.fs.azurebfs.utils.TracingContext tracingContext)
     
    void
    createFilesystem(org.apache.hadoop.fs.azurebfs.utils.TracingContext tracingContext)
     
    createNonRecursive(Path path, org.apache.hadoop.fs.FileSystem.Statistics statistics, boolean overwrite, FsPermission permission, FsPermission umask, org.apache.hadoop.fs.azurebfs.utils.TracingContext tracingContext)
    Checks existence of parent of the given path.
    void
    delete(Path path, boolean recursive, org.apache.hadoop.fs.azurebfs.utils.TracingContext tracingContext)
     
    void
    deleteFilesystem(org.apache.hadoop.fs.azurebfs.utils.TracingContext tracingContext)
     
    static String
    extractEtagHeader(org.apache.hadoop.fs.azurebfs.services.AbfsHttpOperation result)
    Get the etag header from a response, stripping any quotations.
    org.apache.hadoop.fs.azurebfs.AbfsConfiguration
     
    getAclStatus(Path path, org.apache.hadoop.fs.azurebfs.utils.TracingContext tracingContext)
     
    org.apache.hadoop.fs.azurebfs.services.AbfsClient
     
    org.apache.hadoop.fs.azurebfs.services.AbfsClient
     
    org.apache.hadoop.fs.azurebfs.services.AbfsClientHandler
     
    getFileStatus(Path path, org.apache.hadoop.fs.azurebfs.utils.TracingContext tracingContext)
     
    getFilesystemProperties(org.apache.hadoop.fs.azurebfs.utils.TracingContext tracingContext)
     
    boolean
    getIsNamespaceEnabled(org.apache.hadoop.fs.azurebfs.utils.TracingContext tracingContext)
    Resolves namespace information of the filesystem from the state of isNamespaceEnabled().
    getPathStatus(Path path, org.apache.hadoop.fs.azurebfs.utils.TracingContext tracingContext)
     
     
     
     
     
    boolean
    Checks if the given key in Azure Storage should be stored as a page blob instead of block blob.
    boolean
     
    listStatus(Path path, String startFrom, List<FileStatus> fileStatuses, boolean fetchAll, String continuation, org.apache.hadoop.fs.azurebfs.utils.TracingContext tracingContext)
     
    listStatus(Path path, String startFrom, org.apache.hadoop.fs.azurebfs.utils.TracingContext tracingContext)
     
    listStatus(Path path, org.apache.hadoop.fs.azurebfs.utils.TracingContext tracingContext)
     
    void
    modifyAclEntries(Path path, List<AclEntry> aclSpec, org.apache.hadoop.fs.azurebfs.utils.TracingContext tracingContext)
     
    org.apache.hadoop.fs.azurebfs.services.AbfsInputStream
    openFileForRead(Path path, Optional<org.apache.hadoop.fs.impl.OpenFileParameters> parameters, org.apache.hadoop.fs.FileSystem.Statistics statistics, org.apache.hadoop.fs.azurebfs.utils.TracingContext tracingContext)
     
    org.apache.hadoop.fs.azurebfs.services.AbfsInputStream
    openFileForRead(Path path, org.apache.hadoop.fs.FileSystem.Statistics statistics, org.apache.hadoop.fs.azurebfs.utils.TracingContext tracingContext)
     
    openFileForWrite(Path path, org.apache.hadoop.fs.FileSystem.Statistics statistics, boolean overwrite, org.apache.hadoop.fs.azurebfs.utils.TracingContext tracingContext)
     
    void
    removeAcl(Path path, org.apache.hadoop.fs.azurebfs.utils.TracingContext tracingContext)
     
    void
    removeAclEntries(Path path, List<AclEntry> aclSpec, org.apache.hadoop.fs.azurebfs.utils.TracingContext tracingContext)
     
    void
    removeDefaultAcl(Path path, org.apache.hadoop.fs.azurebfs.utils.TracingContext tracingContext)
     
    boolean
    rename(Path source, Path destination, org.apache.hadoop.fs.azurebfs.utils.TracingContext tracingContext, String sourceEtag)
    Rename a file or directory.
    void
    Restricts all service types to BLOB when FNS account detected Updates the client to reflect the new default service type.
    void
    setAcl(Path path, List<AclEntry> aclSpec, org.apache.hadoop.fs.azurebfs.utils.TracingContext tracingContext)
     
    void
    setFilesystemProperties(Hashtable<String,String> properties, org.apache.hadoop.fs.azurebfs.utils.TracingContext tracingContext)
     
    void
    setOwner(Path path, String owner, String group, org.apache.hadoop.fs.azurebfs.utils.TracingContext tracingContext)
     
    void
    setPathProperties(Path path, Hashtable<String,String> properties, org.apache.hadoop.fs.azurebfs.utils.TracingContext tracingContext)
     
    void
    setPermission(Path path, FsPermission permission, org.apache.hadoop.fs.azurebfs.utils.TracingContext tracingContext)
     

    Methods inherited from class java.lang.Object

    clone, equals, finalize, getClass, hashCode, notify, notifyAll, toString, wait, wait, wait
  • Constructor Details

    • AzureBlobFileSystemStore

      public AzureBlobFileSystemStore(org.apache.hadoop.fs.azurebfs.AzureBlobFileSystemStore.AzureBlobFileSystemStoreBuilder abfsStoreBuilder) throws IOException
      FileSystem Store for AzureBlobFileSystem for Abfs operations. Built using the AzureBlobFileSystemStore.AzureBlobFileSystemStoreBuilder with parameters required.
      Parameters:
      abfsStoreBuilder - Builder for AzureBlobFileSystemStore.
      Throws:
      IOException - Throw IOE in case of failure during constructing.
  • Method Details

    • isAppendBlobKey

      public boolean isAppendBlobKey(String key)
      Checks if the given key in Azure Storage should be stored as a page blob instead of block blob.
      Parameters:
      key - The key to check.
      Returns:
      True if the key should be stored as a page blob, false otherwise.
    • getUser

      public String getUser()
      Returns:
      local user name.
    • getPrimaryGroup

      public String getPrimaryGroup()
      Returns:
      primary group that user belongs to.
    • close

      public void close() throws IOException
      Specified by:
      close in interface AutoCloseable
      Specified by:
      close in interface Closeable
      Throws:
      IOException
    • getIsNamespaceEnabled

      public boolean getIsNamespaceEnabled(org.apache.hadoop.fs.azurebfs.utils.TracingContext tracingContext) throws AzureBlobFileSystemException
      Resolves namespace information of the filesystem from the state of isNamespaceEnabled(). if the state is UNKNOWN, it will be determined by making a GET_ACL request to the root of the filesystem. GET_ACL call is synchronized to ensure a single call is made to determine the namespace information in case multiple threads are calling this method at the same time. The resolution of namespace information would be stored back as setNamespaceEnabled(boolean).
      Parameters:
      tracingContext - tracing context
      Returns:
      true if namespace is enabled, false otherwise.
      Throws:
      AzureBlobFileSystemException - server errors.
    • getAbfsConfiguration

      public org.apache.hadoop.fs.azurebfs.AbfsConfiguration getAbfsConfiguration()
    • getFilesystemProperties

      public Hashtable<String,String> getFilesystemProperties(org.apache.hadoop.fs.azurebfs.utils.TracingContext tracingContext) throws AzureBlobFileSystemException
      Throws:
      AzureBlobFileSystemException
    • setFilesystemProperties

      public void setFilesystemProperties(Hashtable<String,String> properties, org.apache.hadoop.fs.azurebfs.utils.TracingContext tracingContext) throws AzureBlobFileSystemException
      Throws:
      AzureBlobFileSystemException
    • getPathStatus

      public Hashtable<String,String> getPathStatus(Path path, org.apache.hadoop.fs.azurebfs.utils.TracingContext tracingContext) throws IOException
      Throws:
      IOException
    • setPathProperties

      public void setPathProperties(Path path, Hashtable<String,String> properties, org.apache.hadoop.fs.azurebfs.utils.TracingContext tracingContext) throws IOException
      Throws:
      IOException
    • createFilesystem

      public void createFilesystem(org.apache.hadoop.fs.azurebfs.utils.TracingContext tracingContext) throws AzureBlobFileSystemException
      Throws:
      AzureBlobFileSystemException
    • deleteFilesystem

      public void deleteFilesystem(org.apache.hadoop.fs.azurebfs.utils.TracingContext tracingContext) throws AzureBlobFileSystemException
      Throws:
      AzureBlobFileSystemException
    • createNonRecursive

      public OutputStream createNonRecursive(Path path, org.apache.hadoop.fs.FileSystem.Statistics statistics, boolean overwrite, FsPermission permission, FsPermission umask, org.apache.hadoop.fs.azurebfs.utils.TracingContext tracingContext) throws IOException
      Checks existence of parent of the given path.
      Parameters:
      path - Path to check.
      statistics - FileSystem statistics.
      overwrite - Overwrite flag.
      permission - Permission of tha path.
      umask - Umask of the path.
      tracingContext - tracing context
      Returns:
      OutputStream output stream of the created file.
      Throws:
      IOException - if there is an issue with the operation.
    • createFile

      public OutputStream createFile(Path path, org.apache.hadoop.fs.FileSystem.Statistics statistics, boolean overwrite, FsPermission permission, FsPermission umask, org.apache.hadoop.fs.azurebfs.utils.TracingContext tracingContext) throws IOException
      Throws:
      IOException
    • createDirectory

      public void createDirectory(Path path, FsPermission permission, FsPermission umask, org.apache.hadoop.fs.azurebfs.utils.TracingContext tracingContext) throws IOException
      Creates a directory.
      Parameters:
      path - Path of the directory to create.
      permission - Permission of the directory.
      umask - Umask of the directory.
      tracingContext - tracing context
      Throws:
      AzureBlobFileSystemException - server error.
      IOException
    • openFileForRead

      public org.apache.hadoop.fs.azurebfs.services.AbfsInputStream openFileForRead(Path path, org.apache.hadoop.fs.FileSystem.Statistics statistics, org.apache.hadoop.fs.azurebfs.utils.TracingContext tracingContext) throws IOException
      Throws:
      IOException
    • openFileForRead

      public org.apache.hadoop.fs.azurebfs.services.AbfsInputStream openFileForRead(Path path, Optional<org.apache.hadoop.fs.impl.OpenFileParameters> parameters, org.apache.hadoop.fs.FileSystem.Statistics statistics, org.apache.hadoop.fs.azurebfs.utils.TracingContext tracingContext) throws IOException
      Throws:
      IOException
    • openFileForWrite

      public OutputStream openFileForWrite(Path path, org.apache.hadoop.fs.FileSystem.Statistics statistics, boolean overwrite, org.apache.hadoop.fs.azurebfs.utils.TracingContext tracingContext) throws IOException
      Throws:
      IOException
    • breakLease

      public void breakLease(Path path, org.apache.hadoop.fs.azurebfs.utils.TracingContext tracingContext) throws AzureBlobFileSystemException
      Break any current lease on an ABFS file.
      Parameters:
      path - file name
      tracingContext - TracingContext instance to track correlation IDs
      Throws:
      AzureBlobFileSystemException - on any exception while breaking the lease
    • rename

      public boolean rename(Path source, Path destination, org.apache.hadoop.fs.azurebfs.utils.TracingContext tracingContext, String sourceEtag) throws IOException
      Rename a file or directory. If a source etag is passed in, the operation will attempt to recover from a missing source file by probing the destination for existence and comparing etags.
      Parameters:
      source - path to source file
      destination - destination of rename.
      tracingContext - trace context
      sourceEtag - etag of source file. may be null or empty
      Returns:
      true if recovery was needed and succeeded.
      Throws:
      AzureBlobFileSystemException - failure, excluding any recovery from overload failures.
      IOException
    • delete

      public void delete(Path path, boolean recursive, org.apache.hadoop.fs.azurebfs.utils.TracingContext tracingContext) throws AzureBlobFileSystemException
      Throws:
      AzureBlobFileSystemException
    • getFileStatus

      public FileStatus getFileStatus(Path path, org.apache.hadoop.fs.azurebfs.utils.TracingContext tracingContext) throws IOException
      Throws:
      IOException
    • listStatus

      public FileStatus[] listStatus(Path path, org.apache.hadoop.fs.azurebfs.utils.TracingContext tracingContext) throws IOException
      Specified by:
      listStatus in interface org.apache.hadoop.fs.azurebfs.services.ListingSupport
      Parameters:
      path - The list path.
      tracingContext - Tracks identifiers for request header
      Returns:
      the entries in the path.
      Throws:
      IOException - in case of error
    • listStatus

      @Unstable public FileStatus[] listStatus(Path path, String startFrom, org.apache.hadoop.fs.azurebfs.utils.TracingContext tracingContext) throws IOException
      Specified by:
      listStatus in interface org.apache.hadoop.fs.azurebfs.services.ListingSupport
      Parameters:
      path - Path the list path.
      startFrom - the entry name that list results should start with. For example, if folder "/folder" contains four files: "afile", "bfile", "hfile", "ifile". Then listStatus(Path("/folder"), "hfile") will return "/folder/hfile" and "folder/ifile" Notice that if startFrom is a non-existent entry name, then the list response contains all entries after this non-existent entry in lexical order: listStatus(Path("/folder"), "cfile") will return "/folder/hfile" and "/folder/ifile".
      tracingContext - Tracks identifiers for request header
      Returns:
      the entries in the path start from "startFrom" in lexical order.
      Throws:
      IOException - in case of error
    • listStatus

      public String listStatus(Path path, String startFrom, List<FileStatus> fileStatuses, boolean fetchAll, String continuation, org.apache.hadoop.fs.azurebfs.utils.TracingContext tracingContext) throws IOException
      Specified by:
      listStatus in interface org.apache.hadoop.fs.azurebfs.services.ListingSupport
      Parameters:
      path - The list path
      startFrom - The entry name that list results should start with. For example, if folder "/folder" contains four files: "afile", "bfile", "hfile", "ifile". Then listStatus(Path("/folder"), "hfile") will return "/folder/hfile" and "folder/ifile" Notice that if startFrom is a non-existent entry name, then the list response contains all entries after this non-existent entry in lexical order: listStatus (Path("/folder"), "cfile") will return "/folder/hfile" and "/folder/ifile".
      fileStatuses - This list has to be filled with the FileStatus objects
      fetchAll - flag to indicate if the above list needs to be filled with just one page os results or the entire result.
      continuation - Contiuation token. null means start rom the begining.
      tracingContext - TracingContext instance to track identifiers
      Returns:
      Continuation token
      Throws:
      IOException - in case of error
    • setOwner

      public void setOwner(Path path, String owner, String group, org.apache.hadoop.fs.azurebfs.utils.TracingContext tracingContext) throws AzureBlobFileSystemException
      Throws:
      AzureBlobFileSystemException
    • setPermission

      public void setPermission(Path path, FsPermission permission, org.apache.hadoop.fs.azurebfs.utils.TracingContext tracingContext) throws AzureBlobFileSystemException
      Throws:
      AzureBlobFileSystemException
    • modifyAclEntries

      public void modifyAclEntries(Path path, List<AclEntry> aclSpec, org.apache.hadoop.fs.azurebfs.utils.TracingContext tracingContext) throws AzureBlobFileSystemException
      Throws:
      AzureBlobFileSystemException
    • removeAclEntries

      public void removeAclEntries(Path path, List<AclEntry> aclSpec, org.apache.hadoop.fs.azurebfs.utils.TracingContext tracingContext) throws AzureBlobFileSystemException
      Throws:
      AzureBlobFileSystemException
    • removeDefaultAcl

      public void removeDefaultAcl(Path path, org.apache.hadoop.fs.azurebfs.utils.TracingContext tracingContext) throws AzureBlobFileSystemException
      Throws:
      AzureBlobFileSystemException
    • removeAcl

      public void removeAcl(Path path, org.apache.hadoop.fs.azurebfs.utils.TracingContext tracingContext) throws AzureBlobFileSystemException
      Throws:
      AzureBlobFileSystemException
    • setAcl

      public void setAcl(Path path, List<AclEntry> aclSpec, org.apache.hadoop.fs.azurebfs.utils.TracingContext tracingContext) throws AzureBlobFileSystemException
      Throws:
      AzureBlobFileSystemException
    • getAclStatus

      public AclStatus getAclStatus(Path path, org.apache.hadoop.fs.azurebfs.utils.TracingContext tracingContext) throws IOException
      Throws:
      IOException
    • access

      public void access(Path path, FsAction mode, org.apache.hadoop.fs.azurebfs.utils.TracingContext tracingContext) throws AzureBlobFileSystemException
      Throws:
      AzureBlobFileSystemException
    • isInfiniteLeaseKey

      public boolean isInfiniteLeaseKey(String key)
    • getRelativePath

      public String getRelativePath(Path path)
    • restrictServiceTypeToBlob

      public void restrictServiceTypeToBlob()
      Restricts all service types to BLOB when FNS account detected Updates the client to reflect the new default service type.
    • getClient

      @VisibleForTesting public org.apache.hadoop.fs.azurebfs.services.AbfsClient getClient()
    • getClient

      @VisibleForTesting public org.apache.hadoop.fs.azurebfs.services.AbfsClient getClient(AbfsServiceType serviceType)
    • getClientHandler

      @VisibleForTesting public org.apache.hadoop.fs.azurebfs.services.AbfsClientHandler getClientHandler()
    • getUri

      @VisibleForTesting public URI getUri()
    • extractEtagHeader

      public static String extractEtagHeader(org.apache.hadoop.fs.azurebfs.services.AbfsHttpOperation result)
      Get the etag header from a response, stripping any quotations. see: https://developer.mozilla.org/en-US/docs/Web/HTTP/Headers/ETag
      Parameters:
      result - response to process.
      Returns:
      the quote-unwrapped etag.