Interface TableProvider
- All Known Subinterfaces:
SessionConfigSupport,SupportsCatalogOptions
@Evolving
public interface TableProvider
The base interface for v2 data sources which don't have a real catalog. Implementations must
have a public, 0-arg constructor.
Note that, TableProvider can only apply data operations to existing tables, like read, append, delete, and overwrite. It does not support the operations that require metadata changes, like create/drop tables.
The major responsibility of this interface is to return a Table for read/write.
- Since:
- 3.0.0
-
Method Summary
Modifier and TypeMethodDescriptionReturn aTableinstance with the specified table schema, partitioning and properties to do read/write.default Transform[]Infer the partitioning of the table identified by the given options.inferSchema(CaseInsensitiveStringMap options) Infer the schema of the table identified by the given options.default booleanReturns true if the source has the ability of accepting external table metadata when getting tables.
-
Method Details
-
inferSchema
Infer the schema of the table identified by the given options.- Parameters:
options- an immutable case-insensitive string-to-string map that can identify a table, e.g. file path, Kafka topic name, etc.
-
inferPartitioning
Infer the partitioning of the table identified by the given options.By default this method returns empty partitioning, please override it if this source support partitioning.
- Parameters:
options- an immutable case-insensitive string-to-string map that can identify a table, e.g. file path, Kafka topic name, etc.
-
getTable
Return aTableinstance with the specified table schema, partitioning and properties to do read/write. The returned table should report the same schema and partitioning with the specified ones, or Spark may fail the operation.- Parameters:
schema- The specified table schema.partitioning- The specified table partitioning.properties- The specified table properties. It's case preserving (contains exactly what users specified) and implementations are free to use it case sensitively or insensitively. It should be able to identify a table, e.g. file path, Kafka topic name, etc.
-
supportsExternalMetadata
default boolean supportsExternalMetadata()Returns true if the source has the ability of accepting external table metadata when getting tables. The external table metadata includes:- For table reader: user-specified schema from
DataFrameReader/DataStreamReaderand schema/partitioning stored in Spark catalog. - For table writer: the schema of the input
DataframeofDataframeWriter/DataStreamWriter.
By default this method returns false, which means the schema and partitioning passed to
getTable(StructType, Transform[], Map)are from the infer methods. Please override it if this source has expensive schema/partitioning inference and wants external table metadata to avoid inference. - For table reader: user-specified schema from
-