Interface TableCatalog
- All Superinterfaces:
CatalogPlugin
- All Known Subinterfaces:
CatalogExtension,StagingTableCatalog
- All Known Implementing Classes:
DelegatingCatalogExtension
TableCatalog implementations may be case-sensitive or case-insensitive. Spark will pass
table identifiers without modification. Field names passed to
alterTable(Identifier, TableChange...) will be normalized to match the case used in the
table schema when updating, renaming, or dropping existing columns when catalyst analysis is
case-insensitive.
- Since:
- 3.0.0
-
Field Summary
FieldsModifier and TypeFieldDescriptionstatic final StringA prefix used to pass OPTIONS in table propertiesstatic final StringA reserved property to specify the collation of the table.static final StringA reserved property to specify the description of the table.static final StringA reserved property to specify a table was created with EXTERNAL.static final StringA reserved property to indicate that the table location is managed, not user-specified.static final StringA reserved property to specify the location of the table.static final StringA reserved property to specify the owner of the table.static final StringA reserved property to specify the provider of the table.static final StringA reserved property that indicates table entity type (external, managed, view, etc.). -
Method Summary
Modifier and TypeMethodDescriptionalterTable(Identifier ident, TableChange... changes) Apply a set ofchangesto a table in the catalog.default Set<TableCatalogCapability>default TablecreateTable(Identifier ident, Column[] columns, Transform[] partitions, Map<String, String> properties) Deprecated.This is deprecated.default TablecreateTable(Identifier ident, TableInfo tableInfo) Create a table in the catalog.default TablecreateTable(Identifier ident, StructType schema, Transform[] partitions, Map<String, String> properties) Deprecated.This is deprecated.booleandropTable(Identifier ident) Drop a table in the catalog.default voidinvalidateTable(Identifier ident) Invalidate cached table metadata for anidentifier.listTables(String[] namespace) List the tables in a namespace from the catalog.default TableSummary[]listTableSummaries(String[] namespace) List the table summaries in a namespace from the catalog.loadTable(Identifier ident) Load table metadata byidentifierfrom the catalog.default TableloadTable(Identifier ident, long timestamp) Load table metadata at a specific time byidentifierfrom the catalog.default TableloadTable(Identifier ident, String version) Load table metadata of a specific version byidentifierfrom the catalog.default TableloadTable(Identifier ident, Set<TableWritePrivilege> writePrivileges) Load table metadata byidentifierfrom the catalog.default booleanpurgeTable(Identifier ident) Drop a table in the catalog and completely remove its data by skipping a trash even if it is supported.voidrenameTable(Identifier oldIdent, Identifier newIdent) Renames a table in the catalog.default booleantableExists(Identifier ident) Test whether a table exists using anidentifierfrom the catalog.default booleanIf true, mark all the fields of the query schema as nullable when executing CREATE/REPLACE TABLE ...Methods inherited from interface org.apache.spark.sql.connector.catalog.CatalogPlugin
defaultNamespace, initialize, name
-
Field Details
-
PROP_LOCATION
A reserved property to specify the location of the table. The files of the table should be under this location. The location is a Hadoop Path string.- See Also:
-
PROP_IS_MANAGED_LOCATION
A reserved property to indicate that the table location is managed, not user-specified. If this property is "true", it means it's a managed table even if it has a location. As an example, SHOW CREATE TABLE will not generate the LOCATION clause.- See Also:
-
PROP_EXTERNAL
A reserved property to specify a table was created with EXTERNAL.- See Also:
-
PROP_TABLE_TYPE
A reserved property that indicates table entity type (external, managed, view, etc.).- See Also:
-
PROP_COMMENT
A reserved property to specify the description of the table.- See Also:
-
PROP_COLLATION
A reserved property to specify the collation of the table.- See Also:
-
PROP_PROVIDER
A reserved property to specify the provider of the table.- See Also:
-
PROP_OWNER
A reserved property to specify the owner of the table.- See Also:
-
OPTION_PREFIX
A prefix used to pass OPTIONS in table properties- See Also:
-
-
Method Details
-
capabilities
- Returns:
- the set of capabilities for this TableCatalog
-
listTables
Identifier[] listTables(String[] namespace) throws org.apache.spark.sql.catalyst.analysis.NoSuchNamespaceException List the tables in a namespace from the catalog.If the catalog supports views, this must return identifiers for only tables and not views.
- Parameters:
namespace- a multi-part namespace- Returns:
- an array of Identifiers for tables
- Throws:
org.apache.spark.sql.catalyst.analysis.NoSuchNamespaceException- If the namespace does not exist (optional).
-
listTableSummaries
default TableSummary[] listTableSummaries(String[] namespace) throws org.apache.spark.sql.catalyst.analysis.NoSuchNamespaceException, org.apache.spark.sql.catalyst.analysis.NoSuchTableException List the table summaries in a namespace from the catalog.This method should return all tables entities from a catalog regardless of type (i.e. views should be listed as well).
- Parameters:
namespace- a multi-part namespace- Returns:
- an array of Identifiers for tables
- Throws:
org.apache.spark.sql.catalyst.analysis.NoSuchNamespaceException- If the namespace does not exist (optional).org.apache.spark.sql.catalyst.analysis.NoSuchTableException- If certain table listed by listTables API does not exist.
-
loadTable
Table loadTable(Identifier ident) throws org.apache.spark.sql.catalyst.analysis.NoSuchTableException Load table metadata byidentifierfrom the catalog.If the catalog supports views and contains a view for the identifier and not a table, this must throw
NoSuchTableException.- Parameters:
ident- a table identifier- Returns:
- the table's metadata
- Throws:
org.apache.spark.sql.catalyst.analysis.NoSuchTableException- If the table doesn't exist or is a view
-
loadTable
default Table loadTable(Identifier ident, Set<TableWritePrivilege> writePrivileges) throws org.apache.spark.sql.catalyst.analysis.NoSuchTableException Load table metadata byidentifierfrom the catalog. Spark will write data into this table later.If the catalog supports views and contains a view for the identifier and not a table, this must throw
NoSuchTableException.- Parameters:
ident- a table identifierwritePrivileges-- Returns:
- the table's metadata
- Throws:
org.apache.spark.sql.catalyst.analysis.NoSuchTableException- If the table doesn't exist or is a view- Since:
- 3.5.3
-
loadTable
default Table loadTable(Identifier ident, String version) throws org.apache.spark.sql.catalyst.analysis.NoSuchTableException Load table metadata of a specific version byidentifierfrom the catalog.If the catalog supports views and contains a view for the identifier and not a table, this must throw
NoSuchTableException.- Parameters:
ident- a table identifierversion- version of the table- Returns:
- the table's metadata
- Throws:
org.apache.spark.sql.catalyst.analysis.NoSuchTableException- If the table doesn't exist or is a view
-
loadTable
default Table loadTable(Identifier ident, long timestamp) throws org.apache.spark.sql.catalyst.analysis.NoSuchTableException Load table metadata at a specific time byidentifierfrom the catalog.If the catalog supports views and contains a view for the identifier and not a table, this must throw
NoSuchTableException.- Parameters:
ident- a table identifiertimestamp- timestamp of the table, which is microseconds since 1970-01-01 00:00:00 UTC- Returns:
- the table's metadata
- Throws:
org.apache.spark.sql.catalyst.analysis.NoSuchTableException- If the table doesn't exist or is a view
-
invalidateTable
Invalidate cached table metadata for anidentifier.If the table is already loaded or cached, drop cached data. If the table does not exist or is not cached, do nothing. Calling this method should not query remote services.
- Parameters:
ident- a table identifier
-
tableExists
Test whether a table exists using anidentifierfrom the catalog.If the catalog supports views and contains a view for the identifier and not a table, this must return false.
- Parameters:
ident- a table identifier- Returns:
- true if the table exists, false otherwise
-
createTable
@Deprecated(since="3.4.0") default Table createTable(Identifier ident, StructType schema, Transform[] partitions, Map<String, String> properties) throws org.apache.spark.sql.catalyst.analysis.TableAlreadyExistsException, org.apache.spark.sql.catalyst.analysis.NoSuchNamespaceExceptionDeprecated.This is deprecated. Please overridecreateTable(Identifier, Column[], Transform[], Map)instead.Create a table in the catalog.- Throws:
org.apache.spark.sql.catalyst.analysis.TableAlreadyExistsExceptionorg.apache.spark.sql.catalyst.analysis.NoSuchNamespaceException
-
createTable
@Deprecated(since="4.1.0") default Table createTable(Identifier ident, Column[] columns, Transform[] partitions, Map<String, String> properties) throws org.apache.spark.sql.catalyst.analysis.TableAlreadyExistsException, org.apache.spark.sql.catalyst.analysis.NoSuchNamespaceExceptionDeprecated.This is deprecated. Please overridecreateTable(Identifier, TableInfo)instead.Create a table in the catalog.- Throws:
org.apache.spark.sql.catalyst.analysis.TableAlreadyExistsExceptionorg.apache.spark.sql.catalyst.analysis.NoSuchNamespaceException
-
createTable
default Table createTable(Identifier ident, TableInfo tableInfo) throws org.apache.spark.sql.catalyst.analysis.TableAlreadyExistsException, org.apache.spark.sql.catalyst.analysis.NoSuchNamespaceException Create a table in the catalog.- Parameters:
ident- a table identifiertableInfo- information about the table.- Returns:
- metadata for the new table. This can be null if getting the metadata for the new table
is expensive. Spark will call
loadTable(Identifier)if needed (e.g. CTAS). - Throws:
org.apache.spark.sql.catalyst.analysis.TableAlreadyExistsException- If a table or view already exists for the identifierUnsupportedOperationException- If a requested partition transform is not supportedorg.apache.spark.sql.catalyst.analysis.NoSuchNamespaceException- If the identifier namespace does not exist (optional)- Since:
- 4.1.0
-
useNullableQuerySchema
default boolean useNullableQuerySchema()If true, mark all the fields of the query schema as nullable when executing CREATE/REPLACE TABLE ... AS SELECT ... and creating the table. -
alterTable
Table alterTable(Identifier ident, TableChange... changes) throws org.apache.spark.sql.catalyst.analysis.NoSuchTableException Apply a set ofchangesto a table in the catalog.Implementations may reject the requested changes. If any change is rejected, none of the changes should be applied to the table.
The requested changes must be applied in the order given.
If the catalog supports views and contains a view for the identifier and not a table, this must throw
NoSuchTableException.- Parameters:
ident- a table identifierchanges- changes to apply to the table- Returns:
- updated metadata for the table. This can be null if getting the metadata for the updated table is expensive. Spark always discard the returned table here.
- Throws:
org.apache.spark.sql.catalyst.analysis.NoSuchTableException- If the table doesn't exist or is a viewIllegalArgumentException- If any change is rejected by the implementation.
-
dropTable
Drop a table in the catalog.If the catalog supports views and contains a view for the identifier and not a table, this must not drop the view and must return false.
- Parameters:
ident- a table identifier- Returns:
- true if a table was deleted, false if no table exists for the identifier
-
purgeTable
Drop a table in the catalog and completely remove its data by skipping a trash even if it is supported.If the catalog supports views and contains a view for the identifier and not a table, this must not drop the view and must return false.
If the catalog supports to purge a table, this method should be overridden. The default implementation throws
UnsupportedOperationException.- Parameters:
ident- a table identifier- Returns:
- true if a table was deleted, false if no table exists for the identifier
- Throws:
UnsupportedOperationException- If table purging is not supported- Since:
- 3.1.0
-
renameTable
void renameTable(Identifier oldIdent, Identifier newIdent) throws org.apache.spark.sql.catalyst.analysis.NoSuchTableException, org.apache.spark.sql.catalyst.analysis.TableAlreadyExistsException Renames a table in the catalog.If the catalog supports views and contains a view for the old identifier and not a table, this throws
NoSuchTableException. Additionally, if the new identifier is a table or a view, this throwsTableAlreadyExistsException.If the catalog does not support table renames between namespaces, it throws
UnsupportedOperationException.- Parameters:
oldIdent- the table identifier of the existing table to renamenewIdent- the new table identifier of the table- Throws:
org.apache.spark.sql.catalyst.analysis.NoSuchTableException- If the table to rename doesn't exist or is a vieworg.apache.spark.sql.catalyst.analysis.TableAlreadyExistsException- If the new table name already exists or is a viewUnsupportedOperationException- If the namespaces of old and new identifiers do not match (optional)
-