Star (*) Clause
Description
A shorthand to name all the referencable columns in the FROM clause or a specific table reference’s columns or fields in the FROM clause. The star clause is most frequently used in the SELECT list. Spark also supports its use in function invocation and certain n-ary operations within the SELECT list and WHERE clause.
Syntax
[ name . ] * [ except_clause ]
except_clause
EXCEPT ( { column_name | field_name } [, ...] )
Parameters
-
name
If present limits the columns or fields to be named to those in the specified referencable field, column, or table.
-
except_clause
Optionally prunes columns or fields from the referencable set of columns identified in the select_star clause.
-
column_name
A column that is part of the set of columns that you can reference.
-
field_name
A reference to a field in a column of the set of columns that you can reference. If you exclude all fields from a STRUCT, the result is an empty STRUCT. Each name must reference a column included in the set of columns that you can reference or their fields. Otherwise, Spark SQL raises a UNRESOLVED_COLUMN error. If names overlap or are not unique, Spark raises an EXCEPT_OVERLAPPING_COLUMNS error.
-
Examples
-- Return all columns in the FROM clause
SELECT * FROM VALUES(1, 2) AS TA(c1, c2), VALUES('a', 'b') AS TB(ca, cb);
1 2 a b
-- Return all columns from TA
SELECT TA.* FROM VALUES(1, 2) AS TA(c1, c2), VALUES('a', 'b') AS TB(ca, cb);
1 2
-- Return all columns except TA.c1 and TB.cb
SELECT * EXCEPT (c1, cb) FROM VALUES(1, 2) AS TA(c1, c2), VALUES('a', 'b') AS TB(ca, cb);
2 a
-- Return all columns, but strip the field x from the struct.
SELECT TA.* EXCEPT (c1.x) FROM VALUES(named_struct('x', x, 'y', 'y'), 2) AS (c1, c2), VALUES('a', 'b') AS TB(ca, cb);
{ y } 2 a b
-- Return the first not-NULL column in TA
SELECT coalesce(TA.*) FROM VALUES(1, 2) AS TA(c1, c2), VALUES('a', 'b') AS TB(ca, cb);
1
-- Return 1 if any column in TB contains a 'c'.
SELECT CASE WHEN 'c' IN (TB.*) THEN 1 END FROM VALUES(1, 2) AS TA(c1, c2), VALUES('a', 'b') AS TB(ca, cb);
NULL
-- Return all column as a single struct
SELECT (*) FROM VALUES(1, 2) AS TA(c1, c2), VALUES('a', 'b') AS TB(ca, cb);
{ c1: 1, c2: 2, ca: a, cb: b }
-- Flatten a struct into individual columns
SELECT c1.* FROM VALUES(named_struct('x', 1, 'y', 2)) AS TA(c1);
1 2