CREATE FUNCTION (SQL)
Description
The CREATE FUNCTION statement creates a SQL function that can be used in SQL statements. The function can be temporary or permanent, and can return either a scalar value or a table result. The function body can be defined either a SQL expression or a query.
When TEMPORARY is specified, the function is only available for the current session. Otherwise, it is persisted in the catalog and available across sessions. The OR REPLACE option allows updating an existing function definition, while IF NOT EXISTS prevents errors when creating a function that already exists.
The function parameters must be specified with their data types. The return type can be either a scalar data type or a table with an optional schema definition.
Syntax
CREATE [OR REPLACE] [TEMPORARY] FUNCTION [IF NOT EXISTS]
function_name ( [ function_parameter [, ...] ] )
{ [ RETURNS data_type ] |
RETURNS TABLE [ ( column_spec [, ...]) ] }
[ characteristic [...] ]
RETURN { expression | query }
function_parameter
parameter_name data_type [DEFAULT default_expression] [COMMENT parameter_comment]
column_spec
column_name data_type [COMMENT column_comment]
characteristic
{ LANGUAGE SQL |
[NOT] DETERMINISTIC |
COMMENT function_comment |
[CONTAINS SQL | READS SQL DATA] }
Parameters
-
OR REPLACE
If specified, the function with the same name and signature (number of parameters and parameter types) is replaced. You cannot replace an existing function with a different signature or a procedure. This is mainly useful to update the function body and the return type of the function. You cannot specify this parameter with
IF NOT EXISTS. -
TEMPORARY
The scope of the function being created. When you specify
TEMPORARY, the created function is valid and visible in the current session. Temporary functions live in the per-sessionsystem.sessionnamespace and are dropped when the session ends. No persistent entry is made in the catalog. -
IF NOT EXISTS
If specified, creates the function only when it does not exist. The creation of the function succeeds (no error is thrown) if the specified function already exists in the system. You cannot specify this parameter with
OR REPLACE. -
function_name
A name for the function.
-
For a permanent function, you can optionally qualify the function name with a database name (or a catalog and database). If the name is not qualified the permanent function is created in the current schema.
Syntax:
[ catalog_name. ] [ database_name. ] function_name -
For a temporary function, you can optionally qualify the function name with the session schema (
sessionorsystem.session). Any other qualifier — includingsystem.builtin, the current schema, or an arbitrary database name — is rejected withINVALID_TEMP_OBJ_QUALIFIER. For example,CREATE TEMPORARY FUNCTION session.f ...andCREATE TEMPORARY FUNCTION system.session.f ...are accepted.Syntax:
[ { session | system.session } . ] function_name
The function name must be unique among all routines (procedures and functions) in its schema.
-
-
function_parameter
Specifies a parameter of the function.
-
The parameter name must be unique within the function.
-
Any supported data type.
-
DEFAULT default_expression
An optional default to be used when a function invocation does not assign an argument to the parameter.
default_expressionmust be castable todata_type. The expression must not reference another parameter or contain a subquery.When you specify a default for one parameter, all following parameters must also have a default.
-
COMMENT comment
An optional description of the parameter.
commentmust be aSTRINGliteral.
-
-
RETURNS data_type
The return data type of the scalar function. This clause is optional. The data type will be derived from the SQL function body if it is not provided.
-
RETURNS TABLE [ (column_spec [,…] ) ]
This clause marks the function as a table function. Optionally it also specifies the signature of the result of the table function. If no column_spec is specified it will be derived from the body of the SQL UDF.
-
The column name must be unique within the signature.
-
Any supported data type.
-
COMMENT column_comment
An optional description of the column.
commentmust be aSTRINGliteral.
-
-
**RETURN { expression query }** The body of the function. For a scalar function, it can either be a query or an expression. For a table function, it can only be a query. The expression cannot contain:
- Aggregate functions
- Window functions
- Ranking functions
- Row producing functions such as
explode
A persistent SQL UDF cannot reference temporary views, temporary functions, or session variables.
The SQL Path in effect at
CREATE FUNCTIONtime is captured into the function’s metadata; the body resolves against that frozen path on every invocation, not the invoker’s current path.current_schema()andcurrent_path()inside the body still return the invoker’s context. Use DESCRIBE FUNCTION EXTENDED to inspect the captured path. See SET PATH.Within the body of the function you can refer to parameter by its unqualified name or by qualifying the parameter with the function name.
-
characteristic
All characteristic clauses are optional. You can specify any number of them in any order, but you can specify each clause only once.
-
LANGUAGE SQL
The language of the function implementation.
-
[NOT] DETERMINISTIC
Whether the function is deterministic. A function is deterministic when it returns only one result for a given set of arguments. You may mark a function as
DETERMINISTICwhen its body is not and vice versa. A reason for this may be to encourage or discourage query optimizations such as constant folding or query caching. If you do not specify ths option it is derived from the function body. -
COMMENT function_comment
A comment for the function.
function_commentmust be String literal. -
CONTAINS SQL or READS SQL DATA
Whether a function reads data directly or indirectly from a table or a view. When the function reads SQL data, you cannot specify
CONTAINS SQL. If you don’t specify either clause, the property is derived from the function body.
-
Examples
Create and use a SQL scalar function
> CREATE VIEW t(c1, c2) AS VALUES (0, 1), (1, 2);
-- Create a temporary function with no parameter.
> CREATE TEMPORARY FUNCTION hello() RETURNS STRING
RETURN 'Hello World!';
> SELECT hello();
Hello World!
-- Create a permanent function with parameters.
> CREATE FUNCTION area(x DOUBLE, y DOUBLE) RETURNS DOUBLE RETURN x * y;
-- Use a SQL function in the SELECT clause of a query.
> SELECT area(c1, c2) AS area FROM t;
1.0
1.0
-- Use a SQL function in the WHERE clause of a query.
> SELECT * FROM t WHERE area(c1, c2) > 0;
1 2
-- Compose SQL functions.
> CREATE FUNCTION square(x DOUBLE) RETURNS DOUBLE RETURN area(x, x);
> SELECT c1, square(c1) AS square FROM t;
0 0.0
1 1.0
-- Create a non-deterministic function
> CREATE FUNCTION roll_dice()
RETURNS INT
NOT DETERMINISTIC
CONTAINS SQL
COMMENT 'Roll a single 6 sided die'
RETURN (rand() * 6)::INT + 1;
-- Roll a single 6-sided die
> SELECT roll_dice();
3
Create a SQL table function
-- Produce all weekdays between two dates
> CREATE FUNCTION weekdays(start DATE, end DATE)
RETURNS TABLE(day_of_week STRING, day DATE)
RETURN SELECT extract(DAYOFWEEK_ISO FROM day), day
FROM (SELECT sequence(weekdays.start, weekdays.end)) AS T(days)
LATERAL VIEW explode(days) AS day
WHERE extract(DAYOFWEEK_ISO FROM day) BETWEEN 1 AND 5;
-- Return all weekdays
> SELECT weekdays.day_of_week, day
FROM weekdays(DATE'2022-01-01', DATE'2022-01-14');
1 2022-01-03
2 2022-01-04
3 2022-01-05
4 2022-01-06
5 2022-01-07
1 2022-01-10
2 2022-01-11
3 2022-01-12
4 2022-01-13
5 2022-01-14
-- Return weekdays for date ranges originating from a LATERAL correlation
> SELECT weekdays.*
FROM VALUES (DATE'2020-01-01'),
(DATE'2021-01-01'),
(DATE'2022-01-01') AS starts(start),
LATERAL weekdays(start, start + INTERVAL '7' DAYS);
3 2020-01-01
4 2020-01-02
5 2020-01-03
1 2020-01-06
2 2020-01-07
3 2020-01-08
5 2021-01-01
1 2021-01-04
2 2021-01-05
3 2021-01-06
4 2021-01-07
5 2021-01-08
1 2022-01-03
2 2022-01-04
3 2022-01-05
4 2022-01-06
5 2022-01-07
Replace a SQL function
-- Replace a SQL scalar function.
> CREATE OR REPLACE FUNCTION square(x DOUBLE) RETURNS DOUBLE RETURN x * x;
-- Replace a SQL table function.
> CREATE OR REPLACE FUNCTION getemps(deptno INT)
RETURNS TABLE (name STRING)
RETURN SELECT name FROM employee e WHERE e.deptno = getemps.deptno;
-- Describe a SQL table function.
> DESCRIBE FUNCTION getemps;
Function: default.getemps
Type: TABLE
Input: deptno INT
Returns: id INT
name STRING
Describe a SQL function
> DESCRIBE FUNCTION hello;
Function: hello
Type: SCALAR
Input: ()
Returns: STRING
> DESCRIBE FUNCTION area;
Function: default.area
Type: SCALAR
Input: x DOUBLE
y DOUBLE
Returns: DOUBLE
> DESCRIBE FUNCTION roll_dice;
Function: default.roll_dice
Type: SCALAR
Input: num_dice INT
num_sides INT
Returns: INT
Create a temporary SQL function with a session qualifier
-- Unqualified, `session`-qualified, and `system.session`-qualified names all create the same
-- temporary function in the per-session `system.session` namespace.
> CREATE TEMPORARY FUNCTION add_one(x INT) RETURNS INT RETURN x + 1;
> CREATE OR REPLACE TEMPORARY FUNCTION session.add_one(x INT) RETURNS INT
RETURN x + 1;
> CREATE OR REPLACE TEMPORARY FUNCTION system.session.add_one(x INT) RETURNS INT
RETURN x + 1;
-- All three names refer to the same temporary function:
> SELECT add_one(1), session.add_one(1), system.session.add_one(1);
2 2 2
-- DROP TEMPORARY FUNCTION accepts the same qualifiers:
> DROP TEMPORARY FUNCTION session.add_one;
-- Any other qualifier on a TEMPORARY function is rejected.
> CREATE TEMPORARY FUNCTION mydb.bad_temp() RETURNS INT RETURN 1;
[INVALID_TEMP_OBJ_QUALIFIER] qualifier `mydb` is not allowed for temporary FUNCTION ...
> CREATE TEMPORARY FUNCTION system.builtin.bad_temp() RETURNS INT RETURN 1;
[INVALID_TEMP_OBJ_QUALIFIER] qualifier `system`.`builtin` is not allowed for temporary FUNCTION ...
Frozen SQL Path
A SQL UDF captures the SQL Path that is in effect at CREATE FUNCTION time. The body resolves
against that frozen path on every invocation, even if the caller’s session has set a different
PATH. See SET PATH.
> CREATE SCHEMA path_a;
> CREATE SCHEMA path_b;
> CREATE TABLE path_a.t USING parquet AS SELECT 10 AS id;
> CREATE TABLE path_b.t USING parquet AS SELECT 20 AS id;
-- The PATH at CREATE FUNCTION time points at path_a, so unqualified `t` in the body binds to
-- path_a.t.
> SET PATH = spark_catalog.path_a, system.builtin;
> CREATE FUNCTION default.frozen_fn() RETURNS INT
RETURN (SELECT MAX(id) FROM t);
-- Flip the live PATH. The function body still resolves `t` against the frozen path.
> SET PATH = spark_catalog.path_b, system.builtin;
-- A bare query follows the LIVE path:
> SELECT MAX(id) FROM t;
20
-- The function body follows its FROZEN path:
> SELECT default.frozen_fn();
10
-- DESCRIBE FUNCTION EXTENDED shows the captured path:
> DESC FUNCTION EXTENDED default.frozen_fn;
Function: spark_catalog.default.frozen_fn
...
SQL Path: spark_catalog.path_a, system.builtin