CREATE FUNCTION (SQL)

Description

The CREATE FUNCTION statement creates a SQL function that can be used in SQL statements. The function can be temporary or permanent, and can return either a scalar value or a table result. The function body can be defined either a SQL expression or a query.

When TEMPORARY is specified, the function is only available for the current session. Otherwise, it is persisted in the catalog and available across sessions. The OR REPLACE option allows updating an existing function definition, while IF NOT EXISTS prevents errors when creating a function that already exists.

The function parameters must be specified with their data types. The return type can be either a scalar data type or a table with an optional schema definition.

Syntax

CREATE [OR REPLACE] [TEMPORARY] FUNCTION [IF NOT EXISTS]
    function_name ( [ function_parameter [, ...] ] )
    { [ RETURNS data_type ] |
      RETURNS TABLE [ ( column_spec [, ...]) ] }
    [ characteristic [...] ]
    RETURN { expression | query }

function_parameter
    parameter_name data_type [DEFAULT default_expression] [COMMENT parameter_comment]

column_spec
    column_name data_type [COMMENT column_comment]

characteristic
  { LANGUAGE SQL |
    [NOT] DETERMINISTIC |
    COMMENT function_comment |
    [CONTAINS SQL | READS SQL DATA] }

Parameters

Examples

Create and use a SQL scalar function

> CREATE VIEW t(c1, c2) AS VALUES (0, 1), (1, 2);

-- Create a temporary function with no parameter.
> CREATE TEMPORARY FUNCTION hello() RETURNS STRING
    RETURN 'Hello World!';

> SELECT hello();
  Hello World!

-- Create a permanent function with parameters.
> CREATE FUNCTION area(x DOUBLE, y DOUBLE) RETURNS DOUBLE RETURN x * y;

-- Use a SQL function in the SELECT clause of a query.
> SELECT area(c1, c2) AS area FROM t;
 1.0
 1.0

-- Use a SQL function in the WHERE clause of a query.
> SELECT * FROM t WHERE area(c1, c2) > 0;
 1  2

-- Compose SQL functions.
> CREATE FUNCTION square(x DOUBLE) RETURNS DOUBLE RETURN area(x, x);

> SELECT c1, square(c1) AS square FROM t;
  0  0.0
  1  1.0

-- Create a non-deterministic function
> CREATE FUNCTION roll_dice()
    RETURNS INT
    NOT DETERMINISTIC
    CONTAINS SQL
    COMMENT 'Roll a single 6 sided die'
    RETURN (rand() * 6)::INT + 1;
-- Roll a single 6-sided die
> SELECT roll_dice();
 3

Create a SQL table function

-- Produce all weekdays between two dates
> CREATE FUNCTION weekdays(start DATE, end DATE)
    RETURNS TABLE(day_of_week STRING, day DATE)
    RETURN SELECT extract(DAYOFWEEK_ISO FROM day), day
             FROM (SELECT sequence(weekdays.start, weekdays.end)) AS T(days)
                  LATERAL VIEW explode(days) AS day
             WHERE extract(DAYOFWEEK_ISO FROM day) BETWEEN 1 AND 5;

-- Return all weekdays
> SELECT weekdays.day_of_week, day
    FROM weekdays(DATE'2022-01-01', DATE'2022-01-14');
  1     2022-01-03
  2     2022-01-04
  3     2022-01-05
  4     2022-01-06
  5     2022-01-07
  1     2022-01-10
  2     2022-01-11
  3     2022-01-12
  4     2022-01-13
  5     2022-01-14

-- Return weekdays for date ranges originating from a LATERAL correlation
> SELECT weekdays.*
    FROM VALUES (DATE'2020-01-01'),
                (DATE'2021-01-01'),
                (DATE'2022-01-01') AS starts(start),
         LATERAL weekdays(start, start + INTERVAL '7' DAYS);
  3     2020-01-01
  4     2020-01-02
  5     2020-01-03
  1     2020-01-06
  2     2020-01-07
  3     2020-01-08
  5     2021-01-01
  1     2021-01-04
  2     2021-01-05
  3     2021-01-06
  4     2021-01-07
  5     2021-01-08
  1     2022-01-03
  2     2022-01-04
  3     2022-01-05
  4     2022-01-06
  5     2022-01-07

Replace a SQL function

-- Replace a SQL scalar function.
> CREATE OR REPLACE FUNCTION square(x DOUBLE) RETURNS DOUBLE RETURN x * x;

-- Replace a SQL table function.
> CREATE OR REPLACE FUNCTION getemps(deptno INT)
    RETURNS TABLE (name STRING)
    RETURN SELECT name FROM employee e WHERE e.deptno = getemps.deptno;

-- Describe a SQL table function.
> DESCRIBE FUNCTION getemps;
 Function: default.getemps
 Type:     TABLE
 Input:    deptno INT
 Returns:  id   INT
           name STRING

Describe a SQL function

> DESCRIBE FUNCTION hello;
 Function: hello
 Type:     SCALAR
 Input:    ()
 Returns:  STRING

> DESCRIBE FUNCTION area;
 Function: default.area
 Type:     SCALAR
 Input:    x DOUBLE
           y DOUBLE
 Returns:  DOUBLE

> DESCRIBE FUNCTION roll_dice;
 Function: default.roll_dice
 Type:     SCALAR
 Input:    num_dice  INT
           num_sides INT
 Returns:  INT

Create a temporary SQL function with a session qualifier

-- Unqualified, `session`-qualified, and `system.session`-qualified names all create the same
-- temporary function in the per-session `system.session` namespace.
> CREATE TEMPORARY FUNCTION add_one(x INT) RETURNS INT RETURN x + 1;

> CREATE OR REPLACE TEMPORARY FUNCTION session.add_one(x INT) RETURNS INT
    RETURN x + 1;

> CREATE OR REPLACE TEMPORARY FUNCTION system.session.add_one(x INT) RETURNS INT
    RETURN x + 1;

-- All three names refer to the same temporary function:
> SELECT add_one(1), session.add_one(1), system.session.add_one(1);
 2  2  2

-- DROP TEMPORARY FUNCTION accepts the same qualifiers:
> DROP TEMPORARY FUNCTION session.add_one;

-- Any other qualifier on a TEMPORARY function is rejected.
> CREATE TEMPORARY FUNCTION mydb.bad_temp() RETURNS INT RETURN 1;
  [INVALID_TEMP_OBJ_QUALIFIER] qualifier `mydb` is not allowed for temporary FUNCTION ...

> CREATE TEMPORARY FUNCTION system.builtin.bad_temp() RETURNS INT RETURN 1;
  [INVALID_TEMP_OBJ_QUALIFIER] qualifier `system`.`builtin` is not allowed for temporary FUNCTION ...

Frozen SQL Path

A SQL UDF captures the SQL Path that is in effect at CREATE FUNCTION time. The body resolves against that frozen path on every invocation, even if the caller’s session has set a different PATH. See SET PATH.

> CREATE SCHEMA path_a;
> CREATE SCHEMA path_b;
> CREATE TABLE path_a.t USING parquet AS SELECT 10 AS id;
> CREATE TABLE path_b.t USING parquet AS SELECT 20 AS id;

-- The PATH at CREATE FUNCTION time points at path_a, so unqualified `t` in the body binds to
-- path_a.t.
> SET PATH = spark_catalog.path_a, system.builtin;
> CREATE FUNCTION default.frozen_fn() RETURNS INT
    RETURN (SELECT MAX(id) FROM t);

-- Flip the live PATH. The function body still resolves `t` against the frozen path.
> SET PATH = spark_catalog.path_b, system.builtin;

-- A bare query follows the LIVE path:
> SELECT MAX(id) FROM t;
 20

-- The function body follows its FROZEN path:
> SELECT default.frozen_fn();
 10

-- DESCRIBE FUNCTION EXTENDED shows the captured path:
> DESC FUNCTION EXTENDED default.frozen_fn;
 Function:    spark_catalog.default.frozen_fn
 ...
 SQL Path:    spark_catalog.path_a, system.builtin