pyspark.sql.functions.dayofyear#

pyspark.sql.functions.dayofyear(col)[source]#

Extract the day of the year of a given date/timestamp as integer.

New in version 1.5.0.

Changed in version 3.4.0: Supports Spark Connect.

Parameters
colColumn or column name

target date/timestamp column to work on.

Returns
Column

day of the year for given date/timestamp as integer.

Examples

Example 1: Extract the day of the year from a string column representing dates

>>> from pyspark.sql import functions as sf
>>> df = spark.createDataFrame([('2015-04-08',), ('2024-10-31',)], ['dt'])
>>> df.select("*", sf.typeof('dt'), sf.dayofyear('dt')).show()
+----------+----------+-------------+
|        dt|typeof(dt)|dayofyear(dt)|
+----------+----------+-------------+
|2015-04-08|    string|           98|
|2024-10-31|    string|          305|
+----------+----------+-------------+

Example 2: Extract the day of the year from a string column representing timestamp

>>> from pyspark.sql import functions as sf
>>> df = spark.createDataFrame([('2015-04-08 13:08:15',), ('2024-10-31 10:09:16',)], ['ts'])
>>> df.select("*", sf.typeof('ts'), sf.dayofyear('ts')).show()
+-------------------+----------+-------------+
|                 ts|typeof(ts)|dayofyear(ts)|
+-------------------+----------+-------------+
|2015-04-08 13:08:15|    string|           98|
|2024-10-31 10:09:16|    string|          305|
+-------------------+----------+-------------+

Example 3: Extract the day of the year from a date column

>>> import datetime
>>> from pyspark.sql import functions as sf
>>> df = spark.createDataFrame([
...     (datetime.date(2015, 4, 8),),
...     (datetime.date(2024, 10, 31),)], ['dt'])
>>> df.select("*", sf.typeof('dt'), sf.dayofyear('dt')).show()
+----------+----------+-------------+
|        dt|typeof(dt)|dayofyear(dt)|
+----------+----------+-------------+
|2015-04-08|      date|           98|
|2024-10-31|      date|          305|
+----------+----------+-------------+

Example 4: Extract the day of the year from a timestamp column

>>> import datetime
>>> from pyspark.sql import functions as sf
>>> df = spark.createDataFrame([
...     (datetime.datetime(2015, 4, 8, 13, 8, 15),),
...     (datetime.datetime(2024, 10, 31, 10, 9, 16),)], ['ts'])
>>> df.select("*", sf.typeof('ts'), sf.dayofyear('ts')).show()
+-------------------+----------+-------------+
|                 ts|typeof(ts)|dayofyear(ts)|
+-------------------+----------+-------------+
|2015-04-08 13:08:15| timestamp|           98|
|2024-10-31 10:09:16| timestamp|          305|
+-------------------+----------+-------------+