pyspark.sql.functions.schema_of_json#

pyspark.sql.functions.schema_of_json(json, options=None)[source]#

Parses a JSON string and infers its schema in DDL format.

New in version 2.4.0.

Changed in version 3.4.0: Supports Spark Connect.

Parameters
jsonColumn or str

a JSON string or a foldable string column containing a JSON string.

optionsdict, optional

options to control parsing. accepts the same options as the JSON datasource. See Data Source Option for the version you use.

Changed in version 3.0.0: It accepts options parameter to control schema inferring.

Returns
Column

a string representation of a StructType parsed from given JSON.

Examples

>>> import pyspark.sql.functions as sf
>>> parsed1 = sf.schema_of_json(sf.lit('{"a": 0}'))
>>> parsed2 = sf.schema_of_json('{a: 1}', {'allowUnquotedFieldNames':'true'})
>>> spark.range(1).select(parsed1, parsed2).show()
+------------------------+----------------------+
|schema_of_json({"a": 0})|schema_of_json({a: 1})|
+------------------------+----------------------+
|       STRUCT<a: BIGINT>|     STRUCT<a: BIGINT>|
+------------------------+----------------------+