Json Functions¶
This page lists all json functions available in Spark SQL.
from_json¶
from_json(jsonStr, schema[, options]) - Returns a struct value with the given jsonStr and schema.
Examples:
> SELECT from_json('{"a":1, "b":0.8}', 'a INT, b DOUBLE');
{"a":1,"b":0.8}
> SELECT from_json('{"time":"26/08/2015"}', 'time Timestamp', map('timestampFormat', 'dd/MM/yyyy'));
{"time":2015-08-26 00:00:00}
> SELECT from_json('{"teacher": "Alice", "student": [{"name": "Bob", "rank": 1}, {"name": "Charlie", "rank": 2}]}', 'STRUCT<teacher: STRING, student: ARRAY<STRUCT<name: STRING, rank: INT>>>');
{"teacher":"Alice","student":[{"name":"Bob","rank":1},{"name":"Charlie","rank":2}]}
Since: 2.2.0
get_json_object¶
get_json_object(json_txt, path) - Extracts a json object from path.
Examples:
> SELECT get_json_object('{"a":"b"}', '$.a');
b
> SELECT get_json_object('[{"a":"b"},{"a":"c"}]', '$[0].a');
b
> SELECT get_json_object('[{"a":"b"},{"a":"c"}]', '$[*].a');
["b","c"]
Since: 1.5.0
json_array_length¶
json_array_length(jsonArray) - Returns the number of elements in the outermost JSON array.
Arguments:
- jsonArray - A JSON array.
NULLis returned in case of any other valid JSON string,NULLor an invalid JSON.
Examples:
> SELECT json_array_length('[1,2,3,4]');
4
> SELECT json_array_length('[1,2,3,{"f1":1,"f2":[5,6]},4]');
5
> SELECT json_array_length('[1,2');
NULL
Since: 3.1.0
json_object_keys¶
json_object_keys(json_object) - Returns all the keys of the outermost JSON object as an array.
Arguments:
- json_object - A JSON object. If a valid JSON object is given, all the keys of the outermost object will be returned as an array. If it is any other valid JSON string, an invalid JSON string or an empty string, the function returns null.
Examples:
> SELECT json_object_keys('{}');
[]
> SELECT json_object_keys('{"key": "value"}');
["key"]
> SELECT json_object_keys('{"f1":"abc","f2":{"f3":"a", "f4":"b"}}');
["f1","f2"]
Since: 3.1.0
json_tuple¶
json_tuple(jsonStr, p1, p2, ..., pn) - Returns a tuple like the function get_json_object, but it takes multiple names. All the input parameters and output column types are string.
Examples:
> SELECT json_tuple('{"a":1, "b":2}', 'a', 'b');
1 2
Since: 1.6.0
schema_of_json¶
schema_of_json(json[, options]) - Returns schema in the DDL format of JSON string.
Examples:
> SELECT schema_of_json('[{"col":0}]');
ARRAY<STRUCT<col: BIGINT>>
> SELECT schema_of_json('[{"col":01}]', map('allowNumericLeadingZeros', 'true'));
ARRAY<STRUCT<col: BIGINT>>
Since: 2.4.0
to_json¶
to_json(expr[, options]) - Returns a JSON string with a given struct value
Examples:
> SELECT to_json(named_struct('a', 1, 'b', 2));
{"a":1,"b":2}
> SELECT to_json(named_struct('time', to_timestamp('2015-08-26', 'yyyy-MM-dd')), map('timestampFormat', 'dd/MM/yyyy'));
{"time":"26/08/2015"}
> SELECT to_json(array(named_struct('a', 1, 'b', 2)));
[{"a":1,"b":2}]
> SELECT to_json(map('a', named_struct('b', 1)));
{"a":{"b":1}}
> SELECT to_json(map(named_struct('a', 1),named_struct('b', 2)));
{"[1]":{"b":2}}
> SELECT to_json(map('a', 1));
{"a":1}
> SELECT to_json(array(map('a', 1)));
[{"a":1}]
Since: 2.2.0