Xml Functions¶
This page lists all xml functions available in Spark SQL.
from_xml¶
from_xml(xmlStr, schema[, options]) - Returns a struct value with the given xmlStr and schema.
Examples:
> SELECT from_xml('<p><a>1</a><b>0.8</b></p>', 'a INT, b DOUBLE');
{"a":1,"b":0.8}
> SELECT from_xml('<p><time>26/08/2015</time></p>', 'time Timestamp', map('timestampFormat', 'dd/MM/yyyy'));
{"time":2015-08-26 00:00:00}
> SELECT from_xml('<p><teacher>Alice</teacher><student><name>Bob</name><rank>1</rank></student><student><name>Charlie</name><rank>2</rank></student></p>', 'STRUCT<teacher: STRING, student: ARRAY<STRUCT<name: STRING, rank: INT>>>');
{"teacher":"Alice","student":[{"name":"Bob","rank":1},{"name":"Charlie","rank":2}]}
Since: 4.0.0
schema_of_xml¶
schema_of_xml(xml[, options]) - Returns schema in the DDL format of XML string.
Examples:
> SELECT schema_of_xml('<p><a>1</a></p>');
STRUCT<a: BIGINT>
> SELECT schema_of_xml('<p><a attr="2">1</a><a>3</a></p>', map('excludeAttribute', 'true'));
STRUCT<a: ARRAY<BIGINT>>
Since: 4.0.0
to_xml¶
to_xml(expr[, options]) - Returns a XML string with a given struct value
Examples:
> SELECT to_xml(named_struct('a', 1, 'b', 2));
<ROW>
<a>1</a>
<b>2</b>
</ROW>
> SELECT to_xml(named_struct('time', to_timestamp('2015-08-26', 'yyyy-MM-dd')), map('timestampFormat', 'dd/MM/yyyy'));
<ROW>
<time>26/08/2015</time>
</ROW>
Since: 4.0.0
xpath¶
xpath(xml, xpath) - Returns a string array of values within the nodes of xml that match the XPath expression.
Examples:
> SELECT xpath('<a><b>b1</b><b>b2</b><b>b3</b><c>c1</c><c>c2</c></a>','a/b/text()');
["b1","b2","b3"]
> SELECT xpath('<a><b>b1</b><b>b2</b><b>b3</b><c>c1</c><c>c2</c></a>','a/b');
[null,null,null]
Since: 2.0.0
xpath_boolean¶
xpath_boolean(xml, xpath) - Returns true if the XPath expression evaluates to true, or if a matching node is found.
Examples:
> SELECT xpath_boolean('<a><b>1</b></a>','a/b');
true
Since: 2.0.0
xpath_double¶
xpath_double(xml, xpath) - Returns a double value, the value zero if no match is found, or NaN if a match is found but the value is non-numeric.
Examples:
> SELECT xpath_double('<a><b>1</b><b>2</b></a>', 'sum(a/b)');
3.0
Since: 2.0.0
xpath_float¶
xpath_float(xml, xpath) - Returns a float value, the value zero if no match is found, or NaN if a match is found but the value is non-numeric.
Examples:
> SELECT xpath_float('<a><b>1</b><b>2</b></a>', 'sum(a/b)');
3.0
Since: 2.0.0
xpath_int¶
xpath_int(xml, xpath) - Returns an integer value, or the value zero if no match is found, or a match is found but the value is non-numeric.
Examples:
> SELECT xpath_int('<a><b>1</b><b>2</b></a>', 'sum(a/b)');
3
Since: 2.0.0
xpath_long¶
xpath_long(xml, xpath) - Returns a long integer value, or the value zero if no match is found, or a match is found but the value is non-numeric.
Examples:
> SELECT xpath_long('<a><b>1</b><b>2</b></a>', 'sum(a/b)');
3
Since: 2.0.0
xpath_number¶
xpath_number(xml, xpath) - Returns a double value, the value zero if no match is found, or NaN if a match is found but the value is non-numeric.
Examples:
> SELECT xpath_number('<a><b>1</b><b>2</b></a>', 'sum(a/b)');
3.0
Since: 2.0.0
xpath_short¶
xpath_short(xml, xpath) - Returns a short integer value, or the value zero if no match is found, or a match is found but the value is non-numeric.
Examples:
> SELECT xpath_short('<a><b>1</b><b>2</b></a>', 'sum(a/b)');
3
Since: 2.0.0
xpath_string¶
xpath_string(xml, xpath) - Returns the text contents of the first xml node that matches the XPath expression.
Examples:
> SELECT xpath_string('<a><b>b</b><c>cc</c></a>','a/c');
cc
Since: 2.0.0