pyspark.sql.functions.create_map#
- pyspark.sql.functions.create_map(*cols)[source]#
Map function: Creates a new map column from an even number of input columns or column references. The input columns are grouped into key-value pairs to form a map. For instance, the input (key1, value1, key2, value2, …) would produce a map that associates key1 with value1, key2 with value2, and so on. The function supports grouping columns as a list as well.
New in version 2.0.0.
Changed in version 3.4.0: Supports Spark Connect.
- Parameters
- Returns
Column
A new Column of Map type, where each value is a map formed from the corresponding key-value pairs provided in the input arguments.
Examples
Example 1: Basic usage of create_map function.
>>> from pyspark.sql import functions as sf >>> df = spark.createDataFrame([("Alice", 2), ("Bob", 5)], ("name", "age")) >>> df.select(sf.create_map('name', 'age')).show() +--------------+ |map(name, age)| +--------------+ | {Alice -> 2}| | {Bob -> 5}| +--------------+
Example 2: Usage of create_map function with a list of columns.
>>> from pyspark.sql import functions as sf >>> df = spark.createDataFrame([("Alice", 2), ("Bob", 5)], ("name", "age")) >>> df.select(sf.create_map([df.name, df.age])).show() +--------------+ |map(name, age)| +--------------+ | {Alice -> 2}| | {Bob -> 5}| +--------------+
Example 3: Usage of create_map function with more than one key-value pair.
>>> from pyspark.sql import functions as sf >>> df = spark.createDataFrame([("Alice", 2, "female"), ... ("Bob", 5, "male")], ("name", "age", "gender")) >>> df.select(sf.create_map(sf.lit('name'), df['name'], ... sf.lit('gender'), df['gender'])).show(truncate=False) +---------------------------------+ |map(name, name, gender, gender) | +---------------------------------+ |{name -> Alice, gender -> female}| |{name -> Bob, gender -> male} | +---------------------------------+
Example 4: Usage of create_map function with values of different types.
>>> from pyspark.sql import functions as sf >>> df = spark.createDataFrame([("Alice", 2, 22.2), ... ("Bob", 5, 36.1)], ("name", "age", "weight")) >>> df.select(sf.create_map(sf.lit('age'), df['age'], ... sf.lit('weight'), df['weight'])).show(truncate=False) +-----------------------------+ |map(age, age, weight, weight)| +-----------------------------+ |{age -> 2.0, weight -> 22.2} | |{age -> 5.0, weight -> 36.1} | +-----------------------------+