pyspark.sql.GroupedData.count#
- GroupedData.count()[source]#
Counts the number of records for each group.
New in version 1.3.0.
Changed in version 3.4.0: Supports Spark Connect.
Examples
>>> df = spark.createDataFrame( ... [(2, "Alice"), (3, "Alice"), (5, "Bob"), (10, "Bob")], ["age", "name"]) >>> df.show() +---+-----+ |age| name| +---+-----+ | 2|Alice| | 3|Alice| | 5| Bob| | 10| Bob| +---+-----+
Group-by name, and count each group.
>>> df.groupBy(df.name).count().sort("name").show() +-----+-----+ | name|count| +-----+-----+ |Alice| 2| | Bob| 2| +-----+-----+