pyspark.pandas.groupby.GroupBy.min#

GroupBy.min(numeric_only=False, min_count=-1)[source]#

Compute min of group values.

Added in version 3.3.0.

Parameters:

numeric_onlybool, default False: Include only float, int, boolean columns. If None, will attempt to use everything, then use only numeric data.

Added in version 3.4.0.
min_countbool, default -1: The required number of valid values to perform the operation. If fewer than min_count non-NA values are present the result will be NA.

Added in version 3.4.0.

See also

pyspark.pandas.Series.groupby
pyspark.pandas.DataFrame.groupby

Examples

>>> df = ps.DataFrame({"A": [1, 2, 1, 2], "B": [True, False, False, True],
...                    "C": [3, 4, 3, 4], "D": ["a", "a", "b", "a"]})
>>> df.groupby("A").min().sort_index()
       B  C  D
A
1  False  3  a
2  False  4  a

Include only float, int, boolean columns when set numeric_only True.

>>> df.groupby("A").min(numeric_only=True).sort_index()
       B  C
A
1  False  3
2  False  4

>>> df.groupby("D").min().sort_index()
   A      B  C
D
a  1  False  3
b  1  False  3

>>> df.groupby("D").min(min_count=3).sort_index()
     A      B    C
D
a  1.0  False  3.0
b  NaN   None  NaN