pyspark.pandas.groupby.GroupBy.prod#
- GroupBy.prod(numeric_only=False, min_count=0)[source]#
Compute prod of groups.
New in version 3.4.0.
- Parameters
- numeric_onlybool, default False
Include only float, int, boolean columns.
Changed in version 4.0.0.
- min_countint, default 0
The required number of valid values to perform the operation. If fewer than min_count non-NA values are present the result will be NA.
- Returns
- Series or DataFrame
Computed prod of values within each group.
Examples
>>> import numpy as np >>> df = ps.DataFrame( ... { ... "A": [1, 1, 2, 1, 2], ... "B": [np.nan, 2, 3, 4, 5], ... "C": [1, 2, 1, 1, 2], ... "D": [True, False, True, False, True], ... } ... )
Groupby one column and return the prod of the remaining columns in each group.
>>> df.groupby('A').prod().sort_index() B C D A 1 8.0 2 0 2 15.0 2 1
>>> df.groupby('A').prod(min_count=3).sort_index() B C D A 1 NaN 2.0 0.0 2 NaN NaN NaN