pyspark.sql.plot.core.PySparkPlotAccessor.hist#
- PySparkPlotAccessor.hist(column=None, bins=10, **kwargs)[source]#
Draw one histogram of the DataFrame’s columns.
A histogram is a representation of the distribution of data.
- Parameters
- column: str or list of str, optional
Column name or list of names to be used for creating the hostogram plot. If None (default), all numeric columns will be used.
- binsinteger, default 10
Number of histogram bins to be used.
- **kwargs
Additional keyword arguments.
- Returns
plotly.graph_objs.Figure
Examples
>>> data = [(5.1, 3.5, 0), (4.9, 3.0, 0), (7.0, 3.2, 1), (6.4, 3.2, 1), (5.9, 3.0, 2)] >>> columns = ["length", "width", "species"] >>> df = spark.createDataFrame(data, columns) >>> df.plot.hist(bins=4) >>> df.plot.hist(column=["length", "width"]) >>> df.plot.hist(column="length", bins=4)