pyspark.SparkContext.range#

SparkContext.range(start, end=None, step=1, numSlices=None)[source]#

Create a new RDD of int containing elements from start to end (exclusive), increased by step every element. Can be called the same way as python’s built-in range() function. If called with a single argument, the argument is interpreted as end, and start is set to 0.

Added in version 1.5.0.

Parameters:

startint: the start value
endint, optional: the end value (exclusive)
stepint, optional, default 1: the incremental step
numSlicesint, optional: the number of partitions of the new RDD

Returns:

RDD: An RDD of int

See also

pyspark.sql.SparkSession.range()

Examples

>>> sc.range(5).collect()
[0, 1, 2, 3, 4]
>>> sc.range(2, 4).collect()
[2, 3]
>>> sc.range(1, 7, 2).collect()
[1, 3, 5]

Generate RDD with a negative step

>>> sc.range(5, 0, -1).collect()
[5, 4, 3, 2, 1]
>>> sc.range(0, 5, -1).collect()
[]

Control the number of partitions

>>> sc.range(5, numSlices=1).getNumPartitions()
1
>>> sc.range(5, numSlices=10).getNumPartitions()
10