Class RandomRDDs

Object
org.apache.spark.mllib.random.RandomRDDs

public class RandomRDDs extends Object
Generator methods for creating RDDs comprised of i.i.d. samples from some distribution.
  • Constructor Details

    • RandomRDDs

      public RandomRDDs()
  • Method Details

    • uniformRDD

      public static RDD<Object> uniformRDD(SparkContext sc, long size, int numPartitions, long seed)
      Generates an RDD comprised of i.i.d. samples from the uniform distribution U(0.0, 1.0).

      To transform the distribution in the generated RDD from U(0.0, 1.0) to U(a, b), use RandomRDDs.uniformRDD(sc, n, p, seed).map(v => a + (b - a) * v).

      Parameters:
      sc - SparkContext used to create the RDD.
      size - Size of the RDD.
      numPartitions - Number of partitions in the RDD (default: sc.defaultParallelism).
      seed - Random seed (default: a random long integer).
      Returns:
      RDD[Double] comprised of i.i.d. samples ~ U(0.0, 1.0).
    • uniformJavaRDD

      public static JavaDoubleRDD uniformJavaRDD(JavaSparkContext jsc, long size, int numPartitions, long seed)
      Java-friendly version of RandomRDDs.uniformRDD.
      Parameters:
      jsc - (undocumented)
      size - (undocumented)
      numPartitions - (undocumented)
      seed - (undocumented)
      Returns:
      (undocumented)
    • uniformJavaRDD

      public static JavaDoubleRDD uniformJavaRDD(JavaSparkContext jsc, long size, int numPartitions)
      RandomRDDs.uniformJavaRDD with the default seed.
      Parameters:
      jsc - (undocumented)
      size - (undocumented)
      numPartitions - (undocumented)
      Returns:
      (undocumented)
    • uniformJavaRDD

      public static JavaDoubleRDD uniformJavaRDD(JavaSparkContext jsc, long size)
      RandomRDDs.uniformJavaRDD with the default number of partitions and the default seed.
      Parameters:
      jsc - (undocumented)
      size - (undocumented)
      Returns:
      (undocumented)
    • normalRDD

      public static RDD<Object> normalRDD(SparkContext sc, long size, int numPartitions, long seed)
      Generates an RDD comprised of i.i.d. samples from the standard normal distribution.

      To transform the distribution in the generated RDD from standard normal to some other normal N(mean, sigma^2^), use RandomRDDs.normalRDD(sc, n, p, seed).map(v => mean + sigma * v).

      Parameters:
      sc - SparkContext used to create the RDD.
      size - Size of the RDD.
      numPartitions - Number of partitions in the RDD (default: sc.defaultParallelism).
      seed - Random seed (default: a random long integer).
      Returns:
      RDD[Double] comprised of i.i.d. samples ~ N(0.0, 1.0).
    • normalJavaRDD

      public static JavaDoubleRDD normalJavaRDD(JavaSparkContext jsc, long size, int numPartitions, long seed)
      Java-friendly version of RandomRDDs.normalRDD.
      Parameters:
      jsc - (undocumented)
      size - (undocumented)
      numPartitions - (undocumented)
      seed - (undocumented)
      Returns:
      (undocumented)
    • normalJavaRDD

      public static JavaDoubleRDD normalJavaRDD(JavaSparkContext jsc, long size, int numPartitions)
      RandomRDDs.normalJavaRDD with the default seed.
      Parameters:
      jsc - (undocumented)
      size - (undocumented)
      numPartitions - (undocumented)
      Returns:
      (undocumented)
    • normalJavaRDD

      public static JavaDoubleRDD normalJavaRDD(JavaSparkContext jsc, long size)
      RandomRDDs.normalJavaRDD with the default number of partitions and the default seed.
      Parameters:
      jsc - (undocumented)
      size - (undocumented)
      Returns:
      (undocumented)
    • poissonRDD

      public static RDD<Object> poissonRDD(SparkContext sc, double mean, long size, int numPartitions, long seed)
      Generates an RDD comprised of i.i.d. samples from the Poisson distribution with the input mean.

      Parameters:
      sc - SparkContext used to create the RDD.
      mean - Mean, or lambda, for the Poisson distribution.
      size - Size of the RDD.
      numPartitions - Number of partitions in the RDD (default: sc.defaultParallelism).
      seed - Random seed (default: a random long integer).
      Returns:
      RDD[Double] comprised of i.i.d. samples ~ Pois(mean).
    • poissonJavaRDD

      public static JavaDoubleRDD poissonJavaRDD(JavaSparkContext jsc, double mean, long size, int numPartitions, long seed)
      Java-friendly version of RandomRDDs.poissonRDD.
      Parameters:
      jsc - (undocumented)
      mean - (undocumented)
      size - (undocumented)
      numPartitions - (undocumented)
      seed - (undocumented)
      Returns:
      (undocumented)
    • poissonJavaRDD

      public static JavaDoubleRDD poissonJavaRDD(JavaSparkContext jsc, double mean, long size, int numPartitions)
      RandomRDDs.poissonJavaRDD with the default seed.
      Parameters:
      jsc - (undocumented)
      mean - (undocumented)
      size - (undocumented)
      numPartitions - (undocumented)
      Returns:
      (undocumented)
    • poissonJavaRDD

      public static JavaDoubleRDD poissonJavaRDD(JavaSparkContext jsc, double mean, long size)
      RandomRDDs.poissonJavaRDD with the default number of partitions and the default seed.
      Parameters:
      jsc - (undocumented)
      mean - (undocumented)
      size - (undocumented)
      Returns:
      (undocumented)
    • exponentialRDD

      public static RDD<Object> exponentialRDD(SparkContext sc, double mean, long size, int numPartitions, long seed)
      Generates an RDD comprised of i.i.d. samples from the exponential distribution with the input mean.

      Parameters:
      sc - SparkContext used to create the RDD.
      mean - Mean, or 1 / lambda, for the exponential distribution.
      size - Size of the RDD.
      numPartitions - Number of partitions in the RDD (default: sc.defaultParallelism).
      seed - Random seed (default: a random long integer).
      Returns:
      RDD[Double] comprised of i.i.d. samples ~ Pois(mean).
    • exponentialJavaRDD

      public static JavaDoubleRDD exponentialJavaRDD(JavaSparkContext jsc, double mean, long size, int numPartitions, long seed)
      Java-friendly version of RandomRDDs.exponentialRDD.
      Parameters:
      jsc - (undocumented)
      mean - (undocumented)
      size - (undocumented)
      numPartitions - (undocumented)
      seed - (undocumented)
      Returns:
      (undocumented)
    • exponentialJavaRDD

      public static JavaDoubleRDD exponentialJavaRDD(JavaSparkContext jsc, double mean, long size, int numPartitions)
      RandomRDDs.exponentialJavaRDD with the default seed.
      Parameters:
      jsc - (undocumented)
      mean - (undocumented)
      size - (undocumented)
      numPartitions - (undocumented)
      Returns:
      (undocumented)
    • exponentialJavaRDD

      public static JavaDoubleRDD exponentialJavaRDD(JavaSparkContext jsc, double mean, long size)
      RandomRDDs.exponentialJavaRDD with the default number of partitions and the default seed.
      Parameters:
      jsc - (undocumented)
      mean - (undocumented)
      size - (undocumented)
      Returns:
      (undocumented)
    • gammaRDD

      public static RDD<Object> gammaRDD(SparkContext sc, double shape, double scale, long size, int numPartitions, long seed)
      Generates an RDD comprised of i.i.d. samples from the gamma distribution with the input shape and scale.

      Parameters:
      sc - SparkContext used to create the RDD.
      shape - shape parameter (greater than 0) for the gamma distribution
      scale - scale parameter (greater than 0) for the gamma distribution
      size - Size of the RDD.
      numPartitions - Number of partitions in the RDD (default: sc.defaultParallelism).
      seed - Random seed (default: a random long integer).
      Returns:
      RDD[Double] comprised of i.i.d. samples ~ Pois(mean).
    • gammaJavaRDD

      public static JavaDoubleRDD gammaJavaRDD(JavaSparkContext jsc, double shape, double scale, long size, int numPartitions, long seed)
      Java-friendly version of RandomRDDs.gammaRDD.
      Parameters:
      jsc - (undocumented)
      shape - (undocumented)
      scale - (undocumented)
      size - (undocumented)
      numPartitions - (undocumented)
      seed - (undocumented)
      Returns:
      (undocumented)
    • gammaJavaRDD

      public static JavaDoubleRDD gammaJavaRDD(JavaSparkContext jsc, double shape, double scale, long size, int numPartitions)
      RandomRDDs.gammaJavaRDD with the default seed.
      Parameters:
      jsc - (undocumented)
      shape - (undocumented)
      scale - (undocumented)
      size - (undocumented)
      numPartitions - (undocumented)
      Returns:
      (undocumented)
    • gammaJavaRDD

      public static JavaDoubleRDD gammaJavaRDD(JavaSparkContext jsc, double shape, double scale, long size)
      RandomRDDs.gammaJavaRDD with the default number of partitions and the default seed.
      Parameters:
      jsc - (undocumented)
      shape - (undocumented)
      scale - (undocumented)
      size - (undocumented)
      Returns:
      (undocumented)
    • logNormalRDD

      public static RDD<Object> logNormalRDD(SparkContext sc, double mean, double std, long size, int numPartitions, long seed)
      Generates an RDD comprised of i.i.d. samples from the log normal distribution with the input mean and standard deviation

      Parameters:
      sc - SparkContext used to create the RDD.
      mean - mean for the log normal distribution
      std - standard deviation for the log normal distribution
      size - Size of the RDD.
      numPartitions - Number of partitions in the RDD (default: sc.defaultParallelism).
      seed - Random seed (default: a random long integer).
      Returns:
      RDD[Double] comprised of i.i.d. samples ~ Pois(mean).
    • logNormalJavaRDD

      public static JavaDoubleRDD logNormalJavaRDD(JavaSparkContext jsc, double mean, double std, long size, int numPartitions, long seed)
      Java-friendly version of RandomRDDs.logNormalRDD.
      Parameters:
      jsc - (undocumented)
      mean - (undocumented)
      std - (undocumented)
      size - (undocumented)
      numPartitions - (undocumented)
      seed - (undocumented)
      Returns:
      (undocumented)
    • logNormalJavaRDD

      public static JavaDoubleRDD logNormalJavaRDD(JavaSparkContext jsc, double mean, double std, long size, int numPartitions)
      RandomRDDs.logNormalJavaRDD with the default seed.
      Parameters:
      jsc - (undocumented)
      mean - (undocumented)
      std - (undocumented)
      size - (undocumented)
      numPartitions - (undocumented)
      Returns:
      (undocumented)
    • logNormalJavaRDD

      public static JavaDoubleRDD logNormalJavaRDD(JavaSparkContext jsc, double mean, double std, long size)
      RandomRDDs.logNormalJavaRDD with the default number of partitions and the default seed.
      Parameters:
      jsc - (undocumented)
      mean - (undocumented)
      std - (undocumented)
      size - (undocumented)
      Returns:
      (undocumented)
    • randomRDD

      public static <T> RDD<T> randomRDD(SparkContext sc, RandomDataGenerator<T> generator, long size, int numPartitions, long seed, scala.reflect.ClassTag<T> evidence$1)
      Generates an RDD comprised of i.i.d. samples produced by the input RandomDataGenerator.

      Parameters:
      sc - SparkContext used to create the RDD.
      generator - RandomDataGenerator used to populate the RDD.
      size - Size of the RDD.
      numPartitions - Number of partitions in the RDD (default: sc.defaultParallelism).
      seed - Random seed (default: a random long integer).
      evidence$1 - (undocumented)
      Returns:
      RDD[T] comprised of i.i.d. samples produced by generator.
    • randomJavaRDD

      public static <T> JavaRDD<T> randomJavaRDD(JavaSparkContext jsc, RandomDataGenerator<T> generator, long size, int numPartitions, long seed)
      Generates an RDD comprised of i.i.d. samples produced by the input RandomDataGenerator.

      Parameters:
      jsc - JavaSparkContext used to create the RDD.
      generator - RandomDataGenerator used to populate the RDD.
      size - Size of the RDD.
      numPartitions - Number of partitions in the RDD (default: sc.defaultParallelism).
      seed - Random seed (default: a random long integer).
      Returns:
      RDD[T] comprised of i.i.d. samples produced by generator.
    • randomJavaRDD

      public static <T> JavaRDD<T> randomJavaRDD(JavaSparkContext jsc, RandomDataGenerator<T> generator, long size, int numPartitions)
      RandomRDDs.randomJavaRDD with the default seed.
      Parameters:
      jsc - (undocumented)
      generator - (undocumented)
      size - (undocumented)
      numPartitions - (undocumented)
      Returns:
      (undocumented)
    • randomJavaRDD

      public static <T> JavaRDD<T> randomJavaRDD(JavaSparkContext jsc, RandomDataGenerator<T> generator, long size)
      RandomRDDs.randomJavaRDD with the default seed & numPartitions
      Parameters:
      jsc - (undocumented)
      generator - (undocumented)
      size - (undocumented)
      Returns:
      (undocumented)
    • uniformVectorRDD

      public static RDD<Vector> uniformVectorRDD(SparkContext sc, long numRows, int numCols, int numPartitions, long seed)
      Generates an RDD[Vector] with vectors containing i.i.d. samples drawn from the uniform distribution on U(0.0, 1.0).

      Parameters:
      sc - SparkContext used to create the RDD.
      numRows - Number of Vectors in the RDD.
      numCols - Number of elements in each Vector.
      numPartitions - Number of partitions in the RDD.
      seed - Seed for the RNG that generates the seed for the generator in each partition.
      Returns:
      RDD[Vector] with vectors containing i.i.d samples ~ U(0.0, 1.0).
    • uniformJavaVectorRDD

      public static JavaRDD<Vector> uniformJavaVectorRDD(JavaSparkContext jsc, long numRows, int numCols, int numPartitions, long seed)
      Java-friendly version of RandomRDDs.uniformVectorRDD.
      Parameters:
      jsc - (undocumented)
      numRows - (undocumented)
      numCols - (undocumented)
      numPartitions - (undocumented)
      seed - (undocumented)
      Returns:
      (undocumented)
    • uniformJavaVectorRDD

      public static JavaRDD<Vector> uniformJavaVectorRDD(JavaSparkContext jsc, long numRows, int numCols, int numPartitions)
      RandomRDDs.uniformJavaVectorRDD with the default seed.
      Parameters:
      jsc - (undocumented)
      numRows - (undocumented)
      numCols - (undocumented)
      numPartitions - (undocumented)
      Returns:
      (undocumented)
    • uniformJavaVectorRDD

      public static JavaRDD<Vector> uniformJavaVectorRDD(JavaSparkContext jsc, long numRows, int numCols)
      RandomRDDs.uniformJavaVectorRDD with the default number of partitions and the default seed.
      Parameters:
      jsc - (undocumented)
      numRows - (undocumented)
      numCols - (undocumented)
      Returns:
      (undocumented)
    • normalVectorRDD

      public static RDD<Vector> normalVectorRDD(SparkContext sc, long numRows, int numCols, int numPartitions, long seed)
      Generates an RDD[Vector] with vectors containing i.i.d. samples drawn from the standard normal distribution.

      Parameters:
      sc - SparkContext used to create the RDD.
      numRows - Number of Vectors in the RDD.
      numCols - Number of elements in each Vector.
      numPartitions - Number of partitions in the RDD (default: sc.defaultParallelism).
      seed - Random seed (default: a random long integer).
      Returns:
      RDD[Vector] with vectors containing i.i.d. samples ~ N(0.0, 1.0).
    • normalJavaVectorRDD

      public static JavaRDD<Vector> normalJavaVectorRDD(JavaSparkContext jsc, long numRows, int numCols, int numPartitions, long seed)
      Java-friendly version of RandomRDDs.normalVectorRDD.
      Parameters:
      jsc - (undocumented)
      numRows - (undocumented)
      numCols - (undocumented)
      numPartitions - (undocumented)
      seed - (undocumented)
      Returns:
      (undocumented)
    • normalJavaVectorRDD

      public static JavaRDD<Vector> normalJavaVectorRDD(JavaSparkContext jsc, long numRows, int numCols, int numPartitions)
      RandomRDDs.normalJavaVectorRDD with the default seed.
      Parameters:
      jsc - (undocumented)
      numRows - (undocumented)
      numCols - (undocumented)
      numPartitions - (undocumented)
      Returns:
      (undocumented)
    • normalJavaVectorRDD

      public static JavaRDD<Vector> normalJavaVectorRDD(JavaSparkContext jsc, long numRows, int numCols)
      RandomRDDs.normalJavaVectorRDD with the default number of partitions and the default seed.
      Parameters:
      jsc - (undocumented)
      numRows - (undocumented)
      numCols - (undocumented)
      Returns:
      (undocumented)
    • logNormalVectorRDD

      public static RDD<Vector> logNormalVectorRDD(SparkContext sc, double mean, double std, long numRows, int numCols, int numPartitions, long seed)
      Generates an RDD[Vector] with vectors containing i.i.d. samples drawn from a log normal distribution.

      Parameters:
      sc - SparkContext used to create the RDD.
      mean - Mean of the log normal distribution.
      std - Standard deviation of the log normal distribution.
      numRows - Number of Vectors in the RDD.
      numCols - Number of elements in each Vector.
      numPartitions - Number of partitions in the RDD (default: sc.defaultParallelism).
      seed - Random seed (default: a random long integer).
      Returns:
      RDD[Vector] with vectors containing i.i.d. samples.
    • logNormalJavaVectorRDD

      public static JavaRDD<Vector> logNormalJavaVectorRDD(JavaSparkContext jsc, double mean, double std, long numRows, int numCols, int numPartitions, long seed)
      Java-friendly version of RandomRDDs.logNormalVectorRDD.
      Parameters:
      jsc - (undocumented)
      mean - (undocumented)
      std - (undocumented)
      numRows - (undocumented)
      numCols - (undocumented)
      numPartitions - (undocumented)
      seed - (undocumented)
      Returns:
      (undocumented)
    • logNormalJavaVectorRDD

      public static JavaRDD<Vector> logNormalJavaVectorRDD(JavaSparkContext jsc, double mean, double std, long numRows, int numCols, int numPartitions)
      RandomRDDs.logNormalJavaVectorRDD with the default seed.
      Parameters:
      jsc - (undocumented)
      mean - (undocumented)
      std - (undocumented)
      numRows - (undocumented)
      numCols - (undocumented)
      numPartitions - (undocumented)
      Returns:
      (undocumented)
    • logNormalJavaVectorRDD

      public static JavaRDD<Vector> logNormalJavaVectorRDD(JavaSparkContext jsc, double mean, double std, long numRows, int numCols)
      RandomRDDs.logNormalJavaVectorRDD with the default number of partitions and the default seed.
      Parameters:
      jsc - (undocumented)
      mean - (undocumented)
      std - (undocumented)
      numRows - (undocumented)
      numCols - (undocumented)
      Returns:
      (undocumented)
    • poissonVectorRDD

      public static RDD<Vector> poissonVectorRDD(SparkContext sc, double mean, long numRows, int numCols, int numPartitions, long seed)
      Generates an RDD[Vector] with vectors containing i.i.d. samples drawn from the Poisson distribution with the input mean.

      Parameters:
      sc - SparkContext used to create the RDD.
      mean - Mean, or lambda, for the Poisson distribution.
      numRows - Number of Vectors in the RDD.
      numCols - Number of elements in each Vector.
      numPartitions - Number of partitions in the RDD (default: sc.defaultParallelism)
      seed - Random seed (default: a random long integer).
      Returns:
      RDD[Vector] with vectors containing i.i.d. samples ~ Pois(mean).
    • poissonJavaVectorRDD

      public static JavaRDD<Vector> poissonJavaVectorRDD(JavaSparkContext jsc, double mean, long numRows, int numCols, int numPartitions, long seed)
      Java-friendly version of RandomRDDs.poissonVectorRDD.
      Parameters:
      jsc - (undocumented)
      mean - (undocumented)
      numRows - (undocumented)
      numCols - (undocumented)
      numPartitions - (undocumented)
      seed - (undocumented)
      Returns:
      (undocumented)
    • poissonJavaVectorRDD

      public static JavaRDD<Vector> poissonJavaVectorRDD(JavaSparkContext jsc, double mean, long numRows, int numCols, int numPartitions)
      RandomRDDs.poissonJavaVectorRDD with the default seed.
      Parameters:
      jsc - (undocumented)
      mean - (undocumented)
      numRows - (undocumented)
      numCols - (undocumented)
      numPartitions - (undocumented)
      Returns:
      (undocumented)
    • poissonJavaVectorRDD

      public static JavaRDD<Vector> poissonJavaVectorRDD(JavaSparkContext jsc, double mean, long numRows, int numCols)
      RandomRDDs.poissonJavaVectorRDD with the default number of partitions and the default seed.
      Parameters:
      jsc - (undocumented)
      mean - (undocumented)
      numRows - (undocumented)
      numCols - (undocumented)
      Returns:
      (undocumented)
    • exponentialVectorRDD

      public static RDD<Vector> exponentialVectorRDD(SparkContext sc, double mean, long numRows, int numCols, int numPartitions, long seed)
      Generates an RDD[Vector] with vectors containing i.i.d. samples drawn from the exponential distribution with the input mean.

      Parameters:
      sc - SparkContext used to create the RDD.
      mean - Mean, or 1 / lambda, for the Exponential distribution.
      numRows - Number of Vectors in the RDD.
      numCols - Number of elements in each Vector.
      numPartitions - Number of partitions in the RDD (default: sc.defaultParallelism)
      seed - Random seed (default: a random long integer).
      Returns:
      RDD[Vector] with vectors containing i.i.d. samples ~ Exp(mean).
    • exponentialJavaVectorRDD

      public static JavaRDD<Vector> exponentialJavaVectorRDD(JavaSparkContext jsc, double mean, long numRows, int numCols, int numPartitions, long seed)
      Java-friendly version of RandomRDDs.exponentialVectorRDD.
      Parameters:
      jsc - (undocumented)
      mean - (undocumented)
      numRows - (undocumented)
      numCols - (undocumented)
      numPartitions - (undocumented)
      seed - (undocumented)
      Returns:
      (undocumented)
    • exponentialJavaVectorRDD

      public static JavaRDD<Vector> exponentialJavaVectorRDD(JavaSparkContext jsc, double mean, long numRows, int numCols, int numPartitions)
      RandomRDDs.exponentialJavaVectorRDD with the default seed.
      Parameters:
      jsc - (undocumented)
      mean - (undocumented)
      numRows - (undocumented)
      numCols - (undocumented)
      numPartitions - (undocumented)
      Returns:
      (undocumented)
    • exponentialJavaVectorRDD

      public static JavaRDD<Vector> exponentialJavaVectorRDD(JavaSparkContext jsc, double mean, long numRows, int numCols)
      RandomRDDs.exponentialJavaVectorRDD with the default number of partitions and the default seed.
      Parameters:
      jsc - (undocumented)
      mean - (undocumented)
      numRows - (undocumented)
      numCols - (undocumented)
      Returns:
      (undocumented)
    • gammaVectorRDD

      public static RDD<Vector> gammaVectorRDD(SparkContext sc, double shape, double scale, long numRows, int numCols, int numPartitions, long seed)
      Generates an RDD[Vector] with vectors containing i.i.d. samples drawn from the gamma distribution with the input shape and scale.

      Parameters:
      sc - SparkContext used to create the RDD.
      shape - shape parameter (greater than 0) for the gamma distribution.
      scale - scale parameter (greater than 0) for the gamma distribution.
      numRows - Number of Vectors in the RDD.
      numCols - Number of elements in each Vector.
      numPartitions - Number of partitions in the RDD (default: sc.defaultParallelism)
      seed - Random seed (default: a random long integer).
      Returns:
      RDD[Vector] with vectors containing i.i.d. samples ~ Exp(mean).
    • gammaJavaVectorRDD

      public static JavaRDD<Vector> gammaJavaVectorRDD(JavaSparkContext jsc, double shape, double scale, long numRows, int numCols, int numPartitions, long seed)
      Java-friendly version of RandomRDDs.gammaVectorRDD.
      Parameters:
      jsc - (undocumented)
      shape - (undocumented)
      scale - (undocumented)
      numRows - (undocumented)
      numCols - (undocumented)
      numPartitions - (undocumented)
      seed - (undocumented)
      Returns:
      (undocumented)
    • gammaJavaVectorRDD

      public static JavaRDD<Vector> gammaJavaVectorRDD(JavaSparkContext jsc, double shape, double scale, long numRows, int numCols, int numPartitions)
      RandomRDDs.gammaJavaVectorRDD with the default seed.
      Parameters:
      jsc - (undocumented)
      shape - (undocumented)
      scale - (undocumented)
      numRows - (undocumented)
      numCols - (undocumented)
      numPartitions - (undocumented)
      Returns:
      (undocumented)
    • gammaJavaVectorRDD

      public static JavaRDD<Vector> gammaJavaVectorRDD(JavaSparkContext jsc, double shape, double scale, long numRows, int numCols)
      RandomRDDs.gammaJavaVectorRDD with the default number of partitions and the default seed.
      Parameters:
      jsc - (undocumented)
      shape - (undocumented)
      scale - (undocumented)
      numRows - (undocumented)
      numCols - (undocumented)
      Returns:
      (undocumented)
    • randomVectorRDD

      public static RDD<Vector> randomVectorRDD(SparkContext sc, RandomDataGenerator<Object> generator, long numRows, int numCols, int numPartitions, long seed)
      Generates an RDD[Vector] with vectors containing i.i.d. samples produced by the input RandomDataGenerator.

      Parameters:
      sc - SparkContext used to create the RDD.
      generator - RandomDataGenerator used to populate the RDD.
      numRows - Number of Vectors in the RDD.
      numCols - Number of elements in each Vector.
      numPartitions - Number of partitions in the RDD (default: sc.defaultParallelism).
      seed - Random seed (default: a random long integer).
      Returns:
      RDD[Vector] with vectors containing i.i.d. samples produced by generator.
    • randomJavaVectorRDD

      public static JavaRDD<Vector> randomJavaVectorRDD(JavaSparkContext jsc, RandomDataGenerator<Object> generator, long numRows, int numCols, int numPartitions, long seed)
      Java-friendly version of RandomRDDs.randomVectorRDD.
      Parameters:
      jsc - (undocumented)
      generator - (undocumented)
      numRows - (undocumented)
      numCols - (undocumented)
      numPartitions - (undocumented)
      seed - (undocumented)
      Returns:
      (undocumented)
    • randomJavaVectorRDD

      public static JavaRDD<Vector> randomJavaVectorRDD(JavaSparkContext jsc, RandomDataGenerator<Object> generator, long numRows, int numCols, int numPartitions)
      :: RandomRDDs.randomJavaVectorRDD with the default seed.
      Parameters:
      jsc - (undocumented)
      generator - (undocumented)
      numRows - (undocumented)
      numCols - (undocumented)
      numPartitions - (undocumented)
      Returns:
      (undocumented)
    • randomJavaVectorRDD

      public static JavaRDD<Vector> randomJavaVectorRDD(JavaSparkContext jsc, RandomDataGenerator<Object> generator, long numRows, int numCols)
      RandomRDDs.randomJavaVectorRDD with the default number of partitions and the default seed.
      Parameters:
      jsc - (undocumented)
      generator - (undocumented)
      numRows - (undocumented)
      numCols - (undocumented)
      Returns:
      (undocumented)