public class FuzzyKMeansDriver extends AbstractJob
Modifier and Type | Field and Description |
---|---|
static String |
M_OPTION |
argMap, inputFile, inputPath, outputFile, outputPath, tempPath
Constructor and Description |
---|
FuzzyKMeansDriver() |
Modifier and Type | Method and Description |
---|---|
static org.apache.hadoop.fs.Path |
buildClusters(org.apache.hadoop.conf.Configuration conf,
org.apache.hadoop.fs.Path input,
org.apache.hadoop.fs.Path clustersIn,
org.apache.hadoop.fs.Path output,
double convergenceDelta,
int maxIterations,
float m,
boolean runSequential)
Iterate over the input vectors to produce cluster directories for each iteration
|
static void |
clusterData(org.apache.hadoop.conf.Configuration conf,
org.apache.hadoop.fs.Path input,
org.apache.hadoop.fs.Path clustersIn,
org.apache.hadoop.fs.Path output,
double convergenceDelta,
float m,
boolean emitMostLikely,
double threshold,
boolean runSequential)
Run the job using supplied arguments
|
static void |
main(String[] args) |
static void |
run(org.apache.hadoop.conf.Configuration conf,
org.apache.hadoop.fs.Path input,
org.apache.hadoop.fs.Path clustersIn,
org.apache.hadoop.fs.Path output,
double convergenceDelta,
int maxIterations,
float m,
boolean runClustering,
boolean emitMostLikely,
double threshold,
boolean runSequential)
Iterate over the input vectors to produce clusters and, if requested, use the
results of the final iteration to cluster the input vectors.
|
static void |
run(org.apache.hadoop.fs.Path input,
org.apache.hadoop.fs.Path clustersIn,
org.apache.hadoop.fs.Path output,
double convergenceDelta,
int maxIterations,
float m,
boolean runClustering,
boolean emitMostLikely,
double threshold,
boolean runSequential)
Iterate over the input vectors to produce clusters and, if requested, use the
results of the final iteration to cluster the input vectors.
|
int |
run(String[] args) |
addFlag, addInputOption, addOption, addOption, addOption, addOption, addOutputOption, buildOption, buildOption, getAnalyzerClassFromOption, getCLIOption, getConf, getDimensions, getFloat, getFloat, getGroup, getInputFile, getInputPath, getInt, getInt, getOption, getOption, getOption, getOptions, getOutputFile, getOutputPath, getOutputPath, getTempPath, getTempPath, hasOption, keyFor, maybePut, parseArguments, parseArguments, parseDirectories, prepareJob, prepareJob, prepareJob, prepareJob, setConf, setS3SafeCombinedInputPath, shouldRunNextPhase
public static final String M_OPTION
public static void run(org.apache.hadoop.fs.Path input, org.apache.hadoop.fs.Path clustersIn, org.apache.hadoop.fs.Path output, double convergenceDelta, int maxIterations, float m, boolean runClustering, boolean emitMostLikely, double threshold, boolean runSequential) throws IOException, ClassNotFoundException, InterruptedException
input
- the directory pathname for input pointsclustersIn
- the directory pathname for initial & computed clustersoutput
- the directory pathname for output pointsconvergenceDelta
- the convergence delta valuemaxIterations
- the maximum number of iterationsm
- the fuzzification factor, see
http://en.wikipedia.org/wiki/Data_clustering#Fuzzy_c-means_clusteringrunClustering
- true if points are to be clustered after iterations completeemitMostLikely
- a boolean if true emit only most likely cluster for each pointthreshold
- a double threshold value emits all clusters having greater pdf (emitMostLikely = false)runSequential
- if true run in sequential execution modeIOException
ClassNotFoundException
InterruptedException
public static void run(org.apache.hadoop.conf.Configuration conf, org.apache.hadoop.fs.Path input, org.apache.hadoop.fs.Path clustersIn, org.apache.hadoop.fs.Path output, double convergenceDelta, int maxIterations, float m, boolean runClustering, boolean emitMostLikely, double threshold, boolean runSequential) throws IOException, ClassNotFoundException, InterruptedException
input
- the directory pathname for input pointsclustersIn
- the directory pathname for initial & computed clustersoutput
- the directory pathname for output pointsconvergenceDelta
- the convergence delta valuemaxIterations
- the maximum number of iterationsm
- the fuzzification factor, see
http://en.wikipedia.org/wiki/Data_clustering#Fuzzy_c-means_clusteringrunClustering
- true if points are to be clustered after iterations completeemitMostLikely
- a boolean if true emit only most likely cluster for each pointthreshold
- a double threshold value emits all clusters having greater pdf (emitMostLikely = false)runSequential
- if true run in sequential execution modeIOException
ClassNotFoundException
InterruptedException
public static org.apache.hadoop.fs.Path buildClusters(org.apache.hadoop.conf.Configuration conf, org.apache.hadoop.fs.Path input, org.apache.hadoop.fs.Path clustersIn, org.apache.hadoop.fs.Path output, double convergenceDelta, int maxIterations, float m, boolean runSequential) throws IOException, InterruptedException, ClassNotFoundException
input
- the directory pathname for input pointsclustersIn
- the file pathname for initial cluster centersoutput
- the directory pathname for output pointsconvergenceDelta
- the convergence delta valuemaxIterations
- the maximum number of iterationsm
- the fuzzification factor, see
http://en.wikipedia.org/wiki/Data_clustering#Fuzzy_c-means_clusteringrunSequential
- if true run in sequential execution modeIOException
InterruptedException
ClassNotFoundException
public static void clusterData(org.apache.hadoop.conf.Configuration conf, org.apache.hadoop.fs.Path input, org.apache.hadoop.fs.Path clustersIn, org.apache.hadoop.fs.Path output, double convergenceDelta, float m, boolean emitMostLikely, double threshold, boolean runSequential) throws IOException, ClassNotFoundException, InterruptedException
input
- the directory pathname for input pointsclustersIn
- the directory pathname for input clustersoutput
- the directory pathname for output pointsconvergenceDelta
- the convergence delta valueemitMostLikely
- a boolean if true emit only most likely cluster for each pointthreshold
- a double threshold value emits all clusters having greater pdf (emitMostLikely = false)runSequential
- if true run in sequential execution modeIOException
ClassNotFoundException
InterruptedException
Copyright © 2008–2015 The Apache Software Foundation. All rights reserved.