public class ClusterClassifier extends AbstractVectorClassifier implements OnlineLearner, org.apache.hadoop.io.Writable
MIN_LOG_LIKELIHOOD
Modifier | Constructor and Description |
---|---|
|
ClusterClassifier() |
protected |
ClusterClassifier(ClusteringPolicy policy) |
|
ClusterClassifier(List<Cluster> models,
ClusteringPolicy policy)
The public constructor accepts a list of clusters to become the models
|
Modifier and Type | Method and Description |
---|---|
Vector |
classify(Vector instance)
Compute and return a vector containing
n-1 scores, where
n is equal to numCategories() , given an input
vector instance . |
double |
classifyScalar(Vector instance)
Classifies a vector in the special case of a binary classifier where
AbstractVectorClassifier.classify(Vector) would return a vector with only one element. |
void |
close()
Prepares the classifier for classification and deallocates any temporary data structures.
|
List<Cluster> |
getModels() |
ClusteringPolicy |
getPolicy() |
int |
numCategories()
Returns the number of categories that a target variable can be assigned to.
|
void |
readFields(DataInput in) |
void |
readFromSeqFiles(org.apache.hadoop.conf.Configuration conf,
org.apache.hadoop.fs.Path path) |
static ClusteringPolicy |
readPolicy(org.apache.hadoop.fs.Path path) |
void |
train(int actual,
Vector instance)
Updates the model using a particular target variable value and a feature vector.
|
void |
train(int actual,
Vector data,
double weight)
Train the models given an additional weight.
|
void |
train(long trackingKey,
int actual,
Vector instance)
Updates the model using a particular target variable value and a feature vector.
|
void |
train(long trackingKey,
String groupKey,
int actual,
Vector instance)
Updates the model using a particular target variable value and a feature vector.
|
void |
write(DataOutput out) |
static void |
writePolicy(ClusteringPolicy policy,
org.apache.hadoop.fs.Path path) |
void |
writeToSeqFiles(org.apache.hadoop.fs.Path path) |
classify, classifyFull, classifyFull, classifyFull, classifyNoLink, classifyScalar, logLikelihood
public ClusterClassifier(List<Cluster> models, ClusteringPolicy policy)
models
- a Listpolicy
- a ClusteringPolicypublic ClusterClassifier()
protected ClusterClassifier(ClusteringPolicy policy)
public Vector classify(Vector instance)
AbstractVectorClassifier
n-1
scores, where
n
is equal to numCategories()
, given an input
vector instance
. Higher scores indicate that the input vector
is more likely to belong to that category. The categories are denoted by
the integers 0
through n-1
(inclusive), and the
scores in the returned vector correspond to categories 1 through
n-1
(leaving out category 0). It is assumed that the score for
category 0 is one minus the sum of the scores in the returned vector.classify
in class AbstractVectorClassifier
instance
- A feature vector to be classified.n-1
encoding.public double classifyScalar(Vector instance)
AbstractVectorClassifier
AbstractVectorClassifier.classify(Vector)
would return a vector with only one element. As
such, using this method can avoid the allocation of a vector.classifyScalar
in class AbstractVectorClassifier
instance
- The feature vector to be classified.AbstractVectorClassifier.classify(Vector)
public int numCategories()
AbstractVectorClassifier
0
to numCategories()-1
(inclusive).numCategories
in class AbstractVectorClassifier
public void write(DataOutput out) throws IOException
write
in interface org.apache.hadoop.io.Writable
IOException
public void readFields(DataInput in) throws IOException
readFields
in interface org.apache.hadoop.io.Writable
IOException
public void train(int actual, Vector instance)
OnlineLearner
train
in interface OnlineLearner
actual
- The value of the target variable. This value should be in the half-open
interval [0..n) where n is the number of target categories.instance
- The feature vector for this example.public void train(int actual, Vector data, double weight)
actual
- the int index of a modeldata
- a data Vectorweight
- a double weighting factorpublic void train(long trackingKey, String groupKey, int actual, Vector instance)
OnlineLearner
train
in interface OnlineLearner
trackingKey
- The tracking key for this training example.groupKey
- An optional value that allows examples to be grouped in the computation of
the update to the model.actual
- The value of the target variable. This value should be in the half-open
interval [0..n) where n is the number of target categories.instance
- The feature vector for this example.public void train(long trackingKey, int actual, Vector instance)
OnlineLearner
train
in interface OnlineLearner
trackingKey
- The tracking key for this training example.actual
- The value of the target variable. This value should be in the half-open
interval [0..n) where n is the number of target categories.instance
- The feature vector for this example.public void close()
OnlineLearner
close
in interface Closeable
close
in interface AutoCloseable
close
in interface OnlineLearner
public ClusteringPolicy getPolicy()
public void writeToSeqFiles(org.apache.hadoop.fs.Path path) throws IOException
IOException
public void readFromSeqFiles(org.apache.hadoop.conf.Configuration conf, org.apache.hadoop.fs.Path path) throws IOException
IOException
public static ClusteringPolicy readPolicy(org.apache.hadoop.fs.Path path) throws IOException
IOException
public static void writePolicy(ClusteringPolicy policy, org.apache.hadoop.fs.Path path) throws IOException
IOException
Copyright © 2008–2015 The Apache Software Foundation. All rights reserved.