public class LuceneIterator extends AbstractLuceneIterator
Modifier and Type | Field and Description |
---|---|
protected String |
idField |
protected Set<String> |
idFieldSelector |
bump, field, indexReader, maxErrorDocs, nextDocId, nextLogRecord, normPower, numErrorDocs, skippedErrorMessages, terminfo, weight
Constructor and Description |
---|
LuceneIterator(org.apache.lucene.index.IndexReader indexReader,
String idField,
String field,
TermInfo termInfo,
Weight weight,
double normPower)
Produce a LuceneIterable that can create the Vector plus normalize it.
|
LuceneIterator(org.apache.lucene.index.IndexReader indexReader,
String idField,
String field,
TermInfo termInfo,
Weight weight,
double normPower,
double maxPercentErrorDocs) |
Modifier and Type | Method and Description |
---|---|
protected String |
getVectorName(int documentIndex)
Given the document name, derive a name for the vector.
|
computeNext
protected final String idField
public LuceneIterator(org.apache.lucene.index.IndexReader indexReader, String idField, String field, TermInfo termInfo, Weight weight, double normPower)
indexReader
- IndexReader
to read the documents from.idField
- field containing the id. May be null.field
- field to use for the VectortermInfo
- termInfoweight
- weightnormPower
- the normalization value. Must be non-negative, or LuceneIterable.NO_NORMALIZING
public LuceneIterator(org.apache.lucene.index.IndexReader indexReader, String idField, String field, TermInfo termInfo, Weight weight, double normPower, double maxPercentErrorDocs)
indexReader
- IndexReader
to read the documents from.idField
- field containing the id. May be null.field
- field to use for the VectortermInfo
- termInfoweight
- weightnormPower
- the normalization value. Must be non-negative, or LuceneIterable.NO_NORMALIZING
maxPercentErrorDocs
- most documents that will be tolerated without a term freq vector. In [0,1].LuceneIterator(org.apache.lucene.index.IndexReader, String, String, org.apache.mahout.utils.vectors.TermInfo,
org.apache.mahout.vectorizer.Weight, double)
protected String getVectorName(int documentIndex) throws IOException
AbstractLuceneIterator
getVectorName
in class AbstractLuceneIterator
documentIndex
- the lucene document index.IOException
Copyright © 2008–2015 The Apache Software Foundation. All rights reserved.