public final class MongoDBDataModel extends Object implements DataModel
A DataModel
backed by a MongoDB database. This class expects a
collection in the database which contains a user ID (long
or
ObjectId
), item ID (long
or
ObjectId
), preference value (optional) and timestamps
("created_at", "deleted_at").
An example of a document in MongoDB:
{ "_id" : ObjectId("4d7627bf6c7d47ade9fc7780"),
"user_id" : ObjectId("4c2209fef3924d31102bd84b"),
"item_id" : ObjectId(4c2209fef3924d31202bd853),
"preference" : 0.5,
"created_at" : "Tue Mar 23 2010 20:48:43 GMT-0400 (EDT)" }
Preference value is optional to accommodate applications that have no notion of a preference value (that is, the user simply expresses a preference for an item, but no degree of preference).
The preference value is assumed to be parseable as a double
.
The user IDs and item IDs are assumed to be parseable as long
s
or ObjectId
s. In case of ObjectId
s, the
model creates a Map<ObjectId>
, long
>
(collection "mongo_data_model_map") inside the MongoDB database. This
conversion is needed since Mahout uses the long datatype to feed the
recommender, and MongoDB uses 12 bytes to create its identifiers.
The timestamps ("created_at", "deleted_at"), if present, are assumed to be
parseable as a long
or Date
. To express
timestamps as Date
s, a DateFormat
must be provided in the class constructor. The default Date format is
"EE MMM dd yyyy HH:mm:ss 'GMT'Z (zzz)"
. If this parameter
is set to null, timestamps are assumed to be parseable as long
s.
It is also acceptable for the documents to contain additional fields. Those fields will be ignored.
This class will reload data from the MondoDB database when
refresh(Collection)
is called. MongoDBDataModel keeps the
timestamp of the last update. This variable and the fields "created_at"
and "deleted_at" help the model to determine if the triple
(user, item, preference) must be added or deleted.
Modifier and Type | Field and Description |
---|---|
static String |
DEFAULT_MONGO_MAP_COLLECTION |
Constructor and Description |
---|
MongoDBDataModel()
Creates a new MongoDBDataModel
|
MongoDBDataModel(String host,
int port,
String database,
String collection,
boolean manage,
boolean finalRemove,
DateFormat format)
Creates a new MongoDBDataModel with MongoDB basic configuration
(without authentication)
|
MongoDBDataModel(String host,
int port,
String database,
String collection,
boolean manage,
boolean finalRemove,
DateFormat format,
String user,
String password)
Creates a new MongoDBDataModel with MongoDB basic configuration
(with authentication)
|
MongoDBDataModel(String host,
int port,
String database,
String collection,
boolean manage,
boolean finalRemove,
DateFormat format,
String userIDField,
String itemIDField,
String preferenceField,
String mappingCollection)
Creates a new MongoDBDataModel with MongoDB advanced configuration
(without authentication)
|
MongoDBDataModel(String host,
int port,
String database,
String collection,
boolean manage,
boolean finalRemove,
DateFormat format,
String user,
String password,
String userIDField,
String itemIDField,
String preferenceField,
String mappingCollection)
Creates a new MongoDBDataModel with MongoDB advanced configuration
(with authentication)
|
Modifier and Type | Method and Description |
---|---|
void |
cleanupMappingCollection()
Cleanup mapping collection.
|
String |
fromIdToLong(String id,
boolean isUser)
Translates the MongoDB identifier to Mahout/MongoDBDataModel's internal
identifier, if required.
|
String |
fromLongToId(long id)
Translates the Mahout/MongoDBDataModel's internal identifier to MongoDB
identifier, if required.
|
LongPrimitiveIterator |
getItemIDs() |
FastIDSet |
getItemIDsFromUser(long userID) |
float |
getMaxPreference() |
float |
getMinPreference() |
int |
getNumItems() |
int |
getNumUsers() |
int |
getNumUsersWithPreferenceFor(long itemID) |
int |
getNumUsersWithPreferenceFor(long itemID1,
long itemID2) |
PreferenceArray |
getPreferencesForItem(long itemID) |
PreferenceArray |
getPreferencesFromUser(long id) |
Long |
getPreferenceTime(long userID,
long itemID) |
Float |
getPreferenceValue(long userID,
long itemID) |
LongPrimitiveIterator |
getUserIDs() |
boolean |
hasPreferenceValues() |
boolean |
isIDInModel(String ID)
Checks if an ID is currently in the model.
|
Date |
mongoUpdateDate()
Date of the latest update of the model.
|
void |
refresh(Collection<Refreshable> alreadyRefreshed)
Triggers "refresh" -- whatever that means -- of the implementation.
|
void |
refreshData(String userID,
Iterable<List<String>> items,
boolean add)
Adds/removes (user, item) pairs to/from the model.
|
void |
removePreference(long userID,
long itemID) |
void |
setPreference(long userID,
long itemID,
float value) |
String |
toString() |
public static final String DEFAULT_MONGO_MAP_COLLECTION
public MongoDBDataModel() throws UnknownHostException
UnknownHostException
public MongoDBDataModel(String host, int port, String database, String collection, boolean manage, boolean finalRemove, DateFormat format) throws UnknownHostException
host
- MongoDB host.port
- MongoDB port. Default: 27017database
- MongoDB databasecollection
- MongoDB collection/tablemanage
- If true, the model adds and removes users and items
from MongoDB database when the model is refreshed.finalRemove
- If true, the model removes the user/item completely
from the MongoDB database. If false, the model adds the "deleted_at"
field with the current date to the "deleted" user/item.format
- MongoDB date format. If null, the model uses timestamps.UnknownHostException
- if the database host cannot be resolvedpublic MongoDBDataModel(String host, int port, String database, String collection, boolean manage, boolean finalRemove, DateFormat format, String userIDField, String itemIDField, String preferenceField, String mappingCollection) throws UnknownHostException
userIDField
- Mongo user ID fielditemIDField
- Mongo item ID fieldpreferenceField
- Mongo preference value fieldUnknownHostException
- if the database host cannot be resolvedMongoDBDataModel(String, int, String, String, boolean, boolean, DateFormat)
public MongoDBDataModel(String host, int port, String database, String collection, boolean manage, boolean finalRemove, DateFormat format, String user, String password) throws UnknownHostException
user
- Mongo username (authentication)password
- Mongo password (authentication)UnknownHostException
- if the database host cannot be resolvedMongoDBDataModel(String, int, String, String, boolean, boolean, DateFormat)
public MongoDBDataModel(String host, int port, String database, String collection, boolean manage, boolean finalRemove, DateFormat format, String user, String password, String userIDField, String itemIDField, String preferenceField, String mappingCollection) throws UnknownHostException
UnknownHostException
- if the database host cannot be resolvedMongoDBDataModel(String, int, String, String, boolean, boolean, DateFormat, String, String)
public void refreshData(String userID, Iterable<List<String>> items, boolean add) throws NoSuchUserException, NoSuchItemException
Adds/removes (user, item) pairs to/from the model.
userID
- MongoDB user identifieritems
- List of pairs (item, preference) which want to be added or
deletedadd
- If true, this flag indicates that the pairs (user, item)
must be added to the model. If false, it indicates deletion.NoSuchUserException
NoSuchItemException
refresh(Collection)
public void refresh(Collection<Refreshable> alreadyRefreshed)
Triggers "refresh" -- whatever that means -- of the implementation. The general contract is that any should always leave itself in a consistent, operational state, and that the refresh atomically updates internal state from old to new.
refresh
in interface Refreshable
alreadyRefreshed
- s that are known to have already been refreshed as
a result of an initial call to a method on some object. This ensures
that objects in a refresh dependency graph aren't refreshed twice
needlessly.refreshData(String, Iterable, boolean)
public String fromIdToLong(String id, boolean isUser)
Translates the MongoDB identifier to Mahout/MongoDBDataModel's internal identifier, if required.
If MongoDB identifiers are long datatypes, it returns the id.
This conversion is needed since Mahout uses the long datatype to feed the recommender, and MongoDB uses 12 bytes to create its identifiers.
id
- MongoDB identifierisUser
- fromLongToId(long)
,
Mongo Object IDspublic String fromLongToId(long id)
Translates the Mahout/MongoDBDataModel's internal identifier to MongoDB identifier, if required.
If MongoDB identifiers are long datatypes, it returns the id in String format.
This conversion is needed since Mahout uses the long datatype to feed the recommender, and MongoDB uses 12 bytes to create its identifiers.
id
- Mahout's internal identifierfromIdToLong(String, boolean)
,
Mongo Object IDspublic boolean isIDInModel(String ID)
Checks if an ID is currently in the model.
ID
- user or item IDpublic Date mongoUpdateDate()
Date of the latest update of the model.
public void cleanupMappingCollection()
public LongPrimitiveIterator getUserIDs() throws TasteException
getUserIDs
in interface DataModel
TasteException
public PreferenceArray getPreferencesFromUser(long id) throws TasteException
getPreferencesFromUser
in interface DataModel
TasteException
public FastIDSet getItemIDsFromUser(long userID) throws TasteException
getItemIDsFromUser
in interface DataModel
TasteException
public LongPrimitiveIterator getItemIDs() throws TasteException
getItemIDs
in interface DataModel
TasteException
public PreferenceArray getPreferencesForItem(long itemID) throws TasteException
getPreferencesForItem
in interface DataModel
TasteException
public Float getPreferenceValue(long userID, long itemID) throws TasteException
getPreferenceValue
in interface DataModel
TasteException
public Long getPreferenceTime(long userID, long itemID) throws TasteException
getPreferenceTime
in interface DataModel
TasteException
public int getNumItems() throws TasteException
getNumItems
in interface DataModel
TasteException
public int getNumUsers() throws TasteException
getNumUsers
in interface DataModel
TasteException
public int getNumUsersWithPreferenceFor(long itemID) throws TasteException
getNumUsersWithPreferenceFor
in interface DataModel
TasteException
public int getNumUsersWithPreferenceFor(long itemID1, long itemID2) throws TasteException
getNumUsersWithPreferenceFor
in interface DataModel
TasteException
public void setPreference(long userID, long itemID, float value)
setPreference
in interface DataModel
public void removePreference(long userID, long itemID)
removePreference
in interface DataModel
public boolean hasPreferenceValues()
hasPreferenceValues
in interface DataModel
public float getMaxPreference()
getMaxPreference
in interface DataModel
public float getMinPreference()
getMinPreference
in interface DataModel
Copyright © 2008–2015 The Apache Software Foundation. All rights reserved.