See: Description
Class | Description |
---|---|
InMemBuilder |
MapReduce implementation where each mapper loads a full copy of the data in-memory.
|
InMemInputFormat |
Custom InputFormat that generates InputSplits given the desired number of trees.
each input split contains a subset of the trees. The number of splits is equal to the number of requested splits |
InMemInputFormat.InMemInputSplit |
Custom InputSplit that indicates how many trees are built by each mapper
|
InMemInputFormat.InMemRecordReader | |
InMemMapper |
In-memory mapper that grows the trees using a full copy of the data loaded in-memory.
|
Each mapper is responsible for growing a number of trees with a whole copy of the dataset loaded in memory, it uses the reference implementation's code to build each tree and estimate the oob error.
The dataset is distributed to the slave nodes using the DistributedCache
.
A custom InputFormat
(InMemInputFormat
) is configured with the
desired number of trees and generates a number of InputSplit
s
equal to the configured number of maps.
There is no need for reducers, each map outputs (the trees it built and, for each tree, the labels the tree predicted for each out-of-bag instance. This step has to be done in the mapper because only there we know which instances are o-o-b.
The Forest builder (InMemBuilder
) is responsible
for configuring and launching the job.
At the end of the job it parses the output files and builds the corresponding
DecisionForest
.
Copyright © 2008–2015 The Apache Software Foundation. All rights reserved.