Package org.apache.hadoop.examples.pi
package org.apache.hadoop.examples.pi
This package consists of a map/reduce application,
distbbp,
which computes exact binary digits of the mathematical constant π.
distbbp is designed for computing the nth bit of π,
for large n, say n > 100,000,000.
For computing the lower bits of π, consider using bbp.
Note that it may take a long time to finish all the jobs when <b> is large.
If the program is killed in the middle of the execution, the same command with
a different <remoteDir> can be used to resume the execution. For example, suppose
we use the following command to compute the (10^15+57)th bit of π.
The distbbp Program
The main class is DistBbp and the actually computation is done by DistSum jobs. The steps for launching the jobs are:- Initialize parameters.
- Create a list of sums.
- Read computed values from the given local directory.
- Remove the computed values from the sums.
- Partition the remaining sums into computation jobs.
- Submit the computation jobs to a cluster and then wait for the results.
- Write job outputs to the given local directory.
- Combine the job outputs and print the π bits.
The Bits of πThe table on the right are the results computed by distbbp.
|
0010010000.
|
Command Line Usages
The command line format is:
$ hadoop org.apache.hadoop.examples.pi.DistBbp \
<b> <nThreads> <nJobs> <type> <nPart> <remoteDir> <localDir>
And the parameters are:
| <b> | The number of bits to skip, i.e. compute the (b+1)th position. |
| <nThreads> | The number of working threads. |
| <nJobs> | The number of jobs per sum. |
| <type> | 'm' for map side job, 'r' for reduce side job, 'x' for mix type. |
| <nPart> | The number of parts per job. |
| <remoteDir> | Remote directory for submitting jobs. |
| <localDir> | Local directory for storing output files. |
$ hadoop org.apache.hadoop.examples.pi.DistBbp \
1,000,000,000,000,056 20 1000 x 500 remote/a local/output
It uses 20 threads to summit jobs so that there are at most 20 concurrent jobs.
Each sum (there are totally 14 sums) is partitioned into 1000 jobs.
The jobs will be executed in map-side or reduce-side. Each job has 500 parts.
The remote directory for the jobs is remote/a and the local directory
for storing output is local/output. Depends on the cluster configuration,
it may take many days to finish the entire execution. If the execution is killed,
we may resume it by
$ hadoop org.apache.hadoop.examples.pi.DistBbp \
1,000,000,000,000,056 20 1000 x 500 remote/b local/output-
ClassDescriptionorg.apache.hadoop.examples.pi.Combinable<T>A class is Combinable if its object can be combined with other objects.org.apache.hadoop.examples.pi.Container<T>A class is a Container if it contains an element.org.apache.hadoop.examples.pi.DistBbpA map/reduce program that uses a BBP-type method to compute exact binary digits of Pi.org.apache.hadoop.examples.pi.DistSumThe main class for computing sums using map/reduce jobs.org.apache.hadoop.examples.pi.DistSum.MachineAbstract machine for job execution.org.apache.hadoop.examples.pi.DistSum.Machine.AbstractInputFormatAn abstract InputFormat for the jobsorg.apache.hadoop.examples.pi.DistSum.Machine.SummationSplitSplit for the summationsorg.apache.hadoop.examples.pi.DistSum.MapSideA machine which does computation on the map side.org.apache.hadoop.examples.pi.DistSum.MapSide.PartitionInputFormatAn InputFormat which partitions a summationorg.apache.hadoop.examples.pi.DistSum.MapSide.SummingMapperA mapper which computes sumsorg.apache.hadoop.examples.pi.DistSum.MixMachineA machine which chooses Machine in runtime according to the cluster statusorg.apache.hadoop.examples.pi.DistSum.ReduceSideA machine which does computation on the reduce side.org.apache.hadoop.examples.pi.DistSum.ReduceSide.IndexPartitionerUse the index for partitioning.org.apache.hadoop.examples.pi.DistSum.ReduceSide.PartitionMapperA Mapper which partitions a summationorg.apache.hadoop.examples.pi.DistSum.ReduceSide.SummationInputFormatAn InputFormat which returns a single summation.org.apache.hadoop.examples.pi.DistSum.ReduceSide.SummingReducerA Reducer which computes sumsorg.apache.hadoop.examples.pi.ParserA class for parsing outputsorg.apache.hadoop.examples.pi.SummationWritableA Writable class for Summationorg.apache.hadoop.examples.pi.TaskResultA class for map task results or reduce task results.org.apache.hadoop.examples.pi.UtilUtility methodsorg.apache.hadoop.examples.pi.Util.TimerTimer