Title: bayesian-commandline
# Introduction
This quick start page describes how to run the naive bayesian and
complementary naive bayesian classification algorithms on a Hadoop cluster.
# Steps
## Testing it on one single machine w/o cluster
In the examples directory type:
mvn -q exec:java
-Dexec.mainClass="org.apache.mahout.classifier.bayes.mapreduce.bayes."
-Dexec.args=""
mvn -q exec:java
-Dexec.mainClass="org.apache.mahout.classifier.bayes.mapreduce.cbayes."
-Dexec.args=""
## Running it on the cluster
* In $MAHOUT_HOME/, build the jar containing the job (mvn install) The job
will be generated in $MAHOUT_HOME/core/target/ and it's name will contain
the Mahout version number. For example, when using Mahout 0.1 release, the
job will be mahout-core-0.1.jar
* (Optional) 1 Start up Hadoop: $HADOOP_HOME/bin/start-all.sh
* Put the data: $HADOOP_HOME/bin/hadoop fs -put testdata
* Run the Job: $HADOOP_HOME/bin/hadoop jar
$MAHOUT_HOME/core/target/mahout-core-.job
org.apache.mahout.classifier.bayes.mapreduce.bayes.BayesDriver
* Get the data out of HDFS and have a look. Use bin/hadoop fs -lsr output
to view all outputs.
# Command line options
BayesDriver, BayesThetaNormalizerDriver, CBayesNormalizedWeightDriver,
CBayesDriver, CBayesThetaDriver, CBayesThetaNormalizerDriver,
BayesWeightSummerDriver, BayesFeatureDriver, BayesTfIdfDriver Usage:
[--input --output