Mahout MapReduce Overview
Getting Mahout
Download the latest release
Download the latest release here.
Or checkout the latest code from here
Alternatively: Add Mahout 0.10.0 to a maven project
Mahout is also available via a maven repository under the group id org.apache.mahout.
If you would like to import the latest release of mahout into a java project, add the following dependency in your pom.xml:
<dependency>
<groupId>org.apache.mahout</groupId>
<artifactId>mahout-mr</artifactId>
<version>0.10.0</version>
</dependency>
Features
For a full list of Mahout’s features see our Features by Engine page.
Using Mahout
Mahout has prepared a bunch of examples and tutorials for users to quickly learn how to use its machine learning algorithms.
Recommendations
Check the Recommender Quickstart or the tutorial on creating a userbased recommender in 5 minutes.
If you are building a recommender system for the first time, please also refer to a list of Dos and Don’ts that might be helpful.
Clustering
Check the Synthetic data example.
Classification
If you are interested in how to train a Naive Bayes model, look at the 20 newsgroups example.
If you plan to build a Hidden Markov Model for speech recognition, the example here might be instructive.
Or you could build a Random Forest model by following this quick start page.
Working with Text
If you need to convert raw text into word vectors as input to clustering or classification algorithms, please refer to this page on how to create vectors from text.