=== Mahout Status Report: May 2010 ===

(This is the first report from Mahout as a top-level Apache project; 
previously it was a subproject of Apache Lucene. Mahout
recently reported status with Lucene's special April report. We take the
opportunity to summarize Mahout state and restate recent activity.)

ISSUES

There are no issues requiring board attention at this time.

OVERVIEW

Mahout's goal is to build scalable implementations of machine learning and
data mining algorithms. "Scalable" means designed with exceptional scale in 
mind, for efficiency and low memory consumption, and in many cases means 
providing Hadoop-based implementations. The "machine learning" implemented 
to date has been primarily in the broad areas of:

- Collaborative filtering / recommender engines
- Clustering
- Classification
- Frequent item set mining
- Evolutionary algorithms

CURRENT ACTIVITY

Mahout has created a release approximately every six months, most recently
releasing version 0.3 in March 2010. The project remains in a state of
rapid change and evolution, and looks to release 0.4 in September, 2010.
Recent activity in the project can be viewed here:

https://issues.apache.org/jira/secure/IssueNavigator.jspa?
  pid=12310751&fixfor=12314396&resolution=1

This month, Mahout will complete migration of website, mailing lists, 
SVN, and other information to reflect its status as a top-level project.

GOOGLE SUMMER OF CODE

Mahout will mentor five projects as part of Google's Summer of Code 
program. The projects will add or enhance capability in the specific 
areas of:

- Boltzmann Machines
- Support Vector Machines
- Singular Value Decomposition for recommendations
- Neural network with back propagation learning
- Eigencuts spectral clustering

MAHOUT IN ACTION

The book "Mahout in Action", published by Manning, continues to be written
and is approximately half complete. It has received some favorable feedback
via Manning's early access program.