ApacheCon NA 2011

Grant Ingersoll

Check out Mahout: http://mahout.apache.org/ Grant Ingersoll is a long time Lucene and Solr committer as well as the co-founder of the Apache Mahout project and of Lucid Imagination, a company dedicated to supporting Lucene and Solr technologies. Grant has spent the better part of his career working on search and natural language processing problems.

One Day -- Mahout Boot Camp
November 7 10:00AM
Mahout Boot Camp is a 1-day training designed to get newcomers to Mahout up and running using Mahout's classification, clustering and collaborative filtering tools. The class will also introduce some of Mahout's other features such as frequent patternset mining. We will also cover the basics of machine learning.

The class will be both lecture and labs, so students should be prepared to code. No prior machine learning experience is required. Experience with Java is helpful, but not required.

Course Outline:
1. Introduction
a. What is Mahout?
b. What is Machine Learning?
c. What can it solve?
d. What can’t it solve?
e. What version and Why?
2. Getting Started
a. Installing Mahout
b. Validating Installation
3. The Three C’s of Mahout – Mahout Concepts
a. Classification
b. Clustering
c. Collaborative Filtering (Recommendation)
4. Lab 1: The C’s in Action
i. Run the Mahout examples
5. Classification In Depth
a. Concepts in Classification
i. Understanding your data
1. Feature Selection
b. Mahout’s classification algorithms
i. Naïve Bayes and Complementary Naïve Bayes
ii. Random Forests
iii. SGD
c. Lab: Classifying Wikipedia
d. Classification in Production
6. Clustering In Depth
a. Concepts in Clustering
i. Document
ii. Topic/Word
b. Mahout’s Clustering Algorithms
i. K-Means
ii. Mean-shift
iii. Canopy
iv. Latent Dirchlet
c. Lab: Clustering the News
d. Clustering in Production
7. Collaborative Filtering (CF) In Depth
a. Concepts in CF
i. Modeling data
ii. Measuring Affinity
b. Mahout’s CF Capabilities
i. User-Item
ii. Item-Item
iii. Scoring
1. Slope One
2. Other Distance Measures
iv. Online vs Offline
c. Lab: Recommending Movies
8. Mahout’s other features and functionalities
a. Freq. Patternset Mining
b. Primitive Collections
c. Utils


Bet You Didn't Know Lucene Can...
November 9 1:30PM
Lucene and Solr have always provided very capable text search, but did you know it is useful for many other things as well?

In this talk, we'll take a look at some of the myriad of ways that Lucene and Solr can be used to solve real world challenges ranging from classifying content, recommending movies all the way through to taking your unit testing to the next level.


Platinum Sponsors

Gold Sponsors

Silver Sponsors

Bronze Sponsors

Community Sponsors

Exhibitors