Apache Mahout > Mahout Wiki > Algorithms > Online Passive Aggressive

Online Passive Aggressive

Implements http://www.google.com/url?sa=t&source=web&cd=1&ved=0CCIQFjAA&url=http%3A%2F%2Fciteseer.ist.psu.edu%2Fviewdoc%2Fdownload%3Bjsessionid%3DF4743238B0EF35EB396A5ABFF1332021%3Fdoi%3D10.1.1.61.5120%26rep%3Drep1%26type%3Dpdf&rct=j&q=online%20passive%20aggressive&ei=elvWTa6jBcfHrQf8o52KBg&usg=AFQjCNGqNjaHyWgT4Z3QrK7hEqSTGM10YQ&sig2=-szWIrzBLoQ52jBER9-I0Q&cad=rja.

Use cases:

When you have many classes that are linearly separable and want a fast online learner to get results quickly.

Pre-requisites:

Data must be shuffled and normalized either between 0..1 or by mean and standard deviation.

Technical details:

The training approach taken is to minimize the ranking loss of the correct label vs the incorrect ones. We define this loss as hinge(1 - correct label score + wrong label score) where wrong label score is the score of the highest scoring label that is not the correct label. The hinge function is hinge = x if x > 0, 0 otherwise.

Parameters:

There is only one - learningRate. You set it to a larger number to converge faster, or a smaller number to be more cautious. The normal way to use it is via cross validation. Good values are (0.1, 1.0, 10.0).