ApacheCon NA 2011

Shevek

Shevek is an expert Java programmer, and one part of the creative team behind Karmasphere, a San Francisco based big data analytics company. He has worked on cutting edge research in compilers and language design, algorithmic optimization, systems and security. He received a Doctorate in Computing from the University of Bath, England. He also holds a Masters in Pure Mathematics and an epee.

Instrumenting Hadoop Jobs for Fun and Profit
November 11 1:30PM
Instrumentation is a general purpose technique to automatically gather detailed information about the execution of a process.

The distributed nature of a Hadoop job makes both the engineering of the instrumentation and the presentation of the output harder.

However, instrumentation can also take advantage of a detailed knowledge of the code paths within Hadoop to build a much deeper insight into the behaviour of the user code.

We will present our approach to general purpose instrumentation for Hadoop, which uses Hadoop-specific insights to profile, debug and diagnose faults in a job.

We will describe techniques using attempt success/failure, internal exception rates and differential analysis, amongst others, to help us localize badly performing code or malformed input data without user intervention.



Platinum Sponsors

Gold Sponsors

Silver Sponsors

Bronze Sponsors

Community Sponsors

Exhibitors