Instrumenting Hadoop Jobs for Fun and Profit
Instrumentation is a general purpose technique to automatically gather detailed information about the execution of a process.
The distributed nature of a Hadoop job makes both the engineering of the instrumentation and the presentation of the output harder.
However, instrumentation can also take advantage of a detailed knowledge of the code paths within Hadoop to build a much deeper insight into the behaviour of the user code.
We will present our approach to general purpose instrumentation for Hadoop, which uses Hadoop-specific insights to profile, debug and diagnose faults in a job.
We will describe techniques using attempt success/failure, internal exception rates and differential analysis, amongst others, to help us localize badly performing code or malformed input data without user intervention.