Dynamic Hadoop Clusters

This presentation looks at the challenges of bringing up Hadoop clusters on dynamically allocated server —real or virtual. It shows how to bring up the cluster, verify its health, get data into the file system, submit work, and retrieve the results. Once you can bring up a cluster dynamically, you can start to use Hadoop in interesting ways —inside a unit test, as part of a workflow, or on spare machines at night. It also lets you deploy to dynamically allocated servers, real or virtual, and keep an eye on the live machines. This talk will use Hadoop, SmartFrog, VMWare and perhaps Amazon EC2 by way of the Typica and Restlet libraries.