Hama WebBook
By The Apache Hama Teamama is a distributed computing framework based on BSP Bulk Synchronous Parallel computing technique for massive scientific computations (e.g., matrix, graph, network, ..., etc) designed to run on massive data-sets stored in HDFS Hadoop Distributed File System. It's a TLP Top-Level Project under Apache Software Foundation. Apache Hama leverages BSP computing techniques to speed up iteration loops during the iterative process that requires several passes of messages before the final processed output is available. It provides an easy and flexible programming model, as compared with traditional models of Message Passing.It is compatible with any distributed storage, so you can use the Hama BSP on your existing Hadoop clusters. Finding shortest paths, K-Means clustering, ..., etc. are some of the problems tackled by Hama today.
This WebBook is written by The Apache Hama Team.
Table of Contents
- Getting Started
- Introduction
- Quick Start
- Hama Cluster Configuration
- Local or Pseudo Distributed Mode
- Fully Distributed Mode
- Java 7 and SDP Protocol
- Appendix 1. Running Hama on Hadoop Yarn
- Appendix 2. Running Hama on Clouds