Apache > Hadoop > Hive
 

Welcome to Hive!

Hive is a data warehouse system for Hadoop that facilitates easy data summarization, ad-hoc queries, and the analysis of large datasets stored in Hadoop compatible file systems. Hive provides a mechanism to project structure onto this data and query the data using a SQL-like language called HiveQL. At the same time this language also allows traditional map/reduce programmers to plug in their custom mappers and reducers when it is inconvenient or inefficient to express this logic in HiveQL.

Getting Started

Check out the Getting Started Guide on the Hive wiki.

Getting Involved

Hive is an open source volunteer project under the Apache Software Foundation. Previously it was a subproject of Hadoop, but has now graduated to become a top-level project of its own. We encourage you to learn about the project and contribute your expertise. Here are some starter links:

  1. Give us feedback: What can we do better?
  2. Join the mailing list: Meet the community.
  3. Become a Hive Fan on Facebook.