Apache > Hadoop > HDFS
 

Welcome to Hadoop Distributed File System!

Hadoop Distributed File System (HDFS) is the primary storage system used by Hadoop applications. HDFS creates multiple replicas of data blocks and distributes them on compute nodes throughout a cluster to enable reliable, extremely rapid computations.

Getting Started

To get started, begin here:

  1. Learn about HDFS by reading the documentation.
  2. Download Hadoop from the release page.
  3. Watch the HDFS training.
  4. Discuss HDFS on the mailing list.

Getting Involved

HDFS is an open source volunteer project under the Apache Software Foundation. We encourage you to learn about the project and contribute your expertise. Here are some starter links:

  1. See our How to Contribute page.
  2. Give us feedback: What can we do better?
  3. Join the mailing list: Meet the community.