The Apache HBase™ Reference Guide

Revision History
Revision 0.94.27 2015-11-03T11:44

Abstract

This is the official reference guide of Apache HBase (TM), a distributed, versioned, column-oriented database built on top of Apache Hadoop and Apache ZooKeeper.


Table of Contents

Preface
1. Getting Started
1.1. Introduction
1.2. Quick Start
2. Apache HBase (TM) Configuration
2.1. Basic Prerequisites
2.2. HBase run modes: Standalone and Distributed
2.3. Configuration Files
2.4. Example Configurations
2.5. The Important Configurations
3. Upgrading
3.1. Upgrading from 0.94.x to 0.96.x
3.2. Upgrading from 0.92.x to 0.94.x
3.3. Upgrading from 0.90.x to 0.92.x
3.4. Upgrading to HBase 0.90.x from 0.20.x or 0.89.x
4. The Apache HBase Shell
4.1. Scripting
4.2. Shell Tricks
5. Data Model
5.1. Conceptual View
5.2. Physical View
5.3. Table
5.4. Row
5.5. Column Family
5.6. Cells
5.7. Data Model Operations
5.8. Versions
5.9. Sort Order
5.10. Column Metadata
5.11. Joins
5.12. ACID
6. HBase and Schema Design
6.1. Schema Creation
6.2. On the number of column families
6.3. Rowkey Design
6.4. Number of Versions
6.5. Supported Datatypes
6.6. Joins
6.7. Time To Live (TTL)
6.8. Keeping Deleted Cells
6.9. Secondary Indexes and Alternate Query Paths
6.10. Schema Design Smackdown
6.11. Operational and Performance Configuration Options
6.12. Constraints
7. HBase and MapReduce
7.1. Map-Task Spitting
7.2. HBase MapReduce Examples
7.3. Accessing Other HBase Tables in a MapReduce Job
7.4. Speculative Execution
8. Secure Apache HBase (TM)
8.1. Secure Client Access to Apache HBase
8.2. Access Control
8.3. Secure Bulk Load
9. Architecture
9.1. Overview
9.2. Catalog Tables
9.3. Client
9.4. Client Request Filters
9.5. Master
9.6. RegionServer
9.7. Regions
9.8. Bulk Loading
9.9. HDFS
10. Apache HBase (TM) External APIs
10.1. Non-Java Languages Talking to the JVM
10.2. REST
10.3. Thrift
10.4. C/C++ Apache HBase Client
11. Apache HBase (TM) Performance Tuning
11.1. Operating System
11.2. Network
11.3. Java
11.4. HBase Configurations
11.5. ZooKeeper
11.6. Schema Design
11.7. Writing to HBase
11.8. Reading from HBase
11.9. Deleting from HBase
11.10. HDFS
11.11. Amazon EC2
11.12. Case Studies
12. Troubleshooting and Debugging Apache HBase (TM)
12.1. General Guidelines
12.2. Logs
12.3. Resources
12.4. Tools
12.5. Client
12.6. MapReduce
12.7. NameNode
12.8. Network
12.9. RegionServer
12.10. Master
12.11. ZooKeeper
12.12. Amazon EC2
12.13. HBase and Hadoop version issues
12.14. Case Studies
13. Apache HBase (TM) Case Studies
13.1. Overview
13.2. Schema Design
13.3. Performance/Troubleshooting
14. Apache HBase (TM) Operational Management
14.1. HBase Tools and Utilities
14.2. Region Management
14.3. Node Management
14.4. HBase Metrics
14.5. HBase Monitoring
14.6. Cluster Replication
14.7. HBase Backup
14.8. HBase Snapshots
14.9. Capacity Planning
15. Building and Developing Apache HBase (TM)
15.1. Apache HBase Repositories
15.2. IDEs
15.3. Building Apache HBase
15.4. Adding an Apache HBase release to Apache's Maven Repository
15.5. Generating the HBase Reference Guide
15.6. Updating hbase.apache.org
15.7. Tests
15.8. Maven Build Commands
15.9. Getting Involved
15.10. Developing
15.11. Submitting Patches
16. ZooKeeper
16.1. Using existing ZooKeeper ensemble
16.2. SASL Authentication with ZooKeeper
17. Community
17.1. Decisions
17.2. Community Roles
A. FAQ
B. hbck In Depth
B.1. Running hbck to identify inconsistencies
B.2. Inconsistencies
B.3. Localized repairs
B.4. Region Overlap Repairs
C. Compression In HBase
C.1. CompressionTest Tool
C.2. hbase.regionserver.codecs
C.3. LZO
C.4. GZIP
C.5. SNAPPY
C.6. Changing Compression Schemes
D. YCSB: The Yahoo! Cloud Serving Benchmark and HBase
E. HFile format version 2
E.1. Motivation
E.2. HFile format version 1 overview
E.3. HBase file format with inline blocks (version 2)
F. Other Information About HBase
F.1. HBase Videos
F.2. HBase Presentations (Slides)
F.3. HBase Papers
F.4. HBase Sites
F.5. HBase Books
F.6. Hadoop Books
G. HBase History
H. HBase and the Apache Software Foundation
H.1. ASF Development Process
H.2. ASF Board Reporting
I. Enabling Dapper-like Tracing in HBase
I.1. SpanReceivers
I.2. Client Modifications
Index

List of Tables

2.1. Hadoop version support matrix
5.1. Table webtable
5.2. ColumnFamily anchor
5.3. ColumnFamily contents
8.1. Operation To Permission Mapping
comments powered by Disqus