Use Apache HBase™ when you need random, realtime read/write access to your Big Data. This project’s goal is the hosting of very large tables — billions of rows X millions of columns — atop clusters of commodity hardware. Apache HBase is an open-source, distributed, versioned, non-relational database modeled after Google’s Bigtable: A Distributed Storage System for Structured Data by Chang et al. Just as Bigtable leverages the distributed data storage provided by the Google File System, Apache HBase provides Bigtable-like capabilities on top of Hadoop and HDFS.
- Linear and modular scalability.
- Strictly consistent reads and writes.
- Automatic and configurable sharding of tables
- Automatic failover support between RegionServers.
- Convenient base classes for backing Hadoop MapReduce jobs with Apache HBase tables.
- Easy to use Java API for client access.
- Block cache and Bloom Filters for real-time queries.
- Query predicate push down via server side Filters
- Thrift gateway and a REST-ful Web service that supports XML, Protobuf, and binary data encoding options
- Extensible jruby-based (JIRB) shell
- Support for exporting metrics via the Hadoop metrics subsystem to files or Ganglia; or via JMX
Apache HBase 2.0 released.
Notable new features include:
- – Preliminary support for JDK11 (requires Hadoop 3.2.0+, see HBASE-22972)
- – Hadoop versions increased to 2.10.0 (HBASE-23986) and 3.1.2 (HBASE-22500)
- – ZooKeeper version increased to 3.5.7 (HBASE-24132)
- – Lots of improvements around HBCK2
- – HBase client connections to ZooKeeper are now optional (HBASE-18095)
- – HBASE-15560 TinyLFU-based BlockCache
- – HBASE-21874 Bucket cache on Persistent memory
- – HBASE-23286 Improve MTTR: Split WAL to HFile