Apache Kudu is Open Source software. A Kudu cluster stores tables that look just like tables you’re used to from relational (SQL) databases. A table can be as simple as an binary
value, or as complex as a few hundred different strongly-typed attributes.
Just like SQL, every table has a
PRIMARY KEYmade up of one or more columns. This might be a single column like a unique user identifier, or a compound key such as a
(host, metric, timestamp)tuple for a machine time series database. Rows can be efficiently read, updated, or deleted by their primary key.
Kudu’s simple data model makes it breeze to port legacy applications or build new ones: no need to worry about how to encode your data into binary blobs or make sense of a huge database full of hard-to-interpret JSON. Tables are self-describing, so you can use standard tools like SQL engines or Spark to analyze your data.
Apache Kudu 1.12 released
- Kudu now supports native fine-grained authorization via integration with Apache Ranger. Kudu may now enforce access control policies defined for Kudu tables and columns stored in Ranger. See the authorization documentation for more details.
- Kudu’s web UI now supports proxying via Apache Knox. Kudu may be deployed in a firewalled state behind a Knox Gateway which will forward HTTP requests and responses between clients and the Kudu web UI.
- Kudu’s web UI now supports HTTP keep-alive. Operations that access multiple URLs will now reuse a single HTTP connection, improving their performance.
kudu tserver quiescetool is added to quiesce tablet servers. While a tablet server is quiescing, it will stop hosting tablet leaders and stop serving new scan requests. This can be used to orchestrate a rolling restart without stopping on-going Kudu workloads.
autotime source for HybridClock timestamps. With
--time_source=autoin AWS and GCE cloud environments, Kudu masters and tablet servers use the built-in NTP client synchronized with dedicated NTP servers available via host-only networks. With
--time_source=autoin environments other than AWS/GCE, Kudu masters and tablet servers rely on their local machine’s clock synchronized by NTP. The default setting for the HybridClock time source (
--time_source=system) is backward-compatible, requiring the local machine’s clock to be synchronized by the kernel’s NTP discipline.
kudu cluster rebalancetool now supports moving replicas away from specific tablet servers by supplying the
--move_replicas_from_ignored_tserversarguments (see KUDU-2914 for more details).
kudu table createtool is added to allow users to specify table creation options using JSON.
- Kudu now supports DATE and VARCHAR data types. See the schema design documentation for more details.