Apache Kylin v3.0.0-alpha released, Open source distributed analytics engine

Apache Kylin

Apache Kylin is an open source Distributed Analytics Engine designed to provide SQL interface and multi-dimensional analysis (OLAP) on Hadoop supporting extremely large datasets, originally contributed from eBay Inc.

Apache Kylin lets you query massive dataset at sub-second latency in 3 steps.

  1. Identify a Star Schema on Hadoop.
  2. Build Cube from the identified tables.
  3. Query with ANSI-SQL and get results in sub-second, via ODBC, JDBC or RESTful API.

WHAT IS KYLIN?

– Extremely Fast OLAP Engine at Scale: 

Kylin is designed to reduce query latency on Hadoop for 10+ billions of rows of data

– ANSI SQL Interface on Hadoop: 

Kylin offers ANSI SQL on Hadoop and supports most ANSI SQL query functions

– Interactive Query Capability: 

Users can interact with Hadoop data via Kylin at sub-second latency, better than Hive queries for the same dataset

– MOLAP Cube:

User can define a data model and pre-build in Kylin with more than 10+ billions of raw data records

– Seamless Integration with BI Tools:

Kylin currently offers integration capability with BI Tools like Tableau, PowerBI, and Excel. Integration with Microstrategy is coming soon

– Other Highlights: 

– Job Management and Monitoring
– Compression and Encoding Support
– Incremental Refresh of Cubes
– Leverage HBase Coprocessor for query latency
– Both approximate and precise Query Capabilities for Distinct Count
– Approximate Top-N Query Capability
– Easy Web interface to manage, build, monitor and query cubes
– Security capability to set ACL at Cube/Project Level
– Support LDAP and SAML Integration

Apache Kylin v3.0.0-alpha was released.

Changelog

New Feature

  • [KYLIN-3654] – Kylin Real-time Streaming
  • [KYLIN-3795] – Submit Spark jobs via Apache Livy
  • [KYLIN-3820] – Add a curator-based scheduler

Improvement

  • [KYLIN-3716] – FastThreadLocal replaces ThreadLocal
  • [KYLIN-3744] – Add javadoc and unittest for Kylin New Streaming Solution
  • [KYLIN-3759] – Streaming ClassNotFoundExeception when lambda is enable in MR job
  • [KYLIN-3786] – Add integration test for real-time streaming
  • [KYLIN-3791] – Map return by Maps.transformValues is a immutable view
  • [KYLIN-3797] – Too many or filters may break Kylin server when flatting filter
  • [KYLIN-3814] – Add pause interval for job retry
  • [KYLIN-3821] – Expose real-time streaming data consuming lag info
  • [KYLIN-3834] – Add monitor for curator-based scheduler
  • [KYLIN-3839] – Storage clean up after refreshing or deleting a segment
  • [KYLIN-3864] – Provide a function to judge whether the os type is Mac os x or not
  • [KYLIN-3867] – Enable JDBC to use key store & trust store for https connection
  • [KYLIN-3901] – Use multi threads to speed up the storage cleanup job
  • [KYLIN-3905] – Enable shrunken dictionary default
  • [KYLIN-3908] – KylinClient’s HttpRequest.releaseConnection is not needed in retrieveMetaData & executeKylinQuery
  • [KYLIN-3929] – Check satisfaction before execute cubeplanner algorithm
  • [KYLIN-3690] – New streaming backend implementation
  • [KYLIN-3691] – New streaming ui implementation
  • [KYLIN-3692] – New streaming ui implementation
  • [KYLIN-3745] – Real-time segment state changed from active to immutable is not sequently
  • [KYLIN-3747] – Use FQDN to register a streaming receiver instead of ip
  • [KYLIN-3768] – Save streaming metadata a standard kylin path in zookeeper

Bug Fix

  • [KYLIN-3787] – NPE throws when dimension value has null when query real-time data
  • [KYLIN-3789] – Stream receiver admin page issue fix
    Bug
  • [KYLIN-3800] – Real-time streaming count distinct result wrong
  • [KYLIN-3817] – Duration in Cube building is a negative number
  • [KYLIN-3818] – After Cube disabled, auto-merge cube job still running
  • [KYLIN-3830] – Wrong result when ‘SELECT SUM(dim1)’ without set a relative metric of dim1.
  • [KYLIN-3866] – Whether to set mapreduce.application.classpath is determined by the user
  • [KYLIN-3880] – DataType is incompatible in Kylin HBase coprocessor
  • [KYLIN-3888] – TableNotDisabledException when running “Convert Lookup Table to HFile”
  • [KYLIN-3898] – Cube level properties are ineffective in the some build steps
  • [KYLIN-3902] – NoRealizationFoundException due to creating a wrong JoinDesc
  • [KYLIN-3909] – Spark cubing job failed for MappeableRunContainer is not registered
  • [KYLIN-3911] – Check if HBase table is enabled before diabling table in DeployCoprocessorCLI
  • [KYLIN-3916] – Fix cube build action issue after streaming migrate
  • [KYLIN-3922] – Fail to update coprocessor when run DeployCoprocessorCLI
  • [KYLIN-3923] – UT GeneralColumnDataTest fail

Download