Apache Kylin 2.3.0 release, Open source distributed analytics engine

Software

Apache Kylin™ is an open source Distributed Analytics Engine designed to provide SQL interface and multi-dimensional analysis (OLAP) on Hadoop supporting extremely large datasets, originally contributed from eBay Inc.

Apache Kylin™ lets you query massive dataset at sub-second latency in 3 steps.

  1. Identify a Star Schema on Hadoop.
  2. Build Cube from the identified tables.
  3. Query with ANSI-SQL and get results in sub-second, via ODBC, JDBC or RESTful API.
Apache Kylin - Extreme OLAP Engine for Big Data
Apache Kylin

WHAT IS KYLIN?

– Extremely Fast OLAP Engine at Scale: 

Kylin is designed to reduce query latency on Hadoop for 10+ billions of rows of data

– ANSI SQL Interface on Hadoop: 

Kylin offers ANSI SQL on Hadoop and supports most ANSI SQL query functions

– Interactive Query Capability: 

Users can interact with Hadoop data via Kylin at sub-second latency, better than Hive queries for the same dataset

– MOLAP Cube:

User can define a data model and pre-build in Kylin with more than 10+ billions of raw data records

– Seamless Integration with BI Tools:

Kylin currently offers integration capability with BI Tools like Tableau, PowerBI, and Excel. Integration with Microstrategy is coming soon

– Other Highlights: 

– Job Management and Monitoring
– Compression and Encoding Support
– Incremental Refresh of Cubes
– Leverage HBase Coprocessor for query latency
– Both approximate and precise Query Capabilities for Distinct Count
– Approximate Top-N Query Capability
– Easy Web interface to manage, build, monitor and query cubes
– Security capability to set ACL at Cube/Project Level
– Support LDAP and SAML Integration

Apache Kylin v2.3.0 was released. This is a major release after 2.1, with over 70 bug fixes and improvements:

New feature

– [KYLIN-3125] – Support SparkSql in Cube building step “Create Intermediate Flat Hive Table”
– [KYLIN-3052] – Support Redshift as data source
– [KYLIN-3044] – Support SQL Server as data source
– [KYLIN-2999] – One click migrate cube in web
– [KYLIN-2960] – Support user/group and role authentication for LDAP
– [KYLIN-2902] – Introduce project-level concurrent query number control
– [KYLIN-2776] – New metric framework based on dropwizard
– [KYLIN-2727] – Introduce cube planner able to select cost-effective cuboids to be built by cost-based algorithms
– [KYLIN-2726] – Introduce a dashboard for showing kylin service related metrics, like query count, query latency, job count, etc
– [KYLIN-1892] – Support volatile range for segments auto merge

Improve:

* [KYLIN-3265] – Add “jobSearchMode” as a condition to “/kylin/api/jobs” API
* [KYLIN-3245] – Searching cube support fuzzy search
* [KYLIN-3243] – Optimize the code and keep the code consistent in the access.html
* [KYLIN-3239] – Refactor the ACL code about “checkPermission” and “hasPermission”
* [KYLIN-3215] – Remove ‘drop’ option when job status is stopped and error
* [KYLIN-3214] – Initialize ExternalAclProvider when starting kylin
* [KYLIN-3209] – Optimize job partial statistics path be consistent with existing one
* [KYLIN-3196] – Replace StringUtils.containsOnly with Regex
* [KYLIN-3194] – Tolerate broken job metadata caused by executable ClassNotFoundException
* [KYLIN-3193] – No model clone across projects
* [KYLIN-3182] – Update Kylin help menu links

More

Download

Reference: kylin.apache.org

Leave a Reply

Your email address will not be published. Required fields are marked *