Apache Kylin is an open-source Distributed Analytics Engine designed to provide SQL interface and multi-dimensional analysis (OLAP) on Hadoop supporting extremely large datasets, originally contributed from eBay Inc.
Apache Kylin lets you query massive datasets at sub-second latency in 3 steps.
- Identify a Star Schema on Hadoop.
- Build Cube from the identified tables.
- Query with ANSI-SQL and get results in sub-second, via ODBC, JDBC or RESTful API.
WHAT IS KYLIN?
– Extremely Fast OLAP Engine at Scale:
– ANSI SQL Interface on Hadoop:
– Interactive Query Capability:
– MOLAP Cube:
– Seamless Integration with BI Tools:
– Other Highlights:
– Compression and Encoding Support
– Incremental Refresh of Cubes
– Leverage HBase Coprocessor for query latency
– Both approximate and precise Query Capabilities for Distinct Count
– Approximate Top-N Query Capability
– Easy Web interface to manage, build, monitor and query cubes
– Security capability to set ACL at Cube/Project Level
– Support LDAP and SAML Integration
Apache Kylin v3.0 was released.
- [KYLIN-4098] – Add cube auto merge api
- [KYLIN-3883] – Kylin supports column count aggregation
- [KYLIN-565] – Unsupported SQL Functions
- [KYLIN-1772] – Highlight segment at HBase tab page of cube admin view when the segment is not healthy.
- [KYLIN-1850] – Show Kylin Version on GUI
- [KYLIN-2431] – StorageCleanupJob will remove intermediate tables created by other kylin instances
- [KYLIN-3756] – Support check-port-availability script for mac os x
- [KYLIN-3865] – Centralize the zookeeper related info
- [KYLIN-3906] – ExecutableManager is spelled as ExecutableManger
- [KYLIN-3907] – Sort the cube list by create time in descending order.
- [KYLIN-3917] – Add max segment merge span to cleanup intermediate data of cube building
- [KYLIN-4010] – Auto adjust offset according to query server’s timezone for time derived column
- [KYLIN-4096] – Make cube metadata validator rules configuable
- [KYLIN-4097] – Throw exception when too many dict slice eviction in AppendTrieDictionary
- [KYLIN-4163] – CreateFlatHiveTableStep has not yarn app url when hive job running
- [KYLIN-4167] – Refactor streaming coordinator
- [KYLIN-4175] – Support secondary hbase storage config for hbase cluster migration
- [KYLIN-4178] – Job scheduler support safe mode
- [KYLIN-4180] – Prevent abnormal CPU usage by limiting flat filters length
- [KYLIN-4187] – Building dimension dictionary using spark
- [KYLIN-4193] – More user-friendly page for loading streaming tables
- [KYLIN-4198] – “bin/system-cube.sh cron” will overwrite user’s crontab
- [KYLIN-4201] – Allow users to delete unused receivers from streaming page
- [KYLIN-4208] – RT OLAP kylin.stream.node configure optimization support all receiver can have the same config
- [KYLIN-4257] – Build historical data by layer in real time Lambda cube
- [KYLIN-4258] – Real-time OLAP may return incorrect result for some case
- [KYLIN-4273] – Make cube planner works for real-time streaming job
- [KYLIN-4283] – FileNotFound error in “Garbage Collection” step should not break cube building.
- [KYLIN-1716] – leave executing query page action stop bug
- [KYLIN-3730] – TableMetadataManager.reloadSourceTableQuietly is wrong
- [KYLIN-3741] – when the sql result is empty and limit is 0 , should not have “load more” bar
- [KYLIN-3842] – kylinProperties.js Unable to get the public configuration of the first line in the front end
- [KYLIN-3881] – Calcite isolating expression with its condition may throw ‘Division Undefined’ exception
- [KYLIN-3887] – Query with decimal sum measure of double complied failed after KYLIN-3703
- [KYLIN-3933] – Currently replica set related operation need refresh current front-end page
- [KYLIN-4135] – Real time streaming segment build task discard but can’t be rebuilt
- [KYLIN-4147] – User has project’s admin permission but doesn’t have permission to see the Storage/Planner/streaming tab in Model page
- [KYLIN-4162] – After drop the build task on the monitor page, subsequent segments cannot be constructed.
- [KYLIN-4165] – RT OLAP building job on “Save Cube Dictionaries” step concurrency error
- [KYLIN-4169] – Too many logs while DataModelManager init, cause the first RESTful API hang for a long time
- [KYLIN-4172] – Can’t rename field when map streaming schema to table
- [KYLIN-4176] – Filter the intermediate tables when loading table metadata from tree
- [KYLIN-4183] – Clicking ‘Submit’ button is unresponsive, when the segment is not selected.
- [KYLIN-4190] – hiveproducer write() function throw exception because hive mertics table location path prefix is different with defaut fs when hdfs uses router-based federation
- [KYLIN-4194] – Throw KylinConfigCannotInitException at STEP “Extract Fact Table Distinct Columns” with spark
- [KYLIN-4203] – Disable a real time cube and then enable it ,this cube may can’t submit build job anymore
- [KYLIN-4229] – String index out of range -1
- [KYLIN-4242] – Usage instructions in ‘PasswordPlaceholderConfigurer’ doesn’t work
- [KYLIN-4244] – ClassNotFoundException while use org.apache.kylin.engine.mr.common.CubeStatsReader in bash
- [KYLIN-4246] – Wrong results from real-time streaming when an optional field is used as a dimension
- [KYLIN-4248] – When adding a user, the prompt message is incorrect when the user name is empty.
- [KYLIN-4254] – The result exporting from Insight with CSV format is empty, when sql contains Chinese
- [KYLIN-4262] – pid in GC filename inconsistent with real pid
- [KYLIN-4265] – SQL tab of cube failed when filter is not empty