Apache Kylin | Analytical Data Warehouse for Big Data

Bring OLAP Back to Big Data!

Apache Kylin™ is an open source, distributed Analytical Data Warehouse for Big Data; it was designed to provide OLAP (Online Analytical Processing) capability in the big data era. By renovating the multi-dimensional cube and precalculation technology on Hadoop and Spark, Kylin is able to achieve near constant query speed regardless of the ever-growing data volume. Reducing query latency from minutes to sub-second, Kylin brings online analytics back to big data.

Apache Kylin™ lets you query billions of rows at sub-second latency in 3 steps.

1 Identify a Star/Snowflake Schema on Hadoop.
2 Build Cube from the identified tables.
3 Query using ANSI-SQL and get results in sub-second, via ODBC, JDBC or RESTful API.

Apache Kylin™ can also integrate with your favorite BI tools like Tableau and PowerBI etc., to enable BI on Hadoop.

👏👏 Kylin 5 is now released, please visit Kylin 5.0 Home page! Some highlight features are: 👈

1 Smooth modeling process in one canvas which based on Vue.js
2 More flexible and enhanced data model, adding Computed Column and Table Index features
3 Toward a native and vectorized query engine

Why Apache Kylin?

Timely Decision Making on Big Data

Kylin can analyze 10+ billions of rows in less than a second. No more waiting on reports for critical decisions.

BI on Hadoop Accelerated

Kylin connects data on Hadoop to BI tools like Tableau, PowerBI/Excel, MSTR, QlikSense, Hue and SuperSet, making the BI on Hadoop faster than ever.

ANSI SQL Interface for Big Data on Hadoop

As an Analytical Data Warehouse, Kylin offers ANSI SQL on Hadoop/Spark and supports most ANSI SQL query functions.

Interactive Queries at High Concurrency

Kylin can support thousands of interactive queries at the same time, thanks to the low resource consumption of each query.

Real-time OLAP for Streaming Big Data

Kylin is able to compute streaming data as soon as it is generated, allowing real-time data analysis at second-level latency.

MOLAP Cube Precalculation

Analysts can define their favorite multi-dimensional model and precalculate the cube in Kylin.

Other Highlights

Job Management and Monitoring
Compression and Encoding Support
Incremental Refresh of Cubes
Leverage HBase Coprocessor for query latency
Both approximate and precise Query Capabilities for Distinct Count
Approximate Top-N Query Capability
Easy Web interface to manage, build, monitor and query cubes
Security capability to set ACL at Project/Table Level
Support LDAP and SAML Integration

Who is using Kylin?

Kylin Ecosystem

Kylin Core:

Fundamental framework of Kylin OLAP Engine comprises of Metadata Engine, Query Engine, Job Engine and Storage Engine to run the entire stack. It also includes a REST Server to service client requests

Extensions:

Plugins to support additional functions and features

Integration:

Lifecycle Management Support to integrate with Job Scheduler, ETL, Monitoring and Alerting Systems

User Interface:

Allows third party users to build customized user-interface atop Kylin core

Drivers:

ODBC and JDBC drivers to support different tools and products, such as Tableau