Cassandra adapter
For instructions on downloading and building Calcite, start with the tutorial.
Once you’ve managed to compile the project, you can return here to start querying Cassandra with Calcite. First, we need a model definition. The model gives Calcite the necessary parameters to create an instance of the Cassandra adapter. Note that while models can contain definitions of materializations, the adapter will attempt to automatically populate any materialized views defined in Cassandra.
A basic example of a model file is given below:
Note that you can also specify username
and password
keys along with
the host
and keyspace
if your server requires authentication.
Assuming this file is stored as model.json
, you can connect to
Cassandra via sqlline
as
follows:
sqlline
will now accept SQL queries which access your CQL tables.
However, you’re not restricted to issuing queries supported by
CQL.
Calcite allows you to perform complex operations such as aggregations
or joins. The adapter will attempt to compile the query into the most
efficient CQL possible by exploiting filtering and sorting directly in
Cassandra where possible.
For example, in the example dataset there is a CQL table named timeline
with username
as the partition key and time
as the clustering key.
We can issue a simple query to fetch the most recent tweet ID of the user by writing standard SQL:
While executing this query, the Cassandra adapter is able to recognize
that username
is the partition key and can be filtered by Cassandra.
It also recognizes the clustering key time
and pushes the ordering to
Cassandra as well.
The final CQL query given to Cassandra is below:
There is still significant work to do in improving the flexibility and performance of the adapter, but if you’re looking for a quick way to gain additional insights into data stored in Cassandra, Calcite should prove useful.