This release comes four months after 1.16.0. It includes more than 90 resolved
issues, comprising a large number of new features as well as general improvements
and bug-fixes. Among others:
This release comes three months after 1.15.0. It includes more than 80 resolved
issues, comprising a large number of new features as well as general improvements
and bug-fixes to Calcite core. Among others:
Calcite has been upgraded to use
Avatica 1.11.0,
which was recently released.
The Apache Calcite PMC
is pleased to announce
Apache Calcite release 1.15.0.
In this release, three months after 1.14.0, 50 issues are fixed by 22
contributors. Among more modest improvements and bug-fixes, here are
some features of note:
[CALCITE-707]
adds DDL commands to Calcite for the first time, including CREATE and DROP
commands for schemas, tables, foreign tables, views, and materialized views.
We know that DDL syntax is a matter of taste, so we added the extensions to a
new “server” module, leaving the “core” parser unchanged;
[CALCITE-2061]
allows dynamic parameters in the LIMIT and OFFSET and clauses;
[CALCITE-1913]
refactors the JDBC adapter to make it easier to plug in a new SQL dialect;
[CALCITE-1616]
adds a data profiler, an algorithm that efficiently analyzes large data sets
with many columns, estimating the number of distinct values in columns and
groups of columns, and finding functional dependencies. The improved
statistics are used by the algorithm that designs summary tables for a
lattice.
Calcite now supports JDK 10 and Guava 23.0. (It continues to run on
JDK 7, 8 and 9, and on versions of Guava as early as 14.0.1. The default
version of Guava remains 19.0, the latest version compatible with JDK 7
and the Cassandra adapter’s dependencies.)
This release comes three months after 1.13.0. It includes 68 resolved issues with many improvements and bug fixes.
This release brings some big new features.
The GEOMETRY data type was added along with 35 associated functions as the start of support for Simple Feature Access.
There are also two new adapters.
Firstly, the Elasticsearch 5 adapter which now exists in parallel with the previous Elasticsearch 2 adapter.
Additionally there is now an OS adapter which exposes operating system metrics as relational tables.
ThetaSketch and HyperUnique support has also been added to the Druid adapter.
Several minor improvements are added as well including improved MATCH_RECOGNIZE support, quantified comparison predicates, and ARRAY and MULTISET support for UDFs.
This release comes three months after 1.12.0. It includes more than 75 resolved issues, comprising
a large number of new features as well as general improvements and bug-fixes.
First, Calcite has been upgraded to use
Avatica 1.10.0,
which was recently released.
Moreover, Calcite core includes improvements which aim at making it more powerful, stable and robust.
In addition to numerous bux-fixes, we have implemented a
new materialized view rewriting algorithm
and new metadata providers which
should prove useful for data processing systems relying on Calcite.
In addition, more progress has been made for the different adapters.
For instance, the Druid adapter now relies on
Druid 0.10.0 and
it can generate more efficient plans where most of the computation can be pushed to Druid,
e.g., using extraction functions.
The Apache Calcite PMC
is pleased to announce further growth of its sub-project, Avatica.
Avatica has been slowly growing inside of Calcite for many years (dating back
to Optiq-0.4.x!). The team has taken the next step to hoist the Avatica code
out of the Calcite repository into its own. The team felt like this was the
next logical step given the maturity of the project.
The previous “/avatica” directory in the Calcite repository has been removed, so
further contributions should be submitted agains the new repository. The de-facto
repository can be found at the ASF’s Git hosting,
with a mirrored-copy also available on Github at apache/calcite-avatica.
Calcite now supports JDK 9 and Guava 21.0. (It continues to run on
JDK 7 and 8, and on versions of Guava as early as 14.0.1. The default
version of Guava remains 19.0, due to the Cassandra adapter’s
dependencies, and the fact that Guava 21.0 requires JDK 8 or later.)
There are two new adapters:
The File adapter
can read files of various formats (such as CSV, JSON, zipped files,
and HTML) over various protocols (including file and HTTP). If
reading HTML files, it can extract data from nested <TABLE>
elements.
And there are continuing improvements in performance and stability of
the Druid adapter. (The Druid project now
embeds Calcite to provide SQL support,
and there has been cross-fertilization between the projects.)
To err is human, as the saying goes. If you mis-type the name of a
schema, table or column in a SQL statement, Calcite now
helps you correct it.
The error message indicates whether it was whether it was the schema,
table or column that was not found; if the mistake was just due to an
upper- or lower-case letter, it suggests the correct name.
New SQL syntax and functions:
HOP, TUMBLE and SESSION functions in the GROUP BY clause
allow you to aggregate over window types (especially useful for
streaming queries);
Experimental support for the MATCH_RECOGNIZE clause for
Complex-Event Processing (CEP);
New YEAR, MONTH, WEEK, DAYOFYEAR, DAYOFMONTH, DAYOFWEEK,
HOUR, MINUTE, SECOND, DATABASE, IFNULL, and USER
functions to comply with the ODBC/JDBC standard. Also, EXTRACT now
allows the corresponding time-unit arguments.
Nearly three months after the previous release, there is a
long list of improvements and bug-fixes,
many of them making planner rules smarter. The following are some of
the more important ones.
Several adapters have improvements:
The JDBC adapter can now push down DML (INSERT, UPDATE, DELETE),
windowed aggregates (OVER), IS NULL and IS NOT NULL operators.
The Cassandra adapter now supports authentication.
Several key bug-fixes in the Druid adapter.
For correlated and uncorrelated sub-queries, we generate more
efficient plans (for example, in some correlated queries we no longer
require a sub-query to generate the values of the correlating
variable), can now handle multiple correlations, and have also fixed a
few correctness bugs.
New SQL syntax:
CROSS APPLY and OUTER APPLY;
MINUS as a synonym for EXCEPT;
an AS JSON option for the EXPLAIN command;
compound identifiers in the target list of INSERT, allowing you to
insert into individual fields of record-valued columns (or column
families if you are using the Apache Phoenix adapter).
A variety of new and extended built-in functions: CONVERT, LTRIM,
RTRIM, 3-parameter LOCATE and POSITION, RAND, RAND_INTEGER,
and SUBSTRING applied to binary types.
There are minor but potentially breaking API changes in
[CALCITE-1519]
(interface SubqueryConverter becomes SubQueryConverter and some
similar changes in the case of classes and methods) and
[CALCITE-1530]
(rename Shuttle to Visitor, and create a new class Visitor<R>).
See the cases for more details.
This release comes shortly after 1.9.0. It includes mainly bug fixes for the core and
Druid adapter. For the latest, we fixed an
important issue that
prevented us from handling consistently time dimensions in different time zones.
This release includes extensions and fixes for the Druid adapter. New features were
added, such as the capability to
recognize and translate Timeseries and TopN Druid queries.
Moreover, this release contains multiple bug fixes over the initial implementation of the
adapter. It is worth mentioning that most of these fixes were contributed by Druid developers,
which demonstrates the good reception of the adapter by that community.
We also added support for
SELECT without FROM
(equivalent to the VALUES clause, and widely used in MySQL and PostgreSQL),
and added a
conformance
parameter to allow you to selectively enable this and other SQL features.
And, as usual, there are a couple of dozen bug-fixes and enhancements to
planner rules and APIs.
A new Apache Calcite adapter allows you to access
Apache Cassandra via industry-standard SQL.
You can map a Cassandra keyspace into Calcite as a schema, Cassandra
CQL tables as tables, and execute SQL queries on them, which Calcite
converts into CQL.
Cassandra can define and maintain materialized views but the adapter
goes further: it can transparently rewrite a query to use a
materialized view even if the view is not mentioned in the query.
The Cassandra adapter is available as part of
Apache Calcite version 1.7.0,
which has just been released. Calcite also has
adapters
for CSV and JSON files, and JDBC data source, MongoDB, Spark and Splunk.
We have added
an adapter for
Apache Cassandra.
You can map a Cassandra keyspace into Calcite as a schema, Cassandra
CQL tables as tables, and execute SQL queries on them, which Calcite
converts into CQL.
Cassandra can define and maintain materialized views but the adapter
goes further: it can transparently rewrite a query to use a
materialized view even if the view is not mentioned in the query.
This release adds an
Oracle-compatibility mode.
If you add fun=oracle to your JDBC connect string, you get all of
the standard operators and functions plus Oracle-specific functions
DECODE, NVL, LTRIM, RTRIM, GREATEST and LEAST. We look
forward to adding more functions, and compatibility modes for other
databases, in future releases.
We’ve replaced our use of JUL (java.util.logging)
with SLF4J. SLF4J provides an API which Calcite can use
independent of the logging implementation. This ultimately provides additional
flexibility to users, allowing them to configure Calcite’s logging within their
own chosen logging framework. This work was done in
[CALCITE-669].
For users experienced with configuring JUL in Calcite previously, there are some
differences as some the JUL logging levels do not exist in SLF4J: FINE,
FINER, and FINEST, specifically. To deal with this, FINE was mapped
to SLF4J’s DEBUG level, while FINER and FINEST were mapped to SLF4J’s TRACE.
The Apache Calcite project management committee (PMC) today announced the
appointment of Josh Elser
to the committee.
Josh has only been a committer for a few months, but has become a prominent
member of the Calcite project, and has taken leadership in several areas,
not least in discussing the future of Avatica.
As usual in this release, there are new SQL features, improvements to
planning rules and Avatica, and lots of bug fixes. We’ll spotlight a
couple of features make it easier to handle complex queries.
[CALCITE-816]
allows you to represent sub-queries (EXISTS, IN and scalar) as
RexSubQuery,
a kind of expression in the relational algebra. Until
now, the sql-to-rel converter was burdened with expanding sub-queries,
and people creating relational algebra directly (or via
RelBuilder)
could only create ‘flat’ relational expressions. Now we have planner
rules to expand and de-correlate sub-queries.
Metadata is the fuel that powers query planning. It includes
traditional query-planning statistics such as cost and row-count
estimates, but also information such as which columns form unique
keys, unique and what predicates are known to apply to a relational
expression’s output rows. From the predicates we can deduce which
columns are constant, and following
[CALCITE-1023]
we can now remove constant columns from GROUP BY keys.
Metadata is often computed recursively, and it is hard to safely and
efficiently calculate metadata on a graph of RelNodes that is large,
frequently cyclic, and constantly changing.
[CALCITE-794]
introduces a context to each metadata call. That context can detect
cyclic metadata calls and produce a safe answer to the metadata
request. It will also allow us to add finer-grained caching and
further tune the metadata layer.
This is our first release as a top-level Apache project! Thanks to everyone who has contributed to it.
In addition to a large number of bug fixes and minor enhancements, this release includes major improvements to Avatica, planner rules, and RelBuilder.
Further, we built Piglet, a subset of the classic Hadoop language Pig. Pig is particularly interesting because it makes heavy use of nested multi-sets. You can follow this example to implement your own query language, and immediately taking advantage of Calcite’s back-ends and optimizer rules.
On October 21st, 2015 the board of the
Apache Software Foundation
voted to establish Calcite as a top-level Apache project.
Describing itself as “the foundation for your next high-performance
database”, Calcite is a
framework for building data management systems.
Calcite includes a comprehensive implementation of relational algebra
and an extensible cost-based query optimizer. It also includes an
optional SQL parser and JDBC driver.
Calcite joined Apache as an incubator project in May, 2014. To
graduate from the incubator, projects have to prove that they can
create high quality releases, form a diverse community, and operate as
a meritocracy.
Calcite’s committers have delivered eight releases during incubation
(roughly one every two months) including the
milestone 1.0 release in January, 2015.
The project has become a key component in many high-performance
databases, including the
Apache Drill,
Apache Hive,
Apache Kylin and
Apache Phoenix open source projects,
and several commercial products.
In addition to a large number of bug fixes and minor enhancements,
this release includes improvements to
lattices and
materialized views,
and adds a
builder API
so that you can easily create relational algebra expressions.
Julian Hyde’s talk Apache Calcite: One planner fits all won
Best Lightning Talk
at the XLDB-2015 conference (with Eric Tschetter’s talk “Sketchy
Approximations”).
XLDB is an annual conference that brings together experts from
science, industry and academia to find practical solutions to problems
involving extremely large data sets.
As a result of winning Best Lightning Talk, Julian will get a 30
minute keynote speaking slot at XLDB-2016.
Calcite’s foundation is a comprehensive implementation of relational
algebra (together with transformation rules, cost model, and metadata)
but to create algebra expressions you had to master a complex API.
We’re solving this problem by introducing an
algebra builder,
a single class with all the methods you need to build any relational
expression.
We’re still working on the algebra builder, but plan to release it
with Calcite 1.4 (see
[CALCITE-748]).
The algebra builder will make some existing tasks easier (such as
writing planner rules), but will also enable new things, such as
writing applications directly on top of Calcite, or implementing
non-SQL query languages. These applications and languages will be able
to take advantage of Calcite’s existing back-ends (including
Hive-on-Tez, Drill, MongoDB, Splunk, Spark, JDBC data sources) and
extensive set of query-optimization rules.
If you have questions or comments, please post to the
mailing list.
There have been many changes to Avatica, hugely improving its coverage of the
JDBC API and overall robustness. A new provider, JdbcMeta, allows
you to remote an existing JDBC driver.
[CALCITE-606]
improves how the planner propagates traits such as collation and
distribution among relational expressions.
This Calcite release makes it possible to exploit physical properties
of relational expressions to produce more efficient plans, introducing
collation and distribution as traits, Exchange relational operator,
and several new forms of metadata.
We add experimental support for streaming SQL.
This release drops support for JDK 1.6; Calcite now requires 1.7 or
later.
We have introduced static create methods for many sub-classes of
RelNode. We strongly suggest that you use these rather than
calling constructors directly.
Since the previous release we have re-organized the into the org.apache.calcite
namespace. To make migration of your code easier, we have described the
mapping from old to new class names
as an attachment to
[CALCITE-296].
The release adds SQL support for GROUPING SETS, EXTEND, UPSERT and sequences;
a remote JDBC driver;
improvements to the planner engine and built-in planner rules;
improvements to the algorithms that implement the relational algebra,
including an interpreter that can evaluate queries without compilation;
and fixes about 30 bugs.
A fairly minor release, and last release before we rename all of the
packages and lots of classes, in what we expect to call 1.0. If you
have an existing application, it’s worth upgrading to this first,
before you move on to 1.0.
Several new features, including a heuristic rule to plan queries with
a large number of joins, a number of windowed aggregate functions, and
new utility, SqlRun.
The official @ApacheCalcite
Twitter account pushes announcements about Calcite. If you give a talk about
Calcite, let us know and we'll tweet it out and add it to the news section
of the website.