HOWTO
Here’s some miscellaneous documentation about using Calcite and its various adapters.
- Building from a source distribution
- Building from git
- If you already have Apache Maven
- Running tests
- Running integration tests
- Contributing
- Getting started
- Setting up an IDE for contributing
- Tracing
- Debugging generated classes in Intellij
- CSV adapter
- MongoDB adapter
- Splunk adapter
- Implementing an adapter
- Advanced topics for developers
- Advanced topics for committers
- Merging pull requests (for Calcite committers)
- Set up PGP signing keys (for Calcite committers)
- Set up Maven repository credentials (for Calcite committers)
- Making a snapshot (for Calcite committers)
- Making a release (for Calcite committers)
- Cleaning up after a failed release attempt (for Calcite committers)
- Validate a release
- Get approval for a release via Apache voting process (for Calcite committers)
- Publishing a release (for Calcite committers)
- Publishing the web site (for Calcite committers)
Building from a source distribution
Prerequisite is Java (JDK 8, 9, 10 or 11) on your path.
Unpack the source distribution .tar.gz
file,
cd
to the root directory of the unpacked source,
then build using the included maven wrapper:
Running tests describes how to run more or fewer tests.
Building from git
Prerequisites are git and Java (JDK 8, 9, 10 or 11) on your path.
Create a local copy of the github repository,
cd
to its root directory,
then build using the included maven wrapper:
Calcite includes a number of machine-generated codes. By default, these are
regenerated on every build, but this has the negative side-effect of causing
a re-compilation of the entire project when the non-machine-generated code
has not changed. To make sure incremental compilation still works as intended,
provide the skipGenerate
command line option with your maven command.
If you invoke the clean
lifecycle phase, you must not specify the
skipGenerate
option as it will not recompile the necessary code for the build
to succeed.
Running tests describes how to run more or fewer tests.
If you already have Apache Maven
If you have already installed Maven and it is on your path, then you
can use mvn
rather than ./mvnw
in commands. You need Maven version
3.5.2 or later.
Running tests
The test suite will run by default when you build, unless you specify
-DskipTests
:
There are other options that control which tests are run, and in what environment, as follows.
-Dcalcite.test.db=DB
(where db ish2
,hsqldb
,mysql
, orpostgresql
) allows you to change the JDBC data source for the test suite. Calcite’s test suite requires a JDBC data source populated with the foodmart data set.hsqldb
, the default, uses an in-memory hsqldb database.- All others access a test virtual machine
(see integration tests below).
mysql
andpostgresql
might be somewhat faster than hsqldb, but you need to populate it (i.e. provision a VM).
-Dcalcite.debug
prints extra debugging information to stdout.-Dcalcite.test.slow
enables tests that take longer to execute. For example, there are tests that create virtual TPC-H and TPC-DS schemas in-memory and run tests from those benchmarks.-Dcalcite.test.splunk
enables tests that run against Splunk. Splunk must be installed and running.
Running integration tests
For testing Calcite’s external adapters, a test virtual machine should be used. The VM includes Cassandra, Druid, H2, HSQLDB, MySQL, MongoDB, and PostgreSQL.
Test VM requires 5GiB of disk space and it takes 30 minutes to build.
Note: you can use calcite-test-dataset to populate your own database, however it is recommended to use test VM so the test environment can be reproduced.
VM preparation
0) Install dependencies: Vagrant and VirtualBox
1) Clone https://github.com/vlsi/calcite-test-dataset.git at the same level as calcite repository. For instance:
Note: integration tests search for ../calcite-test-dataset or ../../calcite-test-dataset. You can specify full path via calcite.test.dataset system property.
2) Build and start the VM:
VM management
Test VM is provisioned by Vagrant, so regular Vagrant vagrant up
and vagrant halt
should be used to start and stop the VM.
The connection strings for different databases are listed in calcite-test-dataset readme.
Suggested test flow
Note: test VM should be started before you launch integration tests. Calcite itself does not start/stop the VM.
Command line:
- Executing regular unit tests (does not require external data): no change.
mvn test
ormvn install
. - Executing all tests, for all the DBs:
mvn verify -Pit
.it
stands for “integration-test”.mvn install -Pit
works as well. - Executing just tests for external DBs, excluding unit tests:
mvn -Dtest=foo -DfailIfNoTests=false -Pit verify
- Executing just MongoDB tests:
cd mongo; mvn verify -Pit
From within IDE:
- Executing regular unit tests: no change.
- Executing MongoDB tests: run
MongoAdapterIT.java
as usual (no additional properties are required) - Executing MySQL tests: run
JdbcTest
andJdbcAdapterTest
with setting-Dcalcite.test.db=mysql
- Executing PostgreSQL tests: run
JdbcTest
andJdbcAdapterTest
with setting-Dcalcite.test.db=postgresql
Integration tests technical details
Tests with external data are executed at maven’s integration-test phase.
We do not currently use pre-integration-test/post-integration-test, however we could use that in future.
The verification of build pass/failure is performed at verify phase.
Integration tests should be named ...IT.java
, so they are not picked up on unit test execution.
Contributing
See the developers guide.
Getting started
See the developers guide.
Setting up an IDE for contributing
Setting up IntelliJ IDEA
To setup IntelliJ IDEA, follow the standard steps for the installation of IDEA and set up one of the JDK versions currently supported by Calcite.
Start with building Calcite from the command line.
Go to File > Open… and open up Calcite’s pom.xml
file.
When IntelliJ asks if you want to open it as a project or a file, select project.
Also, say yes when it asks if you want a new window.
IntelliJ’s Maven project importer should handle the rest.
There is a partially implemented IntelliJ code style configuration that you can import located on GitHub. It does not do everything needed to make Calcite’s style checker happy, but it does a decent amount of it. To import, go to Preferences > Editor > Code Style, click the gear next to “scheme”, then Import Scheme > IntelliJ IDEA Code Style XML.
Once the importer is finished, test the project setup.
For example, navigate to the method JdbcTest.testWinAgg
with
Navigate > Symbol and enter testWinAgg
. Run testWinAgg
by right-clicking and selecting Run (or the equivalent keyboard shortcut).
If you encounter an error while running the JdbcTest.testWinAgg
, run the following Maven command from the command line:
$ ./mvnw -DskipTests clean install
You should see "BUILD SUCCESS"
.
Once that is complete, proceed with running JdbcTest.testWinAgg
.
Setting up NetBeans
From the main menu, select File > Open Project and navigate to a name of the project (Calcite) with a small Maven icon, and choose to open. (See this tutorial for an example of how to open a Maven project) Wait for NetBeans to finish importing all dependencies.
To ensure that the project is configured successfully, navigate to the method testWinAgg
in org.apache.calcite.test.JdbcTest
.
Right-click on the method and select to Run Focused Test Method.
NetBeans will run a Maven process, and you should see in the command output window a line with
Running org.apache.calcite.test.JdbcTest
followed by "BUILD SUCCESS"
.
Tracing
To enable tracing, add the following flags to the java command line:
-Dcalcite.debug=true
The first flag causes Calcite to print the Java code it generates (to execute queries) to stdout. It is especially useful if you are debugging mysterious problems like this:
Exception in thread "main" java.lang.ClassCastException: Integer cannot be cast to Long
at Baz$1$1.current(Unknown Source)
By default, Calcite uses the Log4j bindings for SLF4J. There is a provided configuration
file which outputs logging at the INFO level to the console in core/src/test/resources/log4j.properties
.
You can modify the level for the rootLogger to increase verbosity or change the level
for a specific class if you so choose.
Debugging generated classes in Intellij
Calcite uses Janino to generate Java code. The generated classes can be debugged interactively (see the Janino tutorial).
To debug generated classes, set two system properties when starting the JVM:
-Dorg.codehaus.janino.source_debugging.enable=true
-Dorg.codehaus.janino.source_debugging.dir=C:\tmp
(This property is optional; if not set, Janino will create temporary files in the system’s default location for temporary files, such as/tmp
on Unix-based systems.)
After code is generated, either go into Intellij and mark the folder that
contains generated temporary files as a generated sources root or sources root,
or directly set the value of org.codehaus.janino.source_debugging.dir
to an
existing source root when starting the JVM.
CSV adapter
See the tutorial.
MongoDB adapter
First, download and install Calcite, and install MongoDB.
Note: you can use MongoDB from integration test virtual machine above.
Import MongoDB’s zipcode data set into MongoDB:
Log into MongoDB to check it’s there:
Connect using the mongo-model.json Calcite model:
Splunk adapter
To run the test suite and sample queries against Splunk,
load Splunk’s tutorialdata.zip
data set as described in
the Splunk tutorial.
(This step is optional, but it provides some interesting data for the sample
queries. It is also necessary if you intend to run the test suite, using
-Dcalcite.test.splunk=true
.)
Implementing an adapter
New adapters can be created by implementing CalcitePrepare.Context
:
Testing adapter in Java
The example below shows how SQL query can be submitted to
CalcitePrepare
with a custom context (AdapterContext
in this
case). Calcite prepares and implements the query execution, using the
resources provided by the Context
. CalcitePrepare.PrepareResult
provides access to the underlying enumerable and methods for
enumeration. The enumerable itself can naturally be some adapter
specific implementation.
Advanced topics for developers
The following sections might be of interest if you are adding features to particular parts of the code base. You don’t need to understand these topics if you are just building from source and running tests.
JavaTypeFactory
When Calcite compares types (instances of RelDataType
), it requires them to be the same
object. If there are two distinct type instances that refer to the
same Java type, Calcite may fail to recognize that they match. It is
recommended to:
- Use a single instance of
JavaTypeFactory
within the calcite context; - Store the types so that the same object is always returned for the same type.
Rebuilding generated Protocol Buffer code
Calcite’s Avatica Server component supports RPC serialization using Protocol Buffers. In the context of Avatica, Protocol Buffers can generate a collection of messages defined by a schema. The library itself can parse old serialized messages using a new schema. This is highly desirable in an environment where the client and server are not guaranteed to have the same version of objects.
Typically, the code generated by the Protocol Buffers library doesn’t need to be re-generated only every build, only when the schema changes.
First, install Protobuf 3.0:
Then, re-generate the compiled code:
Advanced topics for committers
The following sections are of interest to Calcite committers and in particular release managers.
Merging pull requests (for Calcite committers)
Ask the contributor to squash the PR into a single commit with a message starting with [CALCITE-XXX] where XXX is the associated JIRA issue number. You can take this step yourself if needed. The contributor’s name should also be added in parentheses at the end of the first line of the commit message. Finally, after a couple new lines make sure the message contains “Close apache/calcite#YYY” where YYY is the GitHub issue number. This is important as it is the only way we have to close issues on GitHub without asking the originator to do so manually. When the PR has been merged and pushed, be sure to mark the JIRA issue as resolved (do not use closed as that is reserved for release time).
Set up PGP signing keys (for Calcite committers)
Follow instructions here to
create a key pair. (On macOS, I did brew install gpg
and
gpg --gen-key
.)
Add your public key to the
KEYS
file by following instructions in the KEYS
file.
(The KEYS
file is not present in the git repo or in a release tar
ball because that would be
redundant.)
Set up Maven repository credentials (for Calcite committers)
Follow the instructions here to add your credentials to your maven configuration.
Making a snapshot (for Calcite committers)
Before you start:
- Set up signing keys as described above.
- Make sure you are using JDK 8.
- Make sure build and tests succeed with
-Dcalcite.test.db=hsqldb
(the default)
When the dry-run has succeeded, change install
to deploy
.
Making a release (for Calcite committers)
Before you start:
- Set up signing keys as described above.
- Make sure you are using JDK 8 (not 9 or 10).
- Check that
README
andsite/_docs/howto.md
have the correct version number. - Check that
NOTICE
has the current copyright year. - Set
version.major
andversion.minor
inpom.xml
. - Make sure build and tests succeed, including with
-P it,it-oracle
. - Make sure that
./mvnw javadoc:javadoc javadoc:test-javadoc
succeeds (i.e. gives no errors; warnings are OK) - Generate a report of vulnerabilities that occur among dependencies,
using
-Ppedantic
; if you like, run again with-DfailBuildOnCVSS=8
to see whether serious vulnerabilities exist. Report to private@calcite.apache.org if new critical vulnerabilities are found among dependencies. - Make sure that
./mvnw apache-rat:check
succeeds. (It will be run as part of the release, but it’s better to trouble-shoot early.) - Decide the supported configurations of JDK, operating system and
Guava. These will probably be the same as those described in the
release notes of the previous release. Document them in the release
notes. To test Guava version x.y, specify
-Dguava.version=x.y
- Optional extra tests:
-Dcalcite.test.db=mysql
-Dcalcite.test.db=hsqldb
-Dcalcite.test.slow
-Dcalcite.test.mongodb
-Dcalcite.test.splunk
- Trigger a
Coverity scan
by merging the latest code into the
julianhyde/coverity_scan
branch, and when it completes, make sure that there are no important issues. - Add release notes to
site/_docs/history.md
. Include the commit history, and say which versions of Java, Guava and operating systems the release is tested against. - Make sure that every “resolved” JIRA case (including duplicates) has a fix version assigned (most likely the version we are just about to release)
Smoke-test sqlline
with Spatial and Oracle function tables:
Create a release branch named after the release, e.g. branch-1.1
, and push it to Apache.
We will use the branch for the entire the release process. Meanwhile,
we do not allow commits to the master branch. After the release is
final, we can use git merge --ff-only
to append the changes on the
release branch onto the master branch. (Apache does not allow reverts
to the master branch, which makes it difficult to clean up the kind of
messy commits that inevitably happen while you are trying to finalize
a release.)
Now, set up your environment and do a dry run. The dry run will not commit any changes back to git and gives you the opportunity to verify that the release process will complete as expected.
If any of the steps fail, clean up (see below), fix the problem, and start again from the top.
Check the artifacts.
Note that when performing the dry run SNAPSHOT
will appear in any file or directory names given below.
The version will be automatically changed when performing the release for real.
- In the
target
directory should be these 3 files, among others:apache-calcite-X.Y.Z-src.tar.gz
apache-calcite-X.Y.Z-src.tar.gz.asc
apache-calcite-X.Y.Z-src.tar.gz.sha256
- Note that the file names start
apache-calcite-
. - In the source distro
.tar.gz
(currently there is no binary distro), check that all files belong to a directory calledapache-calcite-X.Y.Z-src
. - That directory must contain files
NOTICE
,LICENSE
,README
,README.md
- Check that the version in
README
is correct - Check that the copyright year in
NOTICE
is correct
- Check that the version in
- Make sure that there is no
KEYS
file in the source distros - In each .jar (for example
core/target/calcite-core-X.Y.Z.jar
andmongodb/target/calcite-mongodb-X.Y.Z-sources.jar
), check that theMETA-INF
directory containsDEPENDENCIES
,LICENSE
,NOTICE
andgit.properties
- In
core/target/calcite-core-X.Y.Z.jar
, check thatorg-apache-calcite-jdbc.properties
is present and does not contain un-substituted${...}
variables - Check PGP, per this
Now, remove the -DdryRun
flag and run the release for real.
For this step you’ll have to add the Apache servers to ~/.m2/settings.xml
.
Verify the staged artifacts in the Nexus repository:
- Go to https://repository.apache.org/ and login
- Under
Build Promotion
, clickStaging Repositories
- In the
Staging Repositories
tab there should be a line with profileorg.apache.calcite
- Navigate through the artifact tree and make sure the .jar, .pom, .asc files are present
- Check the box on in the first column of the row, and press the ‘Close’ button to publish the repository at https://repository.apache.org/content/repositories/orgapachecalcite-1000 (or a similar URL)
Upload the artifacts via subversion to a staging area, https://dist.apache.org/repos/dist/dev/calcite/apache-calcite-X.Y.Z-rcN:
Cleaning up after a failed release attempt (for Calcite committers)
Validate a release
Get approval for a release via Apache voting process (for Calcite committers)
Release vote on dev list
After vote finishes, send out the result:
Use the Apache URL shortener to generate shortened URLs for the vote proposal and result emails. Examples: s.apache.org/calcite-1.2-vote and s.apache.org/calcite-1.2-result.
Publishing a release (for Calcite committers)
After a successful release vote, we need to push the release out to mirrors, and other tasks.
Choose a release date. This is based on the time when you expect to announce the release. This is usually a day after the vote closes. Remember that UTC date changes at 4pm Pacific time.
In JIRA, search for all issues resolved in this release, and do a bulk update changing their status to “Closed”, with a change comment “Resolved in release X.Y.Z (YYYY-MM-DD)” (fill in release number and date appropriately). Uncheck “Send mail for this update”.
Promote the staged nexus artifacts.
- Go to https://repository.apache.org/ and login
- Under “Build Promotion” click “Staging Repositories”
- In the line with “orgapachecalcite-xxxx”, check the box
- Press “Release” button
Check the artifacts into svn.
Svnpubsub will publish to the release repo and propagate to the mirrors within 24 hours.
If there are now more than 2 releases, clear out the oldest ones:
The old releases will remain available in the release archive.
You should receive an email from the Apache Reporter Service. Make sure to add the version number and date of the latest release at the site linked to in the email.
Add a release note by copying
site/_posts/2016-10-12-release-1.10.0.md,
generate the javadoc using ./mvnw site
, publish the site,
and check that it appears in the contents in news.
Merge the release branch back into master
(e.g. git merge --ff-only branch-X.Y
).
After 24 hours, announce the release by sending an email to announce@apache.org. You can use the 1.10.0 announcement as a template. Be sure to include a brief description of the project.
Publishing the web site (for Calcite committers)
See instructions in site/README.md.