This document describes the Cassandra Query Language (CQL) version 3. CQL v3 is not backward compatible with CQL v2 and differs from it in numerous ways. Note that this document describes the last version of the languages. However, the changes section provides the diff between the different versions of CQL v3.
CQL v3 offers a model very close to SQL in the sense that data is put in tables containing rows of columns. For that reason, when used in this document, these terms (tables, rows and columns) have the same definition than they have in SQL. But please note that as such, they do not refer to the concept of rows and columns found in the internal implementation of Cassandra and in the thrift and CQL v2 API.
To aid in specifying the CQL syntax, we will use the following conventions in this document:
<start> ::= TERMINAL <non-terminal1> <non-terminal1>
<angle brackets>
.?
, +
and *
) to signify that a given symbol is optional and/or can be repeated. We’ll also allow parentheses to group symbols and the [<characters>]
notation to represent any one of <characters>
.CREATE TABLE
statement is optional but supported if present even though the provided grammar in this document suggest it is not supported. SELECT sample_usage FROM cql;
fixed-width font
.The CQL language uses identifiers (or names) to identify tables, columns and other objects. An identifier is a token matching the regular expression [a-zA-Z]
[a-zA-Z0-9_]
*
.
A number of such identifiers, like SELECT
or WITH
, are keywords. They have a fixed meaning for the language and most are reserved. The list of those keywords can be found in Appendix A.
Identifiers and (unquoted) keywords are case insensitive. Thus SELECT
is the same than select
or sElEcT
, and myId
is the same than myid
or MYID
for instance. A convention often used (in particular by the samples of this documentation) is to use upper case for keywords and lower case for other identifiers.
There is a second kind of identifiers called quoted identifiers defined by enclosing an arbitrary sequence of characters in double-quotes("
). Quoted identifiers are never keywords. Thus "select"
is not a reserved keyword and can be used to refer to a column, while select
would raise a parse error. Also, contrarily to unquoted identifiers and keywords, quoted identifiers are case sensitive ("My Quoted Id"
is different from "my quoted id"
). A fully lowercase quoted identifier that matches [a-zA-Z]
[a-zA-Z0-9_]
*
is equivalent to the unquoted identifier obtained by removing the double-quote (so "myid"
is equivalent to myid
and to myId
but different from "myId"
). Inside a quoted identifier, the double-quote character can be repeated to escape it, so "foo "" bar"
is a valid identifier.
CQL defines the following kind of constants: strings, integers, floats, booleans, uuids and blobs:
'
). One can include a single-quote in a string by repeating it, e.g. 'It''s raining today'
. Those are not to be confused with quoted identifiers that use double-quotes.'-'?[0-9]+
.'-'?[0-9]+('.'[0-9]*)?([eE][+-]?[0-9+])?
. On top of that, NaN
and Infinity
are also float constants.true
or false
up to case-insensitivity (i.e. True
is a valid boolean constant).hex{8}-hex{4}-hex{4}-hex{4}-hex{12}
where hex
is an hexadecimal character, e.g. [0-9a-fA-F]
and {4}
is the number of such characters.0[xX](hex)+
where hex
is an hexadecimal character, e.g. [0-9a-fA-F]
.For how these constants are typed, see the data types section.
A comment in CQL is a line beginning by either double dashes (--
) or double slash (//
).
Multi-line comments are also supported through enclosure within /*
and */
(but nesting is not supported).
-- This is a comment // This is a comment too /* This is a multi-line comment */
CQL consists of statements. As in SQL, these statements can be divided in 3 categories:
All statements end with a semicolon (;
) but that semicolon can be omitted when dealing with a single statement. The supported statements are described in the following sections. When describing the grammar of said statements, we will reuse the non-terminal symbols defined below:
<identifier> ::= any quoted or unquoted identifier, excluding reserved keywords <tablename> ::= (<identifier> '.')? <identifier> <string> ::= a string constant <integer> ::= an integer constant <float> ::= a float constant <number> ::= <integer> | <float> <uuid> ::= a uuid constant <boolean> ::= a boolean constant <hex> ::= a blob constant <constant> ::= <string> | <number> | <uuid> | <boolean> | <hex> <variable> ::= '?' | ':' <identifier> <term> ::= <constant> | <collection-literal> | <variable> | <function> '(' (<term> (',' <term>)*)? ')' <collection-literal> ::= <map-literal> | <set-literal> | <list-literal> <map-literal> ::= '{' ( <term> ':' <term> ( ',' <term> ':' <term> )* )? '}' <set-literal> ::= '{' ( <term> ( ',' <term> )* )? '}' <list-literal> ::= '[' ( <term> ( ',' <term> )* )? ']' <function> ::= <ident> <properties> ::= <property> (AND <property>)* <property> ::= <identifier> '=' ( <identifier> | <constant> | <map-literal> )
Please note that not every possible productions of the grammar above will be valid in practice. Most notably, <variable>
and nested <collection-literal>
are currently not allowed inside <collection-literal>
.
A <variable>
can be either anonymous (a question mark (?
)) or named (an identifier preceded by :
). Both declare a bind variables for prepared statements. The only difference between an anymous and a named variable is that a named one will be easier to refer to (how exactly depends on the client driver used).
The <properties>
production is use by statement that create and alter keyspaces and tables. Each <property>
is either a simple one, in which case it just has a value, or a map one, in which case it’s value is a map grouping sub-options. The following will refer to one or the other as the kind (simple or map) of the property.
A <tablename>
will be used to identify a table. This is an identifier representing the table name that can be preceded by a keyspace name. The keyspace name, if provided, allow to identify a table in another keyspace than the currently active one (the currently active keyspace is set through the USE statement).
For supported <function>
, see the section on functions.
CQL supports prepared statements. Prepared statement is an optimization that allows to parse a query only once but execute it multiple times with different concrete values.
In a statement, each time a column value is expected (in the data manipulation and query statements), a <variable>
(see above) can be used instead. A statement with bind variables must then be prepared. Once it has been prepared, it can executed by providing concrete values for the bind variables. The exact procedure to prepare a statement and execute a prepared statement depends on the CQL driver used and is beyond the scope of this document.
In addition to providing column values, bind markers may be used to provide values for LIMIT
, TIMESTAMP
, and TTL
clauses. If anonymous bind markers are used, the names for the query parameters will be [limit]
, [timestamp]
, and [ttl]
, respectively.
Syntax:
<create-keyspace-stmt> ::= CREATE KEYSPACE (IF NOT EXISTS)? <identifier> WITH <properties>
Sample:
CREATE KEYSPACE Excelsior WITH replication = {'class': 'SimpleStrategy', 'replication_factor' : 3}; CREATE KEYSPACE Excalibur WITH replication = {'class': 'NetworkTopologyStrategy', 'DC1' : 1, 'DC2' : 3} AND durable_writes = false;
The CREATE KEYSPACE
statement creates a new top-level keyspace. A keyspace is a namespace that defines a replication strategy and some options for a set of tables. Valid keyspaces names are identifiers composed exclusively of alphanumerical characters and whose length is lesser or equal to 32. Note that as identifiers, keyspace names are case insensitive: use a quoted identifier for case sensitive keyspace names.
The supported <properties>
for CREATE KEYSPACE
are:
name | kind | mandatory | default | description |
---|---|---|---|---|
replication | map | yes | The replication strategy and options to use for the keyspace. | |
durable_writes | simple | no | true | Whether to use the commit log for updates on this keyspace (disable this option at your own risk!). |
The replication
<property>
is mandatory. It must at least contains the 'class'
sub-option which defines the replication strategy class to use. The rest of the sub-options depends on that replication strategy class. By default, Cassandra support the following 'class'
:
'SimpleStrategy'
: A simple strategy that defines a simple replication factor for the whole cluster. The only sub-options supported is 'replication_factor'
to define that replication factor and is mandatory.'NetworkTopologyStrategy'
: A replication strategy that allows to set the replication factor independently for each data-center. The rest of the sub-options are key-value pairs where each time the key is the name of a datacenter and the value the replication factor for that data-center.'OldNetworkTopologyStrategy'
: A legacy replication strategy. You should avoid this strategy for new keyspaces and prefer 'NetworkTopologyStrategy'
.Attempting to create an already existing keyspace will return an error unless the IF NOT EXISTS
option is used. If it is used, the statement will be a no-op if the keyspace already exists.
Syntax:
<use-stmt> ::= USE <identifier>
Sample:
USE myApp;
The USE
statement takes an existing keyspace name as argument and set it as the per-connection current working keyspace. All subsequent keyspace-specific actions will be performed in the context of the selected keyspace, unless otherwise specified, until another USE statement is issued or the connection terminates.
Syntax:
<create-keyspace-stmt> ::= ALTER KEYSPACE <identifier> WITH <properties>
Sample:
ALTER KEYSPACE Excelsior WITH replication = {'class': 'SimpleStrategy', 'replication_factor' : 4};
The ALTER KEYSPACE
statement alters the properties of an existing keyspace. The supported <properties>
are the same as for the CREATE KEYSPACE
statement.
Syntax:
<drop-keyspace-stmt> ::= DROP KEYSPACE ( IF EXISTS )? <identifier>
Sample:
DROP KEYSPACE myApp;
A DROP KEYSPACE
statement results in the immediate, irreversible removal of an existing keyspace, including all column families in it, and all data contained in those column families.
If the keyspace does not exists, the statement will return an error, unless IF EXISTS
is used in which case the operation is a no-op.
Syntax:
<create-table-stmt> ::= CREATE ( TABLE | COLUMNFAMILY ) ( IF NOT EXISTS )? <tablename> '(' <column-definition> ( ',' <column-definition> )* ')' ( WITH <option> ( AND <option>)* )? <column-definition> ::= <identifier> <type> ( STATIC )? ( PRIMARY KEY )? | PRIMARY KEY '(' <partition-key> ( ',' <identifier> )* ')' <partition-key> ::= <identifier> | '(' <identifier> (',' <identifier> )* ')' <option> ::= <property> | COMPACT STORAGE | CLUSTERING ORDER
Sample:
CREATE TABLE monkeySpecies ( species text PRIMARY KEY, common_name text, population varint, average_size int ) WITH comment='Important biological records' AND read_repair_chance = 1.0; CREATE TABLE timeline ( userid uuid, posted_month int, posted_time uuid, body text, posted_by text, PRIMARY KEY (userid, posted_month, posted_time) ) WITH compaction = { 'class' : 'LeveledCompactionStrategy' };
The CREATE TABLE
statement creates a new table. Each such table is a set of rows (usually representing related entities) for which it defines a number of properties. A table is defined by a name, it defines the CREATE COLUMNFAMILY
syntax is supported as an alias for CREATE TABLE
(for historical reasons).
Attempting to create an already existing table will return an error unless the IF NOT EXISTS
option is used. If it is used, the statement will be a no-op if the table already exists.
<tablename>
Valid table names are the same as valid keyspace names (up to 32 characters long alphanumerical identifiers). If the table name is provided alone, the table is created within the current keyspace (see USE), but if it is prefixed by an existing keyspace name (see <tablename>
grammar), it is created in the specified keyspace (but does not change the current keyspace).
<column-definition>
A CREATE TABLE
statement defines the columns that rows of the table can have. A column is defined by its name (an identifier) and its type (see the data types section for more details on allowed types and their properties).
Within a table, a row is uniquely identified by its PRIMARY KEY
(or more simply the key), and hence all table definitions must define a PRIMARY KEY (and only one). A PRIMARY KEY
is composed of one or more of the columns defined in the table. If the PRIMARY KEY
is only one column, this can be specified directly after the column definition. Otherwise, it must be specified by following PRIMARY KEY
by the comma-separated list of column names composing the key within parenthesis. Note that:
CREATE TABLE t ( k int PRIMARY KEY, other text )
is equivalent to
CREATE TABLE t ( k int, other text, PRIMARY KEY (k) )
In CQL, the order in which columns are defined for the PRIMARY KEY
matters. The first column of the key is called the partition key. It has the property that all the rows sharing the same partition key (even across table in fact) are stored on the same physical node. Also, insertion/update/deletion on rows sharing the same partition key for a given table are performed atomically and in isolation. Note that it is possible to have a composite partition key, i.e. a partition key formed of multiple columns, using an extra set of parentheses to define which columns forms the partition key.
The remaining columns of the PRIMARY KEY
definition, if any, are called __clustering columns. On a given physical node, rows for a given partition key are stored in the order induced by the clustering columns, making the retrieval of rows in that clustering order particularly efficient (see SELECT).
STATIC
columnsSome columns can be declared as STATIC
in a table definition. A column that is static will be “shared” by all the rows belonging to the same partition (having the same partition key). For instance, in:
CREATE TABLE test ( pk int, t int, v text, s text static, PRIMARY KEY (pk, t) ); INSERT INTO test(pk, t, v, s) VALUES (0, 0, 'val0', 'static0'); INSERT INTO test(pk, t, v, s) VALUES (0, 1, 'val1', 'static1'); SELECT * FROM test WHERE pk=0 AND t=0;
the last query will return 'static1'
as value for s
, since s
is static and thus the 2nd insertion modified this “shared” value. Note however that static columns are only static within a given partition, and if in the example above both rows where from different partitions (i.e. if they had different value for pk
), then the 2nd insertion would not have modified the value of s
for the first row.
A few restrictions applies to when static columns are allowed:
COMPACT STORAGE
option (see below) cannot have themPRIMARY KEY
columns can be static<option>
The CREATE TABLE
statement supports a number of options that controls the configuration of a new table. These options can be specified after the WITH
keyword.
The first of these option is COMPACT STORAGE
. This option is mainly targeted towards backward compatibility for definitions created before CQL3 (see www.datastax.com/dev/blog/thrift-to-cql3 for more details). The option also provides a slightly more compact layout of data on disk but at the price of diminished flexibility and extensibility for the table. Most notably, COMPACT STORAGE
tables cannot have collections nor static columns and a COMPACT STORAGE
table with at least one clustering column supports exactly one (as in not 0 nor more than 1) column not part of the PRIMARY KEY
definition (which imply in particular that you cannot add nor remove columns after creation). For those reasons, COMPACT STORAGE
is not recommended outside of the backward compatibility reason evoked above.
Another option is CLUSTERING ORDER
. It allows to define the ordering of rows on disk. It takes the list of the clustering column names with, for each of them, the on-disk order (Ascending or descending). Note that this option affects what ORDER BY
are allowed during SELECT
.
Table creation supports the following other <property>
:
option | kind | default | description |
---|---|---|---|
comment | simple | none | A free-form, human-readable comment. |
read_repair_chance | simple | 0.1 | The probability with which to query extra nodes (e.g. more nodes than required by the consistency level) for the purpose of read repairs. |
dclocal_read_repair_chance | simple | 0 | The probability with which to query extra nodes (e.g. more nodes than required by the consistency level) belonging to the same data center than the read coordinator for the purpose of read repairs. |
gc_grace_seconds | simple | 864000 | Time to wait before garbage collecting tombstones (deletion markers). |
bloom_filter_fp_chance | simple | 0.00075 | The target probability of false positive of the sstable bloom filters. Said bloom filters will be sized to provide the provided probability (thus lowering this value impact the size of bloom filters in-memory and on-disk) |
default_time_to_live | simple | 0 | The default expiration time (“TTL”) in seconds for a table. |
compaction | map | see below | Compaction options, see below. |
compression | map | see below | Compression options, see below. |
caching | map | see below | Caching options, see below. |
The compaction
property must at least define the 'class'
sub-option, that defines the compaction strategy class to use. The default supported class are 'SizeTieredCompactionStrategy'
, 'LeveledCompactionStrategy'
and 'DateTieredCompactionStrategy'
. Custom strategy can be provided by specifying the full class name as a string constant. The rest of the sub-options depends on the chosen class. The sub-options supported by the default classes are:
option | supported compaction strategy | default | description |
---|---|---|---|
enabled | all | true | A boolean denoting whether compaction should be enabled or not. |
tombstone_threshold | all | 0.2 | A ratio such that if a sstable has more than this ratio of gcable tombstones over all contained columns, the sstable will be compacted (with no other sstables) for the purpose of purging those tombstones. |
tombstone_compaction_interval | all | 1 day | The minimum time to wait after an sstable creation time before considering it for “tombstone compaction”, where “tombstone compaction” is the compaction triggered if the sstable has more gcable tombstones than tombstone_threshold . |
unchecked_tombstone_compaction | all | false | Setting this to true enables more aggressive tombstone compactions – single sstable tombstone compactions will run without checking how likely it is that they will be successful. |
min_sstable_size | SizeTieredCompactionStrategy | 50MB | The size tiered strategy groups SSTables to compact in buckets. A bucket groups SSTables that differs from less than 50% in size. However, for small sizes, this would result in a bucketing that is too fine grained. min_sstable_size defines a size threshold (in bytes) below which all SSTables belong to one unique bucket |
min_threshold | SizeTieredCompactionStrategy | 4 | Minimum number of SSTables needed to start a minor compaction. |
max_threshold | SizeTieredCompactionStrategy | 32 | Maximum number of SSTables processed by one minor compaction. |
bucket_low | SizeTieredCompactionStrategy | 0.5 | Size tiered consider sstables to be within the same bucket if their size is within [average_size * bucket_low , average_size * bucket_high ] (i.e the default groups sstable whose sizes diverges by at most 50%) |
bucket_high | SizeTieredCompactionStrategy | 1.5 | Size tiered consider sstables to be within the same bucket if their size is within [average_size * bucket_low , average_size * bucket_high ] (i.e the default groups sstable whose sizes diverges by at most 50%). |
sstable_size_in_mb | LeveledCompactionStrategy | 5MB | The target size (in MB) for sstables in the leveled strategy. Note that while sstable sizes should stay less or equal to sstable_size_in_mb , it is possible to exceptionally have a larger sstable as during compaction, data for a given partition key are never split into 2 sstables |
timestamp_resolution | DateTieredCompactionStrategy | MICROSECONDS | The timestamp resolution used when inserting data, could be MILLISECONDS, MICROSECONDS etc (should be understandable by Java TimeUnit) - don’t change this unless you do mutations with USING TIMESTAMP |
base_time_seconds | DateTieredCompactionStrategy | 60 | The base size of the time windows. |
max_sstable_age_days | DateTieredCompactionStrategy | 365 | SSTables only containing data that is older than this will never be compacted. |
For the compression
property, the following sub-options are available:
option | default | description |
---|---|---|
sstable_compression | LZ4Compressor | The compression algorithm to use. Default compressor are: LZ4Compressor, SnappyCompressor and DeflateCompressor. Use an empty string ('' ) to disable compression. Custom compressor can be provided by specifying the full class name as a string constant. |
chunk_length_kb | 64KB | On disk SSTables are compressed by block (to allow random reads). This defines the size (in KB) of said block. Bigger values may improve the compression rate, but increases the minimum size of data to be read from disk for a read |
crc_check_chance | 1.0 | When compression is enabled, each compressed block includes a checksum of that block for the purpose of detecting disk bitrot and avoiding the propagation of corruption to other replica. This option defines the probability with which those checksums are checked during read. By default they are always checked. Set to 0 to disable checksum checking and to 0.5 for instance to check them every other read |
For the caching
property, the following sub-options are available:
option | default | description |
---|---|---|
keys | ALL | Whether to cache keys (“key cache”) for this table. Valid values are: ALL and NONE . |
rows_per_partition | NONE | The amount of rows to cache per partition (“row cache”). If an integer n is specified, the first n queried rows of a partition will be cached. Other possible options are ALL , to cache all rows of a queried partition, or NONE to disable row caching. |
Syntax:
<alter-table-stmt> ::= ALTER (TABLE | COLUMNFAMILY) <tablename> <instruction> <instruction> ::= ALTER <identifier> TYPE <type> | ADD <identifier> <type> | DROP <identifier> | WITH <option> ( AND <option> )*
Sample:
ALTER TABLE addamsFamily ALTER lastKnownLocation TYPE uuid; ALTER TABLE addamsFamily ADD gravesite varchar; ALTER TABLE addamsFamily WITH comment = 'A most excellent and useful column family' AND read_repair_chance = 0.2;
The ALTER
statement is used to manipulate table definitions. It allows for adding new columns, dropping existing ones, changing the type of existing columns, or updating the table options. As with table creation, ALTER COLUMNFAMILY
is allowed as an alias for ALTER TABLE
.
The <tablename>
is the table name optionally preceded by the keyspace name. The <instruction>
defines the alteration to perform:
ALTER
: Update the type of a given defined column. Note that the type of the clustering columns cannot be modified as it induces the on-disk ordering of rows. Columns on which a secondary index is defined have the same restriction. Other columns are free from those restrictions (no validation of existing data is performed), but it is usually a bad idea to change the type to a non-compatible one, unless no data have been inserted for that column yet, as this could confuse CQL drivers/tools.ADD
: Adds a new column to the table. The <identifier>
for the new column must not conflict with an existing column. Moreover, columns cannot be added to tables defined with the COMPACT STORAGE
option.DROP
: Removes a column from the table. Dropped columns will immediately become unavailable in the queries and will not be included in compacted sstables in the future. If a column is readded, queries won’t return values written before the column was last dropped. It is assumed that timestamps represent actual time, so if this is not your case, you should NOT readd previously dropped columns. Columns can’t be dropped from tables defined with the COMPACT STORAGE
option.WITH
: Allows to update the options of the table. The supported <option>
(and syntax) are the same as for the CREATE TABLE
statement except that COMPACT STORAGE
is not supported. Note that setting any compaction
sub-options has the effect of erasing all previous compaction
options, so you need to re-specify all the sub-options if you want to keep them. The same note applies to the set of compression
sub-options.Syntax:
<drop-table-stmt> ::= DROP TABLE ( IF EXISTS )? <tablename>
Sample:
DROP TABLE worldSeriesAttendees;
The DROP TABLE
statement results in the immediate, irreversible removal of a table, including all data contained in it. As for table creation, DROP COLUMNFAMILY
is allowed as an alias for DROP TABLE
.
If the table does not exist, the statement will return an error, unless IF EXISTS
is used in which case the operation is a no-op.
Syntax:
<truncate-stmt> ::= TRUNCATE ( TABLE | COLUMNFAMILY )? <tablename>
Sample:
TRUNCATE superImportantData;
The TRUNCATE
statement permanently removes all data from a table.
Syntax:
<create-index-stmt> ::= CREATE ( CUSTOM )? INDEX ( IF NOT EXISTS )? ( <indexname> )? ON <tablename> '(' <index-identifier> ')' ( USING <string> ( WITH OPTIONS = <map-literal> )? )? <index-identifier> ::= <identifier> | keys( <identifier> )
Sample:
CREATE INDEX userIndex ON NerdMovies (user); CREATE INDEX ON Mutants (abilityId); CREATE INDEX ON users (keys(favs)); CREATE CUSTOM INDEX ON users (email) USING 'path.to.the.IndexClass'; CREATE CUSTOM INDEX ON users (email) USING 'path.to.the.IndexClass' WITH OPTIONS = {'storage': '/mnt/ssd/indexes/'};
The CREATE INDEX
statement is used to create a new (automatic) secondary index for a given (existing) column in a given table. A name for the index itself can be specified before the ON
keyword, if desired. If data already exists for the column, it will be indexed asynchronously. After the index is created, new data for the column is indexed automatically at insertion time.
Attempting to create an already existing index will return an error unless the IF NOT EXISTS
option is used. If it is used, the statement will be a no-op if the index already exists.
When creating an index on a map column, you may index either the keys or the values. If the column identifier is placed within the keys()
function, the index will be on the map keys, allowing you to use CONTAINS KEY
in WHERE
clauses. Otherwise, the index will be on the map values.
Syntax:
<drop-index-stmt> ::= DROP INDEX ( IF EXISTS )? ( <keyspace> '.' )? <identifier>
Sample:
DROP INDEX userIndex; DROP INDEX userkeyspace.address_index;
The DROP INDEX
statement is used to drop an existing secondary index. The argument of the statement is the index name, which may optionally specify the keyspace of the index.
If the index does not exists, the statement will return an error, unless IF EXISTS
is used in which case the operation is a no-op.
Syntax:
<create-type-stmt> ::= CREATE TYPE ( IF NOT EXISTS )? <typename> '(' <field-definition> ( ',' <field-definition> )* ')' <typename> ::= ( <keyspace-name> '.' )? <identifier> <field-definition> ::= <identifier> <type>
Sample:
CREATE TYPE address ( street_name text, street_number int, city text, state text, zip int ) CREATE TYPE work_and_home_addresses ( home_address address, work_address address )
The CREATE TYPE
statement creates a new user-defined type. Each type is a set of named, typed fields. Field types may be any valid type, including collections and other existing user-defined types.
Attempting to create an already existing type will result in an error unless the IF NOT EXISTS
option is used. If it is used, the statement will be a no-op if the type already exists.
<typename>
Valid type names are identifiers. The names of existing CQL types and reserved type names may not be used.
If the type name is provided alone, the type is created with the current keyspace (see USE). If it is prefixed by an existing keyspace name, the type is created within the specified keyspace instead of the current keyspace.
Syntax:
<alter-type-stmt> ::= ALTER TYPE <typename> <instruction> <instruction> ::= ALTER <field-name> TYPE <type> | ADD <field-name> <type> | RENAME <field-name> TO <field-name> ( AND <field-name> TO <field-name> )*
Sample:
ALTER TYPE address ALTER zip TYPE varint ALTER TYPE address ADD country text ALTER TYPE address RENAME zip TO zipcode AND street_name TO street
The ALTER TYPE
statement is used to manipulate type definitions. It allows for adding new fields, renaming existing fields, or changing the type of existing fields.
When altering the type of a column, the new type must be compatible with the previous type.
Syntax:
<drop-type-stmt> ::= DROP TYPE ( IF EXISTS )? <typename>
The DROP TYPE
statement results in the immediate, irreversible removal of a type. Attempting to drop a type that is still in use by another type or a table will result in an error.
If the type does not exist, an error will be returned unless IF EXISTS
is used, in which case the operation is a no-op.
Syntax:
<create-trigger-stmt> ::= CREATE TRIGGER ( IF NOT EXISTS )? ( <triggername> )? ON <tablename> USING <string>
Sample:
CREATE TRIGGER myTrigger ON myTable USING 'org.apache.cassandra.triggers.InvertedIndex';
The actual logic that makes up the trigger can be written in any Java (JVM) language and exists outside the database. You place the trigger code in a lib/triggers
subdirectory of the Cassandra installation directory, it loads during cluster startup, and exists on every node that participates in a cluster. The trigger defined on a table fires before a requested DML statement occurs, which ensures the atomicity of the transaction.
Syntax:
<drop-trigger-stmt> ::= DROP TRIGGER ( IF EXISTS )? ( <triggername> )? ON <tablename>
Sample:
DROP TRIGGER myTrigger ON myTable;
DROP TRIGGER
statement removes the registration of a trigger created using CREATE TRIGGER
.
Syntax:
<insertStatement> ::= INSERT INTO <tablename> '(' <identifier> ( ',' <identifier> )* ')' VALUES '(' <term-or-literal> ( ',' <term-or-literal> )* ')' ( IF NOT EXISTS )? ( USING <option> ( AND <option> )* )? <term-or-literal> ::= <term> | <collection-literal> <option> ::= TIMESTAMP <integer> | TTL <integer>
Sample:
INSERT INTO NerdMovies (movie, director, main_actor, year) VALUES ('Serenity', 'Joss Whedon', 'Nathan Fillion', 2005) USING TTL 86400;
The INSERT
statement writes one or more columns for a given row in a table. Note that since a row is identified by its PRIMARY KEY
, at least the columns composing it must be specified.
Note that unlike in SQL, INSERT
does not check the prior existence of the row by default: the row is created if none existed before, and updated otherwise. Furthermore, there is no mean to know which of creation or update happened.
It is however possible to use the IF NOT EXISTS
condition to only insert if the row does not exist prior to the insertion. But please note that using IF NOT EXISTS
will incur a non negligible performance cost (internally, Paxos will be used) so this should be used sparingly.
All updates for an INSERT
are applied atomically and in isolation.
Please refer to the UPDATE
section for information on the <option>
available and to the collections section for use of <collection-literal>
. Also note that INSERT
does not support counters, while UPDATE
does.
Syntax:
<update-stmt> ::= UPDATE <tablename> ( USING <option> ( AND <option> )* )? SET <assignment> ( ',' <assignment> )* WHERE <where-clause> ( IF <condition> ( AND condition )* )? <assignment> ::= <identifier> '=' <term> | <identifier> '=' <identifier> ('+' | '-') (<int-term> | <set-literal> | <list-literal>) | <identifier> '=' <identifier> '+' <map-literal> | <identifier> '[' <term> ']' '=' <term> <condition> ::= <identifier> <op> <term> | <identifier> IN (<variable> | '(' ( <term> ( ',' <term> )* )? ')') | <identifier> '[' <term> ']' <op> <term> | <identifier> '[' <term> ']' IN <term> <op> ::= '<' | '<=' | '=' | '!=' | '>=' | '>' <where-clause> ::= <relation> ( AND <relation> )* <relation> ::= <identifier> '=' <term> | <identifier> IN '(' ( <term> ( ',' <term> )* )? ')' | <identifier> IN <variable> <option> ::= TIMESTAMP <integer> | TTL <integer>
Sample:
UPDATE NerdMovies USING TTL 400 SET director = 'Joss Whedon', main_actor = 'Nathan Fillion', year = 2005 WHERE movie = 'Serenity'; UPDATE UserActions SET total = total + 2 WHERE user = B70DE1D0-9908-4AE3-BE34-5573E5B09F14 AND action = 'click';
The UPDATE
statement writes one or more columns for a given row in a table. The <where-clause>
is used to select the row to update and must include all columns composing the PRIMARY KEY
(the IN
relation is only supported for the last column of the partition key). Other columns values are specified through <assignment>
after the SET
keyword.
Note that unlike in SQL, UPDATE
does not check the prior existence of the row by default (except through the use of <condition>
, see below): the row is created if none existed before, and updated otherwise. Furthermore, there is no mean to know which of creation or update happened.
It is however possible to use the conditions on some columns through IF
, in which case the row will not be updated unless such condition are met. But please note that using IF
conditions will incur a non negligible performance cost (internally, Paxos will be used) so this should be used sparingly.
In an UPDATE
statement, all updates within the same partition key are applied atomically and in isolation.
The c = c + 3
form of <assignment>
is used to increment/decrement counters. The identifier after the ‘=’ sign must be the same than the one before the ‘=’ sign (Only increment/decrement is supported on counters, not the assignment of a specific value).
The id = id + <collection-literal>
and id[value1] = value2
forms of <assignment>
are for collections. Please refer to the relevant section for more details.
<options>
The UPDATE
and INSERT
statements allows to specify the following options for the insertion:
TIMESTAMP
: sets the timestamp for the operation. If not specified, the coordinator will use the current time (in microseconds) at the start of statement execution as the timestamp. This is usually a suitable default.TTL
: allows to specify an optional Time To Live (in seconds) for the inserted values. If set, the inserted values are automatically removed from the database after the specified time. Note that the TTL concerns the inserted values, not the column themselves. This means that any subsequent update of the column will also reset the TTL (to whatever TTL is specified in that update). By default, values never expire. A TTL of 0 or a negative one is equivalent to no TTL.Syntax:
<delete-stmt> ::= DELETE ( <selection> ( ',' <selection> )* )? FROM <tablename> ( USING TIMESTAMP <integer>)? WHERE <where-clause> ( IF ( EXISTS | ( <condition> ( AND <condition> )*) ) )? <selection> ::= <identifier> ( '[' <term> ']' )? <where-clause> ::= <relation> ( AND <relation> )* <relation> ::= <identifier> '=' <term> | <identifier> IN '(' ( <term> ( ',' <term> )* )? ')' | <identifier> IN <variable> <condition> ::= <identifier> <op> <term> | <identifier> IN (<variable> | '(' ( <term> ( ',' <term> )* )? ')') | <identifier> '[' <term> ']' <op> <term> | <identifier> '[' <term> ']' IN <term> <op> ::= '<' | '<=' | '=' | '!=' | '>=' | '>'
Sample:
DELETE FROM NerdMovies USING TIMESTAMP 1240003134 WHERE movie = 'Serenity'; DELETE phone FROM Users WHERE userid IN (C73DE1D3-AF08-40F3-B124-3FF3E5109F22, B70DE1D0-9908-4AE3-BE34-5573E5B09F14);
The DELETE
statement deletes columns and rows. If column names are provided directly after the DELETE
keyword, only those columns are deleted from the row indicated by the <where-clause>
(the id[value]
syntax in <selection>
is for collection, please refer to the collection section for more details). Otherwise whole rows are removed. The <where-clause>
allows to specify the key for the row(s) to delete (the IN
relation is only supported for the last column of the partition key).
DELETE
supports the TIMESTAMP
options with the same semantic that in the UPDATE
statement.
In a DELETE
statement, all deletions within the same partition key are applied atomically and in isolation.
A DELETE
operation application can be conditioned using IF
like for UPDATE
and INSERT
. But please not that as for the later, this will incur a non negligible performance cost (internally, Paxos will be used) and so should be used sparingly.
Syntax:
<batch-stmt> ::= BEGIN ( UNLOGGED | COUNTER ) BATCH ( USING <option> ( AND <option> )* )? <modification-stmt> ( ';' <modification-stmt> )* APPLY BATCH <modification-stmt> ::= <insert-stmt> | <update-stmt> | <delete-stmt> <option> ::= TIMESTAMP <integer>
Sample:
BEGIN BATCH INSERT INTO users (userid, password, name) VALUES ('user2', 'ch@ngem3b', 'second user'); UPDATE users SET password = 'ps22dhds' WHERE userid = 'user3'; INSERT INTO users (userid, password) VALUES ('user4', 'ch@ngem3c'); DELETE name FROM users WHERE userid = 'user1'; APPLY BATCH;
The BATCH
statement group multiple modification statements (insertions/updates and deletions) into a single statement. It serves several purposes:
BATCH
belonging to a given partition key are performed in isolation.LOGGED
, to ensure all mutations eventually complete (or none will). See the notes on UNLOGGED
for more details.Note that:
BATCH
statements may only contain UPDATE
, INSERT
and DELETE
statements.BATCH
statement. To force a particular operation ordering, you must specify per-operation timestamps.UNLOGGED
By default, Cassandra uses a batch log to ensure all operations in a batch eventually complete or none will (note however that operations are only isolated within a single partition).
There is a performance penalty for batch atomicity when a batch spans multiple partitions. If you do not want to incur this penalty, you can tell Cassandra to skip the batchlog with the UNLOGGED
option. If the UNLOGGED
option is used, a failed batch might leave the patch only partly applied.
COUNTER
Use the COUNTER
option for batched counter updates. Unlike other updates in Cassandra, counter updates are not idempotent.
<option>
BATCH
supports both the TIMESTAMP
option, with similar semantic to the one described in the UPDATE
statement (the timestamp applies to all the statement inside the batch). However, if used, TIMESTAMP
must not be used in the statements within the batch.
Syntax:
<select-stmt> ::= SELECT <select-clause> FROM <tablename> ( WHERE <where-clause> )? ( ORDER BY <order-by> )? ( LIMIT <integer> )? ( ALLOW FILTERING )? <select-clause> ::= DISTINCT? <selection-list> | COUNT '(' ( '*' | '1' ) ')' (AS <identifier>)? <selection-list> ::= <selector> (AS <identifier>)? ( ',' <selector> (AS <identifier>)? )* | '*' <selector> ::= <identifier> | WRITETIME '(' <identifier> ')' | TTL '(' <identifier> ')' | <function> '(' (<selector> (',' <selector>)*)? ')' <where-clause> ::= <relation> ( AND <relation> )* <relation> ::= <identifier> <op> <term> | '(' <identifier> (',' <identifier>)* ')' <op> <term-tuple> | <identifier> IN '(' ( <term> ( ',' <term>)* )? ')' | '(' <identifier> (',' <identifier>)* ')' IN '(' ( <term-tuple> ( ',' <term-tuple>)* )? ')' | TOKEN '(' <identifier> ( ',' <identifer>)* ')' <op> <term> <op> ::= '=' | '<' | '>' | '<=' | '>=' | CONTAINS | CONTAINS KEY <order-by> ::= <ordering> ( ',' <odering> )* <ordering> ::= <identifer> ( ASC | DESC )? <term-tuple> ::= '(' <term> (',' <term>)* ')'
Sample:
SELECT name, occupation FROM users WHERE userid IN (199, 200, 207); SELECT name AS user_name, occupation AS user_occupation FROM users; SELECT time, value FROM events WHERE event_type = 'myEvent' AND time > '2011-02-03' AND time <= '2012-01-01' SELECT COUNT(*) FROM users; SELECT COUNT(*) AS user_count FROM users;
The SELECT
statements reads one or more columns for one or more rows in a table. It returns a result-set of rows, where each row contains the collection of columns corresponding to the query.
<select-clause>
The <select-clause>
determines which columns needs to be queried and returned in the result-set. It consists of either the comma-separated list of *
) to select all the columns defined for the table.
A <selector>
is either a column name to retrieve, or a <function>
of one or multiple column names. The functions allows are the same that for <term>
and are describe in the function section. In addition to these generic functions, the WRITETIME
(resp. TTL
) function allows to select the timestamp of when the column was inserted (resp. the time to live (in seconds) for the column (or null if the column has no expiration set)).
Any <selector>
can be aliased using AS
keyword (see examples). Please note that <where-clause>
and <order-by>
clause should refer to the columns by their original names and not by their aliases.
The COUNT
keyword can be used with parenthesis enclosing *
. If so, the query will return a single result: the number of rows matching the query. Note that COUNT(1)
is supported as an alias.
<where-clause>
The <where-clause>
specifies which rows must be queried. It is composed of relations on the columns that are part of the PRIMARY KEY
and/or have a secondary index defined on them.
Not all relations are allowed in a query. For instance, non-equal relations (where IN
is considered as an equal relation) on a partition key are not supported (but see the use of the TOKEN
method below to do non-equal queries on the partition key). Moreover, for a given partition key, the clustering columns induce an ordering of rows and relations on them is restricted to the relations that allow to select a contiguous (for the ordering) set of rows. For instance, given
CREATE TABLE posts ( userid text, blog_title text, posted_at timestamp, entry_title text, content text, category int, PRIMARY KEY (userid, blog_title, posted_at) )
The following query is allowed:
SELECT entry_title, content FROM posts WHERE userid='john doe' AND blog_title='John''s Blog' AND posted_at >= '2012-01-01' AND posted_at < '2012-01-31'
But the following one is not, as it does not select a contiguous set of rows (and we suppose no secondary indexes are set):
// Needs a blog_title to be set to select ranges of posted_at SELECT entry_title, content FROM posts WHERE userid='john doe' AND posted_at >= '2012-01-01' AND posted_at < '2012-01-31'
When specifying relations, the TOKEN
function can be used on the PARTITION KEY
column to query. In that case, rows will be selected based on the token of their PARTITION_KEY
rather than on the value. Note that the token of a key depends on the partitioner in use, and that in particular the RandomPartitioner won’t yield a meaningful order. Also note that ordering partitioners always order token values by bytes (so even if the partition key is of type int, token(-1) > token(0)
in particular). Example:
SELECT * FROM posts WHERE token(userid) > token('tom') AND token(userid) < token('bob')
Moreover, the IN
relation is only allowed on the last column of the partition key and on the last column of the full primary key.
It is also possible to “group” CLUSTERING COLUMNS
together in a relation using the tuple notation. For instance:
SELECT * FROM posts WHERE userid='john doe' AND (blog_title, posted_at) > ('John''s Blog', '2012-01-01')
will request all rows that sorts after the one having “John's Blog” as blog_tile
and ‘2012-01-01’ for posted_at
in the clustering order. In particular, rows having a post_at <= '2012-01-01'
will be returned as long as their blog_title > 'John''s Blog'
, which wouldn’t be the case for:
SELECT * FROM posts WHERE userid='john doe' AND blog_title > 'John''s Blog' AND posted_at > '2012-01-01'
The tuple notation may also be used for IN
clauses on CLUSTERING COLUMNS
:
SELECT * FROM posts WHERE userid='john doe' AND (blog_title, posted_at) IN (('John''s Blog', '2012-01-01), ('Extreme Chess', '2014-06-01'))
The CONTAINS
operator may only be used on collection columns (lists, sets, and maps). In the case of maps, CONTAINS
applies to the map values. The CONTAINS KEY
operator may only be used on map columns and applies to the map keys.
<order-by>
The ORDER BY
option allows to select the order of the returned results. It takes as argument a list of column names along with the order for the column (ASC
for ascendant and DESC
for descendant, omitting the order being equivalent to ASC
). Currently the possible orderings are limited (which depends on the table CLUSTERING ORDER
):
CLUSTERING ORDER
, then then allowed orderings are the order induced by the clustering columns and the reverse of that one.CLUSTERING ORDER
option and the reversed one.LIMIT
The LIMIT
option to a SELECT
statement limits the number of rows returned by a query.
ALLOW FILTERING
By default, CQL only allows select queries that don’t involve “filtering” server side, i.e. queries where we know that all (live) record read will be returned (maybe partly) in the result set. The reasoning is that those “non filtering” queries have predictable performance in the sense that they will execute in a time that is proportional to the amount of data returned by the query (which can be controlled through LIMIT
).
The ALLOW FILTERING
option allows to explicitly allow (some) queries that require filtering. Please note that a query using ALLOW FILTERING
may thus have unpredictable performance (for the definition above), i.e. even a query that selects a handful of records may exhibit performance that depends on the total amount of data stored in the cluster.
For instance, considering the following table holding user profiles with their year of birth (with a secondary index on it) and country of residence:
CREATE TABLE users ( username text PRIMARY KEY, firstname text, lastname text, birth_year int, country text ) CREATE INDEX ON users(birth_year);
Then the following queries are valid:
SELECT * FROM users; SELECT firstname, lastname FROM users WHERE birth_year = 1981;
because in both case, Cassandra guarantees that these queries performance will be proportional to the amount of data returned. In particular, if no users are born in 1981, then the second query performance will not depend of the number of user profile stored in the database (not directly at least: due to secondary index implementation consideration, this query may still depend on the number of node in the cluster, which indirectly depends on the amount of data stored. Nevertheless, the number of nodes will always be multiple number of magnitude lower than the number of user profile stored). Of course, both query may return very large result set in practice, but the amount of data returned can always be controlled by adding a LIMIT
.
However, the following query will be rejected:
SELECT firstname, lastname FROM users WHERE birth_year = 1981 AND country = 'FR';
because Cassandra cannot guarantee that it won’t have to scan large amount of data even if the result to those query is small. Typically, it will scan all the index entries for users born in 1981 even if only a handful are actually from France. However, if you “know what you are doing”, you can force the execution of this query by using ALLOW FILTERING
and so the following query is valid:
SELECT firstname, lastname FROM users WHERE birth_year = 1981 AND country = 'FR' ALLOW FILTERING;
Syntax:
<create-user-statement> ::= CREATE USER ( IF NOT EXISTS )? <identifier> ( WITH PASSWORD <string> )? (<option>)? <option> ::= SUPERUSER | NOSUPERUSER
Sample:
CREATE USER alice WITH PASSWORD 'password_a' SUPERUSER; CREATE USER bob WITH PASSWORD 'password_b' NOSUPERUSER;
By default users do not possess SUPERUSER
status.
Permissions on database resources (keyspaces and tables) are granted to users.
USer names should be quoted if they contain non-alphanumeric characters.
Use the WITH PASSWORD
clause to set a password for internal authentication, enclosing the password in single quotation marks.
If internal authentication has not been set up the WITH PASSWORD
clause is not necessary.
Attempting to create an existing user results in an invalid query condition unless the IF NOT EXISTS
option is used. If the option is used and the user exists, the statement is a no-op.
CREATE USER carlos; CREATE USER IF NOT EXISTS carlos;
Syntax:
<alter-user-statement> ::= ALTER USER <identifier> ( WITH PASSWORD <string> )? ( <option> )? <option> ::= SUPERUSER | NOSUPERUSER
ALTER USER alice WITH PASSWORD 'PASSWORD_A'; ALTER USER bob SUPERUSER;
ALTER USER
requires SUPERUSER
status, with two caveats:
SUPERUSER
statusSUPERUSER
status is permitted to modify a subset of it’s own properties (e.g. its PASSWORD
)Syntax:
<drop-user-stmt> ::= DROP USER ( IF EXISTS )? <identifier>
Sample:
DROP USER alice; DROP USER IF EXISTS bob;
DROP USER
requires SUPERUSER
status, and users are not permitted to DROP
themselves.
Attempting to drop a user which does not exist results in an invalid query condition unless the IF EXISTS
option is used. If the option is used and the user does not exist the statement is a no-op.
Syntax:
<list-users-stmt> ::= LIST USERS;
Sample:
LIST USERS;
Return all known users in the system.
Permissions on resources are granted to users and data resources in Cassandra are organized hierarchically, like so: ALL KEYSPACES
-> KEYSPACE
-> TABLE
Permissions can be granted at any level of the hierarchy and they flow downwards. So granting a permission on a resource higher up the chain automatically grants that same permission on all resources lower down. For example, granting SELECT
on a KEYSPACE
automatically grants it on all TABLES
in that KEYSPACE
.
Modifications to permissions are visible to existing client sessions; that is, connections need not be re-established following permissions changes.
The full set of available permissions is:
CREATE
ALTER
DROP
SELECT
MODIFY
AUTHORIZE
permission | resource | operations |
---|---|---|
CREATE | ALL KEYSPACES | CREATE KEYSPACE CREATE TABLE in any keyspace |
CREATE | KEYSPACE | CREATE TABLE in specified keyspace |
ALTER | ALL KEYSPACES | ALTER KEYSPACE ALTER TABLE in any keyspace |
ALTER | KEYSPACE | ALTER KEYSPACE ALTER TABLE in keyspace |
ALTER | TABLE | ALTER TABLE |
DROP | ALL KEYSPACES | DROP KEYSPACE DROP TABLE in any keyspace |
DROP | KEYSPACE | DROP TABLE in specified keyspace |
DROP | TABLE | DROP TABLE |
SELECT | ALL KEYSPACES | SELECT on any table |
SELECT | KEYSPACE | SELECT on any table in keyspace |
SELECT | TABLE | SELECT on specified table |
MODIFY | ALL KEYSPACES | INSERT on any table UPDATE on any table DELETE on any table TRUNCATE on any table |
MODIFY | KEYSPACE | INSERT on any table in keyspace UPDATE on any table in keyspace DELETE on any table in keyspace TRUNCATE on any table in keyspace |
MODIFY | TABLE | INSERT UPDATE DELETE TRUNCATE |
AUTHORIZE | ALL KEYSPACES | GRANT PERMISSION on any table REVOKE PERMISSION on any table |
AUTHORIZE | KEYSPACE | GRANT PERMISSION on table in keyspace REVOKE PERMISSION on table in keyspace |
AUTHORIZE | TABLE | GRANT PERMISSION REVOKE PERMISSION |
Syntax:
<grant-permission-stmt> ::= GRANT ( ALL ( PERMISSIONS )? | <permission> ( PERMISSION )? ) ON <resource> TO <identifier> <permission> ::= CREATE | ALTER | DROP | SELECT | MODIFY | AUTHORIZE <resource> ::= ALL KEYSPACES | KEYSPACE <identifier> | ( TABLE )? <tablename>
Sample:
GRANT SELECT ON ALL KEYSPACES TO alice;
This gives alice
permissions to execute SELECT
statements on any table across all keyspaces
GRANT MODIFY ON KEYSPACE keyspace1 TO bob;
This gives bob
permissions to perform UPDATE
, INSERT
, UPDATE
, DELETE
and TRUNCATE
queries on all tables in the keyspace1
keyspace
GRANT DROP ON keyspace1.table1 TO carlos;
This gives carlos
permissions to DROP
keyspace1.table1
.
Syntax:
<revoke-permission-stmt> ::= REVOKE ( ALL ( PERMISSIONS )? | <permission> ( PERMISSION )? ) ON <resource> FROM <identifier> <permission> ::= CREATE | ALTER | DROP | SELECT | MODIFY | AUTHORIZE <resource> ::= ALL KEYSPACES | KEYSPACE <identifier> | ( TABLE )? <tablename>
Sample:
REVOKE SELECT ON ALL KEYSPACES FROM alice; REVOKE MODIFY ON KEYSPACE keyspace1 FROM bob; REVOKE DROP ON keyspace1.table1 FROM carlos;
Syntax:
<list-permissions-stmt> ::= LIST ( ALL ( PERMISSIONS )? | <permission> ) ( ON <resource> )? ( OF <identifier> ( NORECURSIVE )? )? <resource> ::= ALL KEYSPACES | KEYSPACE <identifier> | ( TABLE )? <tablename>
Sample:
LIST ALL PERMISSIONS OF alice;
Show all permissions granted to alice
.
LIST ALL PERMISSIONS ON keyspace1.table1 OF bob;
Show all permissions on keyspace1.table1
granted to bob
. This also includes any permissions higher up the resource hierarchy which can be applied to keyspace1.table1
. For example, should bob
have ALTER
permission on keyspace1
, that would be included in the results of this query. Adding the NORECURSIVE
switch restricts the results to only those permissions which were directly granted to bob
.
LIST SELECT PERMISSIONS OF carlos;
Show any permissions granted to carlos
, limited to SELECT
permissions on any resource.
CQL supports a rich set of data types for columns defined in a table, including collection types. On top of those native and collection types, users can also provide custom types (through a JAVA class extending AbstractType
loadable by Cassandra). The syntax of types is thus:
<type> ::= <native-type> | <collection-type> | <tuple-type> | <string> // Used for custom types. The fully-qualified name of a JAVA class <native-type> ::= ascii | bigint | blob | boolean | counter | decimal | double | float | inet | int | text | timestamp | timeuuid | uuid | varchar | varint <collection-type> ::= list '<' <native-type> '>' | set '<' <native-type> '>' | map '<' <native-type> ',' <native-type> '>' <tuple-type> ::= tuple '<' <type> (',' <type>)* '>'
Note that the native types are keywords and as such are case-insensitive. They are however not reserved ones.
The following table gives additional informations on the native data types, and on which kind of constants each type supports:
type | constants supported | description |
---|---|---|
ascii | strings | ASCII character string |
bigint | integers | 64-bit signed long |
blob | blobs | Arbitrary bytes (no validation) |
boolean | booleans | true or false |
counter | integers | Counter column (64-bit signed value). See Counters for details |
decimal | integers, floats | Variable-precision decimal |
double | integers | 64-bit IEEE-754 floating point |
float | integers, floats | 32-bit IEEE-754 floating point |
inet | strings | An IP address. It can be either 4 bytes long (IPv4) or 16 bytes long (IPv6). There is no inet constant, IP address should be inputed as strings |
int | integers | 32-bit signed int |
text | strings | UTF8 encoded string |
timestamp | integers, strings | A timestamp. Strings constant are allow to input timestamps as dates, see Working with dates below for more information. |
timeuuid | uuids | Type 1 UUID. This is generally used as a “conflict-free” timestamp. Also see the functions on Timeuuid |
uuid | uuids | Type 1 or type 4 UUID |
varchar | strings | UTF8 encoded string |
varint | integers | Arbitrary-precision integer |
For more information on how to use the collection types, see the Working with collections section below.
Values of the timestamp
type are encoded as 64-bit signed integers representing a number of milliseconds since the standard base time known as “the epoch”: January 1 1970 at 00:00:00 GMT.
Timestamp can be input in CQL as simple long integers, giving the number of milliseconds since the epoch, as defined above.
They can also be input as string literals in any of the following ISO 8601 formats, each representing the time and date Mar 2, 2011, at 04:05:00 AM, GMT.:
2011-02-03 04:05+0000
2011-02-03 04:05:00+0000
2011-02-03 04:05:00.000+0000
2011-02-03T04:05+0000
2011-02-03T04:05:00+0000
2011-02-03T04:05:00.000+0000
The +0000
above is an RFC 822 4-digit time zone specification; +0000
refers to GMT. US Pacific Standard Time is -0800
. The time zone may be omitted if desired— the date will be interpreted as being in the time zone under which the coordinating Cassandra node is configured.
2011-02-03 04:05
2011-02-03 04:05:00
2011-02-03 04:05:00.000
2011-02-03T04:05
2011-02-03T04:05:00
2011-02-03T04:05:00.000
There are clear difficulties inherent in relying on the time zone configuration being as expected, though, so it is recommended that the time zone always be specified for timestamps when feasible.
The time of day may also be omitted, if the date is the only piece that matters:
2011-02-03
2011-02-03+0000
In that case, the time of day will default to 00:00:00, in the specified or default time zone.
The counter
type is used to define counter columns. A counter column is a column whose value is a 64-bit signed integer and on which 2 operations are supported: incrementation and decrementation (see UPDATE
for syntax). Note the value of a counter cannot be set. A counter doesn’t exist until first incremented/decremented, and the first incrementation/decrementation is made as if the previous value was 0. Deletion of counter columns is supported but have some limitations (see the Cassandra Wiki for more information).
The use of the counter type is limited in the following way:
PRIMARY KEY
of a table.PRIMARY KEY
have the counter type, or none of them have it.Collections are meant for storing/denormalizing relatively small amount of data. They work well for things like “the phone numbers of a given user”, “labels applied to an email”, etc. But when items are expected to grow unbounded (“all the messages sent by a given user”, “events registered by a sensor”, ...), then collections are not appropriate anymore and a specific table (with clustering columns) should be used. Concretely, collections have the following limitations:
Please note that while some of those limitations may or may not be loosen in the future, the general rule that collections are for denormalizing small amount of data is meant to stay.
A map
is a typed set of key-value pairs, where keys are unique. Furthermore, note that the map are internally sorted by their keys and will thus always be returned in that order. To create a column of type map
, use the map
keyword suffixed with comma-separated key and value types, enclosed in angle brackets. For example:
CREATE TABLE users ( id text PRIMARY KEY, given text, surname text, favs map<text, text> // A map of text keys, and text values )
Writing map
data is accomplished with a JSON-inspired syntax. To write a record using INSERT
, specify the entire map as a JSON-style associative array. Note: This form will always replace the entire map.
// Inserting (or Updating) INSERT INTO users (id, given, surname, favs) VALUES ('jsmith', 'John', 'Smith', { 'fruit' : 'apple', 'band' : 'Beatles' })
Adding or updating key-values of a (potentially) existing map can be accomplished either by subscripting the map column in an UPDATE
statement or by adding a new map literal:
// Updating (or inserting) UPDATE users SET favs['author'] = 'Ed Poe' WHERE id = 'jsmith' UPDATE users SET favs = favs + { 'movie' : 'Cassablanca' } WHERE id = 'jsmith'
Note that TTLs are allowed for both INSERT
and UPDATE
, but in both case the TTL set only apply to the newly inserted/updated values. In other words,
// Updating (or inserting) UPDATE users USING TTL 10 SET favs['color'] = 'green' WHERE id = 'jsmith'
will only apply the TTL to the { 'color' : 'green' }
record, the rest of the map remaining unaffected.
Deleting a map record is done with:
DELETE favs['author'] FROM users WHERE id = 'jsmith'
A set
is a typed collection of unique values. Sets are ordered by their values. To create a column of type set
, use the set
keyword suffixed with the value type enclosed in angle brackets. For example:
CREATE TABLE images ( name text PRIMARY KEY, owner text, date timestamp, tags set<text> );
Writing a set
is accomplished by comma separating the set values, and enclosing them in curly braces. Note: An INSERT
will always replace the entire set.
INSERT INTO images (name, owner, date, tags) VALUES ('cat.jpg', 'jsmith', 'now', { 'kitten', 'cat', 'pet' });
Adding and removing values of a set can be accomplished with an UPDATE
by adding/removing new set values to an existing set
column.
UPDATE images SET tags = tags + { 'cute', 'cuddly' } WHERE name = 'cat.jpg'; UPDATE images SET tags = tags - { 'lame' } WHERE name = 'cat.jpg';
As with maps, TTLs if used only apply to the newly inserted/updated values.
A list
is a typed collection of non-unique values where elements are ordered by there position in the list. To create a column of type list
, use the list
keyword suffixed with the value type enclosed in angle brackets. For example:
CREATE TABLE plays ( id text PRIMARY KEY, game text, players int, scores list<int> )
Do note that as explained below, lists have some limitations and performance considerations to take into account, and it is advised to prefer sets over lists when this is possible.
Writing list
data is accomplished with a JSON-style syntax. To write a record using INSERT
, specify the entire list as a JSON array. Note: An INSERT
will always replace the entire list.
INSERT INTO plays (id, game, players, scores) VALUES ('123-afde', 'quake', 3, [17, 4, 2]);
Adding (appending or prepending) values to a list can be accomplished by adding a new JSON-style array to an existing list
column.
UPDATE plays SET players = 5, scores = scores + [ 14, 21 ] WHERE id = '123-afde'; UPDATE plays SET players = 5, scores = [ 12 ] + scores WHERE id = '123-afde';
It should be noted that append and prepend are not idempotent operations. This means that if during an append or a prepend the operation timeout, it is not always safe to retry the operation (as this could result in the record appended or prepended twice).
Lists also provides the following operation: setting an element by its position in the list, removing an element by its position in the list and remove all the occurrence of a given value in the list. However, and contrarily to all the other collection operations, these three operations induce an internal read before the update, and will thus typically have slower performance characteristics. Those operations have the following syntax:
UPDATE plays SET scores[1] = 7 WHERE id = '123-afde'; // sets the 2nd element of scores to 7 (raises an error is scores has less than 2 elements) DELETE scores[1] FROM plays WHERE id = '123-afde'; // deletes the 2nd element of scores (raises an error is scores has less than 2 elements) UPDATE plays SET scores = scores - [ 12, 21 ] WHERE id = '123-afde'; // removes all occurrences of 12 and 21 from scores
As with maps, TTLs if used only apply to the newly inserted/updated values.
CQL3 supports a few functions (more to come). Currently, it only support functions on values (functions that transform one or more column values into a new value) and in particular aggregation functions are not supported. The functions supported are described below:
The token
function allows to compute the token for a given partition key. The exact signature of the token function depends on the table concerned and of the partitioner used by the cluster.
The type of the arguments of the token
depend on the type of the partition key columns. The return type depend on the partitioner in use:
bigint
.varint
.blob
.For instance, in a cluster using the default Murmur3Partitioner, if a table is defined by
CREATE TABLE users ( userid text PRIMARY KEY, username text, ... )
then the token
function will take a single argument of type text
(in that case, the partition key is userid
(there is no clustering columns so the partition key is the same than the primary key)), and the return type will be bigint
.
The uuid
function takes no parameters and generates a random type 4 uuid suitable for use in INSERT or SET statements.
now
The now
function takes no arguments and generates a new unique timeuuid (at the time where the statement using it is executed). Note that this method is useful for insertion but is largely non-sensical in WHERE
clauses. For instance, a query of the form
SELECT * FROM myTable WHERE t = now()
will never return any result by design, since the value returned by now()
is guaranteed to be unique.
minTimeuuid
and maxTimeuuid
The minTimeuuid
(resp. maxTimeuuid
) function takes a timestamp
value t
(which can be either a timestamp or a date string) and return a fake timeuuid
corresponding to the smallest (resp. biggest) possible timeuuid
having for timestamp t
. So for instance:
SELECT * FROM myTable WHERE t > maxTimeuuid('2013-01-01 00:05+0000') AND t < minTimeuuid('2013-02-02 10:00+0000')
will select all rows where the timeuuid
column t
is strictly older than ‘2013-01-01 00:05+0000’ but strictly younger than ‘2013-02-02 10:00+0000’. Please note that t >= maxTimeuuid('2013-01-01 00:05+0000')
would still not select a timeuuid
generated exactly at ‘2013-01-01 00:05+0000’ and is essentially equivalent to t > maxTimeuuid('2013-01-01 00:05+0000')
.
Warning: We called the values generated by minTimeuuid
and maxTimeuuid
fake UUID because they do no respect the Time-Based UUID generation process specified by the RFC 4122. In particular, the value returned by these 2 methods will not be unique. This means you should only use those methods for querying (as in the example above). Inserting the result of those methods is almost certainly a bad idea.
dateOf
and unixTimestampOf
The dateOf
and unixTimestampOf
functions take a timeuuid
argument and extract the embedded timestamp. However, while the dateof
function return it with the timestamp
type (that most client, including cqlsh, interpret as a date), the unixTimestampOf
function returns it as a bigint
raw value.
A number of functions are provided to “convert” the native types into binary data (blob
). For every <native-type>
type
supported by CQL3 (a notable exceptions is blob
, for obvious reasons), the function typeAsBlob
takes a argument of type type
and return it as a blob
. Conversely, the function blobAsType
takes a 64-bit blob
argument and convert it to a bigint
value. And so for instance, bigintAsBlob(3)
is 0x0000000000000003
and blobAsBigint(0x0000000000000003)
is 3
.
CQL distinguishes between reserved and non-reserved keywords. Reserved keywords cannot be used as identifier, they are truly reserved for the language (but one can enclose a reserved keyword by double-quotes to use it as an identifier). Non-reserved keywords however only have a specific meaning in certain context but can used as identifer otherwise. The only raison d'ĂȘtre of these non-reserved keywords is convenience: some keyword are non-reserved when it was always easy for the parser to decide whether they were used as keywords or not.
Keyword | Reserved? |
---|---|
ADD | yes |
ALL | no |
ALTER | yes |
AND | yes |
ANY | yes |
APPLY | yes |
AS | no |
ASC | yes |
ASCII | no |
AUTHORIZE | yes |
BATCH | yes |
BEGIN | yes |
BIGINT | no |
BLOB | no |
BOOLEAN | no |
BY | yes |
CLUSTERING | no |
COLUMNFAMILY | yes |
COMPACT | no |
CONSISTENCY | no |
COUNT | no |
COUNTER | no |
CREATE | yes |
DECIMAL | no |
DELETE | yes |
DESC | yes |
DOUBLE | no |
DROP | yes |
EACH_QUORUM | yes |
FLOAT | no |
FROM | yes |
GRANT | yes |
IN | yes |
INDEX | yes |
CUSTOM | no |
INSERT | yes |
INT | no |
INTO | yes |
KEY | no |
KEYSPACE | yes |
LEVEL | no |
LIMIT | yes |
LOCAL_ONE | yes |
LOCAL_QUORUM | yes |
MODIFY | yes |
NORECURSIVE | yes |
NOSUPERUSER | no |
OF | yes |
ON | yes |
ONE | yes |
ORDER | yes |
PASSWORD | no |
PERMISSION | no |
PERMISSIONS | no |
PRIMARY | yes |
QUORUM | yes |
REVOKE | yes |
SCHEMA | yes |
SELECT | yes |
SET | yes |
STORAGE | no |
SUPERUSER | no |
TABLE | yes |
TEXT | no |
TIMESTAMP | no |
TIMEUUID | no |
THREE | yes |
TOKEN | yes |
TRUNCATE | yes |
TTL | no |
TWO | yes |
TYPE | no |
UPDATE | yes |
USE | yes |
USER | no |
USERS | no |
USING | yes |
UUID | no |
VALUES | no |
VARCHAR | no |
VARINT | no |
WHERE | yes |
WITH | yes |
WRITETIME | no |
DISTINCT | no |
The following type names are not currently used by CQL, but are reserved for potential future use. User-defined types may not use reserved type names as their name.
type |
---|
byte |
smallint |
complex |
enum |
date |
interval |
macaddr |
bitstring |
The following describes the changes in each version of CQL.
TRUNCATE TABLE X
is now accepted as an alias for TRUNCATE X
CREATE TYPE
, ALTER TYPE
, and DROP TYPE
CREATE INDEX
now supports indexing collection columns, including indexing the keys of map collections through the keys()
functionCONTAINS
and CONTAINS KEY
operatorsDROP INDEX
now supports optionally specifying a keyspaceSELECT
statements now support selecting multiple rows in a single partition using an IN
clause on combinations of clustering columns. See SELECT WHERE clauses.IF NOT EXISTS
and IF EXISTS
syntax is now supported by CREATE USER
and DROP USER
statmenets, respectively.uuid
method has been added.DELETE ... IF EXISTS
syntax.STATIC
columns, see static in CREATE TABLE.CREATE INDEX
now allows specifying options when creating CUSTOM indexes (see CREATE INDEX reference).NaN
and Infinity
has been added as valid float contants. They are now reserved keywords. In the unlikely case you we using them as a column identifier (or keyspace/table one), you will noew need to double quote them (see quote identifiers).SELECT
statement now allows listing the partition keys (using the DISTINCT
modifier). See CASSANDRA-4536.c IN ?
is now supported in WHERE
clauses. In that case, the value expected for the bind variable will be a list of whatever type c
is.:name
instead of ?
).DROP
option has been reenabled for CQL3 tables and has new semantics now: the space formerly used by dropped columns will now be eventually reclaimed (post-compaction). You should not readd previously dropped columns unless you use timestamps with microsecond precision (see CASSANDRA-3919 for more details).SELECT
statement now supports aliases in select clause. Aliases in WHERE and ORDER BY clauses are not supported. See the “section on select”#selectStmt for details.CREATE
statements for KEYSPACE
, TABLE
and INDEX
now supports an IF NOT EXISTS
condition. Similarly, DROP
statements support a IF EXISTS
condition.INSERT
statements optionally supports a IF NOT EXISTS
condition and UPDATE
supports IF
conditions.SELECT
, UPDATE
, and DELETE
statements now allow empty IN
relations (see CASSANDRA-5626).token
method should always be used for range queries on the partition key (see WHERE clauses).'2'
as a valid value for an int
column (interpreting it has the equivalent of 2
), or 42
as a valid blob
value (in which case 42
was interpreted as an hexadecimal representation of the blob). This is no longer the case, type validation of constants is now more strict. See the data types section for details on which constant is allowed for which type.timeuuid
values. Doing so was a bug in the sense that date string are not valid timeuuid
, and it was thus resulting in confusing behaviors. However, the following new methods have been added to help working with timeuuid
: now
, minTimeuuid
, maxTimeuuid
, dateOf
and unixTimestampOf
. See the section dedicated to these methods for more detail.4.2E10
is now a valid floating point value.Versioning of the CQL language adheres to the Semantic Versioning guidelines. Versions take the form X.Y.Z where X, Y, and Z are integer values representing major, minor, and patch level respectively. There is no correlation between Cassandra release versions and the CQL language version.
version | description |
---|---|
Major | The major version must be bumped when backward incompatible changes are introduced. This should rarely occur. |
Minor | Minor version increments occur when new, but backward compatible, functionality is introduced. |
Patch | The patch version is incremented when bugs are fixed. |