0.8.9 ===== Upgrading --------- - Nothing specific to 0.8.9 0.8.8 ===== Upgrading --------- - Nothing specific to 0.8.8 0.8.7 ===== Upgrading --------- - Nothing specific to 0.8.7 0.8.6 ===== Features -------- - describe_ring now returns both the listen_address and rpc_address 0.8.5 ===== Features -------- - SSTables copied to a data directory can be loaded by a live node through nodetool refresh (may be handy to load snapshots). - The configured compaction throughput is exposed through JMX. Other ----- - The sstableloader is now bundled with the debian package. - Repair detects when a participating node is dead and fails instead of hanging forever. 0.8.4 ===== Upgrading --------- - Nothing specific to 0.8.4 Other ----- - This release comes to fix a bug in counter that could lead to (important) over-count. - It also fixes a slight upgrade regression from 0.8.3. It is thus advised to jump directly to 0.8.4 if upgrading from before 0.8.3. 0.8.3 ===== Upgrading --------- - Token removal has been revamped. Removing tokens in a mixed cluster with 0.8.3 will not work, so the entire cluster will need to be running 0.8.3 first, except for the dead node. Features -------- - It is now possible to use thrift asynchronous and half-synchronous/half-asynchronous servers (see cassandra.yaml for more details). - It is now possible to access counter columns through Hadoop. Other ----- - This release fix a regression of 0.8 that can make commit log segment to be deleted even though not all data it contains has been flushed. Upgrades from 0.8.* is very much encouraged. 0.8.2 ===== Upgrading --------- - 0.8.0 and 0.8.1 shipped with a bug that was setting the replicate_on_write option for counter column families to false (this option has no effect on non-counter column family). This is an unsafe default and 0.8.2 correct this, the default for replicate_on_write is now true. It is advised to update your counter column family definitions if replicate_on_write was uncorrectly set to false (before or after upgrade). Tools ----- - Add new simplified classes to write sstables (to complement the bulk loading utility). Other ----- - This release fix a regression of 0.8.1 that made hinted handoff being never delivered. Upgrade from 0.8.1 is thus highly encourage. 0.8.1 ===== Upgrading --------- - 0.8.1 is backwards compatible with 0.8, upgrade can be achieved by a simple rolling restart. - If upgrading for earlier version (0.7), please refer to the 0.8 section for instructions. Features -------- - Numerous additions/improvements to CQL (support for counters, TTL, batch inserts/deletes, index dropping, ...). - Add two new AbstractTypes (comparator) to support compound keys (CompositeType and DynamicCompositeType), as well as a ReverseType to reverse the order of any existing comparator. - New option to bypass the commit log on some keyspaces (for advanced users). Tools ----- - Add new data bulk loading utility (sstableloader). 0.8 === Upgrading --------- - Upgrading from version 0.7.1 or later can be done with a rolling restart, one node at a time. You do not need to bring down the whole cluster at once. - After upgrading, run nodetool scrub against each node before running repair, moving nodes, or adding new ones. - Running nodetool drain before shutting down the 0.7 node is recommended but not required. (Skipping this will result in replay of entire commitlog, so it will take longer to restart but is otherwise harmless.) - 0.8 is fully API-compatible with 0.7. You can continue to use your 0.7 clients. - Avro record classes used in map/reduce and Hadoop streaming code have been removed. Map/reduce can be switched to Thrift by changing org.apache.cassandra.avro in import statements to org.apache.cassandra.thrift (no class names change). Streaming support has been removed for the time being. - The loadbalance command has been removed from nodetool. For similar behavior, decommission then rebootstrap with empty initial_token. - Thrift unframed mode has been removed. - The addition of key_validation_class means the cli will assume keys are bytes, instead of strings, in the absence of other information. See http://wiki.apache.org/cassandra/FAQ#cli_keys for more details. Features -------- - added CQL client API and JDBC/DBAPI2-compliant drivers for Java and Python, respectively (see: drivers/ subdirectory and doc/cql) - added distributed Counters feature; see http://wiki.apache.org/cassandra/Counters - optional intranode encryption; see comments around 'encryption_options' in cassandra.yaml - compaction multithreading and rate-limiting; see 'concurrent_compactors' and 'compaction_throughput_mb_per_sec' in cassandra.yaml - cassandra will limit total memtable memory usage to 1/3 of the heap by default. This can be ajusted or disabled with the memtable_total_space_in_mb option. The old per-ColumnFamily throughput, operations, and age settings are still respected but will be removed in a future major release once we are satisfied that memtable_total_space_in_mb works adequately. Tools ----- - stress and py_stress moved from contrib/ to tools/ - clustertool was removed (see https://issues.apache.org/jira/browse/CASSANDRA-2607 for examples of how to script nodetool across the cluster instead) Other ----- - In the past, sstable2json would write column names and values as hex strings, and now creates human readable values based on the comparator/validator. As a result, JSON dumps created with older versions of sstable2json are no longer compatible with json2sstable, and imports must be made with a configuration that is identical to the export. - manually-forced compactions ("nodetool compact") will do nothing if only a single SSTable remains for a ColumnFamily. To force it to compact that anyway (which will free up space if there are a lot of expired tombstones), use the new forceUserDefinedCompaction JMX method on CompactionManager. - most of contrib/ (which was not part of the binary releases) has been moved either to examples/ or tools/. We plan to move the rest for 0.8.1. JMX --- - By default, JMX now listens on port 7199. 0.7.6 ===== Upgrading --------- - Nothing specific to 0.7.6, but see 0.7.3 Upgrading if upgrading from earlier than 0.7.1. 0.7.5 ===== Upgrading --------- - Nothing specific to 0.7.5, but see 0.7.3 Upgrading if upgrading from earlier than 0.7.1. Changes ------- - system_update_column_family no longer snapshots before applying the schema change. (_update_keyspace never did. _drop_keyspace and _drop_column_family continue to snapshot.) - added memtable_flush_queue_size option to cassandra.yaml to avoid blocking writes when multiple column families (or a colum family with indexes) are flushed at the same time. - allow overriding initial_token, storage_port and rpc_port using system properties 0.7.4 ===== Upgrading --------- - Nothing specific to 0.7.4, but see 0.7.3 Upgrading if upgrading from earlier than 0.7.1. Features -------- - Output to Pig is now supported as well as input 0.7.3 ===== Upgrading --------- - 0.7.1 and 0.7.2 shipped with a bug that caused incorrect row-level bloom filters to be generated when compacting sstables generated with earlier versions. This would manifest in IOExceptions during column name-based queries. 0.7.3 provides "nodetool scrub" to rebuild sstables with correct bloom filters, with no data lost. (If your cluster was never on 0.7.0 or earlier, you don't have to worry about this.) Note that nodetool scrub will snapshot your data files before rebuilding, just in case. 0.7.1 ===== Upgrading --------- - 0.7.1 is completely backwards compatible with 0.7.0. Just restart each node with the new version, one at a time. (The cluster does not all need to be upgraded simultaneously.) Features -------- - added flush_largest_memtables_at and reduce_cache_sizes_at options to cassandra.yaml as an escape valve for memory pressure - added option to specify -Dcassandra.join_ring=false on startup to allow "warm spare" nodes or performing JMX maintenance before joining the ring Performance ----------- - Disk writes and sequential scans avoid polluting page cache (requires JNA to be enabled) - Cassandra performs writes efficiently across datacenters by sending a single copy of the mutation and having the recipient forward that to other replicas in its datacenter. - Improved network buffering - Reduced lock contention on memtable flush - Optimized supercolumn deserialization - Zero-copy reads from mmapped sstable files - Explicitly set higher JVM new generation size - Reduced i/o contention during saving of caches 0.7.0 ===== Features -------- - Secondary indexes (indexes on column values) are now supported - Row size limit increased from 2GB to 2 billion columns. rows are no longer read into memory during compaction. - Keyspace and ColumnFamily definitions may be added and modified live - Streaming data for repair or node movement no longer requires anticompaction step first - NetworkTopologyStrategy (formerly DatacenterShardStrategy) is ready for use, enabling ConsistencyLevel.DCQUORUM and DCQUORUMSYNC. See comments in `cassandra.yaml.` - Optional per-Column time-to-live field allows expiring data without have to issue explicit remove commands - `truncate` thrift method allows clearing an entire ColumnFamily at once - Hadoop OutputFormat and Streaming [non-jvm map/reduce via stdin/out] support - Up to 8x faster reads from row cache - A new ByteOrderedPartitioner supports bytes keys with arbitrary content, and orders keys by their byte value. This should be used in new deployments instead of OrderPreservingPartitioner. - Optional round-robin scheduling between keyspaces for multitenant clusters - Dynamic endpoint snitch mitigates the impact of impaired nodes - New `IntegerType`, faster than LongType and allows integers of both less and more bits than Long's 64 - A revamped authentication system that decouples authorization and allows finer-grained control of resources. Upgrading --------- The Thrift API has changed in incompatible ways; see below, and refer to http://wiki.apache.org/cassandra/ClientOptions for a list of higher-level clients that have been updated to support the 0.7 API. The Cassandra inter-node protocol is incompatible with 0.6.x releases (and with 0.7 beta1), meaning you will have to bring your cluster down prior to upgrading: you cannot mix 0.6 and 0.7 nodes. The hints schema was changed from 0.6 to 0.7. Cassandra automatically snapshots and then truncates the hints column family as part of starting up 0.7 for the first time. Keyspace and ColumnFamily definitions are stored in the system keyspace, rather than the configuration file. The process to upgrade is: 1) run "nodetool drain" on _each_ 0.6 node. When drain finishes (log message "Node is drained" appears), stop the process. 2) Convert your storage-conf.xml to the new cassandra.yaml using "bin/config-converter". 3) Rename any of your keyspace or column family names that do not adhere to the '^\w+' regex convention. 4) Start up your cluster with the 0.7 version. 5) Initialize your Keyspace and ColumnFamily definitions using "bin/schematool import". _You only need to do this to one node_. Thrift API ---------- - The Cassandra server now defaults to framed mode, rather than unframed. Unframed is obsolete and will be removed in the next major release. - The Cassandra Thrift interface file has been updated for Thrift 0.5. If you are compiling your own client code from the interface, you will need to upgrade the Thrift compiler to match. - Row keys are now bytes: keys stored by versions prior to 0.7.0 will be returned as UTF-8 encoded bytes. OrderPreservingPartitioner and CollatingOrderPreservingPartitioner continue to expect that keys contain UTF-8 encoded strings, but RandomPartitioner now works on any key data. - keyspace parameters have been replaced with the per-connection set_keyspace method. - The return type for login() is now AccessLevel. - The get_string_property() method has been removed. - The get_string_list_property() method has been removed. Configuraton ------------ - Configuration file renamed to cassandra.yaml and log4j.properties to log4j-server.properties - PropertyFileSnitch configuration file renamed to cassandra-topology.properties - The ThriftAddress and ThriftPort directives have been renamed to RPCAddress and RPCPort respectively. - EndPointSnitch was renamed to RackInferringSnitch. A new SimpleSnitch has been added. - RackUnawareStrategy and RackAwareStrategy have been renamed to SimpleStrategy and OldNetworkTopologyStrategy, respectively. - RowWarningThresholdInMB replaced with in_memory_compaction_limit_in_mb - GCGraceSeconds is now per-ColumnFamily instead of global - Keyspace and column family names that do not confirm to a '^\w+' regex are considered illegal. - Keyspace and column family definitions will need to be loaded via "bin/schematool import". _You only need to do this to one node_. - In addition to an authenticator, an authority must be configured as well. Users of SimpleAuthenticator should use SimpleAuthority for this value (the default is AllowAllAuthority, which corresponds with AllowAllAuthenticator). - The format of access.properties has changed, see the sample configuration conf/access.properties for documentation on the new format. JMX --- - StreamingService moved from o.a.c.streaming to o.a.c.service - GMFD renamed to GOSSIP_STAGE - {Min,Mean,Max}RowCompactedSize renamed to {Min,Mean,Max}RowSize since it no longer has to wait til compaction to be computed Other ----- - If extending AbstractType, make sure you follow the singleton pattern followed by Cassandra core AbstractType classes: provide a public static final variable called 'instance'. 0.6.6 ===== Upgrading --------- - As part of the cache-saving feature, a third directory (along with data and commitlog) has been added to the config file. You will need to set and create this directory when restarting your node into 0.6.6. 0.6.1 ===== Upgrading --------- - We try to keep minor versions 100% compatible (data format, commitlog format, network format) within the major series, but we introduced a network-level incompatibility in 0.6.1. Thus, if you are upgrading from 0.6.0 to any higher version (0.6.1, 0.6.2, etc.) then you will need to restart your entire cluster with the new version, instead of being able to do a rolling restart. 0.6.0 ===== Features -------- - row caching: configure with the RowsCached attribute in ColumnFamily definition - Hadoop map/reduce support: see contrib/word_count for an example - experimental authentication support, described under Authenticator in storage.conf Configuraton ------------ - MemtableSizeInMB has been replaced by MemtableThroughputInMB which triggers a memtable flush when the specified amount of data has been written, including overwrites. - MemtableObjectCountInMillions has been replaced by the MemtableOperationsInMillions directive which causes a memtable flush to occur after the specified number of operations. - Like MemtableSizeInMB, BinaryMemtableSizeInMB has been replaced by BinaryMemtableThroughputInMB. - Replication factor is now per-keyspace, rather than global. - KeysCachedFraction is deprecated in favor of KeysCached - RowWarningThresholdInMB added, to warn before very large rows get big enough to threaten node stability Thrift API ---------- - removed deprecated get_key_range method - added batch_mutate meethod - deprecated multiget and batch_insert methods in favor of multiget_slice and batch_mutate, respectively - added ConsistencyLevel.ANY, for when you want write availability even when it may not be readable immediately. Unlike CL.ZERO, though, it will throw an exception if it cannot be written *somewhere*. JMX metrics ----------- - read and write statistics are reported as lifetime totals, instead of averages over the last minute. average-since-last requested are also available for convenience. - cache hit rate statistics are now available from JMX under org.apache.cassandra.db.Caches - compaction JMX metrics are moved to org.apache.cassandra.db.CompactionManager. PendingTasks is now a much better estimate of compactions remaining, and the progress of the current compaction has been added. - commitlog JMX metrics are moved to org.apache.cassandra.db.Commitlog - progress of data streaming during bootstrap, loadbalance, or other data migration, is available under org.apache.cassandra.streaming.StreamingService. See http://wiki.apache.org/cassandra/Streaming for details. Installation/Upgrade -------------------- - 0.6 network traffic is not compatible with earlier versions. You will need to shut down all your nodes at once, upgrade, then restart. 0.5.0 ===== 0. The commitlog format has changed (but sstable format has not). When upgrading from 0.4, empty the commitlog either by running bin/nodeprobe flush on each machine and waiting for the flush to finish, or simply remove the commitlog directory if you only have test data. (If more writes come in after the flush command, starting 0.5 will error out; if that happens, just go back to 0.4 and flush again.) The format changed twice: from 0.4 to beta1, and from beta2 to RC1. .5 The gossip protocol has changed, meaning 0.5 nodes cannot coexist in a cluster of 0.4 nodes or vice versa; you must upgrade your whole cluster at the same time. 1. Bootstrap, move, load balancing, and active repair have been added. See http://wiki.apache.org/cassandra/Operations. When upgrading from 0.4, leave autobootstrap set to false for the first restart of your old nodes. 2. Performance improvements across the board, especially on the write path (over 100% improvement in stress.py throughput). 3. Configuration: - Added "comment" field to ColumnFamily definition. - Added MemtableFlushAfterMinutes, a global replacement for the old per-CF FlushPeriodInMinutes setting - Key cache settings 4. Thrift: - Added get_range_slice, deprecating get_key_range 0.4.2 ===== 1. Improve default garbage collector options significantly -- throughput will be 30% higher or more. 0.4.1 ===== 1. SnapshotBeforeCompaction configuration option allows snapshotting before each compaction, which allows rolling back to any version of the data. 0.4.0 ===== 1. On-disk data format has changed to allow billions of keys/rows per node instead of only millions. The new format is incompatible with 0.3; see 0.3 notes below for how to import data from a 0.3 install. 2. Cassandra now supports multiple keyspaces. Typically you will have one keyspace per application, allowing applications to be able to create and modify ColumnFamilies at will without worrying about collisions with others in the same cluster. 3. Many Thrift API changes and documentation. See http://wiki.apache.org/cassandra/API 4. Removed the web interface in favor of JMX and bin/nodeprobe, which has significantly enhanced functionality. 5. Renamed configuration "" to "". 6. Added commitlog fsync; see "" in configuration. 0.3.0 ===== 1. With enough and large enough keys in a ColumnFamily, Cassandra will run out of memory trying to perform compactions (data file merges). The size of what is stored in memory is (S + 16) * (N + M) where S is the size of the key (usually 2 bytes per character), N is the number of keys and M, is the map overhead (which can be guestimated at around 32 bytes per key). So, if you have 10-character keys and 1GB of headroom in your heap space for compaction, you can expect to store about 17M keys before running into problems. See https://issues.apache.org/jira/browse/CASSANDRA-208 2. Because fixing #1 requires a data file format change, 0.4 will not be binary-compatible with 0.3 data files. A client-side upgrade can be done relatively easily with the following algorithm: for key in old_client.get_key_range(everything): columns = old_client.get_slice or get_slice_super(key, all columns) new_client.batch_insert or batch_insert_super(key, columns) The inner loop can be trivially parallelized for speed. 3. Commitlog does not fsync before reporting a write successful. Using blocking writes mitigates this to some degree, since all nodes that were part of the write quorum would have to fail before sync for data to be lost. See https://issues.apache.org/jira/browse/CASSANDRA-182 Additionally, row size (that is, all the data associated with a single key in a given ColumnFamily) is limited by available memory, because compaction deserializes each row before merging. See https://issues.apache.org/jira/browse/CASSANDRA-16