Flume 0.9.5 =========== Files written by a collector now have a '.tmp' suffix that indicates that the file is still being written. When the file is completed, this suffix is removed. Flume 0.9.4 =========== Old Thrift sinks have been removed and are no longer available. These are 'tacksink' and 'trawsink'. Additionally, the old aliases 'tsink' and 'tsource' are no longer present. Use the full names 'thriftSink' and 'thriftSource' instead. The setting 'flume.collector.dfs.compress.gzip' was deprecated in a previous release. It has now been removed. You should use the 'flume.collector.dfs.compress.codec' setting instead. Flume 0.9.3 =========== There have been some changes that removed advanced and purposely undocumented sinks such as the 'let' sink and the 'failchain' sink. Additions to the language include a new syntax for decorators, python-style keyword arguments, and a new 'collector' syntax that allows for multiple collector destinations. Flume 0.9.2 =========== Multiple masters does not work in conjunction with automatic failover chains configurations. This version significantly reduces the chances of data duplication encountered due to the FLUME-252 (races in tail) bug. However, it does not completely fix the problem. A workaround is to use 'exec("tail -F ")' instead of the tail source Linux/Unix systems. A new issue has been filed as FLUME-320 to continue tracking this problem. Flume 0.9.2 is *not backwards compatible* with <0.9.2 installations due to FLUME-216 - this release changes how configurations and node maps are persisted to ZooKeeper and 0.9.2 masters cannot read configurations or node maps written by masters from <0.9.2 installations. There is no upgrade path yet to preserve both node maps and configurations, but one is planned for the 0.9.2 release. The bug that caused FLUME-286 has been fixed for Thrift RPCs but not for Avro-based RPC mechanisms. Previous to the fix, the Thrift version would not recover when a downstream RPC server had a network partion such as a wire cut or power down. Unlike killing a server, these situations provided no failure feedback. The Avro version currently does not properly detect or recover from these situations so the DFO mode can lose data. This is reported as issue FLUME-313. Flume 0.9.1 Update 1 (CDH3b3) Release Notes =========================================== This release is a bugfix update to the 0.9.1 release. The only new features added to this release are: * Support for writing to kerberized HDFS. * Some minor fixes that prevent some user errors * Some cosmetic fixes to the web interface. Bug fixes: * Critical bug fixes having to do with node hangs or situations where agents do not properly recover from failures. Many documentation updates including: * A "cookbook" of example use case tutorials * Documentation corrections * More details on plugin behavior Compatiblity Changes: * When run from the command line, Flume now defaults to storing the agent data in /tmp/flume-${user.name} instead of /tmp/flume. When upgrading an existing installation, if you have not configured an alternate path, please rename /tmp/flume to /tmp/flume-flume Known bugs (includes 0.9.1 issues): * Tail is updated to potentially fix a race condition but races are still possible (FLUME-218). Another implementation is pending for a 0.9.2 release (FLUME-252). Flume 0.9.1 Release notes ========================= This release improves and adds several key features including: * gzip HDFS output file compression * tailDir source * Exposed internal configuration and extension information via web pages * Automated eclipse project generation via ant Documentation: * Added documentation to source repository * Example plugins and documentation on plugins. * Numerous typographic corrections. There were several critical fixes in this release as well. * Fixed support for data coming from scribe * Fixed syslog source underflow problem * Fixed issues with hangs when changing or decommissioning certain node configurations * Prevent certain user errors Compatability changes: * We no longer specifiy a -Xmx memory setting for the JVM. By default the JVM will set this value to 128MB. This value may need to be increased on nodes. (FLUME-72) Now Stable: * Fully-Distributed Mode * Sink/Source/Decorator Plugins * Automatic versions of failover. * Logical Naming translations. Highlighted known bugs: * Multiple Flume Masters does not work with automatic failover chains and automatic flow isolation. (FLUME-114, FLUME-150) * There are several known performance bugs (FLUME-264, FLUME-192). * One event is lost in BE mode and two events are lost in DFO mode when recovering from collector outage. These losses are technically allowed but can probably be improved. (FLUME-257) * Tail has a race condition that results in event duplication (FLUME-218) and there are issues with non ascii characters (FLUME-205). Flume 0.9.0 (CDH3b2) Release Notes ================================== This release of Flume implements the core framework and has implementations of most of the major features of the Flume system described in the Flume User Guide. Some sections are stable, while others are under active development. We also list several known bugs and instabilities. Supported platforms: Linux, requires Java 1.6, not tested on windows Stable: * Introduction / Design * Quick Start * Psuedo-Distributed Mode (node configuration via master) ** Simple web interface ** Agent/collector roles * Fully-Distributed Mode section: ** Static configuration of failover chains. ** Single master Memory-Backed and Zookeeper-Backed Configuration Store. * Data model of Flume event * Output Bucketing with the CollectorSink * Flume's Dataflow language * Flume Shell Stabilizing: * Fully-Distributed Mode: ** Modes for unreliable network conditions in testing ** Disk failover mode and End-to-end modes ** DFO and E2E retransmission is unstable when connected to a network sink. * Sink/Source/Decorator Plugins Active Development: * Automatic versions of failover. * Logical Naming translations. * Multiple Flume Masters * Output Formatting options * Custom metadata extraction Known Bugs * ZK session timeout makes master unrecoverable * Shell does not provide useful feedback on errors * Translated configurations are not always auto-updated when logical nodes are newly spawned onto physical nodes. An extra 'refreshAll' command may need to be executed. * Two collectorSources on the same node sometimes fails to translate the more recent collectorSource * Logical node names must be unique * Logical nodes with the same name as the physical nodes on which they run cannot be deleted