//////////////////// Licensed to Cloudera, Inc. under one or more contributor license agreements. See the NOTICE file distributed with this work for additional information regarding copyright ownership. Cloudera, Inc. licenses this file to you under the Apache License, Version 2.0 (the "License"); you may not use this file except in compliance with the License. You may obtain a copy of the License at http://www.apache.org/licenses/LICENSE-2.0 Unless required by applicable law or agreed to in writing, software distributed under the License is distributed on an "AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. See the License for the specific language governing permissions and limitations under the License. //////////////////// === Troubleshooting include::DefaultPorts[] ==== What versions of Hadoop HDFS can I use? How do I change this? Currently, there are constraints writing to HDFS. A Flume node can only write to one version of Hadoop. Although Hadoop's HDFS API has been fairly stable, HDFS clients are only guaranteed to be wire compatible with the same major version of HDFS. Cloudera's testing used Hadoop HDFS 0.20.x and HDFS 0.18.x. They are API compatible so all that is necessary to switch versions is to swap out the Hadoop jar and restart the node that will write to the other Hadoop version. You still need to configure this instance of Hadoop so that it talks to the correct HDFS nameNode. You configure the Hadoop client settings (such as pointers to the name node) the same way as with Hadoop dataNodes or worker nodes -- modify and use `conf/core-site.xml`. By default, Flume checks for a Hadoop jar in +/usr/lib/hadoop+. If it is not present, it defaults to a jar found in its own lib directory, +/usr/lib/flume/ lib+. ==== Why doesn't a Flume node appear on Flume Master? If a node does not show up on the Master, you should first check to see if the node is running. ---- # jps | grep FlumeNode ---- You can also check logs found in +/usr/lib/flume/logs/flume-flume-node*.log+. Common errors are error messages or warnings about being unable to contact the masters. This could be due to host misconfiguration, port misconfiguration (35872 is the default heartbeat port), or fire walling problems. Another possible error is to have a permissions problems with the local machine's write-ahead log directory. On an out-of-the-box setup, this is in the +/tmp/flume/agent+ directory). If a Flume Node is ever run as a user other than +flume+, (especially if it was run as +root+), the directory needs to be either deleted or its contents must have its permissions modified to allow Flume to use it. ==== Why is the state of a Flume node changing rapidly? Nodes by default start and choose their +hostname+ as their physical node name. Physical nodes names are supposed to be unique. Unexpected results may occur if multiple physical nodes are assigned the same name. ==== Where are the logs for troubleshooting Flume itself? On Ubuntu-based installations, logs are written to +/usr/lib/logs/+. Master logs :: +/usr/lib/logs/flume-flume-master-_host_.log+ Master stdout :: +/usr/lib/logs/flume-flume-master-_host_.out.*+ Node logs :: +/usr/lib/ logs/flume-flume-node-_host_.log+ Node stdout :: +/usr/lib/logs/flume-flume- node-_host_.out.*+ ==== What can I do if I get node failure due to out of file handles? There are two limits in Linux -- the max number of allowable open files (328838), and max number of allowable open files for a user (default 1024). Sockets are file handles so this limits the number of open TCP connections available. On Ubuntu, need to add a line to +/etc/security/limits.conf+ ---- hard nofile 10000 ---- The user should also add the following line to a `~/.bash_profile` to raise the limit to the hard limit. ---- ulimit -n 10000 ---- ==== Failures due when using Disk Failover or Write Ahead Log Flume currently relies on the file system for logging mechanisms. You must make sure that the user running Flume has permissions to write to the specified logging directory. Currently the default is to write to /tmp/flume. In a production system you should write to a directory that does not automatically get deleted on reboot. ==== Can I write data to Amazon S3? Yes. In the collector sink or dfs sinks you can use the +s3n://+ or +s3://+ prefixes to write to S3 buckets. First you must add some jars to your CLASSPATH in order to support s3 writing. Here is a set that Cloudera has tested (other versions will likely work as well): * commons-codec-1.3.jar * commons-httpclient-3.0.1.jar * jets3t-0.6.1.jar +s3n+ uses the native s3 file system and has some limitations on individual file sizes. Files written with this method are compatible with other programs like +s3cmd+. +s3+ writes using an overlay file system must go through Hadoop for file system interaction.