Frequently Asked Questions about log4j

Ceki Gülcü
May 2001


What is log4j?

log4j is a tool to help the programmer output log statements to a variety of output targets.

In case of problems with an application, it is helpful to enable logging so that the problem can be located. With log4j it is possible to enable logging at runtime without modifying the application binary. The log4j package is designed so that log statements can remain in shipped code without incurring a high performance cost. It follows that the speed of logging (or rather not logging) is capital.

At the same time, log output can be so voluminous that it quickly becomes overwhelming. One of the distinctive features of log4j is the notion of hierarchical categories. Using categories it is possible to selectively control which log statements are output at arbitrary granularity.

log4j is designed with two distinct goals in mind: speed and flexibility. There is a tight balance between these two requirements. I believe that log4j strikes the right balance.

Is log4j a reliable logging system?

No. log4j is not reliable. It is a best-effort and fail-stop logging system.

By fail-stop, we mean that log4j will not throw unexpected exceptions at run-time potentially causing your application to crash. If for any reason, log4j throws an uncaught exception, please send an email to the log4j-user@jakarta.apache.org mailing list. Uncaught exceptions are handled as serious bugs requiring immediate attention.

Moreover, log4j will not revert to System.out or System.err when its designated output stream is not opened, is not writable or becomes full. This avoids corrupting an otherwise working program by flooding the user's terminal because logging fails. However, log4j will output a single message to System.err indicating that logging can not be performed.

What are the prerequisites for log4j?

Is there javadoc documentation for log4j?

The javadoc
documentation is part of the log4j package. There is also an introductory manual. In case problems make sure to have a look at log4j troubleshooting document.

What other logging packages are there?

There are many other logging packages out there. I know of
Grace Software's JavaLog, JLog, Software ZOO's toolkit, SDSU logging package. This list is not exhaustive.

Is there example code for using log4j?

There is a directory containing examples in org/log4j/examples. See also org/log4j/xml/examples.

What are the features of log4j?

Is log4j thread-safe?

Yes, log4j is thread-safe.

What does log output look like?

The log output can be customized in many ways. Moreover, one can completely override the output format by implementing one's own Layout.

Here is an example output using PatternLayout with the conversion pattern "%r [%t] %-5p %c{2} %x - %m%n"

176 [main] INFO  examples.Sort - Populating an array of 2 elements in reverse order.
225 [main] INFO  examples.SortAlgo - Entered the sort method.
262 [main] DEBUG SortAlgo.OUTER i=1 - Outer loop.
276 [main] DEBUG SortAlgo.SWAP i=1 j=0 - Swapping intArray[0] = 1 and intArray[1] = 0
290 [main] DEBUG SortAlgo.OUTER i=0 - Outer loop.
304 [main] INFO  SortAlgo.DUMP - Dump of interger array:
317 [main] INFO  SortAlgo.DUMP - Element [0] = 0
331 [main] INFO  SortAlgo.DUMP - Element [1] = 1
343 [main] INFO  examples.Sort - The next log statement should be an error message.
346 [main] ERROR SortAlgo.DUMP - Tried to dump an uninitialized array.
        at org.log4j.examples.SortAlgo.dump(SortAlgo.java:58)
        at org.log4j.examples.Sort.main(Sort.java:64)
467 [main] INFO  examples.Sort - Exiting main method.

The first field is the number of milliseconds elapsed since the start of the program. The second field is the thread outputting the log statement. The third field is the priority of the log statement. The fourth field is the rightmost two components of the category making the log request. The fifth field (just before the '-') is the nested diagnostic context (NDC). Note the nested diagnostic context may be empty as in the first two statements. The text after the '-' is the message of the statement.

What are Categories?

The notion of categories lies at the heart of log4j. Categories define a hierarchy and give the programmer run-time control on which statements are printed or not.

Categories are assigned priorities. A log statement is printed depending on its priority and its category.

Make sure to read the log4j manual for more information.

How can I change log behavior at runtime?

Log behavior can be set using configuration files which are parsed at runtime. Using configuration files the programmer can define categories and set their priorities.

The PropertyConfigurator defines a particular format of a configuration file. See also the org.log4j.examples.Sort example and associated configuration files.

Configuration files can be specified in XML. See log4j.dtd and org.log4j.xml.DOMConfigurator for more details.

See the various Layout and Appender components for specific configuration options.

In addition to configuration files, the user may disable all messages belonging to a set of priorities. See next item.

How can I reduce the computational cost of debug and info statements?

For public releases of your code, calling the BasicConfigurator.disable(pri) method will disable all messages of priority pri and below.

In cases of problems with an application, technical support can re-enable logging by setting the log4j.disableOverride system property without changing the binary at the client's site.

What is the fastest way of (not) logging?

For some category cat, writing,

  cat.debug("Entry number: " + i + " is " + String.valueOf(entry[i]));

incurs the cost of constructing the message parameter, that is converting both integer i and entry[i] to a String, and concatenating intermediate strings. This, regardless of whether the message will be logged or not.

If you are worried about speed, then write

   if(cat.isDebugEnabled()) {
     cat.debug("Entry number: " + i + " is " + String.valueOf(entry[i]));
   }

This way you will not incur the cost of parameter construction if debugging is disabled for category CAT. On the other hand, if the category is debug enabled, you will incur the cost of evaluating whether the category is enabled or not, twice: once in debugEnabled and once in debug. This is an insignificant overhead since evaluating a category takes less than 1% of the time it takes to actually log a statement.

What is the use of the debug method expecting a String array as one of its parameters?

This method no longer exists. Use the Category.isDebugEnabled method instead.

Why was the Category class introduced and how do I migrate from the previous String based implementation?

The reason was speed, speed, speed.

In the former implementation, when evaluating whether a category should be logged or not, we potentially computed a hash and performed an equality check multiple times, once for each higher ranking category. For example, if the category name was "x.y.z", we computed the hash of "x.y.z" and checked if it was already defined (costing an equality check). If not, we parsed "x.y.z" to discover that "x.y" was higher ranking, then computed the hash of "x.y" and checked whether it was defined (costing another equality check). So on, until a valid category was found or there were no possible categories left.

It turns out that for long strings, hash computations and an equality checks are computationally expensive operations.

The new Category class retains the flexibility of the former implementation and offers much much better performance. I would go as far as to claim that the performance cannot be improved upon without loosing functionality. Please do not hesitate to debunk this assertion. Contributions from Alex Blewitt, F. Hoering and M. Oestreicher were instrumental to these performance improvements.

The new syntax for defining a category is

  
  Category cat = Category.getInstance("x.y.z");
  cat.setPriority(Priority.DEBUG);

Previously, to achieve a similar effect, one had to write

  log.setCategory("x.y.z", "DEBUG"); // where log is an instance of Log

As of release 0.8.0, the syntax was further modified so that log statements (debug, info, ... methods) no longer need a log singleton but use a Category instance instead.

For some class X one previously wrote,

package a.b.c;

class X {
  static String cat = "a.b.c.X";

  void foo() {
    log.debug(cat, "Some foo message").
    ...
  }
}
This code needs to be modified as follows
package a.b.c;

import org.log4j.Category; 

class X {
  static Category cat = Category.getInstance("a.b.c.X");

  void foo() {
    cat.debug("Some foo message").
    ...
  }
}

Are there any suggested ways for naming categories?

Yes, there are.

You can name categories by locality. It turns out that instantiating a category in each class, with the category name equal to the fully-qualified name of the class, is a useful and straightforward approach of defining categories. This approach has many benefits:

However, this is not the only way for naming categories. A common alternative is to name categories by functional areas. For example, the "database" category, "RMI" category, "security" category, or the "XML" category.

You may choose to name categories by functionality and subcategorize by locality, as in "DATABASE.com.ibm.some.package.someClass" or "DATABASE.com.ibm.some.other.package.someOtherClass".

You are totally free in choosing the names of your categories. The log4j package merely allows you to manage your names in a hierarchy. However, it is your responsibility to define this hierarchy.

Note by naming categories by locality one tends to name things by functionality, since in most cases the locality relates closely to functionality.

How do I get the fully-qualified name of a class in a static block?

You can easily retrieve the fully-qualified name of a class in a static block for class X, with the statement X.class.getName(). Note that X is the class name and not an instance. The X.class statement does not create a new instance of class X.

Here is the suggested usage template:

package a.b.c;

public class Foo {
  static Category cat = Category.getInstance(Foo.class.getName());
  ... other code

}

Can the log output format be customized?

Yes. Since release 0.7.0, you can extend the Layout class to create you own customized log format. Appenders can be parameterized to use the layout of your choice.

Can the outputs of multiple client request go to different log files?

Many developers are confronted with the problem of distinguishing the log output originating from the same class but different client requests. They come up with ingenious mechanisms to fan out the log output to different files. In most cases, this is not the right approach.

It is simpler to use a nested diagnostic context (NDC). Typically, one would NDC.push() client specific information, such as the client's hostname, ID or any other distinguishing information when starting to handle the client's request. Thereafter, log output will automatically include the nested diagnostic context so that you can distinguish logs from different client requests even if they are output to the same file.

See the NDC and the PatternLayout classes for more information. The NumberCruncher example shows how the NDC can be used to distinguish the log output from multiple clients even if they share the same log file.

For select applications, such as virtual hosting web-servers, the NDC solution is not sufficient. As of version 0.9.0, log4j supports multiple hierarchy trees. Thus, it is possible to log to different targets from the same category depending on the current context.

Category instances seem to be create only. Why isn't there a method to remove category instances?

It is quite nontrivial to define the semantics of a "removed" category which is still referenced by the user.

Future releases may include a remove method in the Category class.

Is it possible to direct log output to different appenders by priority?

Yes it is. Setting the Threshold option of any appender extending AppenderSkeleton, (most log4j appenders extend AppenderSkeleton) to filter out all log events with lower priority than the value of the threshold option.

For example, setting the threshold of an appender to DEBUG also allow INFO, WARN, ERROR and FATAL messages to log along with DEBUG messages. (DEBUG is the lowest priority). This is usually acceptable as there is little use for DEBUG messages without the surrounding INFO, WARN, ERROR and FATAL messages. Similarly, setting the threshold of an appender to ERROR will filter out DEBUG, INFO and ERROR messages but not FATAL messages.

This policy usually best encapsulates what the user actually wants to do, as opposed to her mind-projected solution.

See sort4.lcf for an example threshold configuration.

If you must filter events by exact priority match, then you can attach a PriorityMatchFilter to any appender to filter out logging events by exact priority match.

How do I get multiple process to log to the same file?

You may have each process log to a SocketAppender. The receiving SocketServer (or SimpleSocketServer) can receive all the events and send them to a single log file.

If I have many processes across multiple hosts (possibly across multiple timezones) logging to the same file using the method above, what happens to timestamps?

The timestamp is created when the logging event is created. That is so say, when the debug, info, warn, error or fatal method is invoked. This is unaffected by the time at which they may arrive at a remote socket server. Since the timestamps are stored in UTC format inside the event, they all appear in the same timezone as the host creating the logfile. Since the clocks of various machines may not be synchronized, this may account for time interval inconsistencies between events generated on different hosts.

While this is the intended behavior, it only recently became so due to a bug discovery between version 1.0.4 and 1.1b1. Versions 1.0.4 and before had their timestamp regenerated in the converter. In this case the timestamps seen in the log file would all appear in order, generated at the time they arrived at the log server host according to its local clock.

Why should I donate my extensions to log4j back to the project?

Contrary to the GNU Public License (GPL) the Apache Software License does not make any claims over your extensions. By extensions, we mean totally new code that invokes existing log4j classes. You are free to do whatever you wish with your proprietary log4j extensions. In particular, you may choose to never release your extensions to the wider public.

We are very careful not to change the log4j client API so that newer log4j releases are backward compatible with previous versions. We are a lot less scrupulous with the internal log4j API. Thus, if your extension is designed to work with log4j version n, then when log4j release version n+1 comes out, you will probably need to adapt your proprietary extensions to the new release. Thus, you will be forced to spend precious resources in order to keep up with log4j changes. This is commonly referred to as the "stupid-tax." By donating the code and making it part of the standard distribution, you save yourself the unnecessary maintenance work.

If your extensions are useful then someone will eventually write an extension providing the same or very similar functionality. Your development effort will be wasted.

Unless the proprietary log4j extension is business critical, there is little reason for not donating your extensions back to the project.

What should I keep in mind when contributing code?

  1. Stick to the existing indentation style even if you hate it.

    Alternating between indentation styles makes it hard to understand the source code. Make it hard on yourself but easier on others. Log4j follows the Code Conventions for the JavaTM Programming Language.

  2. Make every effort to stick to the JDK 1.1 API.

    One of the important advantages of log4j is its compatibility with JDK 1.1.x.

  3. Thoroughly test your code.

    There is nothing more irritating than finding the bugs in debugging (i.e. logging) code.

  4. Keep it simple, small and fast.

    It's all about the application not about logging.

  5. Identify yourself as the contributor at the top of the relevant file.

  6. Take responsibility for your code.

    Authoring software is like parenting. It takes many years to raise a child.

  7. Did I mention sticking with the indentation style?

How fast do bugs in log4j get fixed?

Rather than wait for the next release to be ready, we get bug fixes out of the door as soon as possible. Moreover, once a bug is found or reported, it is treated as fire in the house. All other activities stop until the bug is fixed.

Consequently, confirmed bugs are fixed after a short period following the initial bug report.

What is the history of log4j?

The first ancestor of log4j was written for the SEMPER project. Jose-Luis Abad-Peiro wrote the initial 30 liner version that was picked up by Ceki Gülcü and enhanced by Andreas Fleuti. Michael Steiner, N. Asokan, Ceki Gülcü proposed category/priority based evaluation which has remained conceptually the same since 1996.

How do I report bugs?

Report bugs using the
Apache Bug Database.

Please specify the version of log4j you are using. It is helpful to include log configurations files if any, plus source code. A short example reproducing the problem is very much appreciated.

Where can I find the latest distribution of log4j?

The log4j project is hosted at http://jakarta.apache.org/log4j/.