We have been doing performance soak testing on a large web application. It showed that we have a memory leak and eventually the application would run out of memory. After configuring WebSphere to automatically generate a memory dump, diagnosis of the memory dump with HeapAnalyzer revealed that the culprit is the Log4J NDC class which was holding onto 400 megabytes of data. We are using the NDC class to associate a user name and an UUID with all logged messages. This information is setup and removed in a Servlet filter.

The JavaDoc for the remove() method has the following comment:

Each thread that created a diagnostic context by calling push(java.lang.String) should call this method before exiting. Otherwise, the memory used by the thread cannot be reclaimed by the VM.

It then goes on to say:

As this is such an important problem in heavy duty systems and because it is difficult to always guarantee that the remove method is called before exiting a thread, this method has been augmented to lazily remove references to dead threads. In practise, this means that you can be a little sloppy and occasionally forget to call remove() before exiting a thread. However, you must call remove sometime. If you never call it, then your application is sure to run out of memory.

As I have blogged in the past, Log4J has caused my application to hang. Now in this case, you can argue that I should have RTFM more carefully. But, how can I call the `remove()` method before exiting a thread? Since I am running in a WebSphere Servlet container, I have no control over the thread life cycle. Also, the documentation for `remove()` does not state what happens when calling the method multiple times when a thread is not exiting.

So, in the end we:

  • Removed all use of NDC.
  • Re-read the documentation for MDC to verify there are no gotcha's. Being paranoid I also looked at the code.

This is the second time I have been burned by Log4J. Am I alone in thinking that I should just use the `java.util.logging` package directly. Sure, I would miss tools like Chainsaw, but I would rather something reliable and simple. Maybe I should use SLF4J and use Log4J and Chainsaw in development, and `java.util.logging` in production.