Thursday, March 25. 2010
Java PermGen space, memory leaks, and debug mode
Over the last couple of days of development on a Java web application using the Wicket framework, I noticed a peculiar behavior on the server while in debug mode. After several redeployments of the application war file to the Tomcat server (without restarting the server), eventually a page reload would hang and the server would start spewing java.lang.OutOfMemoryError: PermGen space
stack traces to the console. Clearly this is indicating a memory leak somewhere in my code or in the libraries my code relies on and on top of that, this isn't the sort of memory leak usually associated with poor memory management, since this is the PermGen space not the Heap space which is running out. Since PermGen is a special place in memory for ClassLoader objects to reside, its not something that most memory profilers will pick up since it shouldn't ever be a problem for normal code.
At this point, after searching for information on what would cause PermGen space to run out, and reading a bunch of mixed responses (most programmers that have blogged about this before me seem to take the head in the sand approach and just increase the PermGen size so that won't run out as fast), I found these two wonderful blog posts by Frank Kieviet. He's analysis of the situation is very enlightened and helped me understand what was going on behind the scenes.
So, using Sun's VisualVM tool, I started profiling the PermGen space of my local Tomcat install while continually deploying and redeploying the code base. Sure enough, every time I changed the code and eclipse automatically redeployed the code to the Tomcat instance, the PermGen size would increase the next time I requested a page and Tomcat started allocating objects. No matter how long I waited or clicked the garbage collect button, the PermGen space was never reclaimed and eventually I'd hit the memory ceiling and the server would start spewing those familiar OutOfMemoryError messages again.
Having read Frank's excellent analysis of static references, I took a trip through the code base in order to find offending static object references that might be causing the leak. Unfortunately, no where in the code was I creating static references that looked like they would have resulted in a ClassLoader leak (not to say that there aren't, as was pointed out in Frank's article, this is a tricky problem to track down and none of my references looked suspicious). Well, if its not in my code, where is it?
Since I'm using Wicket, I created a skeletal test project that creates a single webpage and dynamically sets a span tag on the page to a hard coded String object. I fired up VisualVM, and ran my same deploy and redeploy test and sure enough the PermGen space starts getting eaten up like clockwork. To be fair, the PermGen space didn't fill up nearly as fast since my test project is only allocating a few classes and they're associated ClassLoader objects every redeploy, but given enough time it did eventually result in the same OutOfMemoryErrors in the console.
"Aha!", I thought, "Wicket's developers have made a grievous mistake, and aren't watching their code for potential ClassLoader leaks". I figured that since I'm using Wicket, I'll just have to live with the knowledge that eventually I'll run out of PermGen space and have to restart my server. Certainly annoying to me as a developer, but its definitely a viable workaround. The only problem will be when this code hits production, and I'll have to be careful to restart the server after each code deployment or run the risk of having the server be brought to its knees from memory problems.
I can't remember what particular Google search netted me this link, but I did eventually find this gem of information:
The JDK's permanent memory behaves differently depending on whether a debugging is enable, i.e. if there is an active agent.
If there is an active agent, the JDK can fail to collect permanent memory in some cases. (Specifically, when some code introspects a class with a primitive array like, byte[] or char[].) This can cause the permanent space to run out, for example, when redeploying .war files.
I had no idea that the JVM's garbage collector behaved differently depending on whether the application is running in debug mode or not. Since I'm constantly running my code in debug mode and using Eclipse's hot fix code replace feature, and every code change redeploys the entire .war archive to the server, resulting in more class loaders on the class path, and another chunk of memory that will never be reclaimed by the server.
Therefore, the moral of the story is while trying to track down memory leaks, do not run your application (especially a web application) in debug mode. It is still true that you can leak PermGen space without your application being run in debug mode, so this isn't an excuse for poor coding practices. Rather, its a cautionary tale of my part of how even a simple thing like running your application in debug mode can mean rather drastically different program performance even outside of the debug instruction overhead.