Thursday, March 25. 2010
Java PermGen space, memory leaks, and debug mode
Over the last couple of days of development on a Java web application using the Wicket framework, I noticed a peculiar behavior on the server while in debug mode. After several redeployments of the application war file to the Tomcat server (without restarting the server), eventually a page reload would hang and the server would start spewing java.lang.OutOfMemoryError: PermGen space
stack traces to the console. Clearly this is indicating a memory leak somewhere in my code or in the libraries my code relies on and on top of that, this isn't the sort of memory leak usually associated with poor memory management, since this is the PermGen space not the Heap space which is running out. Since PermGen is a special place in memory for ClassLoader objects to reside, its not something that most memory profilers will pick up since it shouldn't ever be a problem for normal code.
At this point, after searching for information on what would cause PermGen space to run out, and reading a bunch of mixed responses (most programmers that have blogged about this before me seem to take the head in the sand approach and just increase the PermGen size so that won't run out as fast), I found these two wonderful blog posts by Frank Kieviet. He's analysis of the situation is very enlightened and helped me understand what was going on behind the scenes.
So, using Sun's VisualVM tool, I started profiling the PermGen space of my local Tomcat install while continually deploying and redeploying the code base. Sure enough, every time I changed the code and eclipse automatically redeployed the code to the Tomcat instance, the PermGen size would increase the next time I requested a page and Tomcat started allocating objects. No matter how long I waited or clicked the garbage collect button, the PermGen space was never reclaimed and eventually I'd hit the memory ceiling and the server would start spewing those familiar OutOfMemoryError messages again.
Having read Frank's excellent analysis of static references, I took a trip through the code base in order to find offending static object references that might be causing the leak. Unfortunately, no where in the code was I creating static references that looked like they would have resulted in a ClassLoader leak (not to say that there aren't, as was pointed out in Frank's article, this is a tricky problem to track down and none of my references looked suspicious). Well, if its not in my code, where is it?
Since I'm using Wicket, I created a skeletal test project that creates a single webpage and dynamically sets a span tag on the page to a hard coded String object. I fired up VisualVM, and ran my same deploy and redeploy test and sure enough the PermGen space starts getting eaten up like clockwork. To be fair, the PermGen space didn't fill up nearly as fast since my test project is only allocating a few classes and they're associated ClassLoader objects every redeploy, but given enough time it did eventually result in the same OutOfMemoryErrors in the console.
"Aha!", I thought, "Wicket's developers have made a grievous mistake, and aren't watching their code for potential ClassLoader leaks". I figured that since I'm using Wicket, I'll just have to live with the knowledge that eventually I'll run out of PermGen space and have to restart my server. Certainly annoying to me as a developer, but its definitely a viable workaround. The only problem will be when this code hits production, and I'll have to be careful to restart the server after each code deployment or run the risk of having the server be brought to its knees from memory problems.
I can't remember what particular Google search netted me this link, but I did eventually find this gem of information:
The JDK's permanent memory behaves differently depending on whether a debugging is enable, i.e. if there is an active agent.
If there is an active agent, the JDK can fail to collect permanent memory in some cases. (Specifically, when some code introspects a class with a primitive array like, byte[] or char[].) This can cause the permanent space to run out, for example, when redeploying .war files.
I had no idea that the JVM's garbage collector behaved differently depending on whether the application is running in debug mode or not. Since I'm constantly running my code in debug mode and using Eclipse's hot fix code replace feature, and every code change redeploys the entire .war archive to the server, resulting in more class loaders on the class path, and another chunk of memory that will never be reclaimed by the server.
Therefore, the moral of the story is while trying to track down memory leaks, do not run your application (especially a web application) in debug mode. It is still true that you can leak PermGen space without your application being run in debug mode, so this isn't an excuse for poor coding practices. Rather, its a cautionary tale of my part of how even a simple thing like running your application in debug mode can mean rather drastically different program performance even outside of the debug instruction overhead.
Saturday, March 20. 2010
WshShell.Exec Considered Harmful Due To Blocking
For the unfamiliar, the Exec method of WshShell (used in Windows scripting) runs an application and provides access to that application's standard streams as TextStream objects. The problem with this method is that the blocking behavior of these streams is not defined (as noted in the comments on the StdOut property) and, more importantly, is impossible to use safely. The problem, familiar to anyone who has dealt with reading and writing to child programs, is that it is very easy to block while attempting to read from (or write to) the child process. What makes this problem worse is that TextStream provides no method for dealing with blocking; there is no way to set non-blocking mode or to check if input is ready to be read or to check if it would be safe to write (at least, none that I am aware of).
As a demonstration of the problem, consider the following applications. First, the child program (written in C):
#include <stdio.h>
int main(void)
{
for (int i=0; i<10; ++i) {
for (int j=0; j<8192; ++j)
fputc('x', stdout);
for (int j=0; j<8192; ++j)
fputc('x', stderr);
}
return 0;
}
var WShell = new ActiveXObject("WScript.Shell");
var wsexec = WShell.Exec(cmd);
var output = "";
var error = "";
// Keep looping until the program exits
while (wsexec.Status == 0) {
while (!wsexec.StdOut.AtEndOfStream) {
output += wsexec.StdOut.Read(1);
}
while (!wsexec.StdErr.AtEndOfStream) {
error += wsexec.StdErr.Read(1);
}
WScript.Sleep(100);
}
WScript.Echo("Output: " + output);
WScript.Echo("Error Output: " + error);
The solution that I came up with (which has its own drawbacks in terms of performance), is to write the program output to files, then read it back in the script once the program has finished. This solves the deadlock problem at the expense of decreasing the performance by requiring the output to be written to disk as an intermediate step (although the OS may not flush it to the physical disk). The code for this solution is presented below:
/** Run a command, in a separate process and retrieve its output.
*
* This is a safer, slower, alternative to WshShell.Exec that supports
* retrieving the output (to stdout and stderr) only after the command
* has completed execution. It does not support writing to the standard
* input of the command. It's only redeeming quality is that it will
* not cause deadlocks due to the blocking behavior of attempting to read
* from StdOut/StdErr.
*
* @param cmd The name/path of the command to run
* @param winstyle The window style (see WshShell.Run) of the command, or null
* @return An object with an exitcode property set to the exit code of the
* command, an output property set to the string of text written by the
* command to stdout, and an errors property with the string of text written
* by the command to stderr.
*/
function run(cmd) {
var tmpdir = FSO.GetSpecialFolder(2 /* TemporaryFolder */);
if (!/(\\|\/)$/.test(tmpdir))
tmpdir += "\\";
var outfile = tmpdir + FSO.GetTempName();
var errfile = tmpdir + FSO.GetTempName();
// Note: See KB278411 for this recipe
// Note2: See cmd.exe /? for interesting quoting behavior...
var runcmd = '%comspec% /c "' + cmd + ' > "' + outfile + '" 2> "' + errfile + '""';
var wshexec = WShell.Exec(runcmd);
// Write stuff to the standard input of the command (through cmd.exe)
// Note: This will block until the program exits if significant amounts
// of information are written and not read. But no deadlock will occur.
// Note2: This will error if the program has exited
try {
wshexec.StdIn.Write("stuff\n");
} catch (ex) {
WScript.Echo("Unable to write to program.");
}
// Do stuff, or write more stuff while cmd executes, or wait...
while (wshexec.Status == 0)
WScript.Sleep(100);
exitcode = wshexec.ExitCode;
var output = "";
try {
var outfs = FSO.OpenTextFile(outfile, 1 /* ForReading */);
output = outfs.ReadAll();
outfs.Close();
FSO.DeleteFile(outfile);
} catch (ex) { }
var errors = "";
try {
var errfs = FSO.OpenTextFile(errfile, 1 /* ForReading */);
errors = errfs.ReadAll();
errfs.Close();
FSO.DeleteFile(errfile);
} catch (ex) { }
return { exitcode: exitcode, output: output, errors: errors };
}
result = run("dir");
WScript.Echo("Exit Code: " + result.exitcode);
WScript.Echo("Output:\n" + result.output);
WScript.Echo("Error Output:\n" + result.errors);
Remember, Don't ever WshShell.Exec a command directly if you are not sure of its inputs and outputs and your script deadlocking would be a problem.