We've got ourselves a sick server/application and I'd like to gather a little community advice if I may. I believe the evidence supports a memory leak in my application somewhere and would love to hear a second opinion and/or suggestions.
The issue has been that used memory (as seen by FusionReactor) will climb up to about 90%+ and then the service will start to queue requests and eventually stop processing them all together. A service restart will bring everything back up again and it could run for 2 days or 2 hours before the issue repeats itself. Due to the inconsistant up time, I can't be sure that it's not some trouble bit of code that runs only occasionally or if it's something that's a core part of the application. My current plan is to review the heap graph on the "sick" server and look for sudden jumps in memory usage then review the IIS logs for requests at those times to try and establish a pattern. If anyone has some better suggestions though, I'm all ears! The following are some facts about this situation that may be usefull.
The "sick" server:
- CF 9.0.1.274733 Standard
- FusionReactor 4.0.9
- Win2k8 Web R2 (IIS7.5)
- Dual Xeon 2.8GHz CPUs
- 4GB RAM
JVM Config (same on "sick" and "good" servers):
- Initial and Max heap: 1536
-server -Xss10m -Dsun.io.useCanonCaches=false -XX:PermSize=192m -XX:MaxPermSize=256m -XX:+UseParNewGC -Xincgc -Xbatch -Dcoldfusion.rootDir={application.home}/../ -Dcoldfusion.libPath={application.home}/../lib -Dcoldfusion.dotnet.disableautoconversion=true
What I believe a "healthy" server graph should look like (from "good" server):
And the "sick" server graph looks like this: