Hello,
Last week we began seeing some timeout errors, but today there have been much more of them and I’m not sure how to proceed in troubleshooting.
We have 2 load balanced app servers:
Windows Server 2008 R2 (virtualized)
ColdFusion 8, Enterprise
And a database server:
Windows Server 2008 R2
SQL 2012
The error in “coldfusion-out.log” is:
A non-SQL error occurred while requesting a connection from <DSN>.
Timed out trying to establish connection
This only happens on one of the web servers, but it happens more frequently when the app servers is under a heavier load. Most of the day it works fine, but 4 or 5 times throughout the day we will have 1 – 15 requests timeout in a row(always varies), and then things will be fine again w/out restarting or doing anything.
The database server is much more powerful than the app server, and I don't believe the issue involves long running queries or the DB server being under heavy load.
I’m a DBA who is temporarily handling System Admin responsibilities, and I’m not sure how to further trouble shoot this. I just started some pings going from the app server to the database using IP and the listeners name. How else can I prepare for the next time it happens. What other logs should I be looking at to gain some insight? Are there PerfMon counters I could use to help troubleshoot?
Thanks for your help!