We aim to provide 100% uptime and service availability to our clients by maintaining, managing and supporting their infrastructure. While doing this, we come across multitudes of instances where hardware or software doesn’t want to play nice, and we have to step in to prevent downtime.
Earlier this month I experienced an issue where I was unable to remotely access a clients’ Windows Server 2008R2 SP1 within their environment. Upon attempting to log in with domain credentials, I received the error: [caption id="attachment_3450" align="aligncenter" width="650"]
“There are currently no logon servers available to service the logon request”
The message instantly points me to believe that there is an error with the DNS. Often this is easy enough to fix; a quick disconnect and then reconnect to the domain does the trick in most cases. I launch a console through vSphere to the clients’ server and log in as the local admin to dig a little deeper and identify the root cause of the issue. I open up a command prompt and conduct a simple test to see if the error is to do with DNS or if something else is causing the issue. I attempt to ping another server within the domain by its name, and get a full response. This instantly tells us that the DNS is fine, and that something else is causing the errors. I continue the investigation, running through our standard troubleshooting steps to see if there are any other issues that may be causing the error. During testing connectivity and communication issues, I notice that there are more TCP/IP ports in a ‘TIME_WAIT’ status than would be normal and refer to the Microsoft Support Site to see if this is a known issue or expected behaviour with this error. It turns out that this is not expected behaviour and is an issue with computers running Windows Server 2008 / 2008 R2. After 497 days of uptime (our clients server had 512 at the time the error presented itself), Windows no longer closes TCP/IP ports in the TIME_WAIT status. This means that all the available TCP/IP ports are eventually used and new TCP/IP sessions (such as the sessions needed to authenticate domain credentials with the Domain Controller) cannot be created. You can check what your TCP/IP Ports are currently doing by using the command below:
netstat -ab | more
Luckily, this doesn’t affect all Windows 2008 / 2008 R2 Servers and Microsoft have released a KB article and hotfix to resolve the error. Unluckily, if this error has already occurred with your clients’ server the resolutions both require downtime. The first resolution is to restart the server, which automatically closes the TCP/IP ports thus resolving the issue. The other resolution is to log in as a local administrator to the Server and apply the hotfix provided by Microsoft and then restart the server. Run the solutions past your clients IT Team / Manager and see what their plans are for the server – If it’s going to be decommissioned or replaced within the next 496 days, you can suggest just restarting the server without the hotfix.
If you have any issues that are causing you a headache or don’t seem to make sense, feel free to send us an email. One of our Service Desk team will be more than happy to dig deeper and work with you to find a resolution.