SQL Server Failover Cluster Instance Wont Fail Over

 

At a previous client, had an issue where the SQL Server instance in a 2-node failover cluster wouldn’t fail over to Node B therefore not making it highly available.

NP – this is relates to SQL Server Failover Clustering Instances not SQL Server Always On.

During the patching maintenance windows, there was an issue where a particular application would go offline during patching.
So i performed a controlled test, and manually failed over the instance and services to Node B. Checked if the instance has failed over but it was sitting back on Node A. Hmmm… weird. Tried again. Watched it more closely. It failed over then would error trying to start the SQL Server Instance and then fail back to Node A.

Went and checked the error logs on Node B and found the following:

SQL Server Failover Cluster #1

 

Checking the Registry, I could see that the executable sqlserver.exe is suppose to be located under
C:\Program Files\Microsoft SQL Server\MSQL10_50_.MSSQLSERVER\MSSQL\Binn\.
Checking that location on the server I could see that the exe and the folder structure didn’t exist. This meant that SQL Server wouldn’t be able to start on this server as its unable to find its binaries and files.

SQL Server Failover Cluster #2

Solution

A re-install of the SQL Server binaries on the problematic node fixed the issue and I was able to fail over the SQL Server service which that the intended designed High Availability was able to be met.

Was never able to find a reason why the SQL Server binaries just up and left, but touch wood it hasnt happened since.