Troubleshoot SQL Server Error 983 AlwaysOn Availability Groups Failover
SQL Server primary purpose is to archive, manipulate and retrieve data as per requested by other several software applications. These applications may run either on identical computers or on different systems connected via the Internet (or a network). To handle such large amount of data, which can also be lost or damaged due to some issue, there is a very common feature in SQL server i.e. AlwaysOn Availability groups. This feature aims to provide high-availability and disaster recovery solution, generally, providing an alternative to business-level users to backup their database. In addition, there may occur multiple errors in SQL server like SQL Server error 983, SQL Server error 976, SQL Server 156, etc.
Therefore, in this article. we are going to discuss SQL error 983 in different scenarios, which will help you to get rid of this error. Database corruption in SQL server can also be a reason for such errors as well.
Error: 983, Severity: 14, State: 1. Unable to access database ‘HADB’ because its replica role is RESOLVING which does not allow connections. Try the operation again later.
Table of Content
SQL Error 983 – User Query
Here, we have an actual user query that states the issues of a user that originates with this 983 SQL error. Understanding the pressure points of users is quite crucial for them to get the perfect solution without facing any difficulties.
Here, we need to troubleshoot the AlwaysOn Availability Groups failover issue. Therefore, let’s have a look at the most common issues that could have caused this problem. Then we can proceed with the solution while maintaining utmost accuracy.
Unable to Access Database Because Its Replica Role is Resolving
There can be multiple reasons for users to face this error. However, in case this issue occurs because of the replica role is resolving, users will get a message like this:
Error 983, Severity 14: Unable to access database ‘%.*ls’ because its replica role is RESOLVING which does not allow connections.
Users are most likely to get this error because of network communication issues among the primary & secondary. To solve the error caused by this particular reason, there are some checks that users need to undergo:
- Windows Firewall enabled
- Restricted endpoint access to the SQL Server service account.
- Having cuts in the cables is another major issue that we observe.
- Secondary storage corruption troubles users equally as other issues.
- Enabling Transparent Data Encryption on the primary but not on the secondary.
There can be other minor issues present as well, that users need to check before they run any operation.
Scenario 1: Examine Value of FailureConditionLevel
For setting the conditions related to AlwaysOn feature, we require an additional property named as FailureConditionLevel. This property performs controlling when failover occurs either from failover cluster or from AlwaysOn availability group. In SQL Server, by default, the value of FailureConditionLevel is 3 and one can also modify this value using T-SQL script.
To resolve the Microsoft SQL Server error 983, first examine whether the condition of SQL server service is down or not. If it is down, then increase up the service; else check the conditions described in below snapshot.
NOTE: Perform the Cluster action only if any sub-system encounters any kind of error, do not perform any action when a warning occurs.
Scenario 2: Cluster Diagnostic
The Cluster Diagnostic logs (or Cluster log) are situated in the log directory of MS SQL server and are different from log files. These files are in the format:
When you double-click on the cluster log file, you will find several events present in it. Multiple attributes of the events will display in front of you.
These attributes define the health status of SQL server i.e. whether sp_server_diagnostics displays an error or not. The error occurs when the resource is performing its functioning and suddenly an interruption named Failure, occurs. Such failure occurs due to FailureConditionLevel set up & display that the resources are unhealthy i.e. NOTHEALTH
Also Read: How to Fix Table Corruption in SQL Server Database
Scenario 3: AlwaysOn Extended Event Log
SQL Server comprises of multiple extended events log files, which are related to AlwaysOn feature. To view such files, make use of the following query & get rid of the SQL Server error 983 issue:
SELECT * FROM sys.dm_xe_objects WHERE name LIKE '%hadr%'
One such extended event log file is AlwaysOn Health Extended Event log, which deals with the AlwaysOn availability group related to diagnostics like State changes for Group, errors report, expiration days, etc. These files are of the format:
One can double-click on the health file and view the events related to such file. Sometimes, we find that availability groups are to be expired, and warned to go for failover that will migrate the states from PRIMARY_NORMAL to RESOLVING_NORMAL.
Scenario 4: Dealing with Cluster Log Files
In this scenario, we are going to deal with restart operation of nodes. Within cluster log files, one can see the ‘restart action’ command, which indicates that a restart is tried on a present node before performing failover to the previous node. Therefore, if restart action is successful it denotes that failing over was unsuccessful to previous node.
In addition, we have a property box that confirms the Restart action of node.
Scenario 5: ErrorLog of SQL Server
Microsoft SQL Server Error 983 occurs due to multiple reasons. Commonly, it is the outcome of SAP performing re-connection during its working process. In addition, we have several events (in sequence) initially documented in stack of SQL Server ErrorLog.
There are multiple reasons behind occurrence of Microsoft SQL Server error 983, but when it comes to resolving it different scenario helps us to troubleshoot this error. A User should learn about these scenarios and then perform activities accordingly.