Forums on Intune, SCCM, and Windows 11

Welcome to the forums. Register a free account today to become a member! Once signed in, you'll be able to participate on this site by adding your topics and posts, as well as connect with other members through your own private inbox!

SOLVED CAS Site Recovery Issue

Status
Not open for further replies.

Vavamoose

Member
Messages
5
Solutions
2
Reaction score
0
Points
1
Hi Guys!

I am administering a CAS recovery after a HA SQL cluster failure. The Cluster was dismantled, rebuilt and synced by an DB Admin, but obviously, this broke SCCM. The CAS would not connect pointing to a site recovery to reconnect the CAS to the new HA cluster Listener.
Towards the end, I got the following error on the wizard:

"Setup has encountered fatal errors while installing SMS services. Click the View Log button for more info"

The 'ConfigMgrSetup.log' has the below errors:

"
*** *** Unknown SQL Error!
CSiteControlSetup::Create_BackupSQLCert : Failed to get SSB certificate thumbprint.
CSiteControlSetup::SetupCertificateForSSB : Failed to create/backup SQL SSB certificate.
ERROR: Failed to set up SQL Server certificate for service broker on replica node "Contoso-SQL02" .
"
1715351457412.png
"Contoso-SQL02" is the secondary node of the HA SQL cluster.

SPN configs have not changed on the Domain
permissions are still same since this issue - rightly configured.
Firewalls are configured OK with appropriate ports.

All activities on the SCCM participating servers are carried out with a single service account.

I will appreciate any recommendations. I suspect DB recovery actions on this SQL02 node were not done with the service account but with a personal Domain Admin account, breaking SQL certs permissions.

I am looking for a way out of this. A little dig around the net points to a DB and Config Manager rebuild, but we are not considering this option yet.

Thanks in advance for all feedback.
 

Attachments

  • 1715351076000.png
    1715351076000.png
    12 KB · Views: 1
Solution
I was able to make some headway today on this. below are not workstream notes. Hope this helps anyone who finds themselves where I am today, I am very poor with SQL hence my gibberish may not make sense to seasoned DB Admins, but be assured the query commands are correct:

  • CAS treats restored Cluster nodes as new DB Servers
  • This causes failure to start console on CAS due to DB connectivity issues (cert based) and causes sites to fail replication downstream.
  • Steps to restore sites
From MSSMS, connect to both DB nodes of the cluster separately and run the following querry...
I was able to make some headway today on this. below are not workstream notes. Hope this helps anyone who finds themselves where I am today, I am very poor with SQL hence my gibberish may not make sense to seasoned DB Admins, but be assured the query commands are correct:

  • CAS treats restored Cluster nodes as new DB Servers
  • This causes failure to start console on CAS due to DB connectivity issues (cert based) and causes sites to fail replication downstream.
  • Steps to restore sites
From MSSMS, connect to both DB nodes of the cluster separately and run the following querry
selectname,collation_name,user_access_desc,is_read_only,state_desc,is_trustworthy_on,is_broker_enabled,is_honor_broker_priority_on fromsys.databases

This will list the db instances and their SSB status

  • Confirm that for the SCCM_CAS instance on each node, that Broker, trustworthy and Honor_broker_Priority has value of 1
  • If any is at 0, use below queries to enable
ALTERDATABASECM_PS1 SETENABLE_BROKER
ALTERDATABASECM_PS1 SETTRUSTWORTHY ONALTERDATABASECM_PS1 SETHONOR_BROKER_PRIORITY ON


  • From the E:\Microsoft Configuration Manager\cd.latest\SMSSETUP\BIN\X64, run setup.exe as Admin and select site reset
  • Veirfy replication status on SCCm console once you can connect
  • Be PATIENT as components re-initialize at CAS. This took almost 2 hours
  • Monitor logs - compmon.log, replmgr.log, rcmctrl.log
  • Verify replication and link health once components have iniitialized from the compmon.log
    • Run Query on the SCCM_SiteDatabase and resolve replication issues as normal
SPDiagDRS

From <https://learn.microsoft.com/en-us/t...ubleshoot-database-replication-service-issues>

  • Run link analysis for each site from CAS and also from site to CAS.
 
Solution
Status
Not open for further replies.
Back
Top