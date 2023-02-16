With January 2023 CU for SharePoint Server 2016, 2019 and Subscription Edition we released a security fix which increased the transport security for communications between the SharePoint applications and the distributed cache cluster.
The change was implemented in two parts:
- a change in the configuration wizard to update the Distributed Cache Cluster Security Settings
- a change in the client logic (the code that connects the SharePoint worker processes to the Distributed Cache Server) to identify if the security of the server has been updated
For some customers these changes were not applied correctly which caused connections to the Distributed Cache server to fail with the following ULS log messages:
w3wp.exe (0xXXXX) 0xXXXX SharePoint Foundation DistributedCache 837o4 Unexpected Unexpected error while executing ExportCacheClusterConfig with parameters provider: 'SPDistributedCacheClusterProvider' , connectionString: '...
and
w3wp.exe (0xXXXX) 0xXXXX SharePoint Foundation DistributedCache 4y50r Unexpected Unexpected Exception in getting cache cluster security config...
and
w3wp.exe (0x285C) 0x2DA4 SharePoint Foundation DistributedCache ah24w Unexpected Unexpected Exception in SPDistributedCachePointerWrapper::InitializeDataCacheFactory for usage 'DistributedLogonTokenCache' - Exception 'Microsoft.SharePoint.Internal.Caching.DataCacheException: ErrorCode<ERRCA0017>:SubStatus<ES0006>:There is a temporary failure. Please retry later. ...'
To resolve these problems please follow the steps below.
Important: The PowerShell commands used below are slightly different for SharePoint Server Subscription Edition compared to SharePoint Server 2016/2019 as AppFabric got integrated into SharePoint Server Subscription Edition and the PowerShell CmdLets were updated with this integration.
Verifying the Distributed Cache configuration
To verify if the Distributed Cache configuration is correct you can use the following commands in a SharePoint Management Shell:
SharePoint Server 2016 and 2019
Use-CacheCluster Export-CacheClusterConfig -Path c:\temp\clusterconfig.xml
SharePoint Server Subscription Edition
Export-SPCacheClusterConfig -Path c:\temp\clusterconfig.xml
Open the exported clusterconfig.xml file and scroll to the end of the file and verify the advancedProperties section.
The correct configuration after applying January 2023 CU or later looks like this:
... <advancedProperties> <partitionStoreConnectionSettings leadHostManagement="false" /> <securityProperties> <authorization> <allow users="WSS_ADMIN_WPG" /> <allow users="WSS_WPG" /> </authorization> </securityProperties> </advancedProperties> ...
This is how the configuration looked like before applying January 2023 CU – if this still looks like this after installing the CU and running the config wizard then the configuration is incorrect and needs to be updated:
... <advancedProperties> <partitionStoreConnectionSettings leadHostManagement="false" /> <securityProperties mode="None" protectionLevel="None"> <authorization> <allow users="WSS_ADMIN_WPG" /> <allow users="WSS_WPG" /> </authorization> </securityProperties> </advancedProperties> ...
You will notice the additional mode and protectionLevel attributes in the above incorrect configuration.
Fixing the Distributed Cache configuration
To resolve the issue execute the following PowerShell commands:
SharePoint Server 2016 and 2019
Stop-CacheCluster Set-CacheClusterSecurity -SecurityMode Transport -ProtectionLevel EncryptAndSign Start-CacheCluster
SharePoint Server Subscription Edition
Stop-SPCacheCluster Set-SPCacheClusterSecurity -SecurityMode Transport -ProtectionLevel EncryptAndSign Start-SPCacheCluster
Be aware that the Set-SPCacheClusterSecurity command has been added in January 2023 CU for SharePoint Server Subscription Edition and is not available in older builds.
The commands above will stop the cache cluster, update the cache cluster configuration (this will only work if the cache cluster has been stopped before – otherwise you will get an error) and start the cluster again.
Ensure that the client logic is able identify the security configuration of the distributed cache server
Required change on servers NOT hosting the Distributed Cache service
To ensure that the SharePoint worker processes on servers which do not host the Distributed Cache service can read the Distributed Cache Configuration a custom provider has to be registered in the registry.
Copy the text below AS IS into a file with extension reg (e.g. DistributedCacheProvider.reg) and apply it on each server in your farm which does not host the Distributed Cache Service.
Best ist to hover over the text below and use the “Copy to Clipboard Function” to ensure that you copy the exact text you need.
SharePoint Server 2016 and 2019
Windows Registry Editor Version 5.00 [HKEY_LOCAL_MACHINE\SOFTWARE\Microsoft\AppFabric\V1.0\Providers\AppFabricCaching\SPDistributedCacheClusterProvider] "DisplayName"="Microsoft SharePoint AppFabric Caching Service Configuration Store Provider" "Type"="Microsoft.SharePoint.DistributedCaching.Utilities.SPDistributedCacheClusterCustomProvider, Microsoft.SharePoint, Version=16.0.0.0, Culture=neutral, PublicKeyToken=71e9bce111e9429c"
SharePoint Server Subscription Edition
Windows Registry Editor Version 5.00 [HKEY_LOCAL_MACHINE\SOFTWARE\Microsoft\Shared Tools\Web Server Extensions\16.0\Caching\Providers\SPDistributedCacheClusterProvider] "DisplayName"="Microsoft SharePoint AppFabric Caching Service Configuration Store Provider" "Type"="Microsoft.SharePoint.DistributedCaching.Utilities.SPDistributedCacheClusterCustomProvider, Microsoft.SharePoint, Version=16.0.0.0, Culture=neutral, PublicKeyToken=71e9bce111e9429c"
Required change on servers hosting the Distributed Cache service
There is one final step that needs to be completed: the worker process needs to be able to access the registry on the Distributed Cache servers to read the Distributed Cache registry key. To ensure that this works grant read access to the WSS_WPG group on the following registry key on all servers hosting the Distributed Cache service:
HKEY_LOCAL_MACHINE\SYSTEM\CurrentControlSet\Control\SecurePipeServers\winreg
After this step is completed ensure to restart IIS on your Web Frontend Servers to ensure that the SharePoint worker processes try to reconnect to the SharePoint Distributed Cache Cluster.
Credits
Thanks a lot to my colleague Dan Cristureanu from Romania whos analysis on several support cases was a significant contribution to this blog post.
17 Comments
If we have not installed the January patch, would installing the February patch rout around this entire issue?
Hi Charles,
unfortunately not.
I will update this thread as soon as I have further information if and when it will be addressed.
Cheers,
Stefan
May I please ask if the Feb 2023 updates cover this fix?
Thanks,
Permalink
Hi Phu, the issue is not addressed in February 2023 CU.
Where are the links to the language dependant and language independent?
Permalink
Hi Peter,
please check my other blog posts about CU releases.
This post is not a CU release post – it talks about a specific problem.
Thanks,
Stefan
That seems to have fixed my issues on my two 2016 test farms, thanks Stefan (and Dan!) for the instructions and details on the issue. I’d been seeing the feed cache repopulation jobs fail and they’re both running properly after the fix. The Distcache seems happy & healthy.
Permalink
Thanks for the confirmation!
🙂
These steps fixed my issues as well in SP 2019 test farm 🙂 Our configuration was correct but the last registry update did the trick. Thank you!
Permalink
Great – thanks for the confirmation!
Permalink
Just to get things clear:
We have the jan CU patch installed. Will we still have this issue, after installing the feb patch?
Permalink
Hi Rickard, yes.
I suppose that the registry fix is not required for single server farms. Am I right?
Permalink
On a single server farm with DCache enabled you should already have this registry key in place on your server.
Hello Stefan, we are getting these unexpected errors in our logs after applying the January patch “Unexpected error while executing ExportCacheClusterConfig with parameters provider” and “Unexpected Exception in getting cache cluster security config” but when we check the config file it looks correct and if we run the powershell to set “-SecurityMode Transport -ProtectionLevel EncryptAndSign” it tells us no change required.
Just curious if anyone else has this issue where we are getting the errors but the config file is set correctly?
Thanks
Permalink
Yes, that is expected. You have to apply the other steps on this page as well to resolve the problem.
Cheers,
Stefan
Hello Stefan,
We are on SP 2019.
What are the day to day usability impacts of this issue?
I have 3 farms on the Jan 2023 CU since January. All passed testing and seem to be functioning as expected.
I confirmed our cluster config on all servers/farms are correct and I don’t find the error messages in the ULS logs of any server. I searched a full day’s worth of ULS logs from each server; is that enough?
Our Dev farm is missing the registry entries for the non-DC servers, but QA and Prod have the registry entries.
None of the DC servers have the registry permission for WSS_WPG.
All farms appear to function as expected, but none meet all of the conditions of the fix instructions. Dev is particularly out of spec.
If I were to apply all of the fix items will the Feb 2023 CU possibly put servers back into a problem state?
Thanks so much!