Solving the extended install time for SPSE CUs

When comparing the installation time for CUs for SharePoint Server 2019 and SharePoint Server Subscription Edition there is a significant difference.
E.g. on my test machines installing a SharePoint Server 2019 fix takes roughly 20 minutes compared to nearly an hour for a similar SharePoint Server Subscription Edition fix.

The reason for this extended install time is the fact that for SharePoint Server 2019 fixes the installer stops all relevant SharePoint and IIS Windows Services before invoking the fix installation while the services fail to be stopped for SharePoint Server Subscription Edition fixes. The root cause here is that the method used to stop the services no longer works with SharePoint Server Subscription. This leads to performance problems when the installer tries to update assemblies in the global assembly cache due files being in use.

Solution

To improve the performance the issue the relevant Windows services need to be stopped before the fix is applied and restarted afterwards.
Here is the list of services to be stopped before the fix is applied. The sequence is important to ensure that the services remain stopped and are not automatically restarted:

Server Role Service
All SharePoint Timer Service (SPTimerV4)
All SharePoint Tracing Service (SPTraceV4)
All SharePoint Administration (SPAdminV4)
All World Wide Web (W3SVC)
Search SharePoint Server Search (OSearch16)
Search SharePoint Search Host Controller Service (SPSearchHostController)
Distributed Cache SharePoint Caching Service (SPCache)

The reverse sequence needs to be applied when restarting the services after the SharePoint CU has been installed.

Important: Extra care needs to be taken for the Distributed Cache when using zero-downtime-patching as a graceful shutdown of the Distributed Cache host is required to guarantee that the cache data is replicated to other Distributed Cache servers still running in the farm before the service is stopped.

Solution Script

The above listed steps of stopping/restarting the services can be automated using a script. For your convenience I have created a sample script which performs the required operations:
https://github.com/stefangossner/Install-SPSE_Fix/blob/main/Install-SPSE_Fix.ps1

The script has two parameters:

  1. CULocation (mandatory): Here you pass in the location of the CU you are planning to install. E.g. C:\temp\uber-subscription-kb5002560-fullfile-x64-glb.exe
  2. ShouldGracefulStopDCache (optional): pass in $true if a graceful shutdown of the distributed cache service on the current server should be attempted,

E.g: to apply February 2024 SharePoint CU which was previously downloaded to the temp directory on a server hosting an instance of the Distributed Cache in a farm with multiple Distributed Cache Hosts where zero downtime patching is being performed:

Install-SPSE_Fix.ps1 -CULocation C:\temp\uber-subscription-kb5002560-fullfile-x64-glb.exe -ShouldGracefulStopDCache $true

Without zero downtime patching the following command should be used:

Install-SPSE_Fix.ps1 -CULocation C:\temp\uber-subscription-kb5002560-fullfile-x64-glb.exe

34 Comments


  1. Many thanks for sharing, Stefan! I have some “déjà vue” with SP2013 patching when I see the approach. 🙂

    Reply

    1. Indeed – although the root cause there was different. In SP2013 the time delay occurred due to the large amount of packages we installed – and between each package we stopped and restarted the services which took a long time. The famous “Russ Maxwell script” disabled the services preventing them from starting and stopping leading to performance gains.
      Here the problem is that the services are not stopped in the first place.

      Reply

  2. We usually restart the server for extra precaution after the CU patch. So I guess we would still start the services in reverse order before restarting the server then. This is a good find and will quicken the patching process.

    Reply

  3. Hi Stefan!
    I need your advice on how to reduce the running time of Configuration Wizard during CU installation. I have a very large Sharepoint 2019 farm (4 servers per role), 4 SQL clusters and 1200 content databases. As a result, the Configuration Wizard on the main server (on which the CA is installed) runs for about 10 hours + about 15-20 minutes on all other servers. The total time, taking into account the installation of the files themselves, takes 15-16 hours.

    Naturally, before installing the CU, I stop the search service via Suspend-SPEnterpriseSearchServiceApplication.

    Is there any way to speed up the Configuration Wizard in this farm configuration?

    Reply

    1. Hi Evgeny,
      I’m pretty sure the majority of time is spent on the upgrading the 1200 content databases.
      But this is something you don’t have to do in the config wizard!
      My suggestion would be – before starting the config wizard, start 10+ powershell windows and run upgrade-spcontentdatabase in parallel for mulitple databases.
      You can even create a Powershell script which launches multiple powershell instances in parallel to automate this.
      After the content databases are upgraded you can then launch the config wizard and it will only have to upgrade the config db and the service application databases and perform the other local machine tasks.
      Cheers,
      Stefan

      Reply

      1. Stefan, thanks a lot!

        Yes, majority of time is spent on the upgrading the 1200 content databases – it’s information appers in the Diagnostioc Logs.
        I’ll try running multiple PS-sessions next time.

        Reply

        1. I’ve done the same with 120 databases, it may be beneficial to spread the load across 6 or more servers with 1200 databases. I do my upgrade in batches of 10 with 4 parallel PS windows being launched on 3 different servers.

          Reply

      2. Hi Stefan,

        can the same approach be used with SPSE? As in upgrade the databases in powershell before running the wizrad? Thanks!

        Reply

        1. Hi Randy,
          yes of course!
          Cheers,
          Stefan

          Reply

  4. Hi Stefan, thanks for sharing. As usual you’re a life saver. I want to ask that, this situation will be fixed with future updates?

    Reply

    1. Hi Hasan,
      a fix request is currently in the backlog but we do not have information when it will be picked up.
      Cheers,
      Stefan

      Reply

  5. Hi Stefan,

    When using the new script on our 2 AP/3 WFE SPSE farm it has greatly improved the install time for our app servers, non-user access, crawl servers but still seeing over an hour or hour and 1/2 to install same on WFE servers. We stop user traffic via the Load balancer however we have noticed several suspended node runner processes where the PIDs correspond to the Restart Manager event log errors .
    ‘Unknown’ (pid 16848) cannot be restarted – Application SID does not match Conductor SID..

    If stopping the OSearch16 service isn’t killing those noderunner processes is there another way to do that before installing updates?

    Reply

    1. Hi Allison,

      i just suspend the Search Service Application, run the installation and after the complete patching process am starting it again.
      $ssa = Get-SPEnterpriseSearchServiceApplication -Identity “Your Search Service Application”
      Suspend-SPEnterpriseSearchServiceApplication -Identity $ssa
      Running the patching process and after that
      Resume-SPEnterpriseSearchServiceApplication -Identity $ssa

      Helped for me. It took just 30 minutes on my Single Server (with all the search components as well)

      Reply

  6. Hi Stephan,

    like your script. better than patchit.
    However: I guess I found a bug

    if ($srvOSearch16.Status -eq “Running” -and $restartOSearch16)

    guess you mean “-ne”

    Reply

  7. Hi,

    Is this going to be fixed in future patches?

    Reply

    1. Hi Dilan,
      a fix is on the backlog but there is no information about when it might be implemented available at this time.
      Cheers,
      Stefan

      Reply

  8. Thanx’s for the great script! The Scripts install the CU on our SP Farm Server in approx 1hr. That is 2jr faster.

    Reply

  9. Hello,
    it does go faster, but on some servers there must be still something stopping the install as it stalls for quite a while.
    For example on a WFE, we saw the same RestartManager (eventid 10010) as Alison wrote about above. In our case the pid was for the powershell window that launched the install -script:
    Application ‘C:\Windows\System32\WindowsPowerShell\v1.0\powershell.exe’ (pid 11660) cannot be restarted – Application SID does not match Conductor SID..
    In eventviewer we can see these type of events for almost an hour, then eventuelly it succeeds.
    Hope we can find the cause for these issues are patching many servers is very time consuming now.
    Thanks,
    David

    Reply

    1. Hi David,

      Same problem here, last week, with one of my custmer, I tried the install CU, same issue happened to me. On App servers, it installed in a short time but on the front end servers, it took almost one hour.

      Reply

      1. Hi David, Hi Hasan,
        the error message “Application SID does not match Conductor SID” comes from windows and is related to permissions. I would recommend to open a support ticket with the Windows team of Microsoft Support to investigate this.
        Cheers,
        Stefan

        Reply

        1. Hello Stefan, we opened a ticket (2504100050001065) but with SharePoint support. Do you suggest we get this closed/moved to a Windows Team support ticket?
          I noticed now that the error appears for the powershell window that is running your install-script. If I kill that PID, the install proceeeds quickly. Good workaround for us right now is to use the install script, with windows application event viewer log open, and then after about 10minutes, these errors will appear, then we look for the PID in task manager details and kill it. Install completes succesfully after that. Then a restart of that server will bring back the services that shuts down in your script.
          All the best,
          David

          Reply

          1. Hi David,

            If you could find the solution for this situation, could you please share the solution with us? Every time when I install the CUs, I have to check event logs for restartmanager warning messages. Sometimes it says powershell, sometimes w3wp…


          2. Hi Hasan,

            We pursued this support case for a while, but unfortunately the support team was unwilling to engage in deeper troubleshooting. Their response was limited to standard statements such as “patch speeds vary depending on the system” and “ensure the system meets the necessary requirements.” I had a brief exchange with Stefan about it, but due to the difficulty in reproducing the issue consistently, we ultimately decided to close the case.

            At this point, we’ve implemented a very manual workaround:
            Approximately 10 minutes after initiating the patch installation, we monitor the Event Viewer for specific events (10010, 1005, and 10001), and manually terminate the processes involved once the restart loop begins. This significantly reduces the total installation time.

            If we take no action, the system eventually recovers on its own, but it can take over an hour. This strongly suggests that there is an underlying issue that can be resolved—though the root cause remains unidentified.

            Interestingly, some SharePoint roles always complete the patching process quickly. However, on servers running components like Trend Micro PortalProtect, the process consistently stalls, and the events mentioned above are logged.

            For context: all our SharePoint 2019 farms (and previously SharePoint 2016) running in the same environment—with identical GPOs and DSC-based installation scripts—do not experience this issue.

            Best regards,
            David


          3. Hi David, Hi Hasan,
            I experience exact the same issue on EVERY SPSE Farm throughout a bunch of different customers with various antivirus and system configurations. SP16/19 usually work fine. Therefore I strongly assume that it is a SharePoint Issue, not an environmental issue.
            Cancelling the installation is my approach as well, but I recently experienced issues with a corrupted install dir due to this. The cancellation as well as the installation is not properly handled by SharePoint SE for years… 🙁
            If someone has issues with the install dir after cancelling the installer, using oputil.vbs to clean it may help: https://www.c-sharpcorner.com/blogs/sharepoint-patch-installation-issues-and-fixes


  10. Hello,
    After executing the script, do i need to run the sharepoint wizard?

    Thanks,

    Reply

    1. Hi Laura,
      as always after completing the installation of SharePoint fixes on all machines in the farm you need to run the SharePoint Products Configuration Wizard on each of these servers.
      Cheers,
      Stefan

      Reply

  11. what is best sequence to for installing & running config wizard for shortening the downtime .
    e.g.:- for installation WFE –search—APP&CA
    for running config wizard APP &CA—Search–WFE.

    as i have 8 WFE (1 along DC ) – 2 Search and 2 APP &CA

    Reply

  12. Hi Stefan,
    after using the script to install last CU for Sharepoint SE on server which has Distributed Cache cluster node, I am not able to add Distributed Cache instance back up.
    Get-SPCacheHost -CachePort 22233 -HostName … gives :

    Get-SPCacheHost : ErrorCode:SubStatus:Error while loading the provider “SPDistributedCacheClusterPr
    ovider”. Check HKEY_LOCAL_MACHINE -> SOFTWARE\Microsoft\Shared Tools\Web Server Extensions\16.0\Caching\Providers -> SP
    DistributedCacheClusterProvider.

    Trying to remove instance via Remove-SPDistributedCacheServiceInstance and add with Add-SPDistributedCacheServiceInstance does not help. Any advice?

    Reply

      1. Sadly it doesn’t seems to be that. Now it shows:

        Get-SPServiceInstance | ? {($_.Service.Tostring()) -eq $instanceName}

        TypeName Status Id
        ——– —— —
        Distributed Cache Disabled c5a58cf9-87cc-45b7-844a-d8ebf517d5bf
        Distributed Cache Online 49a776f7-2e8e-4dc4-8000-318396a61935

        The disabled instance is on the problematic host.
        I can delete it via:
        $ins = Get-SPServiceInstance -Identity c5a58cf9-87cc-45b7-844a-d8ebf517d5bf
        $ins.Delete()

        But when I try to add instance using
        Add-SPDistributedCacheServiceInstance

        It ends with:
        Add-SPDistributedCacheServiceInstance : The operation completed successfully.
        At line:1 char:1
        + Add-SPDistributedCacheServiceInstance
        + ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
        + CategoryInfo : InvalidData: (Microsoft.Share…ServiceInstance:SPCmdletAddDist…ServiceInstance) [Add-
        SPDistributedCacheServiceInstance], CryptographicException
        + FullyQualifiedErrorId : Microsoft.SharePoint.PowerShell.SPCmdletAddDistributedCacheServiceInstance

        And there is again disabled instance…

        Reply

        1. Hi Vaclav,
          in this case it needs more research. Please open a support ticket with Microsoft Support to get assistance.
          Cheers,
          Stefan

          Reply

          1. I finally solved it. It was McAfee antivirus exploit protection – it was somehow blocking provisioning Distributed Cache instance.

Leave a Reply

Your email address will not be published. Required fields are marked *

This site uses Akismet to reduce spam. Learn how your comment data is processed.