So, my plan was to post the results of patching all 50,000 clients in one night with 4 Full Service Satellites. However, our change management team still wants to break it down to four nights until we truly know the impact of the change made. So, I will not have great results until late Feb early March. However, I can tell you that on an average night of running a software, patch, audit and an additional what we call a "config" connect our highest load at 2 AM is around 25 active tasks per RCS. We used to be around 75 per RCS before the change. So, if I minus out the 10 standard active tasks. That is 15 vs 65. My public school tells me we just quadrupled our capacity with this fix! I will provide more real life numbers once we do some more roll outs.
Other links
Radia - General Discussions
Disconnect from RCS on Discover_Patch with Metadata only options
Kudos goes out to the Accelerite team for quickly addressing an issue we reported. I have been a proponent for creating a less RCS intensive Patch Manager product for quite a while. When the "Patch with Metadata Only" option came out I believed this was the answer. It was actually called "Offline Scanning" or "OPUS" (no idea what that stood for) at certain points. That model downloads all the patch data needed to do discover_patch to the device so it does not have to talk to the RCS during the scanning. I had assumed that meant it would disconnect from the RCS and not be using an Active Task. We had designed our capacity around this fact. However, after troubeshooting our infrastructure issues during our production patch rollouts the last couple of months I discovered it was not working that way. It was not actually talking to the RCS, but it was still holding an Active Task and connection to the RCS during the 3~5 minutes or so it took to do discover_patch (all locally). I am referring to the Microsoft patches BTW.
Accelerite understood the issue, addressed it, and now it is already fix for the next time you do a patch acquire. I was impressed.
Long time customers may remember that a big selling point of Radia over EDM when it was first released was that EDM held the connection to the RCS (manager) during the entire connect. With Radia is was broken up so the client would connect and disconnect multiple times as to not waste resources on the server side while the client was busy doing stuff locally. We need to make sure we maintain that model. It helps us customers keep our cost down in infrastructure. I actually have an enhancement request for disconnecting during the BDELETE for similar reasons.
I have not implemented in Prod yet (it will not be until Jan patches are released), but I will repost any interesting results in reduce capacity needs after I analyze them.
Previous 5 comentários
Hi Brian, How do you handle the deployment of the wusscan2.cab file? We looked at the Metadata only model but since we have found that the cab file is by far the largest thing we have to download to everyone on most patch months (so far anyway), we opted for status quo because the file does not come down using the metadata only process.? Incidentally, our numbers are similar to yours, 65000 clients, 10 RCS and average ztoptask threads are 50-60 (including the 10 base threads) per server. The vast majority of these threads are patch manager. (we also have software and audit daily)
@John, we just let the wusscan2.cab go, we don't do anything special. We import discover_patch to our Prod environment a good week or two before we entitle the monthly patches to all machines. If I remember right, that file comes down with the old way (non Metadata only) too (or maybe I am remembering wrong). The additional file for the metadata only is the MSFT_PATCH.XML, that holds all the data that used be held in the patchmgr/zservices. That XML is 85 MB on my machines. However, I am pretty sure it compresses to a lot smaller during downloading. I love the meta-data only option for a few reasons. However, having to use the Apache download manager for it is another story. I have another post about all the issues I had getting that working to an acceptable level. I still want a version of this that is totally disconnected from the RCS, at the same time, when it downloads the actual patch it uses RIS/RCS instances.
Update from 2/2/2016 - last night was the first night we got to use the new update for "metadata only" with 1/4 of our enterprise (so about 12,000) clients patching. 4 FSS, peaked around 50 Active Tasks per RCS. That is down from the last month ago we did a 1/4 of the enterprise and maxed out at around 170 per RCS (and at that time we were hitting some sort of TCP/IP / Radia limit (not hard or soft task limit). So another thing with "metadata only" is that you do 3 patch connects when on patch night, not just 2 like the other model. So, if our base was around 25 active tasks. We added 25 more with patching 1/4 of the enterprise. So, my math says we would be peaking around 125 clients per RCS to patch everyone in a night. I have no issue with that! Of course, not that that the Active Tasks are actually doing something instead of waiting, I will have to re-baseline the server performance. I will update again once we pull the trigger on all in one night.
We did rollout patches to all 50,000 clients last night. We used 4FSS to capacity. With the 5th and 6th receiving some clients from failover. So, it looks like with 6 FSS we can patch our 50,000 clients in one night without issue with patch manager using the meta-data only option. Considering we had 6 RCS in Classic using the "normal" Patch Manager we still split the load of patch manger to connect once a week per client (about 8000 a day). This is a huge improvement in the product. Interesting note is the clients were failing to connect around 75 active tasks (was 175 before). No processor, memory or disk slow downs. Clients just failed to connect over port 3464 (the RCS does not actually have a log for the clients at all). I am starting to wonder if the is related to Symantec Endpoint Protection on the server. The actual network seems to be fine.
Participate
Ask, Discuss, Answer





