Most folks who work with servers know the monthly drill:
Patches are released by manufacturers -> Patches are tested -> Patches are deployed to Production. What could possibly go wrong?
Anyone who has ever experienced the nail-biting joy of patching, and then awaiting a restart, knows exactly what could go wrong. Does anyone remember the really good old days when patches had to be manually staged prior to deployment? For those of you who entered the tech world after Windows NT was retired, consider yourself lucky!
If you think about it, most organizations that patch on a monthly basis are considered to have an aggressive patching strategy. As evidenced by the legendary Equifax breach, some organizations take months to apply patches. This is true even when the organization has been forewarned that the patch is a cure for a vulnerability that is being actively exploited, also known as a “Zero-day” vulnerability.
Patching is never a flawless operation. There is always one server that just seems to have problems. What is the first response when this happens? Blame the patch, of course! After all, what else could have changed on the server? Plenty, actually.
Sometimes, removal of the patch doesn’t fix the problem. I have seen the patch still held responsible for whatever has gone wrong with the server. I am not blindly defending the patch authors, as there have been too many epic blunders in patching for me to exhibit that kind of optimism and not laugh at myself. But what can we do to avoid the patch blame game?
The simple solution is to restart the servers before deploying patches. This is definitely an unorthodox approach, but it can certainly reduce troubleshooting time and “patch blame” when something goes wrong. If you restart a server, and it doesn’t restart properly, that indicates that an underlying problem exists prior to any patching concern.
This may seems like a waste of time, however, the alternative is usually more time consuming.
If you patch a server, and it fails at restart, the first amount of time you will waste is trying to find the offending patch, and then removing the patch. Then, upon the subsequent restart, the machine still fails. Now what?
Even if we scale this practice to 1000 servers, the time is still not wasted. If you are confident that your servers can withstand a simple restart, then restart them all. The odds are in your favor that most will restart without any problems. If less than 1% of them fail, then you can address the problems there before falsely chasing the failure as a patch problem.
Once all the servers restart normally, then, perform your normal patching, and feel free to blame the patch if the server fails after patching.
The same approach could also be applied to workstations in a corporate environment. Since most organizations do not engage automatic workstation patching on the corporate network, a pre-patch restart can be forced on workstations.
Patching has come a long way from the early days when the internet was young and no vulnerabilities existed (insert sardonic smile here). The rate of exploits and vulnerabilities have accelerated, requiring more immediate action towards protecting your networks. Since patches are not without flaws, one easy way to rule out patching as the source of a problem is to restart before patching.