Learn From the CrowdStrike Crash

Jul 24, 2024

Last Friday’s (July 19, 2024) CrowdStrike outage led to over 5,000 canceled flights, affected 8.5 million Windows devices, and could ultimately cost Billions of dollars when the litigation is finalized. Though CrowdStrike has not revealed any compensation plans, tech analysts predict that CrowdStrike lawyers are “probably not going to enjoy” their summers.

Being identified as the single largest IT outage in history, it all started with a software update. Microsoft’s dreaded BSOD “blue screen of death” shutdown government services and businesses across the globe Friday, disrupting emergency call centers, banks, airlines and hospitals. CrowdStrike says their software is used by over 1/2 of all the Fortune 500 companies and Microsoft’s Windows is the most widely used computer operating system in the world, accounting for over 68 percent of the desktop, tablet, and console OS market as of February 2024. Apple and Linux systems were not affected by this challenge.

While Microsoft said a faulty software update from U.S. cybersecurity firm CrowdStrike was responsible for the major IT outage, the incident brought attention to just how large of a market share both companies have in their respective sectors. This was not a cyber security incident; it was the fact that an update to the CrowdStrike application broke with devastating consequences. The glitch was due to an update of CrowdStrike’s Falcon software, ironically designed to prevent harm from viruses and cyber threats and described as a “tiny, single, lightweight sensor.” That’s installed on Windows computers that subscribe to this cyber protection.

What can we learn from this?
These days too many companies and organizations are vulnerable to a “single point of failure”. When 3 companies – Microsoft, Amazon and Google dominate the market for cloud computing, one minor incident can and did have global ramifications.

The initial “fix”, as presented by Microsoft, was for the end user (or IT folks) to pretty much jump through hoops and attempt to get their computers back up and running in Windows. Boot into “safe mode”, find and delete a specific CrowdStrike file and then re-boot the computer normally. These seemingly simple steps are far beyond the skill set of most PC users.

Microsoft releases a CrowdStrike Outage Recovery Tool
Microsoft has released a tool to help recover affected systems after last week’s global outage caused by cyber security firm CrowdStrike. The tool allows you to create a bootable USB drive to help recover impacted machines after the July 19 security incident that affected an estimated 8.5 million Windows devices. Keep in mind that you’ll need to have access to a “working” Windows computer to actually create the USB tool.

https://redmondmag.com/Articles/2024/07/22/Microsoft-Releases-CrowdStrike-Outage-Recovery-Tool.aspx

Thanks to WIRED magazine
https://www.wired.com/story/crowdstrike-outage-update-windows/

David Snell and Rob Hakala

You can listen to this broadcast here: https://actsmartit.com/learn-from-crowdstrike/
David Snell joins Rob Hakala of the South Shore’s Morning News on 95.9 WATD fm every Tuesday at 8:11