this post was submitted on 19 Jul 2024
784 points (94.4% liked)

Linux

48301 readers
675 users here now

From Wikipedia, the free encyclopedia

Linux is a family of open source Unix-like operating systems based on the Linux kernel, an operating system kernel first released on September 17, 1991 by Linus Torvalds. Linux is typically packaged in a Linux distribution (or distro for short).

Distributions include the Linux kernel and supporting system software and libraries, many of which are provided by the GNU Project. Many Linux distributions use the word "Linux" in their name, but the Free Software Foundation uses the name GNU/Linux to emphasize the importance of GNU software, causing some controversy.

Rules

Related Communities

Community icon by Alpár-Etele Méder, licensed under CC BY 3.0

founded 5 years ago
MODERATORS
 

This isn't a gloat post. In fact, I was completely oblivious to this massive outage until I tried to check my bank balance and it wouldn't log in.

Apparently Visa Paywave, banks, some TV networks, EFTPOS, etc. have gone down. Flights have had to be cancelled as some airlines systems have also gone down. Gas stations and public transport systems inoperable. As well as numerous Windows systems and Microsoft services affected. (At least according to one of my local MSMs.)

Seems insane to me that one company's messed up update could cause so much global disruption and so many systems gone down :/ This is exactly why centralisation of services and large corporations gobbling up smaller companies and becoming behemoth services is so dangerous.

top 50 comments
sorted by: hot top controversial new old
[–] aard@kyu.de 185 points 4 months ago (1 children)

The annoying aspect from somebody with decades of IT experience is - what should happen is that crowdstrike gets sued into oblivion, and people responsible for buying that shit should have an epihpany and properly look at how they are doing their infra.

But will happen is that they'll just buy a new crwodstrike product that promises to mitigate the fallout of them fucking up again.

[–] 0x0@programming.dev 93 points 4 months ago (3 children)

decades of IT experience

Do any changes - especially upgrades - on local test environments before applying them in production?

The scary bit is what most in the industry already know: critical systems are held on with duct tape and maintained by juniors 'cos they're the cheapest Big Money can find. And even if not, There's no time. or It's too expensive. are probably the most common answers a PowerPoint manager will give to a serious technical issue being raised.

The Earth will keep turning.

[–] goodgame 34 points 4 months ago (1 children)

some years back I was the 'Head' of systems stuff at a national telco that provided the national telco infra. Part of my job was to manage the national systems upgrades. I had the stop/go decision to deploy, and indeed pushed the 'enter' button to do it. I was a complete PowerPoint Manager and had no clue what I was doing, it was total Accidental Empires, and I should not have been there. Luckily I got away with it for a few years. It was horrifically stressful and not the way to mitigate national risk. I feel for the CrowdStrike engineers. I wonder if the latest embargo on Russian oil sales is in anyway connected?

[–] 0x0@programming.dev 18 points 4 months ago

I wonder if the latest embargo on Russian oil sales is in anyway connected?

Doubt it, but it's ironic that this happens shortly after Kaspersky gets banned.

[–] ik5pvx@lemmy.world 30 points 4 months ago (2 children)

Unfortunately falcon self updates. And it will not work properly if you don't let it do it.

Also add "customer has rejected the maintenance window" to your list.

[–] MyNameIsRichard@lemmy.ml 35 points 4 months ago

Turns out it doesn't work properly if you do let it

load more comments (1 replies)
[–] HumanPenguin 25 points 4 months ago (2 children)

Not OP. But that is how it used to be done. Issue is the attacks we have seen over the years. IE ransom attacks etc. Have made corps feel they needf to fixed and update instantly to avoid attacks. So they depend on the corp they pay for the software to test roll out.

Autoupdate is a 2 edged sword. Without it, attackers etc will take advantage of delays. With it. Well today.

[–] 0x0@programming.dev 15 points 4 months ago* (last edited 4 months ago) (2 children)

I'd wager most ransomware relies on old vulnerabilities. Yes, keep your software updated but you don't need the latest and greatest delivered right to production without any kind of test first.

load more comments (2 replies)
load more comments (1 replies)
[–] shirro@aussie.zone 133 points 4 months ago* (last edited 4 months ago) (11 children)

I isn't even a Linux vs Windows thing but a competent at your job vs don't know what the fuck you are doing thing. Critical systems are immutable and isolated or as close as reasonably possible. They don't do live updates of third party software and certainly not software that is running privileged and can crash the operating system.

I couldn't face working in corporate IT with this sort of bullshit going on.

[–] rozodru@lemmy.world 61 points 4 months ago (1 children)

This is just like "what not to do in IT/dev/tech 101" right here. Every since I've been in the industry for literally decades at this point I was always told, even when in school, "Never test in production, never roll anything out to production on a Friday, if you're unsure have someone senior code review" of which, Crowdstrike, failed to do all of the above. Even the most junior of junior devs should know better. So the fact that this update was allowed go through...I mean blame the juniors, the seniors, the PM's, the CTO's, everyone. If your shit is so critical that a couple bad lines of poorly written code (which apparently is what it was) can cripple the majority of the world....yeah crowdstrike is done.

[–] magic_lobster_party@kbin.run 35 points 4 months ago (1 children)

It’s incredible how an issue of this magnitude didn’t get discovered before they shipped it. It’s not exactly an issue that happens in some niche cases. It’s happening on all Windows computers!

This can only happen if they didn’t test their product at all before releasing to production. Or worse: maybe they did test, got the error, and they just “eh, it’s probably just something wrong with test systems”, and then shipped anyway.

This is just stupid.

load more comments (1 replies)
[–] CalcProgrammer1@lemmy.ml 28 points 4 months ago* (last edited 4 months ago) (4 children)

It's also a "don't allow third party proprietary shit into your kernel" issue. If the driver was open source it would actually go through a public code review and the issue would be more likely to get caught. Even if it did slip through people would publically have a fix by now with all the eyes on the code. It also wouldn't get pushed to everyone simultaneously under the control of a single company, it would get tested and packaged by distributions before making it to end users.

load more comments (4 replies)
load more comments (9 replies)
[–] monoboy@lemmy.zip 83 points 4 months ago* (last edited 4 months ago) (3 children)

Didn't Crowdstrike have a bad update to Debian systems back in April this year that caused a lot of problems? I don't think it was a big thing since not as many companies are using Crowdstrike on Debian.

Sounds like the issue here is Crowdstrike and not Windows.

[–] balder1993@programming.dev 43 points 4 months ago* (last edited 4 months ago) (1 children)

They didn’t even bother to do a gradual rollout, like even small apps do.

The level of company-wide incompetence is astounding, but considering how organizations work and disregard technical people’s concerns, I’m never surprised when these things happen. It’s a social problem more than a technical one.

[–] PlexSheep@infosec.pub 18 points 4 months ago* (last edited 4 months ago) (1 children)

They didn't even bother to test their stuff, must have pushed to prod

(Technically, test in prod)

load more comments (1 replies)
[–] DaneGerous@lemmy.world 17 points 4 months ago

A crowdstrike update killed a bunch of our Linux VMs that had a newer kernel a month or so ago.

load more comments (1 replies)
[–] TCB13@lemmy.world 80 points 4 months ago (17 children)

While I don’t totally disagree with you, this has mostly nothing to do with Windows and everything to do with a piece of corporate spyware garbage that some IT Manager decided to install. If tools like that existed for Linux, doing what they do to to the OS, trust me, we would be seeing kernel panics as well.

[–] tenchiken@lemmy.dbzer0.com 65 points 4 months ago (2 children)

Hate to break it to you, but CrowdStrike falcon is used on Linux too...

[–] kautau@lemmy.world 55 points 4 months ago* (last edited 4 months ago) (9 children)

And if it was a kernel-level driver that failed, Linux machines would fail to boot too. The amount of people seeing this and saying “MS Bad,” (which is true, but has nothing to do with this) instead of “how does an 83 billion dollar IT security firm push an update this fucked” is hilarious

load more comments (9 replies)
load more comments (1 replies)
[–] biscuitswalrus@aussie.zone 33 points 4 months ago (1 children)

Hate to break it to you, but most IT Managers don't care about crowdstrike: they're forced to choose some kind of EDR to complete audits. But yes things like crowdstrike, huntress, sentinelone, even Microsoft Defender all run on Linux too.

load more comments (1 replies)
[–] Mikina@programming.dev 24 points 4 months ago (1 children)

I wouldn't call Crowdstrike a corporate spyware garbage. I work as a Red Teamer in cybersecurity, and EDRs are bane of my existence - they are useful, and pretty good at what they do. In the last few years, I'm struggling more and more to with engagements we do, because EDRs just get in the way and catch a lot of what would pass undetected a month ago. Staying on top of them with our tooling is getting more and more difficult, and I would call that a good thing.

I've recently tested a company without EDR, and boy was it a treat. Not defending Crowdstrike, to call that a major fuckup is great understatement, but calling it "corporate spyware garbage" feels a little bit unfair - EDRs do make a difference, and this wasn't an issue with their product in itself, but with irresponsibility of their patch management.

load more comments (1 replies)
load more comments (14 replies)
[–] Swarfega@lemm.ee 64 points 4 months ago (7 children)

I've just spent the past 6 hours booting into safe mode and deleting crowd strike files on servers.

[–] allywilson@lemmy.ml 18 points 4 months ago (3 children)

Feel you there. 4 hours here. All of them cloud instances whereby getting acces to the actual console isn't as easy as it should be, and trying to hit F8 to get the menu to get into safe mode can take a very long time.

load more comments (3 replies)
load more comments (6 replies)
[–] fin@sh.itjust.works 58 points 4 months ago (2 children)

That’s hell of a strike to the crowd

load more comments (2 replies)
[–] areyouevenreal@lemm.ee 57 points 4 months ago* (last edited 4 months ago) (15 children)

Crowdstrike already killed some Linux machines. Let's not pretend Windows is at fault here or Linux is magically better in this area. No one is immune from software that can run as a kernel module going bad.

load more comments (15 replies)
[–] resin85@lemmy.ca 53 points 4 months ago
[–] nickiam2@aussie.zone 45 points 4 months ago* (last edited 4 months ago) (1 children)

I work in hospitality and our systems are completely down. No POS, no card processing, no reservations, we're completely f'ked.

Our only saving grace is the fact that we are in a remote location and we have power outages frequently. So operating without a POS is semi-normal for us.

load more comments (1 replies)
[–] axzxc1236@lemm.ee 36 points 4 months ago* (last edited 4 months ago) (11 children)

I am born too late to understand what Y2K problem was, this (the result) might be what people thought could happen.

[–] HumanPenguin 40 points 4 months ago* (last edited 4 months ago) (1 children)

Yep pretty much but on a larger scale.

1st please do not believe the bull that there was no problem. Many folks like me were paid to fix it before it was an issue. So other than a few companies, few saw the result, not because it did not exist. But because we were warned. People make jokes about the over panic. But if that had not happened, it would hav been years to fix, not days. Because without the panic, most corporations would have ignored it. Honestly, the panic scared shareholders. So boards of directors had to get experts to confirm the systems were compliant. And so much dependent crap was found running it was insane.

But the exaggerations of planes falling out of the sky etc. Was also bull. Most systems would have failed but BSOD would be rare, but code would crash and some works with errors shutting it down cleanly, some undiscovered until a short while later. As accounting or other errors showed up.

As other have said. The issue was that since the 1960s, computers were set up to treat years as 2 digits. So had no expectation to handle 2000 other than assume it was 1900. While from the early 90s most systems were built with ways to adapt to it. Not all were, as many were only developing top layer stuff. And many libraries etc had not been checked for this issue. Huge amounts of the infra of the world's IT ran on legacy systems. Especially in the financial sector where I worked at the time.

The internet was a fairly new thing. So often stuff had been running for decades with no one needing to change it. Or having any real knowledge of how it was coded. So folks like me were forced to hunt through code or often replace systems that were badly documented or more often not at all.

A lot of modern software development practices grew out of discovering what a fucking mess can grow if people accept an "if it ain't broke, don't touch it" mentality.

load more comments (1 replies)
load more comments (10 replies)
[–] Reddfugee42@lemmy.world 33 points 4 months ago (1 children)

Most people are completely oblivious because it only affects people using crowdstrike, which practically excludes general consumers.

load more comments (1 replies)
[–] SitD@lemy.lol 33 points 4 months ago* (last edited 4 months ago) (4 children)

I love how everyone understands the issue wrong. It's not about being on Windows or Linux. It's about the ecosystem that is common place and people are used to on Windows or Linux. On windows it's accepted that every stupid anticheat can drop its filthy paws into ring 0 and normies don't mind. Linux has a fostered a less clueless community, but ultimately it's a reminder to keep vigilant and strive for pure and well documented open source with the correct permissions.

BSODs won't come from userspace software

[–] nonagonOrc@lemmy.world 16 points 4 months ago (1 children)

While that is true, it makes sense for antivirus/edr software to run in kernelspace. This is a fuck-up of a giant company that sells very expensive software. I wholeheartedly agree with your sentiment, but I mostly see this as a cautionary tale against putting excessive trust and power in the hands of one organization/company.

Imagine if this was actually malicious instead of the product of incompetence, and the update instead ran ransomware.

load more comments (1 replies)
load more comments (3 replies)
[–] digdilem@lemmy.ml 29 points 4 months ago (2 children)

Am on holiday this week - called in to help deal with this shit show :(

[–] Botzo@lemmy.world 20 points 4 months ago

Don't worry, George Kurtz (crowdstrike CEO) is unavailable today. He's got racing to do #04 https://www.gt-world-challenge-america.com/event/95/virginia-international-raceway

[–] jet@hackertalks.com 19 points 4 months ago

i hope you get overtime!

[–] bitwolf@lemmy.one 18 points 4 months ago

Would you really be paying for Crowdstrike for use at home?

[–] Strit@lemmy.linuxuserspace.show 17 points 4 months ago (2 children)
load more comments (2 replies)
[–] bricklove@midwest.social 17 points 4 months ago* (last edited 4 months ago) (5 children)

I wanted to share the article with friends and copy a part of the text I wanted to draw attention to but the asshole site has selection disabled. Now I will not do that and timesnownews can go fuck themselves

load more comments (5 replies)
[–] catculation@lemmy.zip 16 points 4 months ago (3 children)
load more comments (3 replies)
load more comments
view more: next ›