this post was submitted on 20 Jun 2024
25 points (100.0% liked)
Operating Systems
3799 readers
1 users here now
All things operating system related, from Windows to Mac to Linux distros and the more obscure.
Subcommunity of Technology.
This community's icon was made by Aaron Schneider, under the CC-BY-NC-SA 4.0 license.
founded 1 year ago
MODERATORS
you are viewing a single comment's thread
view the rest of the comments
view the rest of the comments
Don't follow tutorials, understand them. I'm so tired of seeing useless uses of cat because some asshole writing a tutorial 20 years ago decided to illustrate how pipes work with a good ol
cat file | grep string
as if grep didn't take a file name as an argument.The more time I spend being mad about this the more I notice people using horrible practices in tutorials because they're too lazy to setup a legit use case.
A new user sees this and thinks this is how grep works.
Loops are another common one. People going around not knowing you can pass a glob to a shell for loop. Because the tutorial they read was lazily written and they didn't bother to understand the bits of what they were being shown, only how to reproduce/mangle the command until they manage to get close enough to what they want out of it.
This is an advanced answer for someone who hasn't even installed Linux on their desktop yet. I've been using Linux for 4 or 5 years don't even know what you're talking about.
You're absolutely right. For what it's worth, it's just the first part that's important.
When you pick up a new concept from a "resource" such as a tutorial, take a minute to explore the concept and understand the semantics of what you're doing. In the name of illustrating a concept tutorials can often be misleading in subtle ways.
An explanation of my "useless use of cat" example:
The command line has a concept called "piping". This lets one command send output to a second command. It's very handy. There is usually also a "cat" command, which will read a file and send the contents where you tell it. This is often your screen, or through a "pipe" to a second command. There is also a "grep" command that lets you search data for certain words.
Many "linux newbie" tutorials combine these tools to show how "piping" lets you send data from one command to another. "cat" some text file, then "pipe" the output to "grep" to search for your words. It usually looks something like
cat ./my_address_book.txt | grep Giles
to find lines in "./my_address_book.txt" that contain the word "Giles". The thing is that "grep" can take a file name as an argument. You can just dogrep Giles ./my_address_book.txt
, and cat is for concatenating files into one. If you want to simply read a file there are more appropriate tools such as "less". This by the way is the "useless use of cat"When you're a newbie though, it may be the first time you're seeing either "grep" or "cat". The tutorial is just trying to show you "pipes". Along the way you're picking up these "bad habits". I've met professional sysadmins who didn't know grep took a filename as an argument. It was always "cat blah | grep my_search". I will see people type "cat /some/file | less" instead of "less /some/file". It shows a lack of understanding of what these tools actually do, and IMO it just comes down to regurgitating tutorial actions without bothering to understand the semantics of what you're being shown.
Not related to your point, but I always felt like piping from
cat
togrep
is crazy inefficient. I’m a programmer so I imaginegrep
is much more efficient at finding stuff in files (in chunks maybe?) whereascat
likely reads the entire thing into memory (somehow less efficiently) to send it through the pipe.…though now I’m wondering if my understanding is off.
I don't think that's what's happening. There's no hard requirement for
cat
to read everything straight into memory. It can send data once it's available, and the receiving process can read it as fast as it wants. There are cases where this might be more clear: Let's say you have a big video file that you want to convert to something that only supports like y4m input and is not in ffmpeg. A common way is something likeffmpeg -i infile -f yuv4mpegpipe - | encoder --y4m outfile
- I'm pretty sure ffmpeg won't read the whole infile into memory, nor will it store the whole y4m representation in memory. Instead, it will decode infile as necessary and push into the pipe at the speed the encoder can handle.But yeah, I remember something about tar using libraries for compression being more efficient that piping its output to a compressor. So it's still the better route, but probably not as much better as you think.
Good points. Yeah, rethinking it it doesn’t make sense at all that it would read the whole thing into memory.
I'm absolutely going to do my best to understand and not copy/paste without doing that. I don't like doing things to my computer that I don't know what is happening, so that makes sense to me! I already ran into that issue plenty of times with my servers, so I'm trying to go all in now.
thank you!