this post was submitted on 12 Apr 2024
20 points (91.7% liked)

Linux

47345 readers
1528 users here now

From Wikipedia, the free encyclopedia

Linux is a family of open source Unix-like operating systems based on the Linux kernel, an operating system kernel first released on September 17, 1991 by Linus Torvalds. Linux is typically packaged in a Linux distribution (or distro for short).

Distributions include the Linux kernel and supporting system software and libraries, many of which are provided by the GNU Project. Many Linux distributions use the word "Linux" in their name, but the Free Software Foundation uses the name GNU/Linux to emphasize the importance of GNU software, causing some controversy.

Rules

Related Communities

Community icon by Alpár-Etele Méder, licensed under CC BY 3.0

founded 5 years ago
MODERATORS
20
Help with HDD (lemmy.ml)
submitted 5 months ago* (last edited 5 months ago) by gary_host_laptop@lemmy.ml to c/linux@lemmy.ml
 

I have a 4TB HDD that I use to store music, films, images, and text files. I have a 250GB SDD that I use to install my OS and video games. So far I didn't have any problem with this setup, obviously it's a bit slower when it reads the HDD but nothing too serious, but lately it's gotten way worse, where it just lags too much when I try to access files on that disk, and specially when it comes to listening to music, it's super annoying. I'm using Elisa music player and it just takes ages to load the albums.

Below is my system and HDD information. I think I'm supposed to use hardlinks or something to access those files, could that be a reason? I've never even fully filled my HDD and it's only 3 years old.

System Details Report


Report details

Hardware Information:

  • Hardware Model: ASUSTeK COMPUTER INC. PRIME A320M-K
  • Memory: 16.0 GiB
  • Processor: AMD Ryzen™ 5 5600G with Radeon™ Graphics × 12
  • Graphics: AMD Radeon™ Graphics
  • Disk Capacity: 4.2 TB

Software Information:

  • Firmware Version: 6042
  • OS Name: Fedora Linux 39 (Workstation Edition)
  • OS Build: (null)
  • OS Type: 64-bit
  • GNOME Version: 45.5
  • Windowing System: Wayland
  • Kernel Version: Linux 6.7.11-200.fc39.x86_64

top 23 comments
sorted by: hot top controversial new old
[–] LordTE7R1S@lemmy.sdf.org 8 points 5 months ago (1 children)

I would check the kernel messages (sudo dmesg) and check for errors on the ata bus. If there are it's most likely the disk that is failling

[–] burrito@sh.itjust.works 6 points 5 months ago (1 children)

A bad SATA cable will cause this too.

[–] LordTE7R1S@lemmy.sdf.org 1 points 5 months ago

Yes, that's true and a bad or undersized power supply also but in my experience it is much more likely to have a bad disk than to have something else fail

[–] boredsquirrel@slrpnk.net 7 points 5 months ago* (last edited 5 months ago) (1 children)

Get a new one, maybe a 4TB SATA SSD (if you have the space, SATA is just better than NVME imho, way cheaper and less heat) and DO A BACKUP.

"Spinning things" will break way easier than nonmoving parts.

These might very well be signs of failure.

[–] atzanteol@sh.itjust.works -1 points 5 months ago

“Spinning things” will break way essier than nonmoving parts.

Just because there are no moving parts doesn't mean there is no wear. SSDs have max program/ erase cycles that causes them to fail over time. They do tend to be more reliable than HDDs but it's not as dramatic as you might think. And would likely vary on the quality of the drive more than anything.

Some info: https://www.backblaze.com/blog/how-reliable-are-ssds/

[–] EinfachUnersetzlich@lemm.ee 4 points 5 months ago (1 children)

What results do you get from hdparm's speed tests?

What filesystem is on the disk?

[–] Extrasvhx9he@lemmy.today 5 points 5 months ago (1 children)

Not op but the pic says ext4

[–] EinfachUnersetzlich@lemm.ee 2 points 5 months ago

Ah, right you are! Didn't spot that

[–] Atemu@lemmy.ml 2 points 5 months ago

Monitor I/O on the drive; is anything using it while your system is idle?

What's I/O like when loading an album?

[–] bizdelnick@lemmy.ml 2 points 5 months ago* (last edited 5 months ago) (1 children)

Check its SMART: smartctl -a /dev/sdb.

[–] gary_host_laptop@lemmy.ml 2 points 5 months ago (1 children)

It's a bit long but here it is.

smartctl 7.4 2023-08-01 r5530 [x86_64-linux-6.7.11-200.fc39.x86_64] (local build)
Copyright (C) 2002-23, Bruce Allen, Christian Franke, www.smartmontools.org

=== START OF INFORMATION SECTION ===
Model Family:     Toshiba X300
Device Model:     TOSHIBA HDWE140
Serial Number:    41IUK67JFBRG
LU WWN Device Id: 5 000039 abc58066c
Firmware Version: FP1R
User Capacity:    4,000,787,030,016 bytes [4.00 TB]
Sector Sizes:     512 bytes logical, 4096 bytes physical
Rotation Rate:    7200 rpm
Form Factor:      3.5 inches
Device is:        In smartctl database 7.3/5528
ATA Version is:   ATA8-ACS (minor revision not indicated)
SATA Version is:  SATA 3.0, 6.0 Gb/s (current: 1.5 Gb/s)
Local Time is:    Sat Apr 13 17:09:03 2024 -03
SMART support is: Available - device has SMART capability.
SMART support is: Enabled

=== START OF READ SMART DATA SECTION ===
SMART overall-health self-assessment test result: PASSED

General SMART Values:
Offline data collection status:  (0x80)	Offline data collection activity
					was never started.
					Auto Offline Data Collection: Enabled.
Self-test execution status:      (   0)	The previous self-test routine completed
					without error or no self-test has ever 
					been run.
Total time to complete Offline 
data collection: 		(  120) seconds.
Offline data collection
capabilities: 			 (0x5b) SMART execute Offline immediate.
					Auto Offline data collection on/off support.
					Suspend Offline collection upon new
					command.
					Offline surface scan supported.
					Self-test supported.
					No Conveyance Self-test supported.
					Selective Self-test supported.
SMART capabilities:            (0x0003)	Saves SMART data before entering
					power-saving mode.
					Supports SMART auto save timer.
Error logging capability:        (0x01)	Error logging supported.
					General Purpose Logging supported.
Short self-test routine 
recommended polling time: 	 (   2) minutes.
Extended self-test routine
recommended polling time: 	 ( 481) minutes.
SCT capabilities: 	       (0x003d)	SCT Status supported.
					SCT Error Recovery Control supported.
					SCT Feature Control supported.
					SCT Data Table supported.

SMART Attributes Data Structure revision number: 16
Vendor Specific SMART Attributes with Thresholds:
ID# ATTRIBUTE_NAME          FLAG     VALUE WORST THRESH TYPE      UPDATED  WHEN_FAILED RAW_VALUE
  1 Raw_Read_Error_Rate     0x000b   100   100   050    Pre-fail  Always       -       0
  2 Throughput_Performance  0x0005   100   100   050    Pre-fail  Offline      -       0
  3 Spin_Up_Time            0x0027   100   100   001    Pre-fail  Always       -       464
  4 Start_Stop_Count        0x0032   100   100   000    Old_age   Always       -       11263
  5 Reallocated_Sector_Ct   0x0033   100   100   050    Pre-fail  Always       -       0
  7 Seek_Error_Rate         0x000b   100   100   050    Pre-fail  Always       -       0
  8 Seek_Time_Performance   0x0005   100   100   050    Pre-fail  Offline      -       0
  9 Power_On_Hours          0x0032   073   073   000    Old_age   Always       -       11051
 10 Spin_Retry_Count        0x0033   253   100   030    Pre-fail  Always       -       0
 12 Power_Cycle_Count       0x0032   100   100   000    Old_age   Always       -       11259
191 G-Sense_Error_Rate      0x0032   100   100   000    Old_age   Always       -       8
192 Power-Off_Retract_Count 0x0032   083   083   000    Old_age   Always       -       8637
193 Load_Cycle_Count        0x0032   099   099   000    Old_age   Always       -       11332
194 Temperature_Celsius     0x0022   100   100   000    Old_age   Always       -       38 (Min/Max 16/59)
196 Reallocated_Event_Count 0x0032   100   100   000    Old_age   Always       -       0
197 Current_Pending_Sector  0x0032   100   100   000    Old_age   Always       -       0
198 Offline_Uncorrectable   0x0030   100   100   000    Old_age   Offline      -       0
199 UDMA_CRC_Error_Count    0x0032   200   253   000    Old_age   Always       -       0
220 Disk_Shift              0x0002   100   100   000    Old_age   Always       -       0
222 Loaded_Hours            0x0032   073   073   000    Old_age   Always       -       11003
223 Load_Retry_Count        0x0032   100   100   000    Old_age   Always       -       0
224 Load_Friction           0x0022   100   100   000    Old_age   Always       -       0
226 Load-in_Time            0x0026   100   100   000    Old_age   Always       -       545
240 Head_Flying_Hours       0x0001   100   100   001    Pre-fail  Offline      -       0

SMART Error Log Version: 1
No Errors Logged

SMART Self-test log structure revision number 1
No self-tests have been logged.  [To run self-tests, use: smartctl -t]

SMART Selective self-test log data structure revision number 1
 SPAN  MIN_LBA  MAX_LBA  CURRENT_TEST_STATUS
    1        0        0  Not_testing
    2        0        0  Not_testing
    3        0        0  Not_testing
    4        0        0  Not_testing
    5        0        0  Not_testing
Selective self-test flags (0x0):
  After scanning selected spans, do NOT read-scan remainder of disk.
If Selective self-test is pending on power-up, resume after 0 minute delay.

The above only provides legacy SMART information - try 'smartctl -x' for more
[–] bizdelnick@lemmy.ml 2 points 5 months ago (1 children)

Everything seems ok. It is unlikely that the disk itself is dying. Maybe the problem is a bad cable or bus controller. Or something is wrong with the filesystem.

[–] gary_host_laptop@lemmy.ml 2 points 4 months ago

Hey, I replaced the cable and everything seems to be working fine, thank you!

[–] nyan@sh.itjust.works 2 points 5 months ago

I've got a 6TB SATA HDD (also formatted ext4) and while files on it don't always open instantaneously, the pause is only a fraction of a second at most (barely enough to notice). So I'll join the chorus suggesting you check for hardware issues—bad drive, bad or loose cables, or a bad controller on the mobo.

[–] Toast@feddit.de 0 points 5 months ago (1 children)
[–] ryannathans@aussie.zone 3 points 5 months ago

Not on ext4 that's not full, unless it vas previously almost completely full and a lot of deletes and rewrites ocurred

[–] Extrasvhx9he@lemmy.today -4 points 5 months ago* (last edited 5 months ago) (1 children)

Iirc with time mechanical drives do slow down significantly due to wear and tear so it kinda sounds its on its way out. If speed is a must maybe look at how much storage capacity you're using and switch to appropriate sized ssd/s. You can keep the mechanical drive as a cold backup.

Edit: not sure if you already done this and I usually don't recommend it if you don't have backups but benchmarking would show you the read and write speeds. Also depending on warranty status, you also have the option of doing a manufacturer replacement. Not sure what info Toshiba asks for but doesn't hurt to look into if you do decide to replace it.

[–] atzanteol@sh.itjust.works 7 points 5 months ago (2 children)

with time mechanical drives do slow down significantly due to wear and tear

Do you mean "because it's dying?" Because otherwise I've never heard of this or seen this before. The disks must spin at a precise speed for the read/write head to work since it expects data to be read at a constant rate.

If it's dying there should be a ton of crap in the system logs (try dmesg or journalctl -k -f). You can also use smartctl to check for reported errors, or badblocks to see if there are issues with the disk.

[–] gary_host_laptop@lemmy.ml 1 points 5 months ago (1 children)

I used journalctl -k -f and this is the output, I've also tried dmesg and smartctl and replied with the output on other comments, and all show some kind of I/O error, I guess I'm fucked or can it be because of a bad cable or something?

Apr 13 17:06:59 fedora kernel: sd 1:0:0:0: [sdb] tag#14 Sense Key : Illegal Request [current] 
Apr 13 17:06:59 fedora kernel: sd 1:0:0:0: [sdb] tag#14 Add. Sense: Unaligned write command
Apr 13 17:06:59 fedora kernel: sd 1:0:0:0: [sdb] tag#14 CDB: Read(16) 88 00 00 00 00 00 39 dd 26 78 00 00 00 08 00 00
Apr 13 17:06:59 fedora kernel: I/O error, dev sdb, sector 970794616 op 0x0:(READ) flags 0x80700 phys_seg 1 prio class 2
Apr 13 17:06:59 fedora kernel: sd 1:0:0:0: [sdb] tag#15 FAILED Result: hostbyte=DID_OK driverbyte=DRIVER_OK cmd_age=3s
Apr 13 17:06:59 fedora kernel: sd 1:0:0:0: [sdb] tag#15 Sense Key : Illegal Request [current] 
Apr 13 17:06:59 fedora kernel: sd 1:0:0:0: [sdb] tag#15 Add. Sense: Unaligned write command
Apr 13 17:06:59 fedora kernel: sd 1:0:0:0: [sdb] tag#15 CDB: Read(16) 88 00 00 00 00 00 39 dd 26 b8 00 00 00 20 00 00
Apr 13 17:06:59 fedora kernel: I/O error, dev sdb, sector 970794680 op 0x0:(READ) flags 0x80700 phys_seg 4 prio class 2
Apr 13 17:06:59 fedora kernel: ata2: EH complete
Apr 13 17:19:28 fedora kernel: ata2.00: exception Emask 0x50 SAct 0x400 SErr 0x30802 action 0xe frozen
Apr 13 17:19:28 fedora kernel: ata2.00: irq_stat 0x00400000, PHY RDY changed
Apr 13 17:19:28 fedora kernel: ata2: SError: { RecovComm HostInt PHYRdyChg PHYInt }
Apr 13 17:19:28 fedora kernel: ata2.00: failed command: READ FPDMA QUEUED
Apr 13 17:19:28 fedora kernel: ata2.00: cmd 60/00:50:38:e3:9d/01:00:39:00:00/40 tag 10 ncq dma 131072 in
                                        res 40/00:01:00:4f:c2/00:00:00:00:00/00 Emask 0x50 (ATA bus error)
Apr 13 17:19:28 fedora kernel: ata2.00: status: { DRDY }
Apr 13 17:19:28 fedora kernel: ata2: hard resetting link
Apr 13 17:19:31 fedora kernel: ata2: SATA link up 1.5 Gbps (SStatus 113 SControl 310)
Apr 13 17:19:31 fedora kernel: ata2.00: configured for UDMA/33
Apr 13 17:19:31 fedora kernel: sd 1:0:0:0: [sdb] tag#10 FAILED Result: hostbyte=DID_OK driverbyte=DRIVER_OK cmd_age=3s
Apr 13 17:19:31 fedora kernel: sd 1:0:0:0: [sdb] tag#10 Sense Key : Illegal Request [current] 
Apr 13 17:19:31 fedora kernel: sd 1:0:0:0: [sdb] tag#10 Add. Sense: Unaligned write command
Apr 13 17:19:31 fedora kernel: sd 1:0:0:0: [sdb] tag#10 CDB: Read(16) 88 00 00 00 00 00 39 9d e3 38 00 00 01 00 00 00
Apr 13 17:19:31 fedora kernel: I/O error, dev sdb, sector 966648632 op 0x0:(READ) flags 0x80700 phys_seg 32 prio class 1
Apr 13 17:19:31 fedora kernel: ata2: EH complete


[–] atzanteol@sh.itjust.works 1 points 5 months ago

That's not great... No idea what those errors are specifically but something is very wrong. Hopefully you've made backups already because the drive may be failing.

[–] Extrasvhx9he@lemmy.today 0 points 5 months ago* (last edited 5 months ago) (1 children)

Huh weird I do have experience of this happening especially on used drives that are technically beyond their lives, +7 years etc, guess it depends on manufacturer and classification since you haven't personally experienced it. When I say slow down I'm referring to the read and write speed not the platter rpm even though that could happen such as with motor bearing wear. There's really multiple potential hard wear issue that could cause read and write speeds to slow down: head wear, platter degradation, etc. Although i do want to clarify that I'm not necessarily saying its dying or even if its 100% a hardware issue since fragmentation could be the cause. Not even sure if it'll be throwing out errors yet so I can't wait to see what OP updates us with.

[–] atzanteol@sh.itjust.works 5 points 5 months ago* (last edited 5 months ago) (1 children)

Those are signs that the drive is failing - not a "normal wear and tear".

The spindle must spin at a constant angular velocity for the drive to operate. It can't slow down and work since the heads expect the bits to be on the platters in certain locations and to come at a fixed rate. It doesn't slow down over time. HDD motors are extremely precise with feedback to ensure the rate is correct. It's not a simple DC motor.

Same with the heads - they don't touch the platters so I'm not sure why they would be "wearing down." But if they are that's a sign of failing not just "normal wear and tear". In fact if it does touch a platter that's catastrophic...

HDDs are made with excellent precision. I have a drive with a powered-on time of over 8 years and it performs as it did when I bought it. Just because there are moving parts doesn't mean they're bad. We've been making them for many years and have gotten very good at it.

And fragmentation is almost certainly not an issue with ext4.

[–] TimeSquirrel@kbin.social 2 points 5 months ago

We’ve been making them for many years and have gotten very good at it.

This can't be overstated enough. Modern mechanical hard drives are a hundred times more impressive than Swiss watches.