this post was submitted on 19 Nov 2023
5 points (100.0% liked)

Self-Hosted Main

515 readers
1 users here now

A place to share alternatives to popular online services that can be self-hosted without giving up privacy or locking you into a service you don't control.

For Example

We welcome posts that include suggestions for good self-hosted alternatives to popular online services, how they are better, or how they give back control of your data. Also include hints and tips for less technical readers.

Useful Lists

founded 1 year ago
MODERATORS
 

Having been so meticulous about taking back ups, I’ve perhaps not as been as careful about where I stored them, so I now have a loads of duplicate files in various places. I;ve tried various tools fdupes, czawka etc. , but none seems to do what I want.. I need a tool that I can tell which folder (and subfolders) is the source of truth, and to look for anything else, anywhere else that’s a duplicate, and give me an option to move or delete. Seems simple enough, but I have found nothing that allows me to do that.. Does anyone know of anything ?

you are viewing a single comment's thread
view the rest of the comments
[–] speculatrix@alien.top 2 points 11 months ago (2 children)

Write a simple script which iterates over the files and generates a hash list, with the hash in the first column.

find . -type f -exec md5sum {} ; >> /tmp/foo

Repeat for the backup files.

Then make a third file by concatenating the two, sort that file, and run "uniq -d". The output will tell you the duplicated files.

You can take the output of uniq and de-duplicate.

[–] jerwong@alien.top 1 points 11 months ago

I think you need a \ in front of the ;

i.e.: find . -type f -exec md5sum {} \; >> /tmp/foo

[–] parkercp@alien.top 1 points 11 months ago (1 children)

Thanks @speculatrix - I wish I had your confidence in scripting - hence I’m hoping to find something that does all that clever stuff for me.. The key thing for me is to say something like multimedia/photos/ is the source of truth anything found elsewhere is a duplicate ..

[–] Digital-Chupacabra@alien.top 1 points 11 months ago (1 children)

I wish I had your confidence in scripting

You know how you get it? by fucking around and finding out! I'd say give it a go!

Do a dry run of the de-dup to make sure you don't delete anything you care about.

[–] parkercp@alien.top 1 points 11 months ago

Give me a few years and maybe :P - but for now I’d rather not risk important data with my own limited skills especially if there is a product out there that it’s tried and tested and hopefully recommended by someone in this sub.. I didn’t expect my ask to be quite so unique..