IPFS? I assure you, no individual here can afford to host even one single copy of the whole Internet Archive
Fediverse
A community dedicated to fediverse news and discussion.
Fediverse is a portmanteau of "federation" and "universe".
Getting started on Fediverse;
- What is the fediverse?
- Fediverse Platforms
- How to run your own community
So, in my understanding ActivityPub is fine for different forms of decentralised communication — what you're suggesting sounds more to me like a generalised peer-to-peer network or distributed file storage (see DAT or IPFS)?
ActivityPub seems like the wrong tool for this job
You're more looking for a decentralised distributed file system/object store as the base for this.
And it's going to require a lot of participants in the network to get to the storage capacity and redundancy necessary for it to function well
The issue I see is ensuring that a distributed archive is comprehensive. How do you know what’s missing and needs to be added unless there’s a central coordinating process aware of what everyone already has?
Internet Archive itself has apparently been involved with something called Filecoin, which I assume would solve that kind of issue with 'blockchain,' somehow.
https://blog.archive.org/2023/10/20/celebrating-1-petabyte-on-the-filecoin-network/
Is that like the usual blockchains where every computer has to store a complete copy? That would get huge with the Internet Archive.
No, just some metadata:
Filecoin is an open protocol and uses a blockchain to record participation in the network.
There are distributed filesystems with redundancy, but the last time I tried something like that, it was extremely slow for both reading and writing. For an offline archive it might be feasible, but you'd have to do a lot of redundancy and error correction to be sure you didn't lose chunks. Plus, the Internet Archive is so big that even with the data distributed, each participant might have to store a prohibitively large amount.
This is well beyond my skillset (or knowledge level), but something like ArchiveBox combined with ActivityPub might be able to distribute internet archiving, each instance sharing with the fediverse what it has archived.