this post was submitted on 11 Jul 2023
11 points (92.3% liked)

Lemmy Support

4660 readers
24 users here now

Support / questions about Lemmy.

Matrix Space: #lemmy-space

founded 5 years ago
MODERATORS
 

Hi every lemmy. I've just stood up a couple new instances and I've been hanging out in the Admin chat over at https://matrix.to/#/#lemmy-support-general:discuss.online. Someone there asked if they could view subscriptions so I wrote and shared the sql query. (could I have done better on the joins with 2 joins to instance?)

sql query to all user subscriptions

And that's when I realized what an invasion of privacy that is. Maybe there's an easier way to do it but could we add optional support for user key pairs, so that if I associated a public key with my account, everything related to me in the db gets hashed with that key? Then I provide my private key at login?

I say optional because I know that's hard for a lot of folks. But maybe there's a way to make it easier with something like letsencrypt at sign up so it would be trivial for everyone to do it.. Or maybe there's a way to do it globally with a central key common to all instances, perhaps paired with instance specific keys?

I understand there's other aspects of user activity that would be best made private to so this could also work, say for votes or whatever else.

top 12 comments
sorted by: hot top controversial new old
[–] Max_P@lemmy.max-p.me 6 points 1 year ago* (last edited 1 year ago) (1 children)

There's no reasonable way around it. The best that can be done would to anonymize the votes but then there's nothing preventing a rogue instance from reporting "yup, 500 users have upvoted this".

Tying the votes to an account can be helpful mitigating spam. Bots can analyze your patterns and based on your account age, comment history and what not, establish whether you're a legit user. If everything is anonymous, there's nothing that can be done.

ActivityPub was just not designed with privacy in mind. There's debates as to whether Lemmy can possibly be GDPR compliant at all.

Keys would not help at all here, you just switch it from a user ID to a public key.

Either we have trusted mega instances, or we have complete transparency.

[–] boulderly@lemmyadmin.site 1 points 1 year ago* (last edited 1 year ago) (2 children)

lets take community subscriptions specifically. Here's a handful of rows from community_follower with my person_id. Why couldn't you hash community_id with my public key and then I provide my private key to whatever ui client I'm using to populate my feeds when I log in?

rows from the community_follower table

[–] Max_P@lemmy.max-p.me 3 points 1 year ago (3 children)

Then how is the server gonna know what content to push to your home instance?

Arguably that might not be needed for subscriptions specifically, the home instance could just know the overall list of subscriptions and use that. But then the subscription counters would be wrong, and lead to the same problem as votes.

But at least subscriptions are already strictly between the home instance and the remote instance, and never leaves the instance if the community is local.

[–] boulderly@lemmyadmin.site 1 points 1 year ago

There, you've already found a reasonable way around it! 😀

[–] boulderly@lemmyadmin.site 1 points 1 year ago

what is the problem with votes btw? Someone else just mentioned those should be private too in the chat where I first raised this.

[–] boulderly@lemmyadmin.site 1 points 1 year ago

also, you could modify subscription counters so you had a count of subscribers from an instance without knowing who they were.

[–] scrubbles@poptalk.scrubbles.tech 2 points 1 year ago (1 children)

You're just joining on your public key to a community then, but your public key still uniquely identifies you. If you encrypt the entire list of subscriptions then you lose all the benefits of sql, the payoff of a tiny bit more privacy would require an entire rearchitecture and requiring everyone to update.

At the end of the day you're still going to have a unique userId that somehow matches to a community. It's just how complicated do you want to make that for the sake of privacy.

There's already a valid workaround. Create a burner account for going to lemmynsfw.

[–] boulderly@lemmyadmin.site 1 points 1 year ago* (last edited 1 year ago) (1 children)

the point is not to encrypt your user id, check this out if you haven't seen it I think I explain it better here: https://lemmyadmin.site/comment/46. It's a lot more privacy. And thinking as an admin that wants to provide a safe space for my users, I think it's worth the effort. I took a very quick look at the tables related to person and I'd bet you could treat these similarly to community_follower:

TABLE "comment_like" CONSTRAINT "comment_like_person_id_fkey" FOREIGN KEY (person_id) REFERENCES person(id) ON UPDATE CASCADE ON DELETE CASCADE
TABLE "comment_saved" CONSTRAINT "comment_saved_person_id_fkey" FOREIGN KEY (person_id) REFERENCES person(id) ON UPDATE CASCADE ON DELETE CASCADE
TABLE "community_block" CONSTRAINT "community_block_person_id_fkey" FOREIGN KEY (person_id) REFERENCES person(id) ON UPDATE CASCADE ON DELETE CASCADE
TABLE "community_follower" CONSTRAINT "community_follower_person_id_fkey" FOREIGN KEY (person_id) REFERENCES person(id) ON UPDATE CASCADE ON DELETE CASCADE
TABLE "person_follower" CONSTRAINT "person_follower_follower_id_fkey" FOREIGN KEY (follower_id) REFERENCES person(id) ON UPDATE CASCADE ON DELETE CASCADE
TABLE "post_like" CONSTRAINT "post_like_person_id_fkey" FOREIGN KEY (person_id) REFERENCES person(id) ON UPDATE CASCADE ON DELETE CASCADE
TABLE "post_read" CONSTRAINT "post_read_person_id_fkey" FOREIGN KEY (person_id) REFERENCES person(id) ON UPDATE CASCADE ON DELETE CASCADE
TABLE "post_saved" CONSTRAINT "post_saved_person_id_fkey" FOREIGN KEY (person_id) REFERENCES person(id) ON UPDATE CASCADE ON DELETE CASCADE
TABLE "private_message" CONSTRAINT "private_message_creator_id_fkey" FOREIGN KEY (creator_id) REFERENCES person(id) ON UPDATE CASCADE ON DELETE CASCADE
TABLE "private_message" CONSTRAINT "private_message_recipient_id_fkey" FOREIGN KEY (recipient_id) REFERENCES person(id) ON UPDATE CASCADE ON DELETE CASCADE
[–] scrubbles@poptalk.scrubbles.tech 1 points 1 year ago (1 children)

Again I don't see the payoff really. I don't think posting on a site that's publicly available really means that you get privacy so to speak. DMs are the only thing that I'd be maybe weary of, but even then I'd say there is no expectation of privacy on Lemmy servers. Matrix chat would be a place to go for private DMs, but here I would take the 90's parent advice and say "Anything you put on the internet should be assumed public".

In Lemmy's case I'll reiterate that yes, it all points to you as in "your unique user ID", but you can control if it points to you, the human. Who cares of userId 42 subscribes to communities A and B. The privacy aspect is do we know who user 42 is?

[–] boulderly@lemmyadmin.site 1 points 1 year ago

so consider a smaller local instance like I'm setting up. If it's ever anything more than me and my mom it's gonna be a bunch of people I know and their friends. And if my instance is their entry point to the fediverse then yeah I want it to be as private as we can make it for them.

But also, even if someone's IRL identity was masked, I've only been around a week and I'm starting to recognize handles on the fediverse. Ideally we make friends here and it's a community for us.

Now imagine how humiliating it would be if someone malicious gained control over an instance and published everyone's subscriptions/likes etc. Sure more savvy users probably do have separate accounts but honestly most will not.

Against the grain, but I don't think users have an expectation of privacy here. This isn't on some data dashboard for everyone to see, this is admin specific, and even then currently it's not exposed in an admin panel even, it's only here.

2 big things that I have already done as an admin that uses this.

  • Querying for bot accounts. I had open signups for a while and wanted to know if anyone had signed up for my instance but had no subscriptions, something theoretically that would root out some bots, and it did.
  • Querying for trolls. I run a safe space instance where I aim to have quality content. For the time being I allow downvotes, but I keep an eye out for trolling. I have a script that shows me anyone with negative "karma" or who downvote more than they upvote. This way I know who my bad-actors are and I can keep an eye on them.

Some may say it's too far, but meh, don't be a jerk on my instance. It keeps my other users happy so I'll continue to do it.

Finally, none of this is PII I'd say except for the email address, which is also something that's allowed to be a spam/throwaway email. If a user wants to sign up completely anonymously that's completely up to them, from a lemmy perspective, it's a binary "should this user be banned or not"

[–] lemann@lemmy.one 1 points 1 year ago

This sounds like it would achieve data at rest - It's a good idea in theory but i'm not sure how sensible this would be for Lemmy?

Even if our subscriptions are protected, the server would need some way to establish what your followed communities are to populate your feed... Because of this it could still be possible to identify a user's subscriptions, maybe via looking at proxy logs or outgoing traffic?

Would be useful if someone more knowledgeable had some input, I'm not really a cryptography guy myself 😅

Tbh, if you did a GDPR request to some site like the Alien R, part of that result package will include a plaintext list of communities you follow...

load more comments
view more: next ›