@hongminhee洪 民憙 (Hong Minhee) I certainly like the focus on growth not punishment. It really aligns with the brief discussion in the " Alternative models of content moderation" sectino of Sarah Gilbert's Towards Intersectional Moderation: An Alternative Model of Moderation Built on Care and Power of justice-based models of moderation that "foster education, rehabilitation, and forgiveness" (as opposed to the more punitive and carceral models of moderation that are the norm in commercial social networks).

My impression is that most fedi moderation teams don't actually notify the reporters about what actions are or aren't taken. Some of that is due to software limitations -- if the report is forwarded from another instance, I'm pretty sure you don't even know who the reporter is (and the Mastodon moderator UX doesn't as far as I know directly support two-way communications, instead you. have to DM the user from the moderator account; and) . Just as importantly, though, reporters sometimes get upset and escalate if action isn't taken. If the original report is malicious, as part of a harassment campaign against the target, this sometimes leads to the moderators getting harassed as well.

Of course it's very frustrating to report something and never hear back so I can certainly see the arguments in favor of doing this as well! But with malicious reporting, or if a team of attackers are reporting different posts to try to understand just what borderline content get through the moderators, there are arguments against it as well. One possibility is to treat reports from people who are part of the HackersPub community (differently on this front than reports from other instances.

While I like the idea of not requiring the reporter to cite the specific clause(s) of the guidelines that are violated, I'm personally skeptical about using LLMs to address the problem. Even assuming the datasets used to train the LLMs were gathered with consent (which isn't typically the case) they're likely to have racial, gender, and cultural biases. Don't get me wrong, the idea of using a tool to help the moderators and report recipient understand just what's been violated is a good one, I'm just not sure this is the right technology. Timnit Gebru and others at DAIR have been thinkiing about approaches to content moderations, so might have some ideas here.

In terms of cross-instance reports, it's really important to leave it up to the reporter whether or not to forward to another instance -- sometimes the admins of the other instance are hostile! And like the discussion of malicious reporting above, this highlights the importance of threat modeling.

Anyhow those are my quick initial thoughts. Looking forward to seeing how the functionality evolves! As you're moving forward with this, it might be interesting to talk to some of the academic researchers who have looked at moderation on the Fediverse -- Sohyeon Hwang, Owen Xingjian Zhang, Tolulope Oshinowo and others at Princeton have done some excellent work and talked to a lot of people here in the process. If it's useful, I can see if they're interested in connecting with you.

@jdp23Jon Thank you so much for this thoughtful feedback and for the paper reference! I'll definitely read Sarah Gilbert's work.

You raise excellent points I hadn't fully considered:

  • On notifying reporters: The malicious reporting scenario is a real concern. I like your suggestion of treating reports from within the community differently from forwarded reports. We should think more carefully about what information to share and with whom.

  • On LLM usage: That's a fair criticism. The bias issue is something we need to take seriously. Perhaps the LLM should only assist moderators as a reference tool, not be exposed to reporters or reported users directly. I'll look into DAIR's work on this.

  • On cross-instance forwarding: Absolutely agreed; this should be opt-in by the reporter, not automatic. I'll update the spec to make that explicit.

I'd love to connect with the Princeton researchers if they're open to it. And thank you for taking the time to share your experience: this is exactly the kind of input we were hoping for.

1

If you have a fediverse account, you can quote this note from your own instance. Search https://hackers.pub/ap/notes/019b81a6-ddde-781c-854c-8d726052258f on your instance and quote it. (Note that quoting is not supported in Mastodon.)