洪民憙 (Hong Minhee): Hi #fediverse! I'm working on Hackers' Pub, a small #ActivityPub-powered social platform for developers and tech folks. We're currently drafting a content #moderation (#flag/#report) system and would really appreciate any feedback from those who have experience with federated moderation—we're still learning. Some ideas we're exploring: Protecting reporter anonymity while giving reported users enough context to understand and improve Graduated responses (warning → content removal → suspension) rather than jumping to bans Using LLM to help match reports to code of conduct provisions Supporting ActivityPub Flag activity for cross-instance reports Our guiding principle is that moderation should be about growth, not punishment. Expulsion is the last resort. Here's the full draft if you're curious: https://github.com/hackers-pub/hackerspub/issues/192. If you've dealt with moderation in federated contexts, what challenges did you run into? What worked well? We'd love to hear your thoughts.

Hi #fediverse! I'm working on Hackers' Pub, a small #ActivityPub-powered social platform for developers and tech folks.

We're currently drafting a content #moderation (#flag/#report) system and would really appreciate any feedback from those who have experience with federated moderation—we're still learning.

Some ideas we're exploring:

Protecting reporter anonymity while giving reported users enough context to understand and improve
Graduated responses (warning → content removal → suspension) rather than jumping to bans
Using LLM to help match reports to code of conduct provisions
Supporting ActivityPub Flag activity for cross-instance reports

Our guiding principle is that moderation should be about growth, not punishment. Expulsion is the last resort.

Here's the full draft if you're curious: https://github.com/hackers-pub/hackerspub/issues/192.

If you've dealt with moderation in federated contexts, what challenges did you run into? What worked well? We'd love to hear your thoughts.

@hongminhee洪民憙 (Hong Minhee) I certainly like the focus on growth not punishment. It really aligns with the brief discussion in the " Alternative models of content moderation" sectino of Sarah Gilbert's Towards Intersectional Moderation: An Alternative Model of Moderation Built on Care and Power of justice-based models of moderation that "foster education, rehabilitation, and forgiveness" (as opposed to the more punitive and carceral models of moderation that are the norm in commercial social networks).

My impression is that most fedi moderation teams don't actually notify the reporters about what actions are or aren't taken. Some of that is due to software limitations -- if the report is forwarded from another instance, I'm pretty sure you don't even know who the reporter is (and the Mastodon moderator UX doesn't as far as I know directly support two-way communications, instead you. have to DM the user from the moderator account; and) . Just as importantly, though, reporters sometimes get upset and escalate if action isn't taken. If the original report is malicious, as part of a harassment campaign against the target, this sometimes leads to the moderators getting harassed as well.

Of course it's very frustrating to report something and never hear back so I can certainly see the arguments in favor of doing this as well! But with malicious reporting, or if a team of attackers are reporting different posts to try to understand just what borderline content get through the moderators, there are arguments against it as well. One possibility is to treat reports from people who are part of the HackersPub community (differently on this front than reports from other instances.

While I like the idea of not requiring the reporter to cite the specific clause(s) of the guidelines that are violated, I'm personally skeptical about using LLMs to address the problem. Even assuming the datasets used to train the LLMs were gathered with consent (which isn't typically the case) they're likely to have racial, gender, and cultural biases. Don't get me wrong, the idea of using a tool to help the moderators and report recipient understand just what's been violated is a good one, I'm just not sure this is the right technology. Timnit Gebru and others at DAIR have been thinkiing about approaches to content moderations, so might have some ideas here.

In terms of cross-instance reports, it's really important to leave it up to the reporter whether or not to forward to another instance -- sometimes the admins of the other instance are hostile! And like the discussion of malicious reporting above, this highlights the importance of threat modeling.

Anyhow those are my quick initial thoughts. Looking forward to seeing how the functionality evolves! As you're moving forward with this, it might be interesting to talk to some of the academic researchers who have looked at moderation on the Fediverse -- Sohyeon Hwang, Owen Xingjian Zhang, Tolulope Oshinowo and others at Princeton have done some excellent work and talked to a lot of people here in the process. If it's useful, I can see if they're interested in connecting with you.

Syntax	Description	Examples
`"` keyword `"`	Finds the string within quotes, including spaces. Case-insensitive. (Escape quotes inside with `\"`)	`"Hackers' Pub"`
`from:` handle	Finds content written by the specified user.	`from:hongminhee` `from:hongminhee@hollo.social`
`lang:` ISO 639-1	Finds content written in the specified language.	`lang:en`
`#` tag	Finds content with the specified tag. Case-insensitive.	`#HackersPub`
condition condition	Finds content that satisfies both conditions on either side of the space (logical AND).	`"Hackers' Pub" lang:en`
condition `OR` condition	Finds content that satisfies at least one of the conditions on either side of the OR operator (logical OR).	`#HackersPub OR "Hackers' Pub" lang:en`
`(` condition `)`	Combines the operators within the parentheses first.	`(#HackersPub OR "Hackers' Pub" OR "Hackers Pub") lang:en`