Why it's impossible to agree on what's allowed

On large platforms, it's impossible to have policies on things like moderation, spam, fraud, and sexual content that people agree on. David Turner made a simple game to illustrate how difficult this is even in a trivial case, No Vehicles in the Park. If you haven't played it yet, I recommend playing it now before continuing to read this document.

The idea behind the site is that it's very difficult to get people to agree on what moderation rules should apply to a platform. Even if you take a much simpler example, what vehicles should be allowed in a park given a rule and some instructions for how to interpret the rule, and then ask a small set of questions, people won't be able to agree. On doing the survey myself, one of the first reactions I had was that the questions aren't chosen to be particularly nettlesome and there are many edge cases Dave could've asked about if he wanted to make it a challenge. And yet, despite not making the survey particularly challenging, there isn't broad agreement on the questions. Comments on the survey also indicate another problem with rules, which is that it's much harder to get agreement than people think it will be. If you read comments on rule interpretation or moderation on lobsters, HN, reddit, etc., when people suggest a solution, the vast majority of people will suggest something that anyone who's done moderation or paid attention to how moderation works knows cannot work, the moderation equivalent of "I could build that in a weekend"¹. Of course we see this on Dave's game as well. The top HN comment, the most agree-upon comment, and a very common sentiment elsewhere is²:

I'm fascinated by the fact that my takeaway is the precise opposite of what the author intended.

To me, the answer to all of the questions was crystal-clear. Yes, you can academically wonder whether an orbiting space station is a vehicle and whether it's in the park, but the obvious intent of the sign couldn't be clearer. Cars/trucks/motorcycles aren't allowed, and obviously police and ambulances (and fire trucks) doing their jobs don't have to follow the sign.

So if this is supposed to be an example of how content moderation rules are unclear to follow, it's achieving precisely the opposite.

And someone agreeingly replies with:

Exactly. There is a clear majority in the answers.

After going through the survey, you get a graph showing how many people answered yes and no to each question, which is where the "clear majority" comes from. First of all, I think it's not correct to say that there is a clear majority. But even supposing that there were, there's no reason to think that there being a majority means that most people agree with you even if you take the majority position in each vote. In fact, given how "wiggly" the per-question majority graph looks, it would be extraordinary if it were the case that being in the majority for each question meant that most people agreed with you or that there's any set of positions that the majority of people agree on. Although you could construct a contrived dataset where this is true, it would be very surprising if this were true in a natural dataset.

If you look at the data (which isn't available on the site, but Dave was happy to pass it along when I asked), as of when I pulled the data, there was no set of answers which the majority of users agreed on and it was not even close. I pulled this data shortly after I posted on the link to HN, when the vast majority of responses were HN readers, who are more homogeneous than the population at large. Despite these factors making it easier to find agreement, the most popular set of answers was only selected by 11.7% of people. This is the position the top commenter says is "obvious", but it's a minority position not only in the sense that only 11.7% of people agree and 88.3% of people disagree, almost no one holds a position with only a small amount of disagreement from this allegedly obvious position. The 2nd and 3rd most common positions, representing 8.5% and 6.5% of the vote, respectively, are similar and only disagree on whether or not a non-functioning WW-II era tank that's part of a memorial violates the rule. Beyond that, approximately 1% of people hold the 4th, 5th, 6th, and 7th most popular positions, with every less popular position having less than 1% agreement, with a fairly rapid drop from there as well. So, 27% of people find themselves in agreement with significantly more than 1% of other users (the median user agrees with 0.16% of other users). See below for a plot of what this looks like. The opinions are sorted from most popular to least popular, with the most popular on the left. A log scale is used because there's so little agreement on opinions that a linear scale plot looks like a few points above zero followed by a bunch of zeros.

a plot illustrating the previous paragraph

Another way to look at this data is that 36902 people expressed an opinion on what constitutes a vehicle in the park and they came up with 9432 distinct opinions, for an average of ~3.9 people, per distinct expressed opinion. i.e., the average user agreement is ~0.01%. Although averages are, on average, overused, an average works as a summary for expressing the level of agreement because while we do have a small handful of opinions with much higher than the average 0.01% agreement, to "maintain" the average, this must be balanced out by a ginormous number of people who have even less agreement with other users. There's no way to have a low average agreement with high actual agreement unless that's balanced out by even higher disagreement, and vice versa.

On HN, in response to the same comment, Michael Chermside had the reasonable but not highly upvoted comment,

> To me, the answer to all of the questions was crystal-clear.

That's not particularly surprising. But you may be asking the wrong question.

If you want to know whether the rules are clear then I think that the right question to ask is not "Are the answers crystal-clear to you?" but "Will different people produce the same answers?".

If we had a sharp drop in the graph at one point then it would suggest that most everyone has the same cutoff; instead we see a very smooth curve as if different people read this VERY SIMPLE AND CLEAR rule and still didn't agree on when it applied.

Many (and probably actually most) people are overconfident when predicting what other people think is obvious and often incorrectly assume that other people will think the same thoughts and find the same things obvious. This is more true of the highly-charged issues that result in bitter fights about moderation than the simple "no vehicles in the park" example, but even this simple example demonstrates not only the difficulty in reaching agreement, but the difficulty in understanding how difficult it is to reach agreement.

To use an example from another context that's more charged, consider in any sport and whether or not a player is considered to be playing fair or is making dirty plays and should be censured. We could look at many different players from many different sports, so let's arbitrarily pick Draymond Green. If you ask any serious basketball fan who's not a Warriors fan, who's the dirtiest player in the NBA today, you'll find general agreement that it's Draymond Green (although some people will argue for Dillon Brooks, so if you want near uniform agreement, you'll have to ask for the top two dirtiest players). And yet, if you ask a Warriors fan about Draymond, most have no problem explaining away every dirty play of his. So if you want to get uniform agreement to a question that's much more straightforward than the "no vehicles in the park" question, such as, "is it ok to stomp on another player's chest and then use them as a springboard to leap into the air? on top of a hundred other dirty plays", you'll find that for many such seemingly obvious questions, a sizable group of people will have extremely strong disagreements with the "obvious" answer. When you move away from a contrived, abstract, example like "no vehicles in the park" to a real-world issue that people have emotional attachments to, it generally becomes impossible to get agreement even in cases where disinterested third parties would all agree, which we observed is already impossible even without emotional attachment. And when you move away from sports into issues people care even more strongly about, like politics, the disagreements get stronger.

While people might be able to "agree to disagree" on whether or not a a non-functioning WW-II era tank that's part of a memorial violates the "no vehicles in the park" rule (resulting in a pair of positions that accounts for 15% of the vote), in reality, people often have a hard time agreeing to disagree over what outsiders would consider very small differences of opinion. Charged issues are often fractally contentious, causing disagreement among people who hold all but identical opinions, making them significantly more difficult to agree on than our "no vehicles in the park" example.

To pick a real-world example, consider Jo Freeman, a feminist who, in 1976, wrote about her experienced being canceled for minute differences in opinion and how this was unfortunately common in the Movement (using the term "trashed" and not "canceled" because cancellation hadn't come into common usage yet and, in my opinion, "trashed" is the better term anyway). In the nearly fifty years since Jo Freeman wrote "Trashing", the propensity of humans to pick on minute differences and attempt to destroy anyone who doesn't completely agree with them hasn't changed; for a recent, parallel, example, Natalie Wynn's similar experience.

For people with opinions far away in the space of commonly held opinions, the differences in opinion between Natalie and the people calling for her to be deplatformed are fairly small. But, not only did these "small" differences in opinion result in people calling for Natalie to be deplatformed, they called for her to be physically assaulted, doxed, etc., and they suggested the same treatment suggested for her friends and associates as well as people who didn't really associate with her, but publicly talked about similar topics and didn't cancel her. Even now, years later, she still gets calls to be deplatformed and I expect this will continue past the end of my life (when I wrote this, years after the event Natalie discussed, I did a Twitter search and found a long thread from someone ranting about what a horrible human being Natalie is for the alleged transgression discussed in the video, dated 10 days ago, and it's easy to find more of these rants). I'm not going to attempt to describe the difference in positions because the positions are close enough that, to describe them would take something like 5k to 10k words (as opposed to, say, a left-wing vs. a right-wing politician, where the difference is blatant enough that you can describe in a sentence or two); you can watch the hour in the 1h40m video that's dedicated to the topic if you want to know the full details.

The point here is just that, if you look at almost any person who has public opinions on charged issues, the opinion space is fractally contentious. No large platform can satisfy user preferences because users will disagree over what content should be moderated off the platform and what content should be allowed. And, of course, this problem scales up as the platform gets larger³.

If you're looking for work, Freshpaint is hiring (US remote) in engineering, sales, and recruiting. Disclaimer: I may be biased since I'm an investor, but they seem to have found product-market fit and are rapidly growing.

Thanks to Peter Bhat Harkins, Dan Gackle, Laurence Tratt, Gary Bernhardt, David Turner, Kevin Burke, Sophia Wisdom, Justin Blank, and Bert Muthalaly for comments/corrections/discussion.

Something I've repeatedly seen on every forum I've been on is the suggestion that we just don't need moderation after all and all our problems will be solved if we just stop this nasty censorship. If you want a small forum that's basically 4chan, then no moderation can work fine, but even if you want a big platform that's like 4chan, no moderation doesn't actually work. If we go back to those Twitter numbers, 300M users and 1M bots removed a day, if you stop doing this kind of "censorship", the platform will quickly fill up with bots to the point that everything you see will be spam/scam/phishing content or content from an account copying content from somewhere else or using LLM-generated content to post scam/scam/phishing content. Not only will most accounts be bots, bots will be a part of large engagement/voting rings that will drown out all human content.

The next most naive suggestion is to stop downranking memes, dumb jokes, etc., often throw in with a comment like "doesn't anyone here have a sense of humor?". If you look at why forums with upvoting/ranking ban memes, it generally happens after the forum becomes totally dominated by memes/comics because people upvote those at a much higher rate than any kind of content with a bit of nuance, and not everyone wants a forum that's full of the lowest common denominator meme/comic content. And as for "having a sense of humor" in comments, if you look forums that don't ban cheap humor, top comments will generally end up dominated by these, e.g., for maybe 3-6 months, one the top comments on any kind of story about a man doing anything vaguely heroic on reddit forums that don't ban this kind of cheap was some variant of "I'm surprised he can walk with balls that weigh 900 lbs.", often repeated multiple times by multiple users, amidst a sea of the other cheap humor that was trendy during that period. Of course, some people actually want that kind of humor to dominate the comments, they actually want to see the same comment 150 times a day for months on end, but I suspect most people who grumpily claim "no one has a sense of humor here" when their cheap humor gets flagged don't actually want to read a forum that's full of other people's cheap humor.
^[return]
This particular commenter indicates that they understand that moderation is, in general, a hard problem; they just don't agree with the "no vehicles in the park" example, but many other people think that both the park example and moderation are easy. ^[return]
Nowadays, it's trendy to use "federation" as a cure-all in the same way people used "blockchain" as a cure-all five years ago, but federation doesn't solve this problem for the typical user. I actually had a conversation with someone who notes in their social media bio that they're one of the creators of the ActivityPub spec, who claimed that federation does solve this problem and that Threads adding ActivityPub would create some kind of federating panacea. I noted that fragmentation is already a problem for many users on Mastodon and whether or not Threads will be blocked is contentious and will only increase fragmentation, and the ActivityPub guy replied with something like "don't worry about that, most people won't block Threads, and it's their problem if they do."

I noted that a problem many of my non-technical friends had when they tried Mastodon was that they'd pick a server and find that they couldn't follow someone they wanted to follow due to some kind of server blocking or ban. So then they'd try another server to follow this one person and then find that another person they wanted to follow is blocked. The fundamental problem is that users on different servers want different things to be allowed, which then results in no server giving you access to everything you want to see. The ActivityPub guy didn't have a response to this and deleted his comment.

By the way, a problem that's much easier than moderation/spam/fraud/obscene content/etc. policy that the fediverse can't even solve is how to present content. Whenever I use Mastodon to interact with someone using "honk", messages get mangled. For example, a Mastodon message " in the subject (and content warning) field gets converted to " the Mastodon user sees the reply from the honk user, so every reply from a honk user forks the discussion into a different subject. Here's something that can be fully specified without ambiguity, where people are much less emotionally attached to the subject than they are for moderation/spam/fraud/obscene content/etc., and the fediverse can't even solve this problem across two platforms.
^[return]