Defeating Coordinated Inauthentic Behavior at Scale
Over the weekend, the YouTube channel SmarterEveryDay posted this video (part 1 of a 3-part series) discussing coordinated inauthentic behavior (via crappy content generation) to manipulate YouTube’s algorithm.
As a cryptography engineer (who, in my professional life, wrote code that secures a nontrivial percentage of websites on the Internet), I thought it’d be appropriate for me to respond to this excellent video and offer my own insights on the strategies that media companies (social or otherwise) like YouTube, Facebook, and Twitter can employ to combat these attempts at coordinated inauthentic behavior.
This post is organized in the following structure:
- The problem faced by media platforms
a. The public countermeasures discussed, and how attackers are bypassing these countermeasures
- Soatok’s Proposal for Solving This Problem
If you’ve watched the video above, feel free to skip to the proposal.
Coordinated Inauthentic Behavior
Whether for financial gain or to influence the public, attackers using hacked accounts have been crap-flooding YouTube with computer-generated content specifically designed to avoid detection.
With a lot of inauthentic content in place, it’s easy to construct a web of referral links that trick ranking algorithms into promoting the mass-produced nonsense into the eyes of human viewers.
From there, the end result is either:
- Extract ad revenue from the platform.
- Celebrate having successfully pushed bullshit onto the general public, possibly influencing people’s beliefs.
The Countermeasures Being Employed
As SmarterEveryDay suggests, there is a myriad of techniques being employed to ensure that the crapflood of basically duplicate content evades detection by YouTube moderators.
Some of the techniques employ selective sampling and cryptographic hash functions. At some point in time, we can also imagine machine learning algorithms being used, although that’s still immature technology.
By using compromised accounts with existing authentic content uploads, different computer-generated voices, randomized script structures for the text-to-speech software, distinct visual effects that tell the same story, and even digital snow effects, it’s incredibly unlikely that two pieces of content that a human would recognize as duplicate would be flagged by the algorithms.
Unless a novel breakthrough in machine learning saves the day, it seems unlikely to ever win the arms race against attackers.
I’d like to propose a solution to this problem that doesn’t rely on novel cryptography or machine learning techniques to succeed.
The core reason why the prospect of defeating the attackers’ counter-countermeasures seems so bleak is easy to understand if you’re familiar with the six dumbest ideas in computer security.
Specifically: Enumerating Badness.
If your input domain is infinite, you’ll end up doing one of two things:
- Creating a blacklist of size infinity, OR
- Missing some badness
So I’d like to propose the antithesis: Let’s enumerate goodness instead.
First, take all content categorized as (or appears to be) news, and default it as unvetted. Put a big scary banner on all such unvetted material, warning users that they may be watching fake news or propaganda.
Next, introduce a team of volunteers who have permission to do two things:
- Attest that certain news-related videos are trustworthy and accurate (or affirm that they are bullshit).
- Invite other users to become a volunteer.
Some important implementation details:
- Outside the initial seed users (which, for example, could be selected from college professors and graduate students at public universities), every volunteer must be invited by another user.
- Every attestation must be linked back to the volunteer vetter’s account for transparency and accountability.
- If any users are found to be abusing their vetting privilege, they will be removed. If several abusers were invited by the same user, YouTube may also remove the ability for the common parent in the nested tree of volunteers to invite future users. All of their peers should similarly be investigated.
- In order to move from unvetted to vetted status, several positive attestations must be provided from users whose distance from a common parent is either undefined (i.e. different seed users, so different trees entirely) or greater than 2.
- Any edge cases should be escalated past the volunteer vetters to the usual YouTube moderation staff.
Optionally, you can partition vetters into areas of expertise, but that gets really complicated fast.
Will anyone volunteer?
Compare the success of Encyclopedia Brittanica with that of Wikipedia. When was the last time any of us looked an article up in Encyclopedia Brittanica?
Wikipedia’s success came from hordes of unpaid volunteers working without financial incentive.
Can we incentivize users to volunteer?
I would strongly advise against monetary incentives.
YouTube can gamify this by creating a score system for vetters who successfully attesting content as trustworthy (with a mildly stronger incentive for identifying unflagged content), with points only awarded upon achieving (and maintaining) quorum and therefore vetted status.
Gamification can include leaderboards, social badges, etc. If StackOverflow is a reasonable precedent, it may even lead to greater career success for users who are proficient and picking out bullshit from the content farms.
What about anything that slips through the cracks?
The default state is unvetted. If the content is sneaky enough to evade volunteer detection, users will be instructed to distrust it until it’s been vetted. (Maybe add a “please vet me” button?)
If the volume of content eclipses the available pool of volunteer vetters, most of it will remain unvetted and be marked as potentially fake news.
Are University professors the best seed to select?
I don’t know of a better one that would lead to an initial pool of users capable of fact-checking. Feel free to replace that example with something that makes more sense.
It might not matter that much. The important thing is appending “who invited who?” into a tree so that common sources of abuse/misuse can be identified and cut out. This is a proactive defense against coordinated semi-authentic behavior to evade detection with malicious fake vetting.
My prediction is that my proposal will totally disrupt the strategies and incentives of these attackers and they’ll be forced to adapt to increasingly deceptive tactics in an attempt to fool experts.
Since this level of deception is uncommon, it is unlikely to succeed at the same scale as current attacks, and will likely be costlier for attackers.
Of course, attackers do have a strong incentive to innovate. And they very well might find another way to game the system, but by inverting the fundamental strategy away from enumerating badness and towards enumerating goodness (with a default state of untrusted/unvetted), none of their current tactics will prevail.