New R Street Study: AI Note Writers Outperform Humans on X’s Community Notes Platform
WASHINGTON, D.C. — Today, the R Street Institute released a new policy study, “AI Note Writers on Community Notes: An Evaluation of Seven Months of Data.” The analysis examines how AI-powered X, formerly Twitter, accounts are performing within X’s Community Notes fact-checking program and what their growing presence means for the future of decentralized content moderation.
The study is authored by senior fellow Spence Purnell with R Street’s technology and innovation program. Drawing on seven months of publicly available Community Notes data spanning September 2025 through March 2026, Purnell analyzed 423,915 notes written by both human contributors and 24 active AI accounts operating through X’s application programming interface (API). In order to get a clear read on quality, the study introduces a metric called the Verdict Success Rate (VSR), which measures how often a note wins when it actually goes through a full review process.
Key findings include that AI notes achieved a VSR of 88.8 percent, compared to 68.5 percent for human-written notes, which is a gap that held across topic categories including political content, war and conflict, and celebrity entertainment. AI notes also reached “Currently Rated Helpful” status, the designation that causes a note to appear publicly beneath a post, at more than double the rate of human-written notes (18.0 percent vs. 8.9 percent), and accounted for 13.9 percent of all shown notes despite representing only 7.4 percent of total output. Additionally, AI notes were three times more efficient than human notes—requiring 304 ratings per successful correction versus 908. This substituted for the equivalent of $577,000 to $3.4 million in professional fact-checking labor over the study period.
The study also identifies areas where the system falls short. The most common criticism of AI notes from human raters was that the notes were unnecessary, a pattern Purnell attributes to targeting logic that flags posts for correction even when no factual claim is being made. The study recommends that AI operators implement pre-screening steps to reduce this over-correction tendency, and that X explore priority routing to direct scarce rater attention toward the highest-quality note writers.
The study further argues that X’s Community Notes model—permissionless, distributed, and built on cross-partisan human judgment rather than regulatory prescription—has direct implications for ongoing policy debates. Purnell recommends that Congress resist importing European-style content moderation frameworks, preempt state-level AI regulations that would disadvantage small and independent developers, preserve Section 230, and sustain federal funding for the statistical agencies whose data underpins both human and AI fact-checking.
As Purnell writes, “The most productive AI note writers in our dataset were deployed by small-scale operators without institutional backing or dedicated compliance staff. A 50-state patchwork of compliance rules would not merely slow the development of AI-driven moderation tools—it would structurally disadvantage the small and experimental developers whose notes the Community Notes rating system has already proven capable of evaluating without external oversight.”
If you would like to speak with Purnell about this study, please contact pr@rstreet.org.