Why the ‘Wisdom of the Crowd’ Can’t Make Generative AI Safe from Misinformation

The crowd may be able to provide reliable feedback on AI-generated text output about how to scramble an egg—but not about whether Covid vaccines work or Nazis rule Ukraine. However, trained journalists applying transparent, accountable, and apolitical criteria can offer AI models trustworthy training data.

By Elan Kane and Veena McCoole | Published on August 7, 2023

When it comes to news, the “crowd” is often inclined to believe definitive, far-left or far-right narratives that align with their personal beliefs instead of moderate sources that don’t take a hard stance on an issue they care about. As the social media platforms learned ten years ago, asking the crowd to “rate” generative AI’s responses to questions or prompts about news will produce skewed results. The crowd may be able to provide reliable feedback on AI-generated text output about how to scramble an egg—but not about whether Covid vaccines work or Nazis rule Ukraine.

The problem with ‘Reinforcement Learning from Human Feedback’

The work of trained journalists applying transparent, accountable, and apolitical criteria offers AI trustworthy training data. In contrast, biased opinions dominate results from “Reinforcement Learning from Human Feedback,” or RLHF. And, as recent news reports have shown, those tasked with training generative AI models are often contractors who are underpaid, overworked, and under trained, especially when it comes to assessing news topics or sources.

A 2021 study published in the Journal of Online Trust and Safety notes that “while crowd-based systems provide some information on news quality, they are nonetheless limited—and have significant variation—in their ability to identify false news.”

A group of global academic researchers who compared NewsGuard’s assessment of news sources with the opinions of people “crowdsourcing” their views found a vast difference between the work of journalistically trained analysts applying transparent, apolitical criteria versus the general public, fewer than 20% of who could correctly spot a false claim in the news.

“Both psychological and neurological evidence shows that people are more likely to believe and pay attention to information that aligns with their political views – regardless of whether it’s true. They distrust and ignore posts that don’t line up with what they already think,” the researchers concluded in a 2019 study.

“What we learned indicates that expert ratings provided by companies like NewsGuard are likely more effective at reducing the spread of propaganda and disinformation than having users rate the reliability and accuracy of news sources themselves,” they wrote. “That makes sense, considering that… “crowdsourcing ‘news’ was what got us into this mess in the ﬁrst place.”

Assessing topics in the news requires an apolitical approach, something that RLHF cannot provide because user feedback is so often politically biased. That’s why NewsGuard’s apolitical approach to evaluating source credibility, based on nine basic, transparent criteria, is an effective way to fine-tune AI models to distinguish between trustworthy news sources and unreliable information.

There is already evidence that training AI with misinformation tools developed by trained journalists can reduce the propensity of these tools to spread false information. As Semafor reported recently, Microsoft’s Bing Chat is trained on NewsGuard data and provides what Semafor editor Ben Smith called “transparent, clear Bing results” on topics in the news that “represent a true balance between transparency and authority, a kind of truce between the demand that platforms serve as gatekeepers and block unreliable sources, and that they exercise no judgment at all.”

How NewsGuard’s Misinformation Fingerprints can help

NewsGuard extracts and catalogs the top misinformation narratives spreading online in our Misinformation Fingerprints dataset. We provide constantly-updated data about each narrative in human- and machine-readable format—such as example language, links containing the false claim, and related keywords and hashtags. These Fingerprints can be used as data seeds to implement guardrails during post-processing of outputs to help an AI model recognize and debunk demonstrably false narratives.

NewsGuard's Misinformation Fingerprints tracks the false narratives spreading globally online.

‘Red teaming’ AI models

NewsGuard analysts apply their training when “red teaming,” or offering challenging prompts to determine if the AI model can distinguish reliable from misinformation sources and detect false narratives. For example, we test AI models in the journalistic style of malign actors such as Russian, Chinese, and others engaged in hostile information operations. We also test Al models the way news consumers use prompts simply seeking an explanation of topics in the news.

On the opening day of the DEF CON 31 conference on Aug. 10, NewsGuard will present the results of its audit of some of the leading generative AI tools so far launched in the market.

NewsGuard’s analysts will deploy a sample of 100 of NewsGuard’s more than 1,400 Misinformation Fingerprints. Our analysts will then prompt each generative AI tool to repeat the false narrative. The same 100 false narratives and corresponding prompts will be used for each of the contestants. This “red team” assessment will indicate how many of the 100 false narratives in the news each generative AI model spread and how many it refused to spread.

In a similar red teaming exercise in January 2023, NewsGuard analysts prompted OpenAI’s ChatGPT-3.5 with these prompts for 100 Misinformation Fingerprints false narratives. ChatGPT-3.5 spread these false a total of 80 out of the 100 prompts. In a subsequent audit of ChatGPT-4 conducted in March, the newer version repeated all 100 false narratives and did so more eloquently. NewsGuard’s red-teaming of Google’s Bard in April 2023 found that it spread 76 out of the 100 false narratives.

The bottom line is clear: when it comes to news, training generative AI with crowdsourced opinions does not work. Data created by journalists applying transparent, accountable criteria does.

To learn more about NewsGuard for AI, contact us at [email protected].

Editor’s note: This post has been updated to reflect the number of Misinformation Fingerprints used in NewsGuard’s AI audit from 50 to 100.