Safeguarding Generative AI Models from Spreading Misinformation with Transparently Sourced Data

The rapidly advancing capabilities of AI models require transparently sourced data inputs for fine-tuning models and providing guardrails during post-processing of outputs. NewsGuard for AI solves this misinformation problem, which has undermined trust in AI.

By Veena McCoole and Elan Kane | Published on May 24, 2023

The propensity for AI to amplify harmful narratives and even infiltrate “news” websites is already well documented. NewsGuard’s own reports have found that ChatGPT-3.5 generated misinformation and false narratives 80% of the time when prompted to do so, a proportion that increased to 100% when its successor, ChatGPT-4, was tested.

As the AI industry acknowledges the threats its technology poses in the hands of malign actors, ethically minded leaders are looking toward solutions to govern the development of this nascent technology.

For example, during a May 2023 hearing before a US Senate subcommittee, Sam Altman, founder of OpenAI, welcomed regulation of AI, calling for independent audits, a licensing regime, and “warnings akin to nutritional labels on food.” The latter is akin to NewsGuard’s Nutrition Labels for news and information sources, which can be used as post-processing guardrails and self-regulation for AI.

NewsGuard’s integration in AI models demonstrates the potential for more transparent, balanced results that enable users to understand the news sources for a given AI result.

What are the implications of AI on misinformation?

Generative AI platforms can be weaponized by those seeking to amplify false information, sway public opinion, or tap into the estimated $2.6 billion of advertising revenue that is inadvertently directed toward misinformation news sites, as found by NewsGuard and Comscore.

There are also dangerous challenges for the security and defense industry if state-sponsored disinformation and other hostile information operations are not countered with stress-tested tools for fine-tuning the AI using transparently sourced, high-quality data inputs. At the summer’s DEF CON 31—the largest global convention of hackers—NewsGuard is offering a sampling of its Misinformation Fingerprints (see below) to DEF CON 31 participants, so they can do their own red team audits or even create a training tool to deploy them.

On May 4, a White House statement announced that the AI Village at the DEF CON 31 conference in August will “allow these models to be evaluated thoroughly by thousands of community partners and AI experts” and said this independent exercise “will enable AI companies and developers to take steps to fix issues found in those models.”

On the opening day of the conference, August 10, NewsGuard will present the results of its audit of the leading generative AI tools so far launched in the market against 100 of its false narratives. NewsGuard will provide a report on how many of the 100 false narratives in the news each generative AI model spread and how many it refused to spread.

How NewsGuard for AI works

AI models can use NewsGuard’s two transparently sourced, human-curated data inputs to counter misinformation.

NewsGuard’s Reliability Ratings for sources of news and information enable the models to be fine-tuned to treat content from trustworthy news sites differently from content from misinformation sites when surfacing results to users.
Its catalog of top false narratives spreading online, the Misinformation Fingerprints, act as guardrails during post-processing to recognize and debunk demonstrably false narratives. This human-created data is the only data of its kind that has been created and that can help generative AI models avoid the spread of misinformation.

NewsGuard’s integration in AI models demonstrates the potential for more transparent, balanced results that enable users to understand the news sources for a given AI result.

To learn more about leveraging NewsGuard’s transparently sourced datasets for AI platforms, get in touch and request data samples at [email protected].