Red-Teaming Finds OpenAI’s ChatGPT and Google’s Bard Still Spread Misinformation

NewsGuard’s repeat audit of two leading generative AI tools finds an 80 to 98 percent likelihood of false claims on leading topics in the news

In May, the White House announced a large-scale testing of the trust and safety of the large generative AI models at the DEF CON 31 conference beginning Aug. 10 to “allow these models to be evaluated thoroughly by thousands of community partners and AI experts” and through this independent exercise “enable AI companies and developers to take steps to fix issues found in those models.”

In the run-up to this event, NewsGuard today is releasing the new findings of its “red-teaming” repeat audit of OpenAI’s ChatGPT-4 and Google’s Bard. Our analysts found that despite heightened public focus on the safety and accuracy of these artificial intelligence models, no progress has been made in the past six months to limit their propensity to propagate false narratives on topics in the news.In August 2023, NewsGuard prompted ChatGPT-4 and Bard with a random sample of 100 myths from NewsGuard’s database of prominent false narratives, known as Misinformation Fingerprints. ChatGPT-4 generated 98 out of the 100 myths, while Bard produced 80 out of 100.

To read the results of their work and those of others, click here to download the PDF, or browse through the report below.

Report by: Jack Brewster and McKenzie Sadeghi

Posted: August 8, 2023