NewsGuard’s Story

Excerpted from “The Death of Truth,” by NewsGuard co-CEO Steven Brill. Become a premium member of our Reality Check newsletter today to receive a free copy of the book and other great member benefits.

Contact Us

You have reached the maximum number of entries to this form in a week. Please try submitting later. Thank you.

Millions of Papers Flying Around

You walk into a library, and instead of seeing books and periodicals neatly arranged by subject, instead of being able to see who the publisher is, instead of being able to know something about the author from a thumbnail biography on the book jacket or in an editor’s note, and instead of a librarian on hand to guide you as you make your reading choices, you see millions of individual pieces of paper flying around in the air. That’s the internet.

You see millions of individual pieces of paper flying around in the air. That’s the Internet.

In a news feed or search, or something forwarded in an email or private chat, you see only a headline or a sentence or two. You have no idea who’s behind what you’re reading. What their credentials are. (Is the person pictured in the white coat recommending a remedy for insomnia really a doctor?) What their standards are. Whether the text contains misinformation or if a photo is a deep fake created by artificial intelligence. Or who’s paying for it. And there’s no librarian to guide you.

In 2018, Gordon Crovitz, a former publisher of The Wall Street Journal, and I launched what CNN would later call the “librarian of the internet.” The company, which we called NewsGuard, would use trained journalists to read, report on, and rate the reliability of news and information websites and their associated Twitter accounts, Facebook feeds, and YouTube channels. Facebook, Twitter, YouTube, and other platforms could then attach our rating—a point score of 0 to 100—to each publisher every time their work appeared on that platform. The reader could then click through and read what we called the publisher’s Nutrition Label. It would explain the rating in detail, provide information on the publisher’s background and financing, and generally tell readers who’s feeding them the news. We would also market a browser extension that consumers could download on their own. They could then see the website’s score, with a link to its Nutrition Label, whenever a website appeared on any search or social media platform that they were using, even if the platform itself had not licensed the ratings to deliver to that consumer.

There would be no politics involved and no favoritism based on reputation or longevity. Whether publishers were venerable news brands or bloggers, liberal or conservative, for profit or nonprofit, they would all be judged on the same basic standards of journalistic practice.

Our idea was that in a world where online content dominates everyone’s media diet and people increasingly see news presented online from generally reliable sources mixed in indiscriminately with purported news from untrustworthy sources, someone needed to make sense of it all. This was already true in 2018, when people were likelier to get news delivered to them in a social media feed—from a conspiracy-minded uncle or by way of a black box recommendation algorithm—than they were to get news from the home pages of trusted news sources.

That seemed logical to us—a way of creating an important service and, we hoped, a good business. However, all of the technology industry investors we exuberantly pitched the idea to thought we were crazy. How could humans do these ratings “at scale”? they asked, employing a favorite tech catchphrase and implying that only technology—in this case, using artificial intelligence to detect misinformation—could achieve the “scale” necessary to rate the internet’s millions of websites and associated social media content. In the tech world a problem caused by technology could only be solved by technology.

In the words of that famous Apple commercial that captivated the world in the late 1990s, we were geared to “think different.” We were going to solve a journalism problem—websites purporting to cover news that had wildly inconsistent standards—with journalists. For starters, we had a different understanding of what achieving “scale” actually meant. We were interested only in sites broadly defined as containing news and information, not the website of a local pizzeria. We had data that showed that there were about twenty-eight hundred such news and information sites in the United States that accounted for 95 percent of online engagement, meaning that these websites were among the 95 percent most shared or commented on in the feeds of the major social media platforms.

That top 95 percent included established global news organizations, Russian propaganda outlets, reputable or hoax health-care sites, local news start-ups, niche business publications, and gossip tabloids. Only the most obscure websites, those with little online engagement, were not in the top 95 percent, and we could add some of those if they did something to break into the news.

Once we broke down the job this way, we calculated that if we could recruit about thirty full-time journalists, plus some freelancers, who could draft and edit an average of two to three ratings a week, we could get to 95 percent in about a year, plus include a few hundred sites that were in the news, or were well-known legacy brands (such as the Harper’s magazine site) that were below the 95 percent threshold, but that we thought should be included because their journalism had impact beyond their online popularity.

That penetration rate—that is, not worrying about the thousands of other sites that were in the remaining 5 percent—was crucial. We decided from the start that perfect should not be the enemy of the good. We would tackle most of the problem and not be paralyzed by not addressing all of it.

We decided from the start that perfect should not be the enemy of good. We would tackle most of the problem and not be paralyzed by not addressing all of it.

Nor were we going to provide a rating for every article on the news sites we rated. Instead, we were going to rate their overall reliability according to their adherence to basic journalistic standards. So, if The Boston Globe or Le Monde (we had plans to expand quickly into Europe, where the numbers were even less daunting) occasionally published an article that had mistakes, it would still appear online with our highly reliable rating. Even the most careful news publication makes mistakes. Again, we knew we were not solving the problem completely, but we were determined to make a big dent in it. Later, we would see this decision validated when we found that the least reliable websites tended to be recidivists. They typically bounced from one false story to another.

Meanwhile, we knew that the tech world had done little to solve the information chaos that it had created. It had developed decent technology to identify and deal with a lot of pornography, hate speech, and over-the-top sensationalism. Machines could indeed be programmed to catch much of that by looking out for targeted words or images. However, when it came to misinformation and disinformation, the best tech tools had proven they could not detect it. A calmly written article extolling the virtues of a phony cure for cancer or explaining why a school shooting was a hoax featuring crisis actors was not something that would trigger a machine. Nor could the machine tell the difference between a site posing as a mission-driven local news start-up and one secretly financed by a political candidate in the middle of a tight race.

When we launched in late 2018, having raised money from individual investors who were familiar with some of our prior work instead of from tech-centric venture capitalists, the evidence that the machines were unable to deal with harmful content when it came to misinformation and disinformation was mounting, resulting in a seemingly endless series of troubling revelations.

Among them:

A well-publicized study by the Stanford Graduate School of Education in late 2016 had found that “when it comes to evaluating information that flows across social channels or pops up in a Google search, young and otherwise digital-savvy students can easily be duped. . . . Ordinary people once relied on publishers, editors, and subject matter experts to vet information they consumed. But when it comes to the unregulated internet, all bets are off.”

Another study, by Vox, had found that during the three-month run-up to the 2016 election the “top twenty fake news sites on Facebook” had more engagement online than the top twenty legitimate news sites. Among the stories most read on the “fake” news sites was one announcing that the pope had endorsed Donald Trump and another that an FBI agent involved in the investigation of Hillary Clintonʼs use of a private server for emails while she was secretary of state had been found dead in his apartment. The FBI agent “story,” which quickly went viral across social media platforms, had originated from DenverGuardian.com, which purported to be the website of a local Denver newspaper. There was no such newspaper. The pope endorsement story came from a website called WTOE 5 News. According to a 2016 report on CNBC.com, “WTOE 5 News has since shut down its website. However, when it was operational, it openly admitted to fabricating content and even had a disclaimer on its homepage saying: ‘most articles on wt0e5news.com are satire or pure fantasy.’”

Human rights groups had published reports that Facebook had been used to promote ethnic violence in Myanmar in 2017. A website called Natural News, which also ran Facebook, Twitter, and YouTube accounts, had stirred attention for its promoting of health-care hoaxes ranging from Bill Gates having killed children in Chad by funding a meningitis vaccine to abortion causing breast cancer. (The same site promoted a variety of non-health-related hoaxes, too, including that the murders at the Sandy Hook Elementary School in 2012 were a “false flag” operation intended to gin up support for gun control.) When the original Natural News site had finally been spotted and kicked off the platforms, it had simply reappeared under dozens and then hundreds of new names.

The investigation by the U.S. special counsel Robert Mueller into Russian efforts to help elect Donald Trump had produced indictments in 2018 (never contested in court by the accused Russians, who remain fugitives) charging that the lavishly funded St. Petersburg–based Internet Research Agency had set out to steer the election to Trump by ginning up an estimated 10.4 million tweets across 3,841 Twitter accounts, 1,100 YouTube videos across 17 account channels, 116,000 Instagram posts across 133 accounts, and 61,500 unique Facebook posts across 81 pages. This resulted in 77 million engagements on Facebook, 187 million engagements on Instagram, and 73 million engagements on original content on Twitter.

Also in 2018, a scandal erupted around revelations that a British company hired by the 2016 Trump campaign, Cambridge Analytica, had used Facebook to breach the privacy data of millions of Americans and target them with misinformation.

Perhaps the most amazing example of just how wrong things had gone when machines were left to assess the bona fides of online publishers was a New York Times report in late 2017, which found that because of YouTubeʼs recommendation algorithm steering users to what the machine decided they should see, a YouTube channel called RT had become second only to CNN as the most popular news offering on YouTube in the United States. RT is a Russian propaganda operation. It had started out calling itself Russia Today, but in a rebranding that no doubt earned someone in the Kremlin a bonus, it had changed its name and added viral, anodyne content, such as pet videos and car crashes, to hide its identity and true purpose and to build an audience.

As a result of all those ugly headlines, Facebook, YouTube, and, to some extent, the other platforms had during the 2016–18 period hired thousands of content moderators and fact-checkers to spot and take down content that violated standards articulated in their terms of service. However, these were mostly outsourced, low-paid workers, many based overseas, who were given inconsistent guidance and scant supervision by the “trust and safety” teams that the platforms had hired, often with great fanfare, to demonstrate how much they cared about the problem. Their job was to monitor millions of website articles and social media postings a day in real time, which, of course, was impossible.

Moreover, the fact-checkers’ work came after the fact—after someone had posted a hoax on Facebook, for example. It would often take days or weeks for the hoax to be discovered and taken down, by which time the most noxious content would long since have gone viral.

At the same time, the search engines and platforms continued tinkering with their algorithms, supposedly to try to catch more of the bad stuff before it got promoted. As we will see, that might have been the goal when it came to pornography and the most violent videos and vile hate speech. However, when it came to misinformation and disinformation, the algorithms not only couldn’t catch it but really didn’t want to.

These efforts—the fact-checking and the algorithmic ratings—were meant to assure regulators and the public that Big Tech was dedicated to giving their users the most reliable, useful content first when they did a search or saw a social media feed. But who were they to decide what was reliable or useful, and what wasn’t? How would they protect against biases? Their answer was that a machine, unlike the humans we had in mind, didn’t have biases. It was just a machine, with no opinions, emotions, or predispositions. For years after we launched, this would be a common refrain from the tech companies: A tech solution could not only achieve scale that humans couldn’t. The machine also didn’t have the biases that we humans had.

Beyond the fact that those providing the algorithmic input into the machine obviously could and did have biases, and that the algorithms clearly didn’t work to prevent online misinformation, we saw an additional, fatal flaw: the machines were unaccountable because nothing about what they did was transparent. All that we or anyone else outside these tech behemoths knew when we started in 2018 was that at least three of them—Twitter, Facebook, and Google—were somehow rating news and information content to be another variable in the algorithmic input that would determine the priority of what users would see about a given subject, or even if they would see the content at all. (Google was doing this for its dominant search engine and for its YouTube product.)

No one knew how they were doing it. Was NationalReview.com, the website of an iconic conservative American magazine, rated higher than the newer conservative DailyCaller.com or higher than TheNation.com, the website of The Nation, the liberal magazine founded by abolitionists in 1865? Was an upstart local French news site rated higher or lower than the legacy publication where its founders had worked before? The editors and publishers of each had no way of knowing, even though the size of their audience and the volume of advertising revenue they would get depended on how high up they would be presented on news feeds and searches. If they wanted to know, there was no one at these platforms whom they could ask. If somehow they did find someone to ask, they would not be told what their rating was. And if they did find out what their rating was, they would not be told how the rating was derived.

The platforms had an only-in-Silicon Valley answer for not revealing this crucial metric or how it was calculated: if publishers knew how the algorithms calculated the ratings, they could “game” the algorithm.

In short, there was no transparency and no accountability in a system that was misfiring so badly that the one time someone from Silicon Valley had offered a public assessment of a publisher, it was when YouTube’s top programming executive celebrated the most pernicious, ubiquitous Russian propaganda outlet for its authenticity. His news judgment was laughable: He praised Kremlin-funded and operated RT for providing “authentic” content instead of “agendas or propaganda.”

We intended that the fourth ratings system—NewsGuard’s—would be exactly the opposite in every respect. Everything about it would be transparent and accountable. Every website would be given a score from 0 to 100 using the same nine criteria—nine basic standards of journalistic practice. The accompanying Nutrition Label would spell out how we had applied the criteria to arrive at the score. And we would not tell any platform to block anything. Instead, we would offer the platforms a license to show their users NewsGuard’s assessment (and anyone else’s assessment if we had competitors, and we did in time) of the publisher’s journalism standards. We would always contact the proprietors of the website for comment before we published anything even remotely negative about the site. Journalists call for comment. Algorithms don’t.

Journalists call for comment. Algorithms don’t.

The Nutrition Label would list the team that had worked on the rating, with their bios and contact information. If a publisher complained about a rating after we published it, we would include that complaint in the Nutrition Label. Or, if we decided the complaint was valid, we would make the correction and describe what we got wrong and how we corrected it. If we had any conflicts or appearances of conflicts in making the rating (such as when we rated the websites of publications I had founded, or when we rated the website of The Wall Street Journal, where Gordon had been the publisher), we disclosed that, too (even though neither of us still had any affiliation with those publications). And, if a publisher made a change to meet one of the criteria that it had not met in the initial rating, we would change the rating and explain the change. Again, we were the opposite of an algorithm. We wanted publishers to game our system.

Our solution was far from perfect. We were human beings who would make mistakes. And, despite all the systems we planned to put in place to prevent bias in applying those criteria, and the fact that competitors could replace us if we fell short, we still might succumb to those biases. Nonetheless, we thought of our solution as better than the two alternatives for addressing the problem: the worst would be to have the government make judgments about the reliability of what people were seeing online; the second worst would be for only the unaccountable, opaque algorithms of Silicon Valley to continue doing it.

We were confident that we were right about how the internet was a no-man’s-land akin to those millions of pieces of paper flying around in the air.

As is now obvious, we were more right than we realized, because what we didn’t know was that this was the relative calm before the storm. All of our certainty that we were attacking a huge problem was before COVID, before the 2020 election, before Stop the Steal and January 6, before the surge in anti-vaccine hoaxes, before the Russia-Ukraine war, before the Trump indictments and the beginning of the 2024 election cycle, before the Israel-Hamas war, and before generative AI dramatically enhanced the power of the internet to spread falsehoods at unimaginable scale.

 


 

Read more in “The Death of Truth,” by NewsGuard co-CEO Steven Brill—FREE for premium members of our Reality Check newsletter.