Friday, October 19, 2007

Part 2: Why is Journalism a Math Free Zone?

Part 1 established that your local news room is a math-free-zone and the problem is extremely well known in the field of journalism. Ironically, they never run news reports about their own math incompetency. Sounds like censorship to me.

Part 2 looks at how this situation came about, plus how reporters cope with numbers, if they have to cope with numbers.

Q: Why has journalism developed into a math-free-zone?
A: Journalism did not develop into a math free zone - it has always been a math free zone. Journalists self selected into the field, in part, because they like to write and did not like math. There are numerous journalist quotes online about failing math, failing the SAT math section, skipping math class, etc. - seems peculiar to me that they boast about this but they do:
"Our distaste for figuring led us to avoid the study and practice that gives more mathematically oriented folks such ease with numerical procedures. And maybe we just lack the talent. Plenty of journalists received verbal SAT scores that were 30 or 40 percentile points above their quantitative scores."
And
"In the grand scheme of things, most journalists rank numbers somewhere below cockroaches. If the truth be told, a good number of us chose journalism as a college major because it allowed us to avoid math courses."
(Source: No Train, No Gain, a web site resource for news trainers. More examples of math problems here.)

Reporters take pride in knowing so little math:
"Sometimes that's because sources are deliberately misleading, and sometimes it's because we relish our mathematical ineptitude so much that we encourage stories that are inaccurate and unfair."
Q: Is math considered important qualification for getting hired as a journalist?
A: No, not at all. Math skill has no bearing on most job opportunities. Until a year ago, math (and by math I mean simple middle school math) did not appear on the accrediting standards for colleges and schools of journalism. Math was not seen as important to getting hired:
"If Simon [a reporter] had been an illiterate -- someone who lacks the ability to read and write -- he never would have been allowed in a newsroom. But as an innumerate, it didn't matter that he wasn't good at math. If you don't know the difference between a noun and a verb, you could never get a job as a reporter or editor. But newsrooms are full of people who don't know how to calculate a percentage."
Q: So how do journalists handle numbers?
A: By avoiding numbers all together! Even avoiding stories that require too many numbers. But their main trick is to simply write entire stories without numbers by replacing numerical values with comparisons and metaphors.

Most are trained to remove numbers from stories because numbers are said to "bog down" an otherwise interesting story. It is common for a reporter to wrap a story with colorful adjectives to suggest trends and comparisons to past events, glued together with colorful quotes from equally colorful "experts". By eliminating numbers and using colorful adjectives, the reporter avoids bogging down the story with actual details that, well, might even make the story moot.

This is what journalists are trained to do - avoid numbers, eliminate them whenever possible. Simplify, simplify, simplify. Give only what "the reader needs to know".

Example 1:

  • Use Only What You Need
    • "In all, conservation groups sought to halt 421 proposed national forest timber sales containing 2.5 billion board feet of timber. That's more than half the 4.8 billion board feet the U.S. Forest Service plans to sell on the 13 national forests in Oregon and Washington this year"
    "The point of this passage was to indicate the scope of the environmentalist attack on logging. So what are the critical figures? To some degree, that's a judgment call. But you can make a good argument that all readers really needed to know [emphasis added] was that the conservationists wanted to stop more than 400 sales containing more than half of all the timber the Forest Service planned to sell in Oregon and Washington. Adding all the other figures just detracts from the central point without contributing anything terribly meaningful."

    On the surface, the recommendation sounds acceptable. So let's rewrite the story using those guidelines:
    In all, conservation groups sought to stop more than 400 timber sales containing more the half the timber the Forest Service planned to sell in Oregon and Washington.
    Sounds good. What could be wrong? The revised version, while sort of accurate, has lost sense of the scope of the acreage involved. Change all of the original "billion" references to "thousand" references. The revised sentence is still accurate, but does it capture the scope of the controversy? No. But that's okay, the reporter knows what I "really needed to know". The Associated Press, in an Orwellian doublespeak proclaims they stand for your "right to know" although that apparently means only your right to know what the AP thinks you need to know.

    Example 2:
    Let's look at another example of scope. A frequent news headline reads similar to "Researchers announce new drug therapy that is 50% more effective". Left out of the math-free and number-free report is "more effective" than what?

    Typical of most medical research, the old drug may have achieved 4% improved patient outcomes and the new drug achieved 6% improved patient outcome. Did the "50%" claim capture the idea that we are discussing a large percentile increase in a small number?

    This type of reporting loses another part of the story. In a typical medical study you might have results like these:
    • 30% took a placebo (received no treatment) and got better
    • 34% of those taking an older medication got better
    • 36% took the new medication and got better
    Are you excited by those numbers? When you see the numbers this way, you discover that hardly anyone gets better as a result of either the old or new drug. Could you tell that from the cheery "50% more effective" headline? Virtually all news reports will proclaim the above results as "the drug is 50% more effective" which is misleading at best. (Junkfood Science has more examples of the bad reporting on relative risks in health care study results. We are slowing learning that many purported treatments, typically from prescription drugs, either do not work or barely work at all. In fact the majority of drugs do not work at all for most patients.)

    Example 3:
    Here is another example of science fiction news reporting from my morning newspaper:
    "From 1985 to 1994, 62 percent of the female polar bears studied dug dens in snow on sea ice. From 1998 to 2004, just 37 percent made dens on ice."
    Those are the only numbers provided in the story. But percent of what? 10 bears? 100 bears? 1000 bears? The first period (1985 to 1994) represents 10 years, but the second period represents 7 years. What happened to the years 1995 to 1997? If the full 10 year period is shown, what are the results? Some one removed a lot of numbers from the story to avoid bogging it down with data - but turned the story into meaningless gibberish.

    (I did some brief looking in to this and found it involved 89 bears across the whole study period and was presented at a conference in Alaska; the study itself is under review and has not yet been submitted for publication or peer reviewed. Treating it as a binary confidence interval problem and assuming part of the missing information, the 95% confidence level would be about + or - 10%. Stated another way the actual number in the overall population lies between 52% to 72% of polar bears dug in snow ice in the first period, and somewhere in 27% to 47% dug in snow ice in the 2nd period. This would be statistically significant - but barely (or maybe "bearly"?) - which means that the finding is "probably true". Would it sill be statistically significant if the 1995 to 1997 period was included? Why is it missing? Was the data "cherry-picked"? Why has no reporter asked that question? We know the answer - it involves both math and skeptically questioning an expert. All the media outlets have since picked up the story and ran it - often after removing even more numbers - all you need to know is that "more polar bears are denning on land".)

    Example 4:
    In January 2008, the NY Times published an article filled with the usual "he said, she said" quotes and the startling claim that returning American veterans were murdering people across the United States due to the stress of the war zones. Sadly, the NY Times ignored the actual numbers in order to reach their conclusion because the actual data contradicted the NY Times story. Presumably the reporter did not wish to "bog down" a good story with actual numbers.

    Q: How are numbers handled on TV news?
    A: Numbers on TV? You're kidding? TV newscasters are taught to avoid all numbers unless they can display a graphic, typically a cute graph (stack of dollar bills, columns on a building, line of puppies, etc).

    The assumption made by the news industry is that readers know less math than the reporters.

    Back when news had to fit a dwindling sheet of paper (newsprint), dumbing down stories to make them fit may have made sense. With rapidly shrinking page counts in most papers, they are dumbing down the stories even more. Their online editions ought to provide more depth, ought to provide the data, and ought to link to the supporting data and key sources. But they cannot do that because the reporters often lack the analytical skills necessary to do much beyond simple averages and percentages.

    Part 3 looks at the impact of the math-free-zone on science and other reporting that is heavily dependent on statistical analysis. We will see that organizations such as the Royal Society (perhaps the oldest scientific organization in the world), NASA and the European Space Agency press release guidelines call for removing as much data as possible from press releases with the admonition that they should assume "the reader knows nothing". Worse, everyone knows this and uses reporter math phobia as a tool to shape the news reports in their favor.