Tag Archives: statistics

Dissecting an Infographic

This morning I woke up, made a cup of coffee, picked up my tablet, and took a quick look at Twitter. The following was in my twitter feed via retweet:

screenshot_140

 

Huh, that’s interesting. We do often see media blowing things out of proportion. See: Ebola in the United States panic this fall. I was about to move on before I noticed the scale on the upper left.

2014 had less than 500 deaths? That can’t be. The two Malaysian Air lines planes that went down this summer (one shot down and one lost entirely) had to have accounted for more than 500 deaths just together. They had no survivors.

Then I noticed the lower right corner.

screenshot_141

So I get things getting safer year over year, but there is no way that 2014 actually had an order of magnitude drop in crashes in one year. 10% decrease seems plausible, but not 80% drop.

That’s because this infographic is from March (which makes a ton more sense). It’s publication date is March 10, so we’re really looking at 1/6 of a year at best in that final bar.screenshot_142

 

Note: the original author completely refuses to acknowledge he was wrong over the course of the twitter thread. Self denial is amazing.

This CNN article from right after the Ukrainian flight was lost on CNN accounts for 761 deaths so far as of end of July.

This year is still going to end up with less deaths than most years, it will look more like the early 2000s, and less like 2013. Not the worst year on record, but definitely not the safest either.

Powerball Probability

If you win the Powerball jackpot today, there are a few things you should know. After beating the 1 in 175 million odds, you have an 11 in 175 million chance of being killed in your car after collecting the winnings. If you survive that, you have a 327,250 to 175 million chance of being robbed of those winnings, and a 805,000 to 175 million chance that new mansion will go up in flames, according to Eve Waltermaurer, associate professor of sociology at State University of New York at New Paltz.

Probability is a bitch some times. (From the Poughkeepsie Journal)

How to manufacture facts like a champ

“Boomerang kids: 85% of college grads move home,” blared a headline on CNNMoney.com. “85% of college grads return to nest,” echoed the New York Post. “Survey: 85% of New College Grads Move Back in with Mom and Dad,”said Time magazine’s website.

Recently, the 85 percent figure emerged in the presidential campaign, in an ad from the Republican group American Crossroads that blames President Barack Obama for the boomerang.

We rated the claim False, but as we dug into the number, we found the media had repeated it with little scrutiny. Journalists were content to copy a number from other news reports without verifying it — or even asking when the survey was conducted.

If the reporters had looked deeper, they would have found some oddities about the firm that claimed to have conducted the survey, a Philadelphia-area company called Twentysomething. The company’s website had an impressive list of staffers, but when we checked on them, we found several who either didn’t work for the company or appeared to be fictional.

The whole story is even weirder than you might imagine, and cane be seen over at Politifact. Moral of the story, news without public citations is suspect.

With all the crap wikipedia gets on accuracy, they are quite good about creating a culture of “citation needed”. We need more of that.

 

Evolution has a speed limit

From physorg, a statistical analysis of the velocity of evolution from the fossil record, looking for an upper limit:

Large evolutionary changes in body size take a very long time. A mouse-to-elephant size change would take at least 24 million generations based on the maximum speed of evolution in the fossil record, according to the work of Alistair Evans and co-authors. Becoming smaller can happen much faster than becoming bigger: the evolution of pygmy elephants took 10 times fewer generations than the equivalent sheep-to-elephant size change.

Now that computational cycles are becoming cheaper, some really interesting statistical analysis can be done in science to ask questions we never could before. Very cool.

Freakonomics off the rails

Great post at American Scientist about how Freakonomics has gone off the rails.

In our analysis of the Freakonomics approach, we encountered a range of avoidable mistakes, from back-of-the-envelope analyses gone wrong to unexamined assumptions to an uncritical reliance on the work of Levitt’s friends and colleagues. This turns accessibility on its head: Readers must work to discern which conclusions are fully quantitative, which are somewhat data driven and which are purely speculative.

I loved their first book, but as the author says, the strongest part was around Levitt’s own peer reviewed research. As they’ve gotten further away from that over the years, the methodology has become a lot more hear say and writing to a deadline.

The Seeds of Psychohistory?

Isaac Asimov’s Foundation series was based on the idea that in the future overall societal events, like the rise and fall of governments, could be modeled mathematically.

Folks at the New England Complex Systems Institute have proposed a paper that looks at global food prices as compared to food riots and unrest, which in many cases directly preceeded much larger overthrows of governments. It’s a really interesting read.

This is not yet peer-reviewed published, so take that with the appropriate caution. The paper makes a point about the fact that many countries no longer have much of a local food system, which means they are directly impacted by global food price index. Two major statistical factors in the rise of global food prices are cited as commodity speculators and US corn ethanol policy.

I encourage interested people to read the whole paper, available on arXiv.

Statistical Zombies

Kevin Drum has a good post on what he calls Statistical Zombies, 10 of the top mistakes people make when using statistics.  I particularly love #2:

What’s the survey error? Statistical sampling error in opinion polls is trivial compared to the error from other sources. Things such as question wording, question order, interviewer bias, and non-response rates, not to mention Bayesian reasons for suspecting that even the standard mathematical confidence interval is misleading, give most polls an accuracy of probably no more than ±15%. Example: a couple of years ago a poll asked respondents if they had voted in the last election. 72% said yes, even though the reality was that voter turnout in that election had been only 51%. Most polls and studies are careful to document the statistical sampling error, but who cares about a 3% sampling error when there might be 21 points of error from other causes?

More fun with statistics, beware your kitchen appliances

Via Bruce Schneier, there is this quite good write up on risk assessment in the government.  Apparently, most government agencies actually have explicit risk metrics when allocating resources based on the chance of things causing human fatalities:

An unacceptable risk is often called de manifestis, meaning of obvious or evident concern — a risk so high that no “reasonable person” would deem it acceptable. A widely cited de manifestis risk assessment comes from a 1980 United States Supreme Court decision regarding workers’ risk from inhaling gasoline vapors. It concluded that an annual fatality risk — the chance per year that a worker would die of inhalation — of 1 in 40,000 is unacceptable. This is in line with standard practice in the regulatory world. Typically, risks considered unacceptable are those found likely to kill more than 1 in 10,000 or 1 in 100,000 per year.

At the other end of the spectrum are risks that are considered acceptable, and there is a fair degree of agreement about that area of risk as well. For example, after extensive research and public consultation, the United States Nuclear Regulatory Commission decided in 1986 that the fatality risk posed by accidents at nuclear power plants should not exceed 1 in 2 million per year and 1 in 500,000 per year from nuclear power plant operations. The governments of Australia, Japan, and the United Kingdom have come up with similar numbers for assessing hazards. So did a review of 132 U.S. federal government regulatory decisions dealing with public exposure to environmental carcinogens, which found that regulatory action always occurred if the individual annual fatality risk exceeded 1 in 700,000. Impressively, the study found a great deal of consistency among a wide range of federal agencies about what is considered an acceptable level of risk.

This falls down when it comes to terrorism:

As can be seen, annual terrorism fatality risks, particularly for areas outside of war zones, are less than one in one million and therefore generally lie within the range regulators deem safe or acceptable, requiring no further regulations, particularly those likely to be expensive. They are similar to the risks of using home appliances (200 deaths per year in the United States) or of commercial aviation (103 deaths per year).

Hmmm… I’m going to have to start keeping an eye out on my dishwasher.  I’m pretty sure it has it in for me.