|
LIES, DAMNED LIES AND STATISTICS*
By Monica A. Frank, Ph.D.
Not a day goes by when I don't throw down the morning newspaper complaining
about the use of statistics in an article. In our world the media
liberally sprinkles statistics throughout articles and television programs to
support a point of view. The problem, however, is that statistics are
frequently misleading if not outright inaccurate. Without a clear
understanding of the nature of statistics and the definitions of statistical
terms, the public believe the statistic-supported statements as if they are
fact. In addition, without understanding the agenda of the journalist or
analyst using the statistics, the public accepts these "facts" uncritically.
And yet, if we "look under the hood" we will find the true nature of how
statistics work. Recently
The Wall Street Journal published an
article regarding the health benefits of exercise (Landro, January 5, 2010). Most people who
know me or who have read much on my website know that I'm a strong supporter of
exercise, so in choosing this article to review and dissect, I am choosing one
that has the same agenda I do: increase people's awareness of the benefits of
exercise. Therefore, there should be less bias on my part in critiquing
such an article since I support the underlying premise. In addition, I
chose The Wall Street Journal because of its reputation whereas an
article from a less respected source could easily be dismissed as "atypical."
Even though I am using an article on exercise as my example of the problems with
statistics, we could take any article on any topic and find the same problems.
Once you have a better understanding of the use of statistics try applying the
concepts learned here to the global warming controversy or to the efficacy of
medications. You might discover some very interesting "facts."
Why are the
underlying numbers important?
Most articles as the one I'm quoting report the percentage of risk or the
percentage of improvement but do not indicate the underlying numbers that were
used. As a result, the conclusions are based upon meaningless numbers.
For instance, the
article mentioned above indicates that "studies show that exercise can lower the
risk of colon cancer by over 60%." On the surface this statement is most
probably accurate. In addition, it sounds pretty impressive. Who wouldn't
want to reduce a potentially fatal disease by over 60%? However, what does
that number really mean? Without the baseline number indicating actual
risk, the claim of 60% reduction is meaningless. For example, if the
chances of having colon cancer (the following numbers are made up) were 1 out of 10 or
10%, and if our risk of colon cancer is reduced 60% by exercise we now have a
6% chance of colon cancer. That may be considered a significant enough
reduction that a person is willing to exercise routinely for the health benefit.
However, the actual annual incidence in the U.S. (I'm rounding all numbers so
they are easier to understand) of colorectal cancer is .05% or 1 out of 2000 if
we use a population figure of 300 million. So, if we consider that
exercise reduces the "risk" of colon cancer which is .05%, to determine the
actual change we multiply the risk (.05%) by the reduction (60%) and find that
the risk for exercisers is .03% or 1 out of 3333. What this means is that
if you are an exerciser the chance of colon cancer would be 1 out of 3333
instead of 1 out of 2000.
For some people, such as myself, this would be significant enough to change
their behavior, for others it would not. It depends upon what the change means
to the individual. For instance, if these numbers were instead reported
for non-chocolate eaters (in other words, I would have to give up chocolate) rather than exercisers, I would probably consider the
reduction in my quality of life to not be worth the reduction of risk. My
point is that we need to know the underlying numbers to make an informed
decision instead of relying on someone else to tell us what the numbers mean and
what we "should" do.
Certainly, the purpose of this article is not to give people another excuse not
to exercise. Therefore, let's use a different example that has different
underlying numbers. The Wall Street Journal (January 5, 2010) indicates
that exercise "reduces the incidence of diabetes by approximately 50%."
You may think, "That's not even as good as the 60% reduction for colon cancer
and she just showed that my chances of colon cancer are not high." But
wait a minute! We need to look at the underlying numbers for diabetes and
we will see an entirely different picture.
The annual incidence of diabetics in the U.S. is 1 out of 340 or .29% or 798,000
people develop Type 2 diabetes each year. Therefore, to determine the
difference regarding how exercise impacts the development of diabetes, we again
(as we did with colon cancer) multiply the risk (.29%) by the reduction (50%)
and find that the risk for exercisers is .17% or 1 out of 588 (using a U.S.
population figure of 275,000,000 as the source did). Now this may be much
more significant for many people. It certainly was for me. I was at
risk for developing diabetes 15 years ago because I already had an
insulin-resistant disorder and a family history of diabetes. However, I
didn't like the idea of giving up sweets, especially chocolate, for the rest of
my life so I decided to lose weight and exercise. As a result, my blood
sugars, blood pressure, and cholesterol are all perfect and I can still eat
chocolate (within reason). Having information regarding the effect of
exercise on the development of diabetes provided me with a method to gain
greater control over my health.
If this hasn't convinced you of both the importance of the underlying numbers
for understanding statistics as well as the importance of exercise, read the
section below under definition of incidence regarding hypertension.
Definitions
Failure to clearly define terms can lead to
inaccuracies in the reporting of statistics. Frequently, by the time
statistics are reported to the public the writer is not the statistician who
compiled the statistics but typically a writer who has little or no training on
the interpretation of statistics or understanding of mathematics.
Therefore, although statisticians may define terms precisely, the terms used in
many articles are imprecise or the same terms are used inaccurately. For
instance, to understand the basics of statistics one must understand the
difference between prevalence and incidence of diseases.
Prevalence
is the number of cases in a population during a particular time period typically
the lifetime. Although prevalence can be any specified time period, most
popular media use the concept of "lifetime prevalence" although they frequently
only state "prevalence" and we must assume they are referring to lifetime
prevalence.
However, it does make a difference what number they are reporting. For
example, the one-year prevalence of anxiety disorders in the adult population,
meaning the number of cases that are present in a typical year, is 18%
(Kessler, Chiu, et al, 2005) whereas the lifetime prevalence for anxiety
disorders in the adult is almost 29% (Kessler, Berlund, et al, 2005).
So, if I was trying to make a point to a college class about how common anxiety
disorders are I could say, "Look around you. One out of every five people
you see in this room have an anxiety disorder" because I'm using the concept of
one-year prevalence and since few anxiety disorders are resolved in less than a
year, I'm referring to how many people currently present in the room have an
anxiety disorder. However, I could also make a similar, but different,
point if I was to say to a client with an anxiety disorder "You are not unusual.
Over the course of their lifetime, almost 1 in 3 adults will suffer with an
anxiety disorder." These comments are based on different statistics:
one-year prevalence versus lifetime prevalence.
Incidence
is the number of new cases that occur in a population during a specified period
which is typically reported as annual incidence. Incidence is usually
reported as a rate which is the number of people who developed the disease
during the period divided by the population. So, if we are looking at the
incidence of diabetes reported above we can obtain the U.S. rate of .29% by
dividing 798,000 (number of annual cases) by 275,000,000 (the population of the
U.S. at the time the statistics were obtained). Incidence statistics tend
to be used to indicate the "risk" of developing a disease.
A great deal of confusion occurs between the terms of "prevalence" and
"incidence" and, unfortunately, many writers of publicly disseminated
information use these terms interchangeably. For example, when I searched
the web for the incidence of hypertension I found the following statement on a
site called "Up to Date Online" providing medical information to the public:
"NHANES
data from 1999-2000 and United States Census bureau information demonstrated a
29 to 31 percent incidence (italics mine) of hypertension in the 18 year
and older population of the United States (www.utdol.com)."
Knowing the correct definition of "incidence" I would translate this statement
to mean that there are 29 to 31 percent new cases of hypertension every year.
You can see how ridiculous that is because it would mean the entire population
of the U.S. would have high-blood pressure within 4 years! I don't know
the quality of the site from which I obtained this data, however, I do know that
they are using the term "incidence" inappropriately which causes me to question
other data they provide. This common error of confusing the two also tends to
create further inaccuracies in the reporting of statistics. If I wasn't a
savvy reader I could easily pass on their incidence data to other unsuspecting
readers contributing to the confusion among the public.
The Wall Street Journal reported that exercise reduces the incidence of
high-blood pressure by 40%. Even though articles published on websites
used the term "incidence," I had trouble finding true incidence figures because
most information about high-blood pressure focuses upon prevalence statistics.
This is due to the difference between a chronic disorder such as hypertension
and an acute disorder such as the flu. Annual incidence is a more
important statistic for the flu because the prevalence statistic would not be
meaningful as people only have the flu for a short period of time.
However, the Wall Street Journal article reported the risk or incidence of
developing hypertension for exercisers and non-exercisers so I needed to use the
same type of statistic in order to examine the meaning of the percentages.
Therefore, incidence is 1 out of 31 (extrapolating from Canadian statistics: Tu,
2008) or over 3% of population of the adult population. Again, by
multiplying risk (3%) by reduction (40%) we obtain an incidence of approximately
1% for exercisers which would be equivalent to 1 out of 100. Now that
might be considered a substantial difference by most people's standards.
In addition to that, since hypertension is a chronic disorder, if everyone in
the U.S. exercised, the number of people with hypertension would be reduced over
a 5-year period by 14 million people!
Agendas
Sometimes agendas are beneficent and sometimes they
are self-serving, but agendas always exist. Therefore, to fully
evaluate the statistics, the agenda of the reporter needs to be considered.
Usually the primary source of data, the scientific article, is fairly free from
bias as it typically needs to provide all the underlying numbers and to be clear
and precise with the interpretation of these numbers to be eligible for
publishing in a scientific journal. However, the further away from the
primary source and the more times the results have been paraphrased, the more
potential for inaccuracies exist.
So, understanding the agenda of the reporter can help with interpreting the
data. However, certain assumptions must be made about agendas. For
instance, I assume that the agenda for the journalist who wrote the article for
The Wall Street Journal article described above is to provide an interesting,
accurate article that will inform the public about an important topic.
However, what is the agenda of her source for the statistics she provided?
One source mentioned in the sidebar to the article (from which the percentage
statistics discussed above were drawn) is the American College of Sports
Medicine (ACSM). A search on the internet shows the ACSM to be a
professional society providing basic and applied exercise science conferences,
meetings and workshops. I certainly to not want to denigrate an
organization I know little about, but it is possible that their agenda is to
convince others of the importance of exercise so as to obtain paid registrants
to their conferences and workshops? If that were the case, they may be
more likely to present data in a positively skewed format such as using
percentages instead of providing the underlying numbers. Although another
possibility is that they did provide the underlying numbers but the newspaper
didn't use them because extreme numbers are more likely to sell newspapers.
I'm not saying that I know this is the case in this situation but it is part of
what we need to consider when determining other's agendas.
To determine agendas we want to speculate as to what benefit the individual or
the organization obtains from providing the information. This allows us to
determine why we might find certain inaccuracies in the information provided.
For instance, a statement in the sidebar of The Wall Street Journal Article
states that according to the ACSM exercise can "decrease depression as
effectively as Prozac or behavioral therapy." As a behavioral therapist
myself, I am puzzled by this statement because I know that behavioral therapy is
about changing behaviors which would include the behavior of exercise.
So, of course exercise is as effective as behavioral therapy at decreasing
depression because exercise is one of the tools of behavioral therapy.
What I don't know is what study this statement was derived from: was it research
examining the difference between medication, therapist- aided behavioral
therapy, and exercise alone? In which case, I would also want to know how
the subjects were motivated to exercise: were they self-motivated or did the
researcher ask them to exercise? If they were asked to exercise, how was
that condition of the research different from the therapist-aided behavioral
therapy? I could continue with this line of questioning but I just want to
give you an idea of some of the questions that can be considered when evaluating
data that is provided to the public. For all I know, the source of this
statement may have an anti-therapy bias, which agenda may bias the overall
presentation of the statistics.
Although there are many other issues to be considered when evaluating the
statistics that are presented by the national media, I hope that I have provided
you with an understanding of some of the most critical issues to evaluate as you
read or hear statistics that are presented. In another article I intend to
use the concepts presented here to evaluate the efficacy of medications and how
to determine whether the benefits outweigh the side effects.
*
"Figures often beguile me, particularly when I have the arranging of them
myself; in which case the remark attributed to Disraeli would often apply with
justice and force: 'There are three kinds of lies: lies, damned lies and
statistics (Benjamin Disraeli).'"
- Mark Twain's Own Autobiography: The Chapters from the North American Review
Kessler, R.C., Berlund, P., Demler, O., Jin, R.,
Merikangas, K.R., Walters, E.E. (2005). Lifetime Prevalence and
Age-of-Onset Distributions of DSM-IV Disorders in the National Comorbidity
Survey Replication. Archives of General Psychiatry, 62,
593-602.
Kessler R.C., Chiu W.T., Demler O., Walters E.E.
(2005). Prevalence, severity, and comorbidity of twelve-month DSM-IV
disorders in the National Comorbidity Survey Replication (NCS-R)..
Archives of General Psychiatry, 62, 617-27.
Landro, L.
(2010, Jan. 5). The Hidden Benefits of Exercise. The Wall Street
Jounal. New York: Dow Jones & Company.
Tu, K., Chen, Z.,
& Libscombe, L.L. for the Canadian Hypertension Education Program Outcomes
Research Taskforce (2008). Prevalence and incidence of hypertension from
1995 to 2005: a population-based study. Canadian Medical Association
Journal, 178, 1429-1435.
Copyright ©
2010 by www.excelatlife.com.
Permission to reprint this article is granted if it includes this entire
copyright and link.
|
Recommended
Books

Mindfulness for Beginners
By
Jon Kabat-Zinn

Inner Focus, Outer Strength: Using Imagery and Exercise for Health, Strength and Beauty
By Eric Franklin


The Relaxation & Stress Reduction Workbook
By Martha Davis

Natural Health, Natural Medicine: The Complete Guide to Wellness and Self-Care for Optimum Health
By Andrew Weil

Flow in Sports: The keys to optimal experiences and performances
By Susan A.
Jackson, Mihaly Csikszentmihalyi

Foundations of Sport and Exercise Psychology w/Web Study Guide
By Robert Weinberg

Thinking Body, Dancing Mind: Taosports for Extraordinary Performance in Athletics, Business, and Life
By Chungliang Al Huang

Embracing Your Potential
By Terry Orlick
Self-Esteem: A Proven Program of Cognitive Techniques
for Assessing, Improving, and Maintaining Your
Self-Esteem
By Matthew McKay, Patrick Fanning

The
Self-esteem Companion: Simple Exercises to Help You Challenge
Your Inner Critic & Celebrate Your Personal Strengths
By Patrick Fanning, Carole
Honeychurch, Catharine Sutker


"To be betrayed, the person must first experience trust in the
betrayer."

"...the way we make attributions about behavior
affects relationships and self-esteem."
|