![]() |
|||||||||||||||||||||||||
Thursday, Jul 14
Doctor Doctor
Looking at statistics sometimes scares writers, particularly those who fled math class for English. But they're critical to determining the quality of a study, and it's important to understand that being able to analyze studies effectively doesn't require any higher-level math skills. The 5% Rule When we look at the results of studies, one of the first things we want to know is whether the results are statistically significant. "Statistical significance" means: a) There's a big enough sample size If you chose (d), you're not alone. The other choices aren't right, either. Statistical significance simply means that the probability that the results occurred by chance - that is, not because they reflect a true result - is quite small. How small? Well, usually 5% or less, by convention. Put another way, if you were to repeat the experiment 100 times, you would get a similar result at least 95 times. Leaving aside terms and numbers for a moment, why would researchers think that their results were due to chance? Think about flipping a coin. If you flipped a coin 100 times, and got 70 heads, and then 100 times again, and got 68 heads, and then another 100 times, and got 73 heads, you might have a funny coin - maybe it's weighted toward one side, or something. Checking for statistical significance is another way to make sure that your results aren't the result of a medical funny coin that you didn’t know you were testing. Now, you ask, why did statisticians pick 5%? That has to do with the way that data is typically distributed - a.k.a. the famous bell curve. If you remember the appearance of a bell curve, you'll remember two "tails" of 2.5% of the data points on either side, for a total of 5%. In a distribution of people's heights, for example, that might be people who are taller than 6'4" or shorter than 4'6". Those tails are also referred to as outliers. You can see how it would be a bad idea to be a statistical outlier if you're a medical study. Outliers are just plain unlikely, and if your statistical analysis says there's something incredibly unlikely about your results, they may not be true. They may be, to coin another phrase, false positives. P values How do we express this statistically in medical studies? That's the p value, which is used to say how likely it a difference between two groups was likely to be due to chance. When you see p ≤ .05, that means it's 5% or less likely that such a difference was due to chance, and that's considered statistically significant. Researchers use p values at two points in a trial. The first is when they compare the characteristics of their control and treatment groups. At this point, you don't want to see much of a difference between the two groups, so you want to see p values larger than .05. At the second point, however, you want to see a difference between your groups if you want to prove that a treatment had an effect, so you'll be looking for a p value smaller than .05. Trends But what about when p values are greater than .05? In a study published in Gastrointestinal Endoscopy in February 2005 comparing the use by gastroenterologists of headsets during colonoscopy with the use of traditional video screens, researchers reported that "there was a trend toward increased time to cecum with the headset (9.8 vs. 8.0 minutes, p = 0.055)." Again, leave aside the medicalese for now. So close! But no cigar. .055 is greater than .05, and so this result is not statistically significant. The researchers report this as a "trend," which you'll sometimes find is a word researchers use to couch their non-statistically significant findings. If a p value is close to .05, they'll write that there's a trend toward an effect. To me, that's sort of cheating, even if it’s an accepted way to express "close to statistically significant." Still, context is important. If this is a treatment for a terminal illness that doesn't have any other treatments, maybe it's worth it. But if it's the tenth treatment for a mild disease, I’d say pass on writing about it. What about p values less than .05? Sometimes you'll see p = .02 or some such. Is that better, statistically speaking, than a p of equal to or less than .05? I suppose that if you go strictly on what p values tell you, such smaller numbers are technically better. But for the purposes of analyzing studies, once it's at or below .05, it's all the same. There's something else - a sort of warning - that this 5% implies, too. If we've accepted that 5% of studies show an effect when there isn’t really one, then we're accepting that one in 20 studies may actually be wrong. Should you write about the first major trial of a particular drug? Your call, but with each trial showing a similar result, it's less likely that the first one was just a fluke. Confidence intervals The issue of confidence intervals is another important one that's tied to this 5% rule. Confidence intervals often appear next to risk ratios. In medical studies, you'll see notations such as "RR 1.6; 95% CI 1.4 to 1.8." In English, that's "risk ratio 1.6; 95% confidence interval 1.4 to 1.8." In understandable English, that's something to the effect of "the risk of [insert horrible disease] in people who [insert bad habit here] was probably 1.6 times that of the general population, but we're 95% sure it was between 1.4 and 1.8 times that of the general population." In other words, the risk may actually be greater than 1.6 times, or less. Another way to think about confidence intervals is that they are similar to the margins of error that you hear describing polls. If a poll says that 32% of people say they're voting for candidate X, and the margin of error is 3%, you can add 3% to the 32% to get the best-case scenario for candidate X, and subtract 3% to get the worst-case scenario. That's really the same concept as 1.4 to 1.8 above. The quick and dirty to tell statistical significance from confidence intervals is to do what is often referred to as checking whether the interval crosses 1. A risk ratio of 1 means that there’s no difference in risk between two groups, an odds ratio of 1 means the odds of an event is the same in two groups, and a hazard ratio of 1 means the same thing as a risk ratio of 1, but after a an analysis that takes a number of variables into account. So if 1 is part of the 95%, it's within reason to think that if you repeat this trial, you may find no difference. That means it's statistically insignificant at a confidence interval of 95%, which is standard. This can be true if you're trying to lower the risk, in which case the ratio should always stay lower than 1, or looking for an increased risk, in which case the ratio should always be higher than 1. Again, using the raloxifene/MORE study, looking abstract at the first page (1140), women taking raloxifene 60 mg/day had a 68% decrease in the risk of new clinical vertebral fractures, and the 95% confidence interval was reported as 20% to 87%. Those can be expressed as .68, .20, and .87 for the sake of consistency. So yes, this finding was statistically significant at a 95% level; the CI didn't cross 1. Let’s rip an example from the headlines, as it were. In a study of celecoxib (Celebrex) and heart disease among patients in a clinical trial for prevention of colon cancer, subjects reached "a composite cardiovascular end point of death from cardiovascular causes, myocardial infarction, stroke, or heart failure" 1.0% of the time in the placebo group and 2.3% of the time in the treatment group, which was receiving 200 mg of celecoxib twice a day. This gave a hazard ratio of 2.3, with a 95 percent confidence interval of 0.9 to 5.5. See something about those numbers? Yep, it crossed one. So at a 95% confidence interval, the finding is not statistically significant. Again, although it isn't true here, researchers may say this showed a "trend" toward significance. And just as you may see p values lower than .05, you may see confidence intervals greater than 95%. Again, by convention, 95% is usually good enough, but sometimes just to prove something closer to beyond a shadow of a doubt, studies will report 97% or 99%. Email This Post |
Jobs of the DayWeb Programmer/Developer Media Director Publicist Freelance MarketplaceFreelancers By
|
||||||||||||||||||||||||
| Editorial | 859 |
| Pub/Market/Adv |
209 |
| New Media/Tech |
169 |
| Photography | 101 |
| Art/Design | 119 |
| Production | 37 |
| Film/TV/Video | 84 |
| Other Media Prof. | 183 |