Loading...
Loading...
The phrase "studies show" has become a rhetorical weapon. It's used to lend scientific authority to claims regardless of the quality, relevance, or representativeness of the actual studies. A study of 12 people over 2 weeks, funded by the company selling the product, published in a pay-to-publish journal, is still "a study." But it's not evidence in any meaningful sense.
Learning to evaluate the statement behind "studies show" is one of the most valuable critical thinking skills you can develop. It doesn't require a statistics degree — it requires knowing which questions to ask.
The five questions that filter 90% of bad science: (1) How many people were studied? (2) Was there a control group? (3) Who funded it? (4) Has it been replicated? (5) What did the study ACTUALLY measure vs what's being claimed?
Tip
The hierarchy of evidence, from strongest to weakest: Systematic reviews/meta-analyses > Randomized controlled trials (RCTs) > Cohort studies > Case-control studies > Case series/reports > Expert opinion > Anecdotal evidence. When someone says "studies show," ask where on this hierarchy the study sits.
Sample Size: A study of 12 people cannot detect subtle effects and is extremely vulnerable to random variation. Most nutrition studies need hundreds to thousands of participants to produce reliable results. When evaluating a claim, the sample size tells you how much weight to give the finding. N=12 is exploratory at best. N=10,000 is meaningful.
Selection Bias: WHO was studied matters as much as how many. A study of college students doesn't generalize to elderly populations. A study of people who voluntarily join a health program doesn't represent the general population (self-selection bias). The question: does the study population match the population the claim is being applied to?
Statistical Significance (p < 0.05): A p-value of 0.05 means there's a 5% chance the result occurred by random chance alone. This threshold is arbitrary — it's convention, not natural law. Important limitations: (1) p = 0.05 is not proof of an effect — 1 in 20 random comparisons will produce p < 0.05 by chance. (2) Statistical significance is not the same as practical significance — a drug that reduces blood pressure by 0.5 mmHg might be statistically significant but clinically meaningless. (3) p-hacking (running many tests until one produces p < 0.05) inflates false positive rates dramatically.
Warning
p-hacking is the practice of running many statistical tests on the same data until one produces a "significant" result. If you test 20 independent hypotheses at p < 0.05, you expect 1 to be significant by pure chance. Researchers who report only the significant result (and not the 19 null results) create a false impression of a real effect. This is a major driver of the replication crisis.
Correlation means two things vary together. Causation means one thing causes the other. The gap between them is where most bad science lives.
Classic example: countries with higher chocolate consumption win more Nobel Prizes. Correlation? Yes. Causation? Obviously not. The confound: wealth. Wealthy countries consume more chocolate AND invest more in research and education. The chocolate isn't producing the Nobels.
Confounding variables are factors that affect both the measured variable and the outcome, creating a false appearance of a direct relationship. In health research: people who exercise more also tend to eat better, sleep more, stress less, and have higher incomes. When a study finds that "exercise reduces disease risk," some of that effect may be the other lifestyle factors that correlate with exercise, not exercise alone.
The only reliable way to establish causation: randomized controlled trials (RCTs). Random assignment distributes confounding variables equally between groups, isolating the effect of the intervention. Observational studies (which are what most "studies show" claims reference) can only show correlation.
Relative risk vs absolute risk: "Drug X reduces heart attack risk by 50%!" sounds dramatic. But if the baseline risk was 2 in 1,000, the drug reduced it to 1 in 1,000. The absolute risk reduction is 0.1%. The relative risk reduction (50%) is technically accurate but creates a distorted impression of the benefit. Always ask for absolute numbers.
Real World
Headlines almost always report relative risk because the numbers sound more impressive. "Coffee reduces diabetes risk by 25%!" In absolute terms: baseline risk might drop from 8% to 6%. A 2-percentage-point reduction is real but modest. The relative framing makes the same finding sound transformative. This is not deception — it's selective emphasis. But it consistently misleads the public about the magnitude of effects.
Survivorship Bias: You only see the winners. Successful entrepreneurs get profiled; the 90% who failed are invisible. Drugs that work get published; drugs that don't work get shelved. Buildings that survived are studied for their design; buildings that collapsed are forgotten. You draw conclusions from the visible survivors while ignoring the invisible majority that didn't make it.
Publication Bias: Studies with positive (significant) results are far more likely to be published than studies with null results. This creates a systematic distortion in the scientific literature — the published record overrepresents effects and underrepresents null findings. If 20 labs test the same hypothesis, the 1 that gets a significant result (by chance) publishes. The 19 null results sit in file drawers. The published literature looks like there's an effect; reality says there isn't.
The File Drawer Problem: Unpublished null results are estimated to outnumber published results by a factor of 3-10x for many research questions. Pre-registration (declaring your hypothesis and methods BEFORE collecting data) and registered reports (journals committing to publish regardless of results) are emerging solutions — but most existing literature doesn't have these safeguards.
How to defend against these biases: Look for meta-analyses and systematic reviews that attempt to account for publication bias. Check if the study was pre-registered. Be skeptical of single dramatic findings that haven't been replicated. And remember: absence of published evidence is not evidence of absence — it might just mean the null results weren't published.
"Studies show" is not evidence — it's a claim that requires evaluation. Five questions filter 90% of bad science: How many people? Was there a control group? Who funded it? Has it been replicated? What was actually measured? Always ask for absolute numbers (not just relative risk), demand randomized controlled trials for causal claims, and remember that published literature systematically overrepresents positive findings.
Keep reading to complete