Epidemiological Methods Over Biochemical Methods
Most health research uses animal studies, in vitro experiments, or short-term human studies to make claims about the causes and cures of chronic diseases.
These methods can be useful in some cases. But it’s become clear that they are not a good fit for studying chronic disease, because they have not been successful. If nothing else, the problems that have not been solved by these methods, even after many years of trying, will probably never be solved by these methods. As blogger Dynomight puts it, “complex mechanistic arguments for diet do not have a good track record. So far they’ve worked for… basically nothing? We’re still debating if eating salt or cholesterol are bad for you.”
Looking back at cases where public health research has succeeded, we almost always got there by starting with epidemiological methods. Notable successes include Chimney sweep’s carcinoma, Minamata disease, and the discovery of the link between smoking and lung cancer. Sir Austin Bradford Hill, one of the researchers who established this link, once wrote,
To take a more modern and more general example upon which I have now reflected for over fifteen years, prospective inquiries into smoking have shown that the death rate from cancer of the lung in cigarette smokers is nine to ten times the rate in non-smokers and the rate in heavy cigarette smokers is twenty to thirty times as great.
Maybe people prefer lab-based studies because it seems like lab work is “real” and “hard science”. The fact that these studies involve test tubes and invoke the names of different molecules gives them an air of legitimacy, an air that’s not always deserved. You can add test tubes to any study you want, but that doesn’t make it a better study.
The truth is that comically simple arguments like “people who smoke more cigarettes get more lung cancer” are the foundation of successful research. Whenever you are feeling too galaxy-brained, remember to reflect and consult the mental gymnastics meme:
Epidemiological research does have limitations. In particular, these methods are almost always correlational, which makes it hard to figure out causality. You have to be especially aware of confounders — when you find a possible cause of a disease, you need to make sure there’s not another demographic feature or feature of life that happens to go along with the possible cause being studied, and might be the real underlying cause.
But for careful researchers, there are still ways to make a strong case for causality, like pseudoexperiments and looking for dose-dependent relationships. In general we subscribe to the Bradford Hill criteria.
The Right Way
Epidemiological research has many strengths that make it competitive with, and often superior to, lab-based research.
If a lab study is trying to discover whether some factor causes a disease, or whether some treatment cures it, the answer almost always has to turn out “yes” — otherwise the paper doesn’t get published and the research team has wasted huge amounts of time and energy. As a result, the results of lab studies are often manipulated or even fabricated to get the result the research team wants. Incidentally, this does mean that when lab papers find that a factor doesn’t cause a disease, or a treatment doesn’t cure it, they can be trusted slightly more.
In contrast, epidemiological data is usually collected in a more neutral way. While every party has their own biases and agenda, epidemiological data was usually collected for a reason totally unrelated to the disease you’re studying.
Lab studies on a particular disease or treatment are conducted by people who know that their work will reflect on that disease or treatment, and they may intentionally or unintentionally twist their results to suggest the conclusions they want. But the people collecting government data on the depth of domestic drinking supply wells never dreamed that this data might someday be used to evaluate whether drinking water contamination could be causing obesity. No one is completely neutral, but they can be expected to have collected their data in a way that is mostly neutral towards that particular question, because they had no reason to think about it.
Most importantly, epidemiological research tends to have especially good external validity. The results of lab and animal studies are not guaranteed to generalize to new contexts, and often they don’t. Just because a treatment works in the context of a clinical trial, on the kinds of people who sign up for an experimental treatment, doesn’t provide great evidence it will work for the average patient under normal circumstances. But epidemiological evidence tends to speak directly to the questions we actually care about.
If in vitro studies say that a chemical will cause cancer, but people who work with the chemical every day don’t have a higher incidence of cancer than the general population, it’s hard to imagine being all that concerned about exposure to the chemical. Whatever the lab studies say, it appears to be perfectly safe in practice. And in contrast, if in vitro studies say that a chemical has no cancer risk, but 90% of people who work with the chemical develop cancer within 10 years, it’s hard to imagine not being concerned.
Lab studies can be useful to confirm or extend epidemiological findings, and they should be carefully considered as the evidence for a hypothesis increases. But lab studies are a bad place to start looking for leads. Lab studies should not be used as the foundation of a hypothesis in public health, because the evidence they supply is so weak and idiosyncratic.
How Much is this Research Worth?
Public health is immensely valuable. A single 2012–2018 CDC anti-smoking campaign was estimated to prevent almost 130,000 early deaths and save $7.3 billion in smoking-related healthcare costs.
The direct medical costs of obesity in the United States alone are estimated at nearly $173 billion per year, and the total annual economic impact of obesity in the U.S. has been estimated at over $1.4 trillion. A recent report in the British Medical Journal estimated that the global economic impact of the obesity epidemic is projected to reach $4.3 trillion annually in the next 10 years, accounting for nearly 3% of global GDP.
For chronic disease in general, the CDC has called chronic diseases “the leading drivers of our nation's $4.5 trillion in annual health care costs”. Healthcare costs for people with a chronic disease are five times higher than those without, and about 40% of American adults have more than one chronic condition. Treatment of the most common chronic diseases, including productivity losses, have been estimated to cost the U.S. economy almost $2 trillion dollars annually.
Open Projects
Organize the NHANES
The CDC has a project called the National Health and Nutrition Examination Survey, or NHANES for short. This is a program of studies that collects health and nutritional data of adults and children in the United States. It began in the early 1960s, and since 1999 the survey has been a continuous program that examines a nationally representative sample of about 5,000 people from across the country every year.
The NHANES datasets are publicly available — they include hundreds of health and nutrition measures for thousands of people, collected in multiple rounds of examination, across several decades. This makes these datasets an amazing resource for public health research.
But NHANES data is very hard to work with. Each two-year period of data (e.g. 1999-2000, 2001-2002, etc.) is split up into several different datasets, which have to be manually combined for analysis.
This combination is not as simple as merging the tables. Formatting and variable names often change year-to-year, sometimes with no explanation. Variables are often added or removed. It’s not always clear if a measure used in one year is the same as a similarly-named measure from another year. There’s a lot of ambiguity. So while the dataset is extremely rich, its hugeness and the fact that it changes so much over time can make it confusing.
This project is simple. Take the NHANES data and clean it up. Compile all years into a single dataset, combine whatever variables can be combined across years, and give them consistent names. Release this cleaned dataset to the public. Then dive in and see what can be learned now that it’s all more carefully organized.
The main limit of this project is that anything we find in the NHANES data would be correlational. But this is not as much of a problem as people think.
The limit on correlational findings is that they don’t provide very strong causal evidence, so we shouldn’t rely on them to justify causal claims. But we don’t always want to justify causal claims — there are lots of other kinds of arguments we might be interested in. In particular, the NHANES data seems like it would be useful for finding new leads, discovering relationships we didn’t expect, possibilities that can be investigated further. And correlational findings can still constrain theories, or provide strong evidence against a theory, even when they can’t provide strong causal evidence in favor.
One particular benefit of using NHANES data to investigate public health questions is that the NHANES automatically gives you multiple years of data. Most variables appear in more than one year, so if you find a relationship between some variables in one year, you can confirm that relationship is robust by checking that the same relationship exists in other years. This means you can get somewhat independent verification for a finding within this single dataset — you can replicate your analysis. This isn’t robust to everything, but it is robust in a way that’s unusual and hard to come by.
A small version of this project already exists as a proof of concept. We worked with a data scientist and a team of bloggers to combine just some of the NHANES datasets, then looked through the data for unusual correlations with BMI. The results were fairly surprising and provide a good example of the kind of findings we might expect from a full-scale version of the project. You can check out this analysis in the resulting blog post, NHANES: Copper and Γ-Tocopherol.
Nutrition Review
Potassium is an essential nutrient that is necessary for the normal functioning of all cells. To be healthy, a person needs a few thousand milligrams of potassium per day.
So it’s quite strange that you can usually only get potassium supplements in doses of 99 mg or less, a tiny fraction of the required daily amount. When a team of independent researchers looked into this (search for “Theory Viability” on that page), they found that this limitation probably comes from a misinterpretation of a small number of studies of Hydrosaluric-K (“enteric-coated hydrochlorothiazide with potassium chloride”) from the 1960s. These specific pills may have been dangerous, but not because of the amount of potassium they contained. People get much higher doses than 99 mg from their diets every day. In general, potassium pills containing more than 99 mg appear perfectly safe in nearly all contexts.
This is not the only unusual thing about the dosing of potassium. For a long time, the recommended daily value for adults (technically, the “Adequate Intake”) was 4,700 mg of potassium per day. But in 2019, the National Academies of Sciences, Engineering, and Medicine changed the recommended / adequate intake to 2,600 mg/day for women and 3,400 mg/day for men.
They say that the change is “due, in part, to the expansion of the DRI model in which consideration of chronic disease risk reduction was separate from consideration of adequacy,” but we can’t help but wonder if they changed it because it was embarrassing to have less than 5% of the population getting the recommended amount. In every CDC NHANES dataset from 1999 to 2018, median potassium intake hovers around 2,400 mg/day, and mean intake around 2,600 mg/day. And in this report from 2004, the National Academy of Medicine found that “most American women … consume no more than half of the recommended amount of potassium, and men’s intake is only moderately higher.”
The point is, there are many recommended dietary allowances, adequate intake values, estimated average requirements, tolerable upper intake levels, and various limits or warnings.
Government agencies are happy to provide numbers. But the history of each number is often obscure. And some of these recommendations are unjustified or entirely mistaken, based on limited data, or misinterpretations of studies that by now are several decades old.
For this project, we propose a series of in-depth literature reviews to be written in plain language and released publicly, documenting the history of each of these recommendations and offering analysis as to whether each recommendation is justified. We could start with an extension of the analysis for potassium RDA and tolerable upper limit, and continue from there.
While nutrition is complicated, the foundations can be made simple, or can at least be mapped out clearly. We can enumerate the most important pieces one by one and lay out the available information for each.
There are a finite number of amino acids — only 22 α-amino acids make up all proteins, and we can compile the research on each. How much does a person need of each to have a baseline of health? How much variation is there in how much of each amino acid each person needs? How much of each is too much? Can you take 10 mg of glycine per day, or is that dangerous? If we can’t provide definitive answers to these questions, we can at least outline what is known, and what is still left to be figured out.
There are a finite number of dietary minerals, some subset of the 118 elements. Some of these like sodium and mercury have known good and bad nutritional properties, some of them like carbon are not considered minerals per se because they are CHON, and there are others where it is still not clear whether they are essential micronutrients in humans.
For all of these minerals, we can for starters:
1) Document the current RDA, tolerable upper intake, etc.
2) Document the history of these numbers, if they have changed since they were introduced, and if so what justification (if any) was given for the change.
3) Document the reasoning and evidence behind these numbers.
To provide a specific example, it’s recommended that people get at least 150 mcg of iodine per day. This certainly seems like it is enough to keep you from developing goiter, and that goal may be the source of this specific number — we think they may have chosen 150 mcg because that was the minimum needed to keep Swiss people from getting this disease.
But we wonder if the optimal level of iodine for general health (actually thriving, not just the minimum amount to keep you from getting goiter) might be much higher, and if the tolerable limit for iodine might be even higher still. These are questions we might be able to answer with a detailed analysis of the literature. And if a detailed analysis doesn’t answer this question, then we’ll have discovered an important gap in the literature.
European Wheat
There are lots of stories where an American goes on vacation for a few weeks, to Europe or Asia or wherever, and loses a significant amount of weight without any special effort. Though sometimes it’s not weight, it’s something else, like acne breakouts or digestive issues.
There are also some stories that are exactly the opposite: someone from Europe or Asia goes on vacation to America for a few weeks, and gains a significant amount of weight without any changes.
The anecdotes are interesting, at times even compelling. But so far there hasn’t been any systematic study. If people really do rapidly gain and lose weight when they briefly visit other countries, that would be good to know. It would give us a powerful tool for causing weight loss (just take a trip to somewhere leaner than where you’re living right now). And most importantly, and it would help us get closer to finding out what causes obesity, because we could extend this kind of study to discover exactly what difference(s) between the two locations is causing the weight loss and/or weight gain.
There are two different kinds of design: one for smaller incremental effects like weight loss, and another for large binary results like digestive issues.
The incremental effects studies can be relatively simple. One design is: Send fat people from the United States to Vietnam (or other lean country) for a few weeks, and see if they lose weight. Or: Send lean people from Vietnam (or other lean country) to the United States for a few weeks, and see if they gain weight.
Obesity is a good choice for these incremental methods because weight is easy to measure and there are a lot of anecdotes claiming that people have gained/lost weight this way. But you can use this same method to study any other health issue where there are obvious international differences.
The studies for large binary issues will usually be more complex, but they make up for it in that they don’t require as many participants. A small number of people, even just a single participant, can produce compelling results.
Here’s an example. Find a small number of people with gluten, wheat, or dairy intolerance who claim that their intolerance went away when they spent a few weeks in Europe. Recruit them to your study. First, confirm that they have this dietary sensitivity in the US and find a way to measure it. Have them consume some gluten / wheat / dairy so you can observe and confirm.
Then, send these people to Europe for a few weeks, and confirm that they don’t have this same issue in Europe. If so, congratulations, you have an interesting finding.
The next question becomes, does this difference come from something about the food in Europe, or some other difference about the environment? Well, you can find out more about that. Either:
Have these people stay in Europe, but ship them American food, exactly what they ate at home, and have them eat the American food while staying in Europe. Does their issue return? If so, then the problem is something in the American food. If not, it suggests that the problem is something else about the American environment, something missing in Europe.
Or, have these people stay in America, but ship them European food, exactly what they ate in Europe, and have them eat the European food while staying in America. Does their issue go away? If so, then the problem is something in American food, something they don’t consume when they eat food shipped from Europe. If not, it suggests that the problem is something else about the American environment, that can’t be fixed by eating carefully imported European foods.
If you can confirm this, then you can go even deeper. For example, imagine you have an American who thought he had a gluten problem. But when you feed him bread made from flour you shipped from Europe, he has no problem at all with this bread. Obviously this can’t be the gluten, because European flour contains gluten just like the US does.
Well, what you can do is you can look up the different ingredients and additives that are in American flour, but not in this European flour, and add them back in one at a time. When you add one that brings back his “gluten problem”, you’ve found the real cause. If it’s not one of the ingredients, then you test the two flours for contaminants, and you add the contaminants from the American flour to the European flour one by one, again until you discover the true problem.
Sign up for our mailing list here, or email us at root@whylome.org to get in contact!
Or make a donation by check to:
Whylome, Inc.
PO Box 211
Jonesville, VT 05466