An intro to exposure assessment in air pollution epidemiology (… and can we trust it?)

Acknowledgement: Thank you to my adviser and my lab group for their helpful feedback and discussion related to this post.

Even though I’m pursuing a PhD in environmental toxicology, my dissertation uses methods from environmental epidemiology to answer the question of whether long-term air pollution exposure (specifically, fine particulate matter (PM2.5)) is associated with dementia or dementia pathology in a cohort based in the Seattle area.

When I tell people about my project, they always ask: “how are you measuring exposure to air pollution?”

My answer: “Well, it’s complicated…”

Exposure assessment – “the process of estimating or measuring the magnitude, frequency, and duration of exposure” – is often considered the Achilles’ heel of environmental epidemiology. Maybe in the future, we’ll have advanced personal sensors that measure all possible chemicals in our bodies. But for now, environmental epidemiologists have to use a variety of imperfect methods.

Different pollutants require different methods of measurement (for example, some are measured internally, in urine or blood), but here I’ll focus on common methods to assess long term exposure to ambient (outdoor) air pollution for use in cohort studies evaluating chronic health effects. Other methods can be used for time-series studies, which are focused on short-term exposures and acute health effects. All of this information exists in the academic literature, but as per the title of this post, my goal is to present a summary so that non-exposure science experts (like me!) can understand it.

A brief review of selected air pollution exposure assessment methods for cohort studies

(1) Monitoring & Simple Interpolation Models

TLDR: We can use information from ground-level monitors, or a weighted average from those monitors, as a basic way to estimate air pollution for a given population. This is easy and cheap, but overly simplistic.

The most consistent, long-term measurements of air pollution come from stationary monitors, such as the Environmental Protection Agency’s (EPA’s) Air Quality System (AQS). A highly simplistic (and inexpensive) approach, then, would be to represent each person’s exposure by the nearest monitor. But, there are several problems with this method. First, as you can see if you click on this map (and choose pollutant monitors under the “select layer” function), the existing network of monitors is quite sparse.  Here’s a map of the three active PM2.5 monitors in Seattle:

I live in northwest Seattle (see the blue triangle on the map), so it is highly unlikely that those monitors in south Seattle adequately represent my long-term air pollution exposures – particularly for pollutants linked to specific local sources, like roadways or factories. Second, these monitors usually only take measurements every third or sixth day, so there are a lot of missing data over time. (However, long term averages in the region tend to be fairly stable from day-to-day, so this probably isn’t a huge problem for cohort studies looking at the effects of long-term exposures).  And third, since we need some amount of population variability to estimate a meaningful association between an exposure and an outcome, assigning the same monitoring information to everyone in a certain area essentially eliminates (spatial) contrasts and decreases our chances of detecting any effects.

To overcome the lack of spatial coverage from federal regulatory monitors, researchers can set up their own monitors. But, developing a new monitoring network is expensive, and you can only get exposure data for the specific period of time that your monitors are set up (as opposed to being able to look at the cumulative effects of exposures over years in the past, which is what most cohort studies want to do).

We can improve on this with “interpolation,” which is a basic model that assigns more weight (influence) to the monitors that are closer to the individual (such as through inverse distance weighting or kriging). You can think about it like a weighted average. While this is an improvement, it often still does not really provide the type of detailed (spatial) resolution that we need.

Overall, these are fairly basic approaches that were more commonly used in the early days of air pollution epidemiology (ex: the seminal study by Dockery et al.), especially when the research focus was on between-city rather than within-city differences in health outcomes. We can definitely do better now.

(2) Spatial Regression Models

TLDR: We can build regression models based on ground monitors, geography, and population features to predict air pollution at various locations. These models are commonly used in cohort studies, but each one varies in quality. 

Land use regression (LUR) and related spatial regression models (e.g., universal kriging) use monitoring data as their foundation, but they also integrate information such as traffic density, population, land use, geographic features, and proximity to known emissions sources. All of these inputs are incorporated into a complex regression model to predict concentrations at sites where no monitoring data exist.  

LUR model quality is judged through one or more validation processes, to see how well they can predict concentrations at sites where the true concentration is known. Researchers also look at how much of the total variability in the exposure can be explained by the model (the technical term for this is “R2”).

While LUR models are better than relying on monitors alone, they are only as good as their inputs. So, if the model does not include some crucial feature that substantially affects air pollution concentration (like a local factory) or if the input data are not detailed enough, the final model will be inaccurate. There are also concerns with how measurement/prediction errors affect the health effect analyses, a complicated topic that deserves a post of its own (hopefully forthcoming!).

LUR is one of the most common methods used to estimate exposure in air pollution cohort studies. However, model quality varies greatly. And in reality, there’s no single number or assessment that can truly capture whether a model is “good enough;” it’s complicated and depends more on the model inputs and methods as well as the decisions that the modelers make along the way. In other words, you have to read the “methods” sections of the publication very carefully. (I know that is not very satisfying, especially since the quality of these models can affect how much we can trust the results of the associated health analysis…)

(3) Dispersion & Chemical Transport Models (DCTMs)

TLDR: We can use physical-chemical models to predict air pollution concentrations. These models don’t use ground-level data, which is good if you want to predict in places where those data don’t exist, but it’s not ideal in the broader sense that we generally trust ground-level information more than theoretical models alone.

DCTMs (such as Box models, Gaussian models, Langragian models, & Computational Fluid Dynamic models) use physical and chemical properties, meteorological data, and topography to model the dispersion and transport of pollutants. In contrast to LURs, these models do not take into account any ground monitoring data with actual pollution measurements.

As with LUR, these models need to be validated, and model performance seems to vary by pollutant. A strength of these models is that they can provide information in places where we don’t have any monitoring data. But, because they are often very expensive to develop, provide limited spatial detail, and are not anchored to any ground-level data (which we tend to trust more than pure models), they are less commonly used in epidemiologic cohort studies than LUR.

(4) Satellite Remote Sensing

TLDR: We have advanced satellite technology that can estimate air pollution from space, based on light scattering! Cool! But, right now, these models provide limited spatial resolution, so they are best when supplemented with ground-level data.  

At a global scale, particularly when there are no ground-level monitors or detailed information on local emissions sources (such as in many low income countries), satellite measures of aerosol optical depth (AOD) can provide good estimates of certain air pollutants.

How does this work? A satellite measures the absorption and scattering of specific wavelengths of light at the same time in the same location each day, and these measures are then converted to ground level concentrations using chemical models such as GEOS-Chem.

Incredibly enough, these models can provide fairly accurate predictions of some ground-level pollutants, such as nitrogen dioxide and PM2.5 (but, modeling based on ground-level monitors is usually still more accurate). Remote sensing works less well for ozone, however, since the high levels of ozone in the stratosphere make it more complicated to accurately extrapolate to the earth’s surface. Other issues include potential interference from clouds, limited temporal characterization (usually just at a single time per day), and limited spatial detail (though spatial resolution continues to improve).

That last point – limited spatial detail – is the main downside of remote sensing right now. One way to mitigate this issue, though, is to incorporate data from LUR or ground level monitors, which can substantially improve accuracy.

Application to my work

I’ve skimmed over many details about each of these methods in the interest of making this post semi-accessible (but even so, I know it’s long!). There are also several other assessment methods as well as some interesting hybrid approaches. (For more details, I recommend Chapter 3 of EPA’s recent Integrated Science Assessment for Particulate Matter).

For my study, we’re relying on a sophisticated version of a LUR model with kriging that incorporates both spatial and temporal variability, similar to what has been done previously for other regions of the country. The inputs come from five federal agency monitoring sources and three different monitoring campaigns that have been implemented by the UW Dept. of Environmental and Occupational Health Sciences over the years, and – like all LURs – the final regression model incorporates different geographic features of the area that influence local pollutant levels. In the end, we will have a model that can estimate air pollution exposure at any location in the Puget Sound area all the way back to the early 1980’s!

The elephant in the room (or, the person in the house)

TLDR: Why is it relevant and meaningful to use measures of outdoor air pollution in our health effect studies? 1) Outdoor air enters the home; 2) Measurement error likely underestimates health effects (so we are not exaggerating our results!); 3) Health effects based on outdoor air can influence policy.

All of the methods I’ve described so far provide exposure estimates based on a given geographic area or point. Some new, small studies use air pollution sensors to get measurements for each person, but that is too expensive for studies with thousands of people – and completely infeasible for those that were started years ago (like the one I’m using). So, instead, I will get estimates of outdoor air pollution based on each person’s residential address and then look at associations with dementia and related pathology.  

There are several reasons why a measure of outdoor air pollution based on a home address might not accurately represent someone’s actual exposure. First, we spend most (apparently, 90%!) of our time inside. Second, most of us spend a majority of our waking hours away from our homes, including commuting (in traffic) and working. (Although, for my study of mostly retired individuals, this should be less of an issue). Additionally, from a toxicology perspective, it’s important to consider that a measure of external exposure is very different from internal dose; what actually gets inside your body depends on your breathing rate, among other factors.

So, how can we trust results from epidemiological studies that use these measures of outdoor air pollution from someone’s home address as the “exposure?” When I first began my research in this area, I was quite bothered by this question and talked extensively with my adviser. Here are a few responses and associated thoughts (some more satisfying than others, I admit):

  • Outdoor pollutants infiltrate into our homes

While we like to believe that we are protected from outdoor air when inside, there is actually measurable and meaningful infiltration of outdoor pollutants into our homes. I experienced this firsthand during the wildfires in Seattle two summers ago, when I could smell smoke inside my room even though all the windows were closed.

  • Measurement error likely underestimates pollution effects

When we measure pollution at people’s homes, we are not capturing the full picture of their exposures. For someone like me, who commutes by bike, a measure of outdoor air pollution at my home is likely an underestimate of my daily exposure.

This concept – when the measured/predicted exposure does not represent the actual exposure – is a technical term referred to as “measurement error.” As I mentioned above, this is a complicated topic that I plan to return to in a future post. For now, a highly simplistic summary is that in most cases, exposure measurement error either 1) attenuates (decreases) the observed association, or 2) has no impact on the observed association (but it could affect the margin of error). (See here and here, for example.) So, in this simplified framework, we assume that the consequence of using these imperfect estimates of ambient air pollution is that our results are underestimated.

[Note: to reiterate…this is very simplified. Research by my own adviser suggests that it could also have the opposite effect.]

[Another note: this simplified summary assumes that the measurement error is “non-differential” (similar for all groups). When measurement error is “differential” (differs between groups), the impact on the health effect estimates could be different than what I described. See, I told you this was complicated!] 

  • Effects linked to outdoor air pollution are policy-relevant

Some people consider personal monitors to be the ideal method for exposure assessment. However, a feature of this approach is that they also capture air pollutants from indoor sources, since they measure everything a person is exposed to throughout the day.

Yet, the Clean Air Act regulates outdoor air pollution, not personal or indoor exposures. Therefore, a study demonstrating that exposure to a certain amount of outdoor air pollution is associated with a specific health outcome provides meaningful information that could inform updates to air pollution standards.

Closing thoughts

Yes, I’m the first to admit that our methods to assess exposure to air pollution are imperfect. But, they allow us to efficiently quantify exposure for large segments of the population. (We really do need these data on large populations to be able to see evidence of the relatively small increased risks for various health effects.) And, while these exposure estimates might not capture the whole story, they are close enough to allow us to consistently see that air pollution is associated with premature mortality, cardiovascular disease, diabetes, asthma….and more.

In the future, I think we will see exciting refinements to these approaches, such as incorporating infiltration rates (to describe how outdoor pollutants enter indoor environments) or more personal behavior information. We might also be able to assess the specific type of particle (ie: metal vs. carbon-based) rather than just evaluating by size (as is done now, in distinguishing between PM2.5 and. PM10); this is important because different particle types may be associated with different health effects. These additional details will increase the accuracy of our exposure assessment methods.

One final note to try to assuage any remaining doubts about epidemiological studies based on these methods. Take the case of PM2.5 and mortality, for example…. Sure, we might not trust one-off results from a study using a single exposure assessment method on a single population. But science moves forward incrementally, based on the results of many studies rather than a single finding. When we see similar results from many studies, even across different populations and based on different exposure assessment methods, we can have strong confidence that PM2.5 is associated with increased risk of death (for example).

In this way, I think that air pollution epidemiology is strengthened by the use of these various methods, each with their own pros and cons. It’s certainly good to be skeptical, though. I think our work (like all science) could benefit from some healthy questioning of the accepted approaches, which could prompt us to improve our methods.

Concerning Glyphosate

Apologies for the long blog absence. I’ve been busy PhD-ing (including preparing for and passing my oral general exam!) and working on various side projects.

One of those side projects has been focused on glyphosate. Glyphosate, the active ingredient in Monsanto’s (now owned by Bayer) Roundup, is the most widely used herbicide in the world. First marketed in 1974, its usage skyrocketed after the introduction of “Roundup-ready” (i.e.: Roundup resistant) crops in 1996 and the practice of “green-burndown” (i.e.: using the chemical as a desiccant shortly before harvest) in the mid-2000s. In 2014, global usage was estimated to be 1.8 billion pounds.  

But these staggering statistics are not the only claim to fame for glyphosate. It has also been the subject of intense international regulatory and scientific scrutiny in recent years, for its possible link to cancer. The stakes are high (billions of dollars for Monsanto, related to sales of both the herbicide itself and its line of herbicide-resistant crops), and the conclusions are controversial.

Carcinogenic or not, that is the question.

In 2015, the International Agency on Cancer (IARC) declared that glyphosate was a “probable human carcinogen” (relevant links: explanation of IARC classifications; official summary for glyphosate; IARC webpage with follow-up links). However, that same year, the European Food Safety Authority (EFSA) concluded that “glyphosate is unlikely to pose a carcinogenic hazard to humans, and the evidence does not support classification with regard to its carcinogenic potential.” In 2016, the US Environmental Protection Agency (EPA) determined that glyphosate was “not likely to be carcinogenic to humans at doses relevant for human health risk assessment.”

Ok, so that’s confusing. How did these agencies, all of which are supposed to conduct unbiased reviews of all of the evidence come to such different conclusions? There have been several recent publications that explain these inconsistencies (for example, see here and here). In essence, it boils down to: 1) differences in how the agencies weighed peer-reviewed, publicly available studies (most show adverse health effects) versus unpublished regulatory studies submitted by manufacturers (most do not show adverse health effects); 2) whether the agencies focused on studies of pure glyphosate or the final formulated glyphosate-based product that is used in agricultural applications (which is known to be more toxic); and 3) whether the agencies considered dietary exposures to the general population only or also took into account elevated exposures in occupational scenarios (i.e. individuals who apply glyphosate-based herbicides in agricultural settings).

Meanwhile, as the debate continues… 27 countries (as of November 2018) have decided to move forward with implementing their own bans or restrictions. And, Monsanto/Bayer faces more than 9,000 lawsuits in the US from individuals who link their cancer to the herbicide. (The courts ruled the first case in favor of the plaintiff, though Monsanto is appealing the decision).

My connection

This highly contentious area is outside the topic of my dissertation research, but I got involved because my advisor was a member of the EPA scientific advisory panel that reviewed the agency’s draft assessment of glyphosate in 2016. The panel’s final report raised a number of concerns with EPA’s process and conclusions, including that the agency did not follow its own cancer guidelines and made some inappropriate statistical decisions in the analysis.

Because of their dissatisfaction with EPA’s report, my advisor and two other panel members decided to pursue related research to dig further into the issues. I enthusiastically accepted the invitation to join.   

Our collaborative group recently published two review papers on glyphosate. I’ll provide brief highlights of both below.

Reviewing our reviews, part 1: exposure to glyphosate  

In January 2019, we published a review of the evidence of worldwide exposure to glyphosate. Even though glyphosate-based products are the most heavily used herbicides in the world, we were surprised (and dismayed) to find less than twenty published studies documenting exposure in only 3721 individuals.

So, our paper mostly serves to highlight the limitations of the existing data:

  • These studies sampled small numbers of individuals from certain geographic regions, mostly in the US and Europe, and therefore are not representative of the full scope of global exposures
  • Most studies relied on a single urine spot sample, which does not represent exposure over the long term and/or in different agricultural seasons
  • The occupational studies only covered 403 workers in total, a serious deficiency given its widespread agricultural use. Few assessed exposure before and after spraying; and no studies evaluated patterns related to seasonality, crop use, etc.
  • Only two small studies evaluated how population exposure has changed over time. So, we definitely don’t know enough about whether the dramatic increases in global usage have resulted in similarly dramatic increased concentrations in our bodies. (Presumably, yes).  

In addition to highlighting the need to address the points above, we specifically recommended  incorporating glyphosate into the National Health and Nutrition Examination Survey (NHANES), a national survey that monitors exposure to many chemicals – including other common pesticides. This is an obvious and fairly straightforward suggestion; in reality, it’s quite bizarre that it has not already been incorporated into NHANES. Testing for glyphosate would allow us to better understand exposure across the US – which is not reflective of global levels, of course, but an important start.

Reviewing our reviews, part 2: glyphosate & non-Hodgkin Lymphoma (NHL)  

Our second paper, published earlier this week, was a meta-analysis of the link between glyphosate exposure and non-Hodgkin Lymphoma (NHL). Yes, diving right in to the controversy.

There had already been several prior meta-analyses that showed an association between glyphosate and NHL, but ours incorporates new research and applies a method that would be more sensitive to detecting an association.

A meta-analysis combines results from separate studies to better understand the overall association. While they technically do not generate any “new” data, meta-analyses are essential in the field of public health. A single study may have certain weaknesses, focus only on selected populations, or reflect a chance finding. In drawing conclusions about hazards (especially in this scenario, affecting millions of people and billions of dollars), we want to look across the collection of data from many studies so we can be confident in our assessment.

We were able to include a newly published follow-up study of over 54,000 licensed pesticide applicators (part of the Agricultural Health Study (AHS)). Compared to an earlier paper of the same cohort, this updated AHS study reports on data for an additional 11-12 years. This extension is important to consider, given that cancer develops over a long period of time, and shorter studies may not have followed individuals long enough for the disease to arise.

We conducted this meta-analysis with a specific and somewhat unusual approach. We decided to focus on the highly exposed groups in order to most directly address the question of carcinogenicity. In other words, we would expect the dangers (or, proof of safety: is it safe enough to drink?) to be most obvious in those who are highly exposed. Combining people who have low exposure with those who have high exposure would dilute the association. IMPORTANT NOTE: this approach of picking out the high exposure groups is only appropriate because we are simply looking for the presence or absence of a link. If you were interested in the specific dose-response relationship (i.e.: how a certain level of exposure relates to a certain level of hazard), this would not be ok.

Our results indicate that individuals who are highly exposed to glyphosate have an increased risk of NHL, compared to the control/comparison groups. This finding itself is not entirely earth-shattering: the results from prior meta-analyses were similar. But, it adds more support to the carcinogenic classification.

More specifically, we report a 41% increased risk. For comparison, the average lifetime risk of NHL is about 2%However, I want to emphasize that because our analytical method prioritized the high exposure groups, the precise numerical estimate is less important than the significant positive correlation. Basically, the purpose of this and other related assessments (like IARC’s) is to understand whether glyphosate is carcinogenic or not: this is a yes/no question. It is up to regulatory agencies to judge the scale of this effect and decide how to act on this information.

As with any scientific project, there are several limitations. In particular, we combined estimates from studies that differed in important ways, including their design (cohort vs. case-control), how they controlled for confounding by exposure to other pesticides, and which reference group they chose for the comparison (unexposed vs. lowest exposed). When studies are very different, we need to be cautious about combining them. This is another reason to focus more on the direction of the effect rather than the exact numerical estimate.  

Beyond the headlines

The news coverage of this work has focused on the overarching results (especially the 41% statistic), as expected. But I want to highlight a few other aspects that have been overlooked.

To better understand the timing of these studies in relation to glyphosate usage, we put together a timeline of market milestones and epidemiological study events.


This took me SO MANY HOURS.

Of note is that all of the studies conducted to date evaluated cancers that developed prior to 2012-2013, at the latest. Most were much earlier (80s, 90s, early 00s). As illustrated in the timeline, we’ve seen a huge increase in glyphosate usage since green burndown started in the mid-2000s. Yet none of these studies would have captured the effects of these exposures, which means the correlation should be easier to see in newer studies if/when they are conducted.

Also, as I mentioned above, we included the newly published AHS cohort study in our meta-analysis. One might expect the old and new AHS studies to be directly comparable, given that they were conducted by the same research group. However, our deep dive into both papers elucidated important differences; consequently, they are not directly comparable (see Table 8 of our paper). An in-depth discussion of these issues (and some of their potential implications) is a topic for a separate post, but there’s a clear lesson here about how important it is to carefully understand study design and exposure assessment methods when interpreting results.

Finally, two brief points on the animal toxicology studies, which we also reviewed in our paper because they provide complementary evidence for assessing hazard in humans. We discuss these data but did not conduct a formal pooled analysis (to combine results from separate but similarly designed animal studies), which would allow us to better understand overarching results from the animal studies. Anyone ready for a project?  

Additionally, in future animal toxicology studies, researchers should use the formulated glyphosate product that is actually used around the world rather than the pure glyphosate chemical that has been the focus of prior testing. There is growing evidence to suggest that the final formulated product is more toxic, perhaps due to the added adjuvants and surfactants. And this would allow for better comparisons to the human epidemiological studies, which assess effects of exposure to the formulated product.

Reflecting on the process

I had followed the evolving story on glyphosate with great interest for several years, so it was exciting to be part of these projects. Contributing to research with a real-world public health impact has always been a priority for me, and this high-profile research (affecting millions of people, billions of dollars) certainly fits the bill.

That being said, it was not an easy process. These two papers represent years of work by our group, which we did on top of our regular commitments. Collaborating with three researchers whom I had never met also proved challenging, since we did not have established rapport or an understanding of each other’s work and communication styles. So, in addition to gaining skills in conducting literature reviews and meta-analyses, I learned valuable lessons in group dynamics. 🙂

Given the high-stakes and high-profile nature of this work, we were extra meticulous about the details of this project. We knew that it would be scrutinized carefully, and any error could damage our credibility (especially worrisome for me, since I’m just establishing myself in my career). It took many, many rounds of review and editing to get everything right. A good lesson in patience.

Speaking of patience, I know that scientific research and related policy decisions take time. But I hope that these two projects can contribute to moving forward in a direction that protects public health.


Breastfeeding in the Age of Chemicals

It’s a catch-22 that would drive any new mother crazy.

Should she breastfeed, which is linked to many lasting health benefits for the newborn child, but take the risk of delivering toxic chemicals, such as dioxins and DDT, that are stored in her breast milk?

Or, should she use infant formula, which avoids the problem of breast milk contaminants but does not offer the same benefits to her newborn and may also contain toxic chemicals (because of lax food safety regulations or if contaminated water is used to reconstitute the formula, for example).

Last month, two papers (from the same group of collaborators) published in Environmental Health Perspectives attempted to address these issues by reviewing decades of relevant research. These papers are both quite extensive and represent impressive work by the authors – but it’s unlikely that non-scientists will wade through the details. So, I’ll do my best to help you out.

Breast milk vs. infant formula: What chemicals are in each?

The first paper starts by documenting all of the chemicals detected in either breast milk or infant formula, based on studies published between the years 2000-2014 (mostly in the United States). Below is a highly simplified table, with just the chemicals rather than other details (refer to the paper if you’re interested in more).

Screen Shot 2018-10-10 at 8.52.42 PM
Abbreviated list of chemicals detected in breast milk and infant formula in studies of women in the United States between 2000-2014. Adapted from Lehmann et al, 2018.
*No data from US studies, so information taken from international studies


What can we learn from these data, other than that it looks like complicated alphabet soup?

Well, toxic chemicals have been detected in both breast milk and infant formula, but there are some differences in the types of chemicals found in each. Breast milk is more likely to contain lipophilic (fat-loving/stored in fat) and long-lasting chemicals, such as dioxins and certain pesticides. By contrast, breast milk and formula both have some common short-lived chemicals, such as bisphenol-A (BPA) and parabens.

While the paper also provides information about the average and range of concentrations of chemicals in each medium (and how they compare to acceptable levels of exposure for infants), it’s hard to draw general conclusions because there are such limited data available. It is complicated, expensive and invasive to get samples of breast milk across wide segments of the population, and relatively few studies have looked at chemicals found in infant formula. We need more information before we can accurately understand the patterns of exposure across the population.

Nevertheless, the presence of all of these chemicals seems concerning. No one wants to deliver toxic milk to children during their early months of life, when they are more vulnerable because their organ systems and defense mechanisms are still developing.

But, what do the data indicate about the health consequences of these exposures?

Early dietary exposures and child health outcomes

That’s where the second paper comes in. Here, the same group of authors reviewed the literature on the association between chemicals in breast milk and adverse health outcomes in children. (Note: they had planned to ask the same question for infant formula, but there were not enough published studies). They looked at many chemicals (such as dioxins, PCBs, organochlorine pesticides, PBDEs) and many outcomes (including neurological development, growth & maturation, immune system, respiratory illness, infection, thyroid hormone levels).

Early studies in the field had indeed suggested cause for concern. For example, infants in Germany fed breast milk contaminated with high levels of PCBs were found to have neurodevelopmental deficits in early life. However, levels of PCBs in the general population have declined in recent years (because of worldwide bans), and subsequent studies in the same region found that these lower levels of PCBs were not associated with harmful neurodevelopmental effects.

Overall, when looking across various chemicals and health outcomes, the current literature is actually… inconclusive. Many studies reported no associations, and studies asking similar questions often reported conflicting results. Furthermore, studies that reported significant effects often evaluated health outcomes at only one or two periods in early life, and we don’t know if those changes really persist over time.

A glass half full…of challenges

In the end, the authors ended up with more questions than answers – and a long list of challenges that prevent us from understanding the effects of breast milk-related chemical exposures on children’s health. For example:

  • Chemicals in breast milk are often also present in the mother during pregnancy. How can we disentangle the effects of exposures during the prenatal period from exposures due only to breast milk in early postnatal life?
  • Many of these studies represent a classic case of “looking for your keys under the lamppost.” We can only study chemicals and outcomes that we choose to focus on, so we could be missing other important associations that exist.
  • On a related note, most studies focused on exposure to only one or a small group of chemicals, rather than the real-world scenario of the complex mixtures in breast milk.
  • There was little study replication (ie: more than one study looking at the same question). Generally, we feel more confident drawing conclusions based on a larger pool of studies.
  • The few studies that did ask the same questions often used different experimental designs. These distinctions also pose challenges for interpretation, since differences in how researchers measure exposures and outcomes could affect their results.
  • Most studies evaluated levels of chemicals in breast milk using one or two samples only. How accurate are these exposure assessments, given that levels in the milk may change over time?
  • Measuring chemicals in breast milk is just one aspect of exposure, but it doesn’t tell us how much the infant actually received. Mothers breastfeed for different amounts of time, which affects how much is delivered to the infant. These person-to-person differences within a study could make it challenging to see clear results in an analysis.

Filling in the gaps

Perhaps the only certain conclusion from these publications is that much work remains. Not only do we need more studies that document the levels of chemicals in breast milk and infant formula (as the first paper highlighted), but we also need more data on the links between these exposures and health outcomes – including targeted research to address the challenges and key gaps noted above.

Importantly, because breastfeeding is associated with many key health benefits (such as improved neurodevelopment and reduced risk of obesity, diabetes, infections, and more), any study that looks at the impact of chemical exposures in breast milk should also ask a similar question in a comparison group of formula-fed infants. It is likely that the positive effects of breast milk far outweigh any potential negative impacts from the chemicals in the milk, and that the infants would actually be worse off if they were fed formula that had the same level of chemicals (but did not receive the benefits of breast milk).

I’ll be the first to admit: it is scary to think about all of these chemicals in breast milk. But, all decisions have trade-offs, and here, when weighing the risks and benefits, the balance still seems to favor breastfeeding in most situations.

A Decade into the “Vision,” Environmental Health gets a Progress Report

This year represents an important 10-year milestone for science and society.

No, I’m not referring to the 10th anniversary of the Apple iPhone, though that has undoubtedly changed all of our lives. Rather, 2017 marks ten years since the National Academy of Sciences (NAS) released its seminal report, Toxicity Testing in the 21st Century: A Vision and a Strategy.

In that report, the NAS laid out a vision for a new approach to toxicology that incorporates emerging cell-based testing techniques, rather than costly and time-intensive whole animal models, and utilizes early biological pathway perturbations as indications of adverse events, rather than relying on evaluations of end disease states. Tox21 and ToxCast, two federal programs focused on using alternative assays to predict adverse effects in humans, were initiated as first steps in this strategy. In the years since its release, the report has profoundly shaped the direction of environmental health sciences, particularly toxicology. (An analogous exposure sciences report, Exposure Science in the 21st Century: A Vision and a Strategy, was published in 2012.)

Now, one decade later, the NAS has reviewed progress on these efforts in its recently released report, Using 21st Century Science in Risk-Based Evaluations.

How are we doing, and what are next steps?

Overall, the committee supports efforts to use data from new tools, such as biological pathway evaluations, in risk assessment and decision-making. (Of course, limitations should be clearly communicated, and tools should be validated for their specific purposes.) Several case studies are described as examples of situations where emerging tools can be useful, such as quickly prioritizing chemicals of concern or evaluating risks from chemical mixtures at a contaminated site.

This report also documents advancements and challenges for each of the three interconnected fields of environmental health sciences: toxicology, exposure science, and epidemiology. I’ve summarized some of these key points in the chart below, and additional (digestible) information is available in the NAS report summary.



Recent Advancements

Key Challenges


  • Incorporate metabolic capacity in in vitro assays
  • Understand applicability & limitations of in vitro assays
  • Improve biological coverage
  • Address human variability & diversity in response

Exposure Science

  • Coordination of exposure science data (ex: databases)
  • Integration of exposure data of multiple chemicals obtained through varied methods


  • Improved data management & data sharing
  • Improved methods for estimation of exposures

I won’t go into detail on all of these points, but I do want to highlight some of the key challenges that the field of toxicology will need to continue to address in the coming years, such as:

  • Improving metabolic capacity of in vitro assays: Cell-based assays hold promise for predicting biological responses of whole animals, but it is critical to remember that these new tools rarely reflect human metabolic capacity. For example, if a chemical is activated or detoxified by an enzyme in our bodies, reductionist assays would not adequately reflect these changes – and thus their prediction would not be fully relevant to human health. We need continued work to incorporate metabolic capacity into such assays.
  • Improving biological coverage: An analogy that I’ve often heard in relation to the limitations of these new tools is that they are only “looking under the biological lamp post.” Essentially, we can only detect effects that the assays are designed to evaluate. So, we need further development of assays that capture the wide array of possible adverse outcomes. And we cannot assume that there is no hazard for endpoints that have not been evaluated.

New models of disease causation

Not only is the environmental health science ‘toolkit’ changing but also our understanding of disease causation. As discussed in the report, 21st century risk assessment must acknowledge that disease is “multifactorial” (multiple different exposures can contribute to a single disease) and “nonspecific” (a single exposure can lead to multiple different adverse outcomes). This advanced understanding of causality will pose challenges for interpreting data and making decisions about risk, and we will need to incorporate new practices and methods to address these complexities.

For example, we can no longer just investigate whether a certain exposure triggering a certain pathway causes disease in isolation, but also whether it may increase risk of disease when combined with other potential exposures. It gets even more complicated when we consider the fact that individuals may respond to the same exposures in different ways, based on their genetics or pre-existing medical conditions.

The Academy suggests borrowing a tool from epidemiology to aid in these efforts. The sufficient-component-cause model provides a framework for thinking about a collection of events or exposures that, together, could lead to an outcome.


Sufficient-component-cause model. Three disease mechanisms (I, II, III), each with different component causes. Image from NAS Report, Using 21st Century Science to Improve Risk Related Evaluations


Briefly, each disease has multiple component causes that fit together to complete the causal pie. These components may be necessary (present in every disease pie) or sufficient (able to cause disease alone), and different combinations of component causes can produce the same disease. Using this model may promote a transition away from a focus on finding a single pathway of disease to a broadened evaluation of causation that better incorporates the complexities of reality. (I’ve blogged previously about the pitfalls of a tunnel-vision, single pathway approach in relation to cancer causation.)

Integration of information, and the importance of interdisciplinary training

As the fields of toxicology, exposure science, and epidemiology continue to contribute data towards this updated causal framework, a related challenge will be the integration of these diverse data streams for risk assessment and decision-making. How should we weigh different types of data in drawing conclusions about causation and risk? For example, what if the in vitro toxicology studies provide results that are different than the epidemiology studies?

The committee notes that we will need to rely on “expert judgment” in this process, at least in the short term until standardized methods are developed. And they discuss the need for more interaction between individuals from different disciplines, so that knowledge can be shared and applied towards making these difficult decisions.

One issue that was not discussed, however, is the importance of training the next generation of scientists to address these complex challenges. Given the inevitable need to integrate multiple sources of data, I believe it is critical that the students in these fields (like me!) receive crosscutting training as well as early practice with examples of these multi-faceted assessments. Some programs offer more opportunities in this area than others, but this should be a priority for all departments in the coming years. Otherwise, how can we be prepared to step up to the challenges of 21st century environmental health sciences?

Looking forward

Speaking of challenges, we certainly have our next decade of work cut out for us. It is amazing to think about how much progress we have made over the last ten years to develop new technologies, particularly in toxicology and exposure sciences. Now we must: refine and enhance these methods so they provide more accurate information about hazard and exposure; address the complexities of multifactorial disease causation and inter-individual susceptibility; and work across disciplines to make decisions that are better protective of public health and the environment.