Acknowledgement: Thank you to my adviser and my lab group for their helpful feedback and discussion related to this post.
Even though I’m pursuing a PhD in environmental toxicology, my dissertation uses methods from environmental epidemiology to answer the question of whether long-term air pollution exposure (specifically, fine particulate matter (PM2.5)) is associated with dementia or dementia pathology in a cohort based in the Seattle area.
When I tell people about my project, they always ask: “how are you measuring exposure to air pollution?”
My answer: “Well, it’s complicated…”
Exposure assessment – “the process of estimating or measuring the magnitude, frequency, and duration of exposure” – is often considered the Achilles’ heel of environmental epidemiology. Maybe in the future, we’ll have advanced personal sensors that measure all possible chemicals in our bodies. But for now, environmental epidemiologists have to use a variety of imperfect methods.
Different pollutants require different methods of measurement (for example, some are measured internally, in urine or blood), but here I’ll focus on common methods to assess long term exposure to ambient (outdoor) air pollution for use in cohort studies evaluating chronic health effects. Other methods can be used for time-series studies, which are focused on short-term exposures and acute health effects. All of this information exists in the academic literature, but as per the title of this post, my goal is to present a summary so that non-exposure science experts (like me!) can understand it.
A brief review of selected air pollution exposure assessment methods for cohort studies
(1) Monitoring & Simple Interpolation Models
TLDR: We can use information from ground-level monitors, or a weighted average from those monitors, as a basic way to estimate air pollution for a given population. This is easy and cheap, but overly simplistic.
The most consistent, long-term measurements of air pollution come from stationary monitors, such as the Environmental Protection Agency’s (EPA’s) Air Quality System (AQS). A highly simplistic (and inexpensive) approach, then, would be to represent each person’s exposure by the nearest monitor. But, there are several problems with this method. First, as you can see if you click on this map (and choose pollutant monitors under the “select layer” function), the existing network of monitors is quite sparse. Here’s a map of the three active PM2.5 monitors in Seattle:
I live in northwest Seattle (see the blue triangle on the map), so it is highly unlikely that those monitors in south Seattle adequately represent my long-term air pollution exposures – particularly for pollutants linked to specific local sources, like roadways or factories. Second, these monitors usually only take measurements every third or sixth day, so there are a lot of missing data over time. (However, long term averages in the region tend to be fairly stable from day-to-day, so this probably isn’t a huge problem for cohort studies looking at the effects of long-term exposures). And third, since we need some amount of population variability to estimate a meaningful association between an exposure and an outcome, assigning the same monitoring information to everyone in a certain area essentially eliminates (spatial) contrasts and decreases our chances of detecting any effects.
To overcome the lack of spatial coverage from federal regulatory monitors, researchers can set up their own monitors. But, developing a new monitoring network is expensive, and you can only get exposure data for the specific period of time that your monitors are set up (as opposed to being able to look at the cumulative effects of exposures over years in the past, which is what most cohort studies want to do).
We can improve on this with “interpolation,” which is a basic model that assigns more weight (influence) to the monitors that are closer to the individual (such as through inverse distance weighting or kriging). You can think about it like a weighted average. While this is an improvement, it often still does not really provide the type of detailed (spatial) resolution that we need.
Overall, these are fairly basic approaches that were more commonly used in the early days of air pollution epidemiology (ex: the seminal study by Dockery et al.), especially when the research focus was on between-city rather than within-city differences in health outcomes. We can definitely do better now.
(2) Spatial Regression Models
TLDR: We can build regression models based on ground monitors, geography, and population features to predict air pollution at various locations. These models are commonly used in cohort studies, but each one varies in quality.
Land use regression (LUR) and related spatial regression models (e.g., universal kriging) use monitoring data as their foundation, but they also integrate information such as traffic density, population, land use, geographic features, and proximity to known emissions sources. All of these inputs are incorporated into a complex regression model to predict concentrations at sites where no monitoring data exist.
LUR model quality is judged through one or more validation processes, to see how well they can predict concentrations at sites where the true concentration is known. Researchers also look at how much of the total variability in the exposure can be explained by the model (the technical term for this is “R2”).
While LUR models are better than relying on monitors alone, they are only as good as their inputs. So, if the model does not include some crucial feature that substantially affects air pollution concentration (like a local factory) or if the input data are not detailed enough, the final model will be inaccurate. There are also concerns with how measurement/prediction errors affect the health effect analyses, a complicated topic that deserves a post of its own (hopefully forthcoming!).
LUR is one of the most common methods used to estimate exposure in air pollution cohort studies. However, model quality varies greatly. And in reality, there’s no single number or assessment that can truly capture whether a model is “good enough;” it’s complicated and depends more on the model inputs and methods as well as the decisions that the modelers make along the way. In other words, you have to read the “methods” sections of the publication very carefully. (I know that is not very satisfying, especially since the quality of these models can affect how much we can trust the results of the associated health analysis…)
TLDR: We can use physical-chemical models to predict air pollution concentrations. These models don’t use ground-level data, which is good if you want to predict in places where those data don’t exist, but it’s not ideal in the broader sense that we generally trust ground-level information more than theoretical models alone.
DCTMs (such as Box models, Gaussian models, Langragian models, & Computational Fluid Dynamic models) use physical and chemical properties, meteorological data, and topography to model the dispersion and transport of pollutants. In contrast to LURs, these models do not take into account any ground monitoring data with actual pollution measurements.
As with LUR, these models need to be validated, and model performance seems to vary by pollutant. A strength of these models is that they can provide information in places where we don’t have any monitoring data. But, because they are often very expensive to develop, provide limited spatial detail, and are not anchored to any ground-level data (which we tend to trust more than pure models), they are less commonly used in epidemiologic cohort studies than LUR.
TLDR: We have advanced satellite technology that can estimate air pollution from space, based on light scattering! Cool! But, right now, these models provide limited spatial resolution, so they are best when supplemented with ground-level data.
At a global scale, particularly when there are no ground-level monitors or detailed information on local emissions sources (such as in many low income countries), satellite measures of aerosol optical depth (AOD) can provide good estimates of certain air pollutants.
How does this work? A satellite measures the absorption and scattering of specific wavelengths of light at the same time in the same location each day, and these measures are then converted to ground level concentrations using chemical models such as GEOS-Chem.
Incredibly enough, these models can provide fairly accurate predictions of some ground-level pollutants, such as nitrogen dioxide and PM2.5 (but, modeling based on ground-level monitors is usually still more accurate). Remote sensing works less well for ozone, however, since the high levels of ozone in the stratosphere make it more complicated to accurately extrapolate to the earth’s surface. Other issues include potential interference from clouds, limited temporal characterization (usually just at a single time per day), and limited spatial detail (though spatial resolution continues to improve).
That last point – limited spatial detail – is the main downside of remote sensing right now. One way to mitigate this issue, though, is to incorporate data from LUR or ground level monitors, which can substantially improve accuracy.
Application to my work
I’ve skimmed over many details about each of these methods in the interest of making this post semi-accessible (but even so, I know it’s long!). There are also several other assessment methods as well as some interesting hybrid approaches. (For more details, I recommend Chapter 3 of EPA’s recent Integrated Science Assessment for Particulate Matter).
For my study, we’re relying on a sophisticated version of a LUR model with kriging that incorporates both spatial and temporal variability, similar to what has been done previously for other regions of the country. The inputs come from five federal agency monitoring sources and three different monitoring campaigns that have been implemented by the UW Dept. of Environmental and Occupational Health Sciences over the years, and – like all LURs – the final regression model incorporates different geographic features of the area that influence local pollutant levels. In the end, we will have a model that can estimate air pollution exposure at any location in the Puget Sound area all the way back to the early 1980’s!
The elephant in the room (or, the person in the house)
TLDR: Why is it relevant and meaningful to use measures of outdoor air pollution in our health effect studies? 1) Outdoor air enters the home; 2) Measurement error likely underestimates health effects (so we are not exaggerating our results!); 3) Health effects based on outdoor air can influence policy.
All of the methods I’ve described so far provide exposure estimates based on a given geographic area or point. Some new, small studies use air pollution sensors to get measurements for each person, but that is too expensive for studies with thousands of people – and completely infeasible for those that were started years ago (like the one I’m using). So, instead, I will get estimates of outdoor air pollution based on each person’s residential address and then look at associations with dementia and related pathology.
There are several reasons why a measure of outdoor air pollution based on a home address might not accurately represent someone’s actual exposure. First, we spend most (apparently, 90%!) of our time inside. Second, most of us spend a majority of our waking hours away from our homes, including commuting (in traffic) and working. (Although, for my study of mostly retired individuals, this should be less of an issue). Additionally, from a toxicology perspective, it’s important to consider that a measure of external exposure is very different from internal dose; what actually gets inside your body depends on your breathing rate, among other factors.
So, how can we trust results from epidemiological studies that use these measures of outdoor air pollution from someone’s home address as the “exposure?” When I first began my research in this area, I was quite bothered by this question and talked extensively with my adviser. Here are a few responses and associated thoughts (some more satisfying than others, I admit):
- Outdoor pollutants infiltrate into our homes
While we like to believe that we are protected from outdoor air when inside, there is actually measurable and meaningful infiltration of outdoor pollutants into our homes. I experienced this firsthand during the wildfires in Seattle two summers ago, when I could smell smoke inside my room even though all the windows were closed.
- Measurement error likely underestimates pollution effects
When we measure pollution at people’s homes, we are not capturing the full picture of their exposures. For someone like me, who commutes by bike, a measure of outdoor air pollution at my home is likely an underestimate of my daily exposure.
This concept – when the measured/predicted exposure does not represent the actual exposure – is a technical term referred to as “measurement error.” As I mentioned above, this is a complicated topic that I plan to return to in a future post. For now, a highly simplistic summary is that in most cases, exposure measurement error either 1) attenuates (decreases) the observed association, or 2) has no impact on the observed association (but it could affect the margin of error). (See here and here, for example.) So, in this simplified framework, we assume that the consequence of using these imperfect estimates of ambient air pollution is that our results are underestimated.
[Note: to reiterate…this is very simplified. Research by my own adviser suggests that it could also have the opposite effect.]
[Another note: this simplified summary assumes that the measurement error is “non-differential” (similar for all groups). When measurement error is “differential” (differs between groups), the impact on the health effect estimates could be different than what I described. See, I told you this was complicated!]
- Effects linked to outdoor air pollution are policy-relevant
Some people consider personal monitors to be the ideal method for exposure assessment. However, a feature of this approach is that they also capture air pollutants from indoor sources, since they measure everything a person is exposed to throughout the day.
Yet, the Clean Air Act regulates outdoor air pollution, not personal or indoor exposures. Therefore, a study demonstrating that exposure to a certain amount of outdoor air pollution is associated with a specific health outcome provides meaningful information that could inform updates to air pollution standards.
Yes, I’m the first to admit that our methods to assess exposure to air pollution are imperfect. But, they allow us to efficiently quantify exposure for large segments of the population. (We really do need these data on large populations to be able to see evidence of the relatively small increased risks for various health effects.) And, while these exposure estimates might not capture the whole story, they are close enough to allow us to consistently see that air pollution is associated with premature mortality, cardiovascular disease, diabetes, asthma….and more.
In the future, I think we will see exciting refinements to these approaches, such as incorporating infiltration rates (to describe how outdoor pollutants enter indoor environments) or more personal behavior information. We might also be able to assess the specific type of particle (ie: metal vs. carbon-based) rather than just evaluating by size (as is done now, in distinguishing between PM2.5 and. PM10); this is important because different particle types may be associated with different health effects. These additional details will increase the accuracy of our exposure assessment methods.
One final note to try to assuage any remaining doubts about epidemiological studies based on these methods. Take the case of PM2.5 and mortality, for example…. Sure, we might not trust one-off results from a study using a single exposure assessment method on a single population. But science moves forward incrementally, based on the results of many studies rather than a single finding. When we see similar results from many studies, even across different populations and based on different exposure assessment methods, we can have strong confidence that PM2.5 is associated with increased risk of death (for example).
In this way, I think that air pollution epidemiology is strengthened by the use of these various methods, each with their own pros and cons. It’s certainly good to be skeptical, though. I think our work (like all science) could benefit from some healthy questioning of the accepted approaches, which could prompt us to improve our methods.