Center for Interdisciplinary Research
in Environmental Exposures and Health
Main / Ecologic Bias

Workshop on ecologic inference: 28-30 November 2007, DIMACS, Rutgers University
The workshop will sponsored through the DIMACS Special Focus on Computational and Mathematical Epidemiology. The focus will be on study designs combining individual and group level data.

Ecologic Bias
Individual-level studies collect information on exposure, outcome and covariates for each individual; purely ecologic studies collect group-level (aggregate) information for these variables. Ecologic bias can occur when aggregate data are used to make inferences about individuals. Many of the important features of ecologic bias can be seen with very simple examples, e.g., 2x2 tables using the risk difference (RD) as the effect measure.

Loss of information: Suppose we are investigating a group of people with dichotomous exposures and outcomes. Individual-level information on exposure and outcome are shown by the interior of a two-by-two table; they are summarized by the risks in the exposed and unexposed and the risk difference (RD). Ecologic studies possess only the margins of the table, summarized by the average exposure X, average risk Y and group size n. The goal of most ecologic studies is to try to make inferences about individuals based on ecologic data, i.e., use the margins to deduce the interior of the table. Put another way, we'd like to use X, Y, n to deduce the RD. Unfortunately, many different interior cell contents are compatible with the same margins. This loss of information is the fundamental problem facing ecologic studies.

Table 1. Individual vs. ecologic data

Risk diagrams: Risk diagrams provide a convenient way to summarize the individual and ecologic data for a group. We summarize individual-level information for a group with a solid black line, ecologic data with a solid black dot. Figure 1 diagrams the data in Table 1. The line connects the risk in the unexposed (q=0.2 at individual-level exposure x=0) with the risk in the exposed (0.4 at x=1) and has slope equal to the risk difference b. The ecologic data are the average exposure X and average risk Y for the group.

Figure 1. Risk diagram illustrating Table 1

Ecologic bias due to confounding by group: Given that many sets of individual-level data (interior cells of the 2x2 table) are compatible with the same ecologic data (margins of the 2x2 table), how can ecologic inference proceed? One typically regresses the ecologic data (X, Y) from several groups. This can produce unbiased estimates of the RD under certain conditions, but serious error can results if the assumption are wrong.

For example, assume as in Table 2 that two groups have different background risks (risks in the unexposed) and different exposure distributions. The crude individual-level table (ignoring groups) is thus confounded: the crude RD=0.28 instead of 0.2. The ecologic estimate of the RD is given by regressing Y against X, i.e., the slope of the line here is (Y1-Y0)/(Y1-Y0) = 2.2. The severe bias that occurred here is caused by confounding by group.

Table 2. Confounding by group

The bias in this example is illustrated in Figure 2. A) Individual level: The solid black lines describing the individual-level information in the two groups are parallel (same risk differences b) but have different intercepts (different background risks q0<>q1 ). The crude estimate of the risk difference bc is confounded (blue line). B) Group level: The ecologic estimate of the risk difference be is the slope of the red line through the two ecologic data points. Massive confounding has occurred, but we can’t tell this from the ecologic data alone. C) Comparison of results on the two levels: The ecologic estimate of the risk difference be is much more biased than the crude individual-level estimate bc . Both biases are in the same direction.

Figure 2. Confounding by group, illustrating Table 2.

Note that the bias is in the same direction on both levels but much larger on the ecologic level. This difference is due to bias magnfication, caused by the reduction in variance of the exposure when individual-level data are aggregated. To learn more about this phenomenon and other types of ecologic bias, read my recent paper: Webster TF. Bias magnification in ecologic studies: a methodological investigation. Environmental Health 2007; 6:17 (5 July 2007). The full text is freely available here.

return to ecologic studies

return to Tom Webster

Edit - History - Print - Recent Changes - Search
Page last modified on August 14, 2007, at 09:11 PM