|
Center for Interdisciplinary Research
in Environmental Exposures and Health |
|
|
Main /
gam code and synthetic data for mapping
The following article discusses methods and analyzes a synthetic data set: Webster T, Vieira V; Weinberg J; Aschengrau A. Method for mapping population-based case-control studies using Generalized Additive Models. International Journal of Health Geographics 2006, 5:26 (9 June 2006). The full text is freely available here.
The synthetic data used in the paper and code for analyzing it are available. This work is available for use under the General Public License.
R code (Note that R is freely available at The R Project for Statistical Computing)
Locations of cases (red) and controls (blue) are shown stratified by a dichotomous variable (age). Disease odds are constant within strata, but four times higher in the old. Young are uniformly distributed; old are clustered in the northeast quadrant.
The crude map of the synthetic data is elevated in the northeast quadrant due to spatial confounding, i.e., spatial clustering of the risk factor age. (To cause confounding, a variable must be associated with both outcome and exposure. In spatial confounding, location acts as the "exposure"). Adjustment for age produced a quite flat map, an expected result since we constructed the data assuming uniform disease odds within each stratum.
|