Last active
September 14, 2022 12:55
-
-
Save lwaldron/f755fdcc98ad941ecedbfd1b85c01c6b to your computer and use it in GitHub Desktop.
Demo of importing, recoding, creating survey, and logistic regression on CHS20 survey data
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
--- | |
title: "NYC CHS import" | |
author: "Levi Waldron" | |
date: "`r Sys.Date()`" | |
output: html_document | |
--- | |
```{r setup, include=FALSE} | |
knitr::opts_chunk$set(echo = TRUE, message = FALSE) | |
``` | |
See https://www1.nyc.gov/site/doh/data/data-sets/community-health-survey-public-use-data.page for the codebook, other years, and further information about the dataset. | |
Import: | |
```{r, cache=TRUE} | |
library(haven) | |
chs21 <- read_sas("https://www1.nyc.gov/assets/doh/downloads/sas/episrv/chs2020_public.sas7bdat") | |
``` | |
Recoding: | |
```{r} | |
library(dplyr) | |
chs21_recode <- | |
transform(chs21, | |
newrace = recode_factor(newrace, `1`="White/N Afri/MidEastern, non-Hispanic", | |
`2`="Black, non-Hispanic", | |
`3`="Hispanic", | |
`4`="Asian/PI, non-Hispanic", | |
`5`="Other, non-Hispanic"), | |
agegroup = recode_factor(agegroup, | |
`1` = "18-24", | |
`2` = "25-44", | |
`3` = "45-64", | |
`4` = "65+"), | |
hiv12months20 = recode_factor(hiv12months20, | |
`2` = "No", # put first to make "No" the reference group | |
`1` = "Yes"), | |
condom20 = recode_factor(condom20, | |
`2` = "No", | |
`1` = "Yes"), | |
## may or may not want sexpartner as an ordered factor: if ordered, | |
## R will add a test for trend to regression analyses | |
sexpartner = factor(sexpartner, ordered = TRUE), | |
## survey strata: | |
strata = as.character(strata)) | |
``` | |
Define the survey - based on code from the NYC CHS web page above: | |
```{r} | |
library(survey) | |
chs.dsgn <- | |
svydesign( | |
ids = ~ 1, | |
strata = ~ strata, | |
weights = ~ wt21_dual, # match to current year dataset | |
data = chs21_recode, # match to current year dataset | |
nest = TRUE, | |
na.rm = TRUE | |
) | |
``` | |
From `svyglm` help page: | |
> for binomial and Poisson families use family=quasibinomial() and family=quasipoisson() | |
to avoid a warning about non-integer numbers of successes. The ‘quasi’ versions | |
of the family objects give the same point estimates and standard errors and | |
do not give the warning. | |
```{r} | |
library(svydiags) | |
fit <- svyglm(hiv12months20 ~ condom20, | |
design = chs.dsgn, | |
family="quasibinomial") | |
summary(fit) | |
``` |
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment