Skip to content

Instantly share code, notes, and snippets.

@lwaldron
Last active September 14, 2022 12:55
Show Gist options
  • Save lwaldron/f755fdcc98ad941ecedbfd1b85c01c6b to your computer and use it in GitHub Desktop.
Save lwaldron/f755fdcc98ad941ecedbfd1b85c01c6b to your computer and use it in GitHub Desktop.
Demo of importing, recoding, creating survey, and logistic regression on CHS20 survey data
---
title: "NYC CHS import"
author: "Levi Waldron"
date: "`r Sys.Date()`"
output: html_document
---
```{r setup, include=FALSE}
knitr::opts_chunk$set(echo = TRUE, message = FALSE)
```
See https://www1.nyc.gov/site/doh/data/data-sets/community-health-survey-public-use-data.page for the codebook, other years, and further information about the dataset.
Import:
```{r, cache=TRUE}
library(haven)
chs21 <- read_sas("https://www1.nyc.gov/assets/doh/downloads/sas/episrv/chs2020_public.sas7bdat")
```
Recoding:
```{r}
library(dplyr)
chs21_recode <-
transform(chs21,
newrace = recode_factor(newrace, `1`="White/N Afri/MidEastern, non-Hispanic",
`2`="Black, non-Hispanic",
`3`="Hispanic",
`4`="Asian/PI, non-Hispanic",
`5`="Other, non-Hispanic"),
agegroup = recode_factor(agegroup,
`1` = "18-24",
`2` = "25-44",
`3` = "45-64",
`4` = "65+"),
hiv12months20 = recode_factor(hiv12months20,
`2` = "No", # put first to make "No" the reference group
`1` = "Yes"),
condom20 = recode_factor(condom20,
`2` = "No",
`1` = "Yes"),
## may or may not want sexpartner as an ordered factor: if ordered,
## R will add a test for trend to regression analyses
sexpartner = factor(sexpartner, ordered = TRUE),
## survey strata:
strata = as.character(strata))
```
Define the survey - based on code from the NYC CHS web page above:
```{r}
library(survey)
chs.dsgn <-
svydesign(
ids = ~ 1,
strata = ~ strata,
weights = ~ wt21_dual, # match to current year dataset
data = chs21_recode, # match to current year dataset
nest = TRUE,
na.rm = TRUE
)
```
From `svyglm` help page:
> for binomial and Poisson families use family=quasibinomial() and family=quasipoisson()
to avoid a warning about non-integer numbers of successes. The ‘quasi’ versions
of the family objects give the same point estimates and standard errors and
do not give the warning.
```{r}
library(svydiags)
fit <- svyglm(hiv12months20 ~ condom20,
design = chs.dsgn,
family="quasibinomial")
summary(fit)
```
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment