Skip to content

Instantly share code, notes, and snippets.

@n8thangreen
Last active December 15, 2016 09:54
Show Gist options
  • Save n8thangreen/f253add98a56ad129e94 to your computer and use it in GitHub Desktop.
Save n8thangreen/f253add98a56ad129e94 to your computer and use it in GitHub Desktop.
Read-in the whole of an Excel workbook and then extract whichever fields wanted
## read from Excel workbook
require(XLConnect)
wb = loadWorkbook("C:/Users/ngreen1/Documents/IDEA/raw_data/TB_database_patientdata_300614.xlsx")
tab.names <- getSheets(wb)
TBlist <- sapply(1:length(tab.names), function(x) readWorksheet(wb, sheet=x, header=TRUE))
names(TBlist) <- tab.names
## extract only columns of interest
require(plyr)
extractNames <- readLines("C:/Users/ngreen1/Documents/IDEA/raw_data/relevant_fields.csv")
data <- join_all(TBlist, by="PatientStudyID")
data <- unique(data[,extractNames])
@n8thangreen
Copy link
Author

This does a multiple array join on patient ID to merge the different arrays in to a single array.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment