Skip to content

Instantly share code, notes, and snippets.

@blahah
Created April 7, 2014 13:21
Show Gist options
  • Save blahah/10020181 to your computer and use it in GitHub Desktop.
Save blahah/10020181 to your computer and use it in GitHub Desktop.
extract gene counts from transcript ID/count
# convert transcript ID to gene ID
df <- data.frame(a = c("AT1G2.1", "AT1G2.3", "AT1G2.3", "AT1G3.1", "AT1G3.3", "AT1G3.3"), b=1:6)
df$t <- gsub(df$a, pattern="\\.[0-9]+", replacement="")
# sum by gene ID
aggregate(df[,2], by=list(as.factor(df$t)), sum)
@blahah
Copy link
Author

blahah commented Apr 7, 2014

NOTE: Please don't ever just sum transcript counts to get gene counts in a real analysis. This code is just to demonstrate gsub and aggregate to a colleague.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment