ballgown
creates a ballgown object from tablemaker outputballgownrsem
creates a ballgown object from RSEM output. (not yet well-tested).gffRead
andgffReadGR
read GTF (annotation) files into RgffRead
gives you a data frame
gffReadGR
gives you a GRanges object
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
import pandas as pd | |
from numpy.random import randint | |
from numpy import median, percentile | |
my_data = pd.read_csv('dataset.csv') | |
n = len(my_data) | |
num_bootstrap_samples = 1000 | |
bootstrap_results = [] | |
for b in xrange(num_bootstrap_samples): |
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
## simple analysis code for GEUVADIS data | |
## AF Oct 2014 | |
library(ballgown) #biocLite | |
library(RSkittleBrewer) #install_github | |
library(RColorBrewer) #CRAN | |
library(usefulstuff) #install_github | |
library(RCurl) #CRAN | |
load('fpkm.rda') # download at http://files.figshare.com/1625419/fpkm.rda |
FOR LEARNING!
- Make a repository on GitHub. Check the box that says "initialize this repo with a README."
- Clone that repository on to your computer. That's
git clone
+ the ssh URL you can find on the right-hand side of the repository. Go to the directory where you want therepository_name
folder to live. - Run a
git status
as a sanity check. (Everything should be clean). - Write some code!
- Run a
git status
again. See that your code now lives in your repository, but hasn't yet been added to version control. (the file names should be red in your terminal) - "Add" (
git add
) the code to version control. - Run another
git status
. The file names should now be green, meaning they've been added to the version control staging area, but not committed to your repository's history. - Commit your changes with
git commit -m 'message_here'
).
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
## create ballgown objects with GEUVADIS data | |
source("http://bioconductor.org/biocLite.R") | |
biocLite('ballgown') | |
library(ballgown) | |
system('mkdir -p Ballgown/small_objects') | |
## make phenotype table: | |
dataDir = 'Ballgown/' #tablemaker output lives here | |
sampnames = list.files(dataDir, pattern = 'H|N') |
notes from BioC 2014
instructions on website I didn't get my invite to get an SVN account - mystery.
- add bioc-sync as a collaborator on your git repo
- then, webhooks & services --> add webhook. (there's a URL for this on the bioc instruction page.)
- git-svn bridge doesn't really do merging very well: it's "winner-take all." you have to pick whether git or svn wins on merge conflicts.
- only deals with master branch of git repo
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
# power calculation examples | |
get_power = function(truep, p0, n, alpha=0.05) { | |
num_rejections = 0 | |
for(i in 1:10000){ | |
dat = rbinom(n, size=1, prob=truep) | |
pv = 2*(1-pbinom(sum(dat), size=n, prob=p0)) | |
if(pv < alpha) num_rejections = num_rejections + 1 | |
} | |
return(num_rejections / 10000) |
- Hilary Mason's collection of research-quality datasets
- 100+ Interesting Data Sets: seems really great for ML/data science practice or fun side projects.
- Most of the datasets available with R, but here ALSO available in CSV format! 700+ datasets.
- Kaggle Higgs Boson data (I think this is a super cool problem)
- Statistical Sleuth data problems -- pretty good for intro stats concepts
- Hadley's data packages -- baby names by sex (1880-2013), fuel economy of cars, atmospheric measurements from Central America, and info on all NYC flights in 2013. Set up for R, but I bet you could process these with other software if you wanted.
- others? (leave a comment!)
NewerOlder