Skip to content

Instantly share code, notes, and snippets.

@Ruhshan
Created November 9, 2017 18:00
Show Gist options
  • Save Ruhshan/542801014bededf6d1bff0fe9954c58f to your computer and use it in GitHub Desktop.
Save Ruhshan/542801014bededf6d1bff0fe9954c58f to your computer and use it in GitHub Desktop.
read a fasta file and gets amino acid composition for all proteins then writes to csv
library(Peptides)
library(seqinr)
get_comp <- function(fasta){
s=getSequence(fasta, as.string = TRUE);
sname=getName(fasta);
comp = aaComp(s);
r<-c(name=sname,
tiny=comp[[1]][1,2],
small=comp[[1]][2,2],
aliphatic=comp[[1]][3,2],
aromatic=comp[[1]][4,2],
nonpolar=comp[[1]][5,2],
polar=comp[[1]][6,2],
charged=comp[[1]][7,2],
basic=comp[[1]][8,2],
acidic=comp[[1]][9,2]);
return(r);
}
fastas = read.fasta("all_sequences.fasta")
compositions_list=lapply(fastas, get_comp)
df_res = data.frame(t(sapply(compositions_list,c)))
write.table(df_res,"compositions.csv", sep = ";", row.names = FALSE, col.names = TRUE)
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment