Skip to content

Instantly share code, notes, and snippets.

@sckott
Created October 12, 2012 22:55
Show Gist options
  • Save sckott/3882112 to your computer and use it in GitHub Desktop.
Save sckott/3882112 to your computer and use it in GitHub Desktop.
Write fasta file, which is ready to run in ClustalW for multiple sequence alignment.
# Clustal doesn't like when there are spaces in species names, so `write_fasta` fixes that in case you forget.
write_fasta <- function (sequences, file.out) {
outfile <- file(description = file.out, open = "w")
write.oneseq <- function(sequence, name) {
if(grepl("\\s", name)){name<-gsub("\\s", "_", name)} else{name<-name}
writeLines(paste(">", name, sep = ""), outfile)
writeLines(sequence[[1]], outfile)
}
sapply(seq_len(length(sequences)), function(x) write.oneseq(
sequence=as.character(sequences[[x]]), name=names(sequences[x])))
close(outfile)
}
# Note that they are all the same sequence
myseqs <- c("TCTTATTTACAATAGGAGGATTATCAGGAATTATATTATCAAATTCATCTATTGATATTATACTACACGATACTTATTACGTTATTGGACACTTTCATTATGTACTCTCAATA",
"TCTTATTTACAATAGGAGGATTATCAGGAATTATATTATCAAATTCATCTATTGATATTATACTACACGATACTTATTACGTTATTGGACACTTTCATTATGTACTCTCAATA",
"TCTTATTTACAATAGGAGGATTATCAGGAATTATATTATCAAATTCATCTATTGATATTATACTACACGATACTTATTACGTTATTGGACACTTTCATTATGTACTCTCAATA")
names(myseqs) <- c("Apis_mellifera","Homo sapiens","Helianthus annuus")
write_fasta(myseqs, "myseqs.fas")
@sckott
Copy link
Author

sckott commented Oct 12, 2012

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment