Skip to content

Instantly share code, notes, and snippets.

@thanhleviet
Created May 31, 2017 11:00
Show Gist options
  • Save thanhleviet/86d560c019615a6b4d5a1eda9c139d0d to your computer and use it in GitHub Desktop.
Save thanhleviet/86d560c019615a6b4d5a1eda9c139d0d to your computer and use it in GitHub Desktop.
Demo script for converting a data frame of nucleotide column to phydat format
rm(list = ls())
# Le Viet Thanh
# May 31st 2017
# Demo script for converting a data frame of nucleotide column to phydat format
# Asked by Dung Nguyen Chi
library(seqRFLP)
library(readr)
library(dplyr)
# source("https://bioconductor.org/biocLite.R")
# biocLite("DECIPHER")
library(DECIPHER)
library(phangorn)
seq_file <- read_csv("DNA.csv")
#Remove N/A seq
clean_seq <- seq_file %>%
dplyr::filter(!is.na(seq)) %>%
select(BC, seq) %>%
as.data.frame()
# Covert from data frame to fasta format, and write it out to a fasta file
dataframe2fas(clean_seq, file = "dna.fasta")
# Read back using DECIPHER package for aligment
seq <- readDNAStringSet("dna.fasta", "fasta")
# Performing alignment
aligned <- AlignSeqs(seq)
# Write out the aligned seq
writeXStringSet(aligned,"dna_aligned.fasta")
# Read in phangorn
phydat <- read.phyDat("dna_aligned.fasta", "DNA", format = "fasta")
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment