Skip to content

Instantly share code, notes, and snippets.

@jimhester
Created November 27, 2012 20:17
Show Gist options
  • Save jimhester/4156738 to your computer and use it in GitHub Desktop.
Save jimhester/4156738 to your computer and use it in GitHub Desktop.
[C++]Parsing fasta files in R using Rcpp
#include <fstream>
#include <Rcpp.h>
using namespace Rcpp;
// [[Rcpp::export]]
CharacterVector read_fasta(std::string file) {
CharacterVector records;
std::ifstream in(file.c_str());
in.get(); // remove first '>'
std::string rec;
while(getline(in,rec,'>')){
int newLineLoc = rec.find('\n');
std::string header = rec.substr(0,newLineLoc);
std::string sequence = rec.substr(newLineLoc+1, rec.length()-newLineLoc-2);
sequence.erase(std::remove(sequence.begin(),sequence.end(),'\n'),sequence.end());
records[header]=sequence;
}
return(records);
}
@RanaivosonHerimanitra
Copy link

Hi, I was looking for that kind of implementation...
I have one question: Is it possible to extend this code to read a csv file and return a dataframe ?
Thanks in advance and sorry if It is too basic.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment