Skip to content

Instantly share code, notes, and snippets.

@jackwasey
Created September 26, 2015 00:51
Show Gist options
  • Save jackwasey/a37c16617a5a6c23d74d to your computer and use it in GitHub Desktop.
Save jackwasey/a37c16617a5a6c23d74d to your computer and use it in GitHub Desktop.
//' @title Sort using STL
//' @description if compiler flags and standard library support is available
//' (only tested on glibc), OpenMP, then this will use a parallel sort
//' algorithm which is significantly faster. It doesn't however deal with NA
//' values. TODO: handle NA values.
//' @param x vector of strings. In the current implementation, NA is completely
//' ignored, so is probably converted to "NA"
//' @examples
//' \dontrun{
//' pts <- icd9:::randomPatients(1e7)
//' microbenchmark::microbenchmark(sort_std(pts$icd9), sort(pts$icd9), times = 5)
//' # four times faster on 4 real core (8 with HT) machine.
//' }
// [[Rcpp::export]]
std::vector<std::string> sort_std(std::vector<std::string> x) {
// will use parallel/algorithm if included
std::sort(x.begin(), x.end());
return(x);
}
@jackwasey
Copy link
Author

include <parallel/algorithm>

_GLIBCXX_PARALLEL_PARALLEL_H
and also need -fopenmp

Better still, for factor creation in R, use boost multi_index to have insertion an order container with hash lookup (for dedupe during filling).

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment