Skip to content

Instantly share code, notes, and snippets.

@hadley
Created October 14, 2012 23:01
Show Gist options
  • Save hadley/3890091 to your computer and use it in GitHub Desktop.
Save hadley/3890091 to your computer and use it in GitHub Desktop.
clusters <- read.csv("clusters.csv", stringsAsFactors = FALSE)
x <- clusters$clean_text
non_ascii <- function(x) {
any(charToRaw(x) > 0x7F)
}
bad <- x[vapply(x, non_ascii, logical(1), USE.NAMES = FALSE)]
get <- function(x) {
rw <- charToRaw(x)
rawToChar(rw[rw > 0x7F])
}
outside <- vapply(bad, get, character(1), USE.NAMES = FALSE)
iconv(outside, "UTF-8", "ASCII//translit", "byte")
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment