Skip to content

Instantly share code, notes, and snippets.

@geoffjentry
Created January 2, 2014 20:42
Show Gist options
  • Save geoffjentry/8226425 to your computer and use it in GitHub Desktop.
Save geoffjentry/8226425 to your computer and use it in GitHub Desktop.
remove weird encodings
# I want to convert this all to lowercase but there are 67 with weird encodings
bad_statuses = numeric()
lowercase_statuses = character()
for (i in seq_along(statuses)) {
tl = try(tolower(statuses[[i]]), silent=TRUE)
if (inherits(tl, "try-error")) {
bad_statuses = c(bad_statuses, i)
} else {
lowercase_statuses = c(lowercase_statuses, tl)
}
}
if (length(bad_statuses) > 0) {
filtered_tweets = filtered_tweets[-bad_statuses]
}
statuses = lowercase_statuses
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment