Skip to content

Instantly share code, notes, and snippets.

@Terkwood
Created October 16, 2022 18:54
Show Gist options
  • Save Terkwood/dbed79d82e2a7304a50c820f444a455d to your computer and use it in GitHub Desktop.
Save Terkwood/dbed79d82e2a7304a50c820f444a455d to your computer and use it in GitHub Desktop.
emoji-friendly iterator over chars
/// https://stackoverflow.com/a/50396438
/// https://stackoverflow.com/questions/47193584/is-there-an-owned-version-of-stringchars
use std::io::{BufRead, BufReader, Read};
use std::vec::IntoIter;
struct Chunks {
remaining: IntoIter<char>,
}
impl Chunks {
fn new(s: String) -> Self {
Chunks {
remaining: s.chars().collect::<Vec<_>>().into_iter(),
}
}
}
fn reader_chars<R: Read>(rdr: R) -> impl Iterator<Item = char> {
// We use 6 bytes here to force emoji to be segmented for demo purposes
// Pick more appropriate size for your case
let reader = BufReader::with_capacity(6, rdr);
reader
.lines()
.flat_map(|l| l) // Ignoring any errors
.flat_map(|s| Chunks::new(s).remaining) // from https://stackoverflow.com/q/47193584/155423
}
fn main() {
// emoji are 4 bytes each
let data = "😻🧐🐪💩a1.💐\nf😃";
let data = data.as_bytes();
for c in reader_chars(data) {
println!(">{}<", c);
}
}
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment