Last active
December 25, 2019 10:55
-
-
Save vurtun/2fb0ed9d3319f6cf21c2aebd42da8dfc to your computer and use it in GitHub Desktop.
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
static char* | |
str_chr(char * str, int chr) | |
{ | |
char *s = str; | |
int c = chr & 0xFF; | |
unsigned m = (c << 24)|(c << 16)|(c << 8)|c; | |
for (;;) { | |
while (((uintptr_t)s) & 3) { | |
chk1: if (s[0] == c) return s; | |
chk2: if (s[0] == 0) return s; | |
++s; | |
} | |
for (;;) { | |
unsigned v = *(unsigned*)s; | |
unsigned c = (~v) & 0x80808080; | |
if (((v ^ m) - 0x01010101) & c) goto chk1; | |
if ((v - 0x01010101) & c) goto chk2; | |
s += 4; | |
} | |
} | |
} | |
static int | |
str_len(const char *str) | |
{ | |
const char *s = str; | |
while (((uintptr_t)s & 3)) { | |
chk: if (*s == 0) return s-str; | |
s++; | |
} | |
for (;;) { | |
unsigned int v = *(unsigned int*)s; | |
unsigned int c = ~v & 0x10101010; | |
if ((v - 0x01010101) & c) goto chk; | |
s += 4; | |
} return s-str; | |
} |
...is already undefined.
Didn't read that far ;)
Btw where do you (want to) use this code? I'm certain, that the library functions are faster on most/all CPUs.
Btw where do you (want to) use this code? I'm certain, that the library functions are faster on most/all CPUs.
The trick is from here: https://github.com/nothings/stb/blob/master/stb_sprintf.h#L306.
Based on: https://graphics.stanford.edu/~seander/bithacks.html#HasLessInWord.
I spend some time trying to understanding how it works. So I wrote these two to:
1.) Confirm I understood how it works
2.) Create a reference if I ever need it
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Yeah this whole thing is undefined behavior. In theory this line:
Is already undefined. As for
unsigned m = (c << 24)|(c << 16)|(c << 8)|c;
at this point allarchitectures outside microcontrollers have int/unsigned 4 byte so I don't care anymore. Heck if you
look at all commonly supported hardware for linux you will not find a single instance with int/unsigned being smaller
than 4 byte.