Counting the trailing zero bit count (TZCNT) can be done by isolating the lowest bit, then depositing this into the appropriate locations for the count. The leading zero bit count (LZCNT) can be done by reversing bits, then computing the TZCNT.
__m128i _mm_tzcnt_epi8(__m128i a) {
// isolate lowest bit
a = _mm_andnot_si128(_mm_add_epi8(a, _mm_set1_epi8(0xff)), a);
// convert lowest bit to index