Bit Manipulation

Tricks

A value is a power of 2 if you can AND it with itself - 1 and have the result be 0.

Example: 16

16: 10000 16 - 1: 01111

As can be seen, ANDing those results in no bits being set, thus the value is 0.

To create a mask of N bits, shift 1 over by N positions to the left and subtract 1:

N = 4 (you want a mask of 4 bits)

1 << 4 -> 10000
-1: 01111

To get mask for all odd bits, use 0xA and one A for every four bits (2 for a byte, 16 for an std::size_t). For all even bits (note: from right to left), use 0x5 and a 5 for every four bits.

hex(0b01010101) # all even 0x55

hex(0b10101010) # all odd 0xAA

Bit-merging

You are given two 32-bit number, N and M, and two bit positions i and j with i being less significant than j. Insert M into N at those positions.

Solution: Create a proper mask to unset the bits between i and j in N, then shift M over by i bits and or them.

Create the mask.
- For signed integers: -~0 << j | ((1 << i) - 1) (first set the bits to the left of the interval, then add, i.e. or, the bits to the right of it.)
- For unsigned types also: ~(((1 << (j - i)) - 1) << i) (create the mask of correct length, shift them into place, then twiddle).
Binary AND N by the mask, to clear the relevant bits between i and j.
Simply "insert" M by ORing N by M.

template<typename T>
void insert(T& first, const T& second, std::size_t i, std::size_t j)
{
	const std::size_t mask = ~(((1 << (j - i + 1)) - 1) << i);
	
	first &= mask;
	
	first |= (second << i);
}

Finding Certain Bit Positions

Find the LSB.

Example: (A) 011010

Subtract 1 from A to complement the bits before the LSB

011010 - 1 = 011001 -> B

OR the result with the original value, so that also those complemented bits are set in A.

A 011010 OR B 011001

C 011011

The consequence is that when you now XOR B with C, the only position where bits differ is at the LSB (because you also set the bits before the LSB and the bits after it are unaffected).

C 011011 XOR B 011001

D 000010

If you now take base-2 logarithm of this value you get the LSB position.

template<typename T>
std::size_t find_lsb(T value)
{
	T less = value - 1;
	
	value = (less | value) ^ less;
	
	return std::log2(value);
}

Find the MSB.

Given that you have the LSB, remove the 0s before the LSB by shifting all bits to the right (by LSB positions). Then you have a cluster of bits. If you add 1, you get a power of two. If you take the base-2 logarithm of that you get a bit position, which is the MSB, but shifted to the right by LSB positions. Thus the real MSB is then LSB + the base-2 logarithm value.

template<typename T>
T find_msb(T value, std::size_t lsb)
{
	value = (value >> lsb) + 1;
	
	return lsb + std::log2(value);
}

Determine the cardinality of a value (how many bits set).

We'll have to count, but optimize a bit by testing if the value is a power of 2 or one value before a power of 2.

template<typename T>
std::size_t cardinality(const T& value, std::size_t msb)
{
	if ((value & (value - 1)) == 0) return 1;
	
	if ((value & (value + 1)) == 0) return std::log2(value + 1);
	
	std::size_t count = 1;
	
	for (std::size_t i = 0; i < msb; ++i)
	{
		if (value & (1 << i)) ++count;
	}
	
	return count;
}

Floating-Point Representation

Given a real number between 0 and 1, print its binary representation if it can be represented with at most 32 characters, else print "Error".

Binary numbers are generally structured such that each bit signifies $0 \text{ or } 1 \cdot 2^N$. This is true for positive values, i.e. 101 means, from right to left, $1 \cdot 2^0 + 0 \cdot 2^1 + 1 \cdot 2^2 = 5$. But it is also true for negative values, where each digit to the right of the 0/1 bit stands for $1 \text{ or } 0 \cdot 2^{-N}$ (note the minus), such that 0.101 means $1 \cdot 2^{-1} = 1 \cdot \frac{1}{2} \dots$.

void print_binary_double(double value)
{
	static const std::size_t limit = 32;
	
	double significance = 0.5;
	
	std::string representation;
	
	for (std::size_t count = 0; count < limit; ++count)
	{
		if (value >= significance)
		{
			representation += "1";
			
			value -= significance;
			
			if (value == 0)
			{
				std::cout << representation << std::endl;
				
				return;
			}
		}
		
		else representation += "0";
		
		significance /= 2;
	}
	
	std::cout << "Error" << std::endl;
}

Bit-Twins

Given an integer with $N$, find the next greater and smaller values with the same number of bits set as that integer.

Note: the number of bit set in an integer is called the cardinality of that number.

Solution 1: Brute force.

template<typename T>
std::size_t cardinality(const T& value)
{
	if ((value & (value - 1)) == 0) return 1;
	
	if ((value & (value + 1)) == 0) return std::log2(value + 1);
	
	const std::size_t bits = sizeof(value) * 8;
	
	std::size_t count = 0;
	
	for (std::size_t i = 0; i < bits; ++i)
	{
		if (value & (1 << i)) ++count;
	}
	
	return count;
}

template<typename T>
std::pair<std::size_t, std::size_t> twins(const T& value)
{
	static const std::size_t maximum = std::numeric_limits<T>::max();
	
	if (value == 0) return {0, maximum};
	
	const std::size_t bits = cardinality(value);
	
	T next = value + 1;
	
	while (next != maximum && cardinality(next) != bits) ++next;
	
	T previous = value - 1;
	
	while (previous != 0 && cardinality(previous) != bits) --previous;
	
	return {previous, next};
}

Solution 2:

template<typename T>
std::size_t find_lsb(T value)
{
	T less = value - 1;
	
	value = (less | value) ^ less;
	
	return std::log2(value + 1);
}

template<typename T>
T find_msb(T value, std::size_t lsb)
{
	value = (value >> lsb) + 1;
	
	return lsb + std::log2(value);
}

template<typename T>
T find_msb(const T& value)
{
	return find_msb(value, find_lsb(value));
}

template<typename T>
std::size_t cardinality(const T& value, std::size_t msb)
{
	if ((value & (value - 1)) == 0) return 1;
	
	if ((value & (value + 1)) == 0) return std::log2(value + 1);
	
	std::size_t count = 1;
	
	for (std::size_t i = 0; i < msb; ++i)
	{
		if (value & (1 << i)) ++count;
	}
	
	return count;
}

template<typename T>
std::size_t cardinality(const T& value)
{
	return cardinality(value, find_msb(value));
}

template<typename T>
std::pair<T, T> twins(const T& value)
{
	// Nothing to do here.
	if (value == 0) return {0, 0};
	
	// If its a power of 2, just left/right shift
	if ((value & (value - 1)) == 0)
	{
		return {value << 1, value >> 1};
	}
	
	T lsb = find_lsb(value);
	
	T msb = find_msb(value, lsb);
	
	T next_largest;
	
	// We need to differentiate between when a number is
	// a cluster of bits, and when it is not. If it is a
	// cluster of bits, the solution to find the next-largest
	// value is to left shift the MSB and right shift the
	// non-MSB bits.
	
	// A value is a cluster of bits when we can right
	// shift the values to the start to remove all 0s
	// before the LSB (11000 -> 00011), then add one
	// and have a power of 2. I.e. if this is a cluster,
	// adding 1 will make it a power of 2. A value is
	// a power of 2 if you can substract 1, AND those
	// two values and get 0. If it is not a cluster of
	// bits, you'll get something like 10110 -> 01011,
	// to remove the 0 padding, then 01100 when adding 1
	// and then the power-of-2 check will fail because
	// 01100 & 01011 is not 0, but 01000 -> thus not a cluster
	
	T copy = value >> lsb;
	
	// Is a cluster of bits
	if ((copy & (copy + 1)) == 0)
	{
		// Unset the old MSB, which we want
		// to left shift afterwards
		next_largest = copy & ~(1 << (msb - 1));
		
		// If we can't right shift the non-msb bits, we
		// already have the next-largest value
		if (lsb > 1) next_largest >>= 1;
	
		// Add the new, left-shifted MSB
		next_largest |= (1 << msb);
		
		// If it's a cluster of bits and lsb is at bit 1,
		// there is no smaller value with the same # of bits
		if (lsb == 0) return {value, next_largest};
	}
	
	else
	{
		// If the value is not composed of a cluster of bits
		// the idea is to find the first gap in the bits
		// and shift the values before the gap to the left
		
		// We find the first gap in the bits of A = 1001 by
		// (1) adding 1: 1001 + 1 = 1010 -> B
		// (2) ORing those two: 1001 | 1010 = 1011 -> C
		// (3) XORing C with A: 1001 ^ 1011 = 0010
		// (4) Taking the log2 to find the bit position where the gap was.
		
		T temp = copy | (copy + 1);
		
		// The gap bit
		std::size_t gap = lsb + std::log2(temp ^ copy);
		
		// Mask off the bits before the gap
		T before_gap_mask = value & ((1 << gap) - 1);
		
		// Get the bits after the gap by unsetting the masked bits
		T after_gap = value & ~before_gap_mask;
		
		next_largest = after_gap | (before_gap_mask << 1);
	}
	
	T next_smallest = 0;
	
	// To find the next smallest value, again two cases
	// If the first bit is not set, just shift the LSB one
	// to the right. Else if the first bit is set, we'll have
	// to shift the MSB one to the right and then cram all the
	// non-MSB bits right next to the MSB to get the highest
	// possible value with the MSB being one to the right.
	
	if (value & 1)
	{
		std::size_t not_msb_bits = cardinality(value, msb) - 1;
		
		msb -= 1;
		
		// Add in the new MSB
		next_smallest |= (1 << msb);
		
		// Create a mask containing bits for all the non-MSB bits
		T mask = (1 << not_msb_bits) - 1;
		
		// Shift those next to the MSB
		mask <<= (msb - not_msb_bits);
		
		// Add in the non-MSB bits
		next_smallest |= mask;
	}
	
	else
	{
		// Unset the old LSB
		next_smallest = value & ~(1 << lsb);
		
		// Insert the LSB one position before
		next_smallest |= (1 << (lsb - 1));
	}
	
	return {next_smallest, next_largest};
}

Edit-Distance for Bits

Write a function to determine the number of bits you would need to flip to convert integer $A$ to integer $B$.

Definitely, one would want to XOR the two integers to determine which bits differ. Then, it depends on how optimize the counting of bits.

Solution 1: Count between the LSB and MSB of the XORed value:

template <typename T>
std::size_t differing_bits(const T& first, const T& second)
{
	T value = first ^ second;
	
	T bit = find_lsb(value);
	
	T msb = find_msb(value, bit);
	
	std::size_t count = 2;
	
	for (++bit; bit < msb; ++bit)
	{
		if (value & (1 << bit)) ++count;
	}
	
	return count;
}

Solution 2: As above, but move to then ext LSB if the next bit is not set:

template <typename T>
std::size_t differing_bits(const T& first, const T& second)
{
	T value = first ^ second;
	
	auto bit = find_lsb(value);
	
	auto msb = find_msb(value, bit);
	
	std::size_t count = 2;
	
	for (++bit; bit < msb; )
	{
		if (value & (1 << bit++)) ++count;
		
		else bit = bit + find_lsb(value >> bit);
	}
	
	return count;
}

Even-Odd bit-swapping

Write ap rogram to swap odd and even bits in an integer with as few instructions as possible

Solution 1: Iterative and inefficient.

template<typename T>
void swap_bits(T& value, std::size_t i, std::size_t j)
{
	bool first = value & (1 << i);
	bool second = value & (1 << j);

	value &= ~(1 << i);
	value |= (second << i);
	
	value &= ~(1 << j);
	value |= (first << j);
}

template<typename T>
T swap_even_odd(T value)
{
	auto msb = find_msb(value);
	
	for (auto bit = 0; bit <= msb; bit += 2)
	{
		swap_bits(value, bit, bit + 1);
	}
	
	return value;
}

Solution 2: Mask off the odd bits and so shifting.

An appropriate mask for all odd bits is 0xAA (two As for every 8 bits / one for every 4).

template<typename T>
T swap_even_odd(const T& value)
{
	static const std::size_t mask = 0xAAAAAAAAAAAAAAAA;
	
	return ((value << 1) & ~(mask >> 1)) | (value & mask) >> 1;
}

Drawing a line in a monochrome screen.

A monochrome screen is a single array of bytes, allowing eight consecutive pixels to be stored in one byte and w bytes in one row. Given a vertical row index y and two x-indices (bits in those bytes) x_1 and x_2, write a function to draw a line between those bits, on that row.

Be careful about edge cases.

void drawLine(char screen [],
			  std::size_t width,
			  std::size_t x_1,
			  std::size_t x_2,
			  std::size_t y)
{
	if (x_1 == x_2) return;
	
	const std::size_t first_byte = width * y;
	
	std::size_t lower_byte = first_byte + (x_1 / 8);
	std::size_t upper_byte = first_byte + (x_2 / 8);
	
	x_1 %= 8;
	x_2 %= 8;
	
	if (lower_byte == upper_byte)
	{
		screen[lower_byte] = ((1 << (x_2 - x_1)) - 1) << (8 - x_2);
	}
	
	else
	{
		screen[lower_byte] = 0xFF >> x_1;
		
		screen[upper_byte] = 0xFF << (8 - x_2);
		
		for (++lower_byte; lower_byte < upper_byte; ++lower_byte)
		{
			screen[lower_byte] = 0xFF;
		}
	}
	
}

int main(int argc, char* argv[])
{
	char screen [] = {0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0};
	
	drawLine(screen, 4, 1, 1, 3);
	
	for (std::size_t i = 0; i < 16; ++i)
	{
		if (i % 4 == 0) std::cout << "\n";
		
		std::cout << std::bitset<8>(screen[i]);
		
		if ((i + 1) % 4 != 0) std::cout << "|";
	}
	
	std::cout << std::endl;
}

goldsborough/bit-manipulation.md