2018-01-10-how-two-s-complement-works.md

language	title	date
en-us	How two's complement works	2018-01-10

Two's complement is widely used to represent integer numbers in computers. We'll explain how it works with examples and justify why it works by mapping it onto modular arithmetic.

Two's Complement

Digital computers use storage units with fixed length in bits: a byte (8 bits), a word (32 or 64 bits) etc. Each bit is denoted by 0 or 1 composing a bit string. It doesn't require any additional structure to represent natural numbers, $\mathbb{N}$ (unsigned integers): a bit string itself represents a number in base 2, for example, an unsigned byte $(10010011)_2 = 147$. However, for integers, $\mathbb{Z}$, a special representation is necessary and two's complement is widely used.

Given a set of bit string elements with $n$ bits each, two's complement consists of using the lower half of them to represent zero and positive integers and the upper half to represent negative integers, and it is ruled by modular arithmetic properties. The previous example, if in two's complement, would represent a negative integer: $(10010011)_{\overline{2}} = -109$. What will tell us whether a bit string is representing an unsigned or signed integer is the semantics, usually a data type in programming languages.

When necessary throughout this post, we'll denote fixed-length numbers in parentheses along with a base (or an overlined base if referring to a complement) in subscript.

How it works

For the following examples, we'll consider bit strings with $n = 8$ bits (a byte). Thus, there will be $2^8 = 256$ bit strings to represent integers evenly distributed in range $[-128, 127]$, where the lower 128 bit strings will represent integers in $[0, 127]$ and the upper 128 bit strings will represent integers in $[-128, -1]$.

For integers in $[0, 127]$, representation is immediate from base 2:

$$\begin{align*} 0 &= (00000000)_2 \\\ 1 &= (00000001)_2 \\\ 2 &= (00000010)_2 \\\ & \dots \\\ 126 &= (01111110)_2 \\\ 127 &= (01111111)_2 \end{align*}$$

For integers in $[-128, -1]$, representation consists of three steps:

Write the corresponding opposite's bit string
Invert (negate) its bits
Add $1$

For example, for $-91$:

$$+91 = (01011011)_2$$

$$\neg (01011011)_2 = (10100100)_2$$

$$\begin{align*} (10100100)_2 + 1 = (10100101)_2 \\\ \implies -91 = (10100101)_{\overline{2}} \end{align*}$$

As a result, $-91$ is represented as $(10100101)_{\overline{2}}$. Note that if it weren't in two's complement such bit string would represent the unsigned integer $165$ which is not in range $[-128, 127]$, and we get back the $+91$ by applying those steps on it again.

Applying those steps to all integers in $[-128, -1]$, we have:

$$\begin{align*} -128 &= (10000000)_{\overline{2}} \\\ -127 &= (10000001)_{\overline{2}} \\\ & \dots \\\ -109 &= (10010011)_{\overline{2}} \\\ & \dots \\\ -91 &= (10100101)_{\overline{2}} \\\ & \dots \\\ -3 &= (11111101)_{\overline{2}} \\\ -2 &= (11111110)_{\overline{2}} \\\ -1 &= (11111111)_{\overline{2}} \end{align*}$$

A nice feature of two's complement is that we can easily tell whether it is representing a negative number by looking at its first bit: if starting with 1, then negative (just like a sign bit). And an unlikable feature is that the minimum negative integer, here $-128$, has no positive counterpart: applying those steps again on $-128$ yields the same bit string.

Now let's investigate further what that operation is all about. In the example for $-91$, by picking the positive $91$, inverting its bits and adding $1$, we are actually performing a "substitute" for the operation $256 - (+91)$ that yields $165$. Before talking about what this operation means, let's see why we said it is a substitute.

We don't have enough bits to write an unsigned integer $256$ with our bit strings because that would require $9$ bits, $(100000000)_2$, but we fixed $n = 8$ bits. For now, in order to perform $256 - (+91)$, we rewrite it as $255 - (+91) + 1$ so it fits in with $8$ bits (more on this later). In particular, subtracting $91$ from $255$ in binary is equivalent to invert (negate) all bits of $91$, also known as ones' complement, that's why we invert its bits and then add $1$:

$$\begin{array}{ccccc} \begin{array}{rr} & 255 \\\ - & 91 \\ \hline & 164 \\\ + & 1 \\ \hline & 165 \end{array} & \boldsymbol{=} & \begin{array}{rr} & (11111111)_2 \\\ - & (01011011)_2 \\ \hline & (10100100)_2 \\\ + & (00000001)_2 \\ \hline & (10100101)_2 \end{array} & \boldsymbol{\equiv} & \begin{array}{rr} & \\\ \neg & (01011011)_2 \\ \hline & (10100100)_2 \\\ + & (00000001)_2 \\ \hline & (10100101)_2 \end{array} \end{array}$$

OK, and what's special about the calculation $256 - (+91)$? It yields $165$ which is the missing amount for $91$ to reach $256$, namely, its complement to $256$. Generically, we are representing the negative of a number $x$ by using the complement of $x$ to $2^n$, denoted here as $\overline{x}$:

$$\begin{align*} & \overline{x} + x = 2^n \\\ \implies & \overline{x} + x - x = 2^n - x \\\ \implies & \overline{x} = 2^n - x \end{align*}$$

Such operation is called two's complement, and can be used to reserve the upper half of the $2^n$ elements to signify a negative number and still be able to perform arithmetic with them.

At this point you may note: if $\overline{x}$ is being considered the opposite of $x$, then $2^n$ must be an identity element equivalent to zero. And it is: because of the constraint of $n$ bits, an attempt to represent $2^n$ in base 2 truncates it to a bit string of $n$ 0 bits and also causes zero to have a unique representation. We'll see later that it relates to modular arithmetic: $2^n \equiv 0 \pmod{2^n}$.

Back to the previous example, then why should we calculate the tricky $255 - (+91) +1 \equiv \neg (01011011)_2 + (00000001)_2$ if we can just calculate $256 - (+91) \equiv (00000000)_2 - (01011011)_2$ instead? Straight answer: simplicity. The former is easier to implement in digital hardware and it simplifies subtraction as a complement addition.

Radix Complement

The concept explained above can be generalized for an $n$ digit number $x$ in base (radix) $b$ and it is called radix complement defined as:

$$\begin{align*} \overline{x} = b^n - x \end{align*}$$

A quick example for base $b = 5$ with numerals ${0, 1, 2, 3, 4}$ and a fixed-length of $n = 2$ digits:

$$\begin{array}{l|l|l|l|l} \begin{array}{c} 0 = (00)_{\overline{5}} \\\ 1 = (01)_{\overline{5}} \\\ 2 = (02)_{\overline{5}} \\\ 3 = (03)_{\overline{5}} \\\ 4 = (04)_{\overline{5}} \end{array} & \begin{array}{c} 5 = (10)_{\overline{5}} \\\ 6 = (11)_{\overline{5}} \\\ 7 = (12)_{\overline{5}} \\\ 8 = (13)_{\overline{5}} \\\ 9 = (14)_{\overline{5}} \end{array} & \begin{array}{c} 10 = (20)_{\overline{5}} \\\ 11 = (21)_{\overline{5}} \\\ 12 = (22)_{\overline{5}} \\\ -12 = (23)_{\overline{5}} \\\ -11 = (24)_{\overline{5}} \end{array} & \begin{array}{c} -10 = (30)_{\overline{5}} \\\ -9 = (31)_{\overline{5}} \\\ -8 = (32)_{\overline{5}} \\\ -7 = (33)_{\overline{5}} \\\ -6 = (34)_{\overline{5}} \end{array} & \begin{array}{c} -5 = (40)_{\overline{5}} \\\ -4 = (41)_{\overline{5}} \\\ -3 = (42)_{\overline{5}} \\\ -2 = (43)_{\overline{5}} \\\ -1 = (44)_{\overline{5}} \end{array} \end{array}$$

You can verify it yourself by hand as an exercise or just play with the implementation in the next section. Notice that due to an odd number of elements, $5^2 = 25$, we end up with a nice symmetry around zero: every negative integer has a positive counterpart.

Implementation

<form lang="en" style="font-family: monospace;">
  <p>
    base &nbsp;&nbsp;&nbsp;&nbsp;=
    <input
      oninput="calc();"
      id="b"
      type="number"
      step="1"
      style="width: 4em;"
      value="2"
      min="2"
      max="16"
    />
    <small>min=2, max=16</small><br />
    n digits =
    <input
      oninput="calc();"
      id="n"
      type="number"
      step="1"
      style="width: 4em;"
      value="8"
      min="1"
    />
    <small>min=1, max=<span id="max_n"></span></small><br />
    decimal &nbsp;=
    <input
      oninput="calc();"
      id="decimal"
      type="number"
      step="1"
      style="width: 8em;"
      value="-91"
    />
    <small
      >min=<span id="min_decimal"></span>, max=<span id="max_decimal"></span
    ></small>
  </p>
  <p>
    radix complement = (<span id="radix-complement-string"></span>)<sub
      id="b-sub"
      style="text-decoration: overline;"
    ></sub>
  </p>
</form>

<script>
  // radixComplementString returns a string with n digits representing decimal
  // in a radix complement according to base b.
  // Ranges:
  // b in [2, 16]
  // n in [1, ceil(log_b(2^32))]
  // decimal in
  //     [-(b^n)/2, (b^n)/2]    , if b^n is odd,
  //     [-(b^n)/2, (b^n)/2 - 1], if b^n is even.
  function radixComplementString(b, n, decimal) {
    // Apply radix complement formula
    if (decimal < 0) {
      decimal = Math.pow(b, n) + decimal;
    }
    // Get digit string and pad zeros
    return decimal.toString(b).padStart(n, "0");
  }

  function calc() {
    // Get values.
    var b = document.getElementById("b").valueAsNumber;
    var n = document.getElementById("n").valueAsNumber;
    var decimal = document.getElementById("decimal").valueAsNumber;

    // Validate base.
    if (isNaN(b) || b == 1) {
      return; // User hasn't finished typing yet; do nothing.
    } else if (b < 2) {
      b = 2;
      document.getElementById("b").value = "2";
    } else if (b > 16) {
      b = 16;
      document.getElementById("b").value = "16";
    }
    document.getElementById("b-sub").innerHTML = b.toString();

    // Validate n digits.
    var max_n = Math.ceil(Math.log(Math.pow(2, 32)) / Math.log(b));
    if (isNaN(n)) {
      n = 1;
    } else if (n < 1) {
      n = 1;
      document.getElementById("n").value = "1";
    } else if (n > max_n) {
      n = max_n;
      document.getElementById("n").value = max_n.toString();
    }
    document.getElementById("max_n").innerHTML = max_n.toString();

    // Define min and max decimal values.
    var N = Math.pow(b, n);
    var min_decimal = -Math.floor(N / 2);
    var max_decimal = Math.floor(N / 2);
    if (N % 2 == 0) {
      max_decimal -= 1;
    }
    document.getElementById("min_decimal").innerHTML = min_decimal.toString();
    document.getElementById("max_decimal").innerHTML = max_decimal.toString();

    // Validate decimal.
    if (isNaN(decimal)) {
      decimal = 0;
    } else if (decimal < min_decimal) {
      decimal = min_decimal;
      document.getElementById("decimal").value = min_decimal.toString();
    } else if (decimal > max_decimal) {
      decimal = max_decimal;
      document.getElementById("decimal").value = max_decimal.toString();
    }

    // Show radix complement string.
    document.getElementById("radix-complement-string").innerHTML =
      radixComplementString(b, n, decimal);
  }

  calc();
</script>

Why it works

Two's complement, or radix complement in general, works because the elements and operation involved form a cyclic abelian group directly related to the group of integers modulo m under addition, denoted as $(\mathbb{Z}_m, +_m)$.

Let $G$ be a set of bit strings with fixed-length $n$ representing integers in base 2, and $+$ be the addition of integers in base 2 constrained to $n$ bits.

Given $a, b \in G \implies a + b \in G$ because the way $+$ is defined, $a + b$ must have $n$ bits making $G$ closed under $+$. Also, the integers that a given $a \in G$ can represent in base 2 are in $[0, 2^n -1]$, so $+$ is equivalent to addition $\text{mod } 2^n$. Consequently, letting $m = 2^n$, the group $(G, +)$ can be mapped onto $(\mathbb{Z}_m, +_m)$, therefore holding all its modular arithmetic properties.

Regarding the $n = 8$ example from a previous section, we can calculate two's complement for $-91$ as:

$$\begin{align*} -91 &= -91 & \\\ &\equiv -91 + 256 &\pmod{256} \\\ &\equiv 165 &\pmod{256} \end{align*}$$

Then $-91$ is equivalent to $165$ under $\text{mod } 256$ and we can use $165$'s bit string to represent $-91$ in two's complement. For example, suppose we have to calculate $112 - 91$, thus:

$$\begin{align*} 112 - 91 &= 112 - 91 & \\\ &\equiv 112 + 165 &\pmod{256} \\\ &\equiv 277 &\pmod{256} \\\ &\equiv 21 &\pmod{256} \\\ &= 21 & \end{align*}$$

elias19r/2018-01-10-how-two-s-complement-works.md

Two's Complement

How it works

Radix Complement

Implementation

Why it works