language |
title |
date |
en-us |
How two's complement works |
2018-01-10 |
Two's complement is widely used to represent integer numbers in computers.
We'll explain how it works with examples and justify why it works by mapping it
onto modular arithmetic.
Digital computers use storage units with fixed length in bits: a byte
(8 bits), a word (32 or 64 bits) etc. Each bit is denoted by 0 or 1 composing a
bit string. It doesn't require any additional structure to represent natural
numbers, $\mathbb{N}$ (unsigned integers): a bit string itself represents a
number in base 2, for example, an unsigned byte $(10010011)_2 = 147$. However,
for integers, $\mathbb{Z}$, a special representation is necessary and two's
complement is widely used.
Given a set of bit string elements with $n$ bits each, two's complement
consists of using the lower half of them to represent zero and positive integers
and the upper half to represent negative integers, and it is ruled by modular
arithmetic properties. The previous example, if in two's complement, would
represent a negative integer: $(10010011)_{\overline{2}} = -109$. What will tell
us whether a bit string is representing an unsigned or signed integer is the
semantics, usually a data type in programming languages.
When necessary throughout this post, we'll denote fixed-length numbers in
parentheses along with a base (or an overlined base if referring to
a complement) in subscript.
For the following examples, we'll consider bit strings with $n = 8$ bits
(a byte). Thus, there will be $2^8 = 256$ bit strings to represent integers
evenly distributed in range $[-128, 127]$, where the lower 128 bit strings
will represent integers in $[0, 127]$ and the upper 128 bit strings will
represent integers in $[-128, -1]$.
For integers in $[0, 127]$, representation is immediate from base 2:
$$\begin{align*}
0 &= (00000000)_2 \\\
1 &= (00000001)_2 \\\
2 &= (00000010)_2 \\\
& \dots \\\
126 &= (01111110)_2 \\\
127 &= (01111111)_2
\end{align*}$$
For integers in $[-128, -1]$, representation consists of three steps:
- Write the corresponding opposite's bit string
- Invert (negate) its bits
- Add $1$
For example, for $-91$:
$$+91 = (01011011)_2$$
$$\neg (01011011)_2 = (10100100)_2$$
$$\begin{align*}
(10100100)_2 + 1 = (10100101)_2 \\\
\implies -91 = (10100101)_{\overline{2}}
\end{align*}$$
As a result, $-91$ is represented as $(10100101)_{\overline{2}}$. Note that if
it weren't in two's complement such bit string would represent the unsigned
integer $165$ which is not in range $[-128, 127]$, and we get back the
$+91$ by applying those steps on it again.
Applying those steps to all integers in $[-128, -1]$, we have:
$$\begin{align*}
-128 &= (10000000)_{\overline{2}} \\\
-127 &= (10000001)_{\overline{2}} \\\
& \dots \\\
-109 &= (10010011)_{\overline{2}} \\\
& \dots \\\
-91 &= (10100101)_{\overline{2}} \\\
& \dots \\\
-3 &= (11111101)_{\overline{2}} \\\
-2 &= (11111110)_{\overline{2}} \\\
-1 &= (11111111)_{\overline{2}}
\end{align*}$$
A nice feature of two's complement is that we can easily tell whether it is
representing a negative number by looking at its first bit: if starting with 1,
then negative (just like a sign bit). And an unlikable feature is that
the minimum negative integer, here $-128$, has no positive counterpart:
applying those steps again on $-128$ yields the same bit string.
Now let's investigate further what that operation is all about. In the example
for $-91$, by picking the positive $91$, inverting its bits and adding
$1$, we are actually performing a "substitute" for the operation
$256 - (+91)$ that yields $165$. Before talking about what this operation
means, let's see why we said it is a substitute.
We don't have enough bits to write an unsigned integer $256$ with our bit
strings because that would require $9$ bits, $(100000000)_2$, but we fixed
$n = 8$ bits. For now, in order to perform $256 - (+91)$, we rewrite it as
$255 - (+91) + 1$ so it fits in with $8$ bits (more on this later). In
particular, subtracting $91$ from $255$ in binary is equivalent to invert
(negate) all bits of $91$, also known as ones' complement, that's why we
invert its bits and then add $1$:
$$\begin{array}{ccccc}
\begin{array}{rr}
& 255 \\\
- & 91 \\ \hline
& 164 \\\
+ & 1 \\ \hline
& 165
\end{array}
&
\boldsymbol{=}
&
\begin{array}{rr}
& (11111111)_2 \\\
- & (01011011)_2 \\ \hline
& (10100100)_2 \\\
+ & (00000001)_2 \\ \hline
& (10100101)_2
\end{array}
&
\boldsymbol{\equiv}
&
\begin{array}{rr}
& \\\
\neg & (01011011)_2 \\ \hline
& (10100100)_2 \\\
+ & (00000001)_2 \\ \hline
& (10100101)_2
\end{array}
\end{array}$$
OK, and what's special about the calculation $256 - (+91)$? It yields $165$
which is the missing amount for $91$ to reach $256$, namely, its complement
to $256$. Generically, we are representing the negative of a number $x$ by
using the complement of $x$ to $2^n$, denoted here as $\overline{x}$:
$$\begin{align*}
& \overline{x} + x = 2^n \\\
\implies & \overline{x} + x - x = 2^n - x \\\
\implies & \overline{x} = 2^n - x
\end{align*}$$
Such operation is called two's complement, and can be used to reserve the
upper half of the $2^n$ elements to signify a negative number and still be
able to perform arithmetic with them.
At this point you may note: if $\overline{x}$ is being considered the
opposite of $x$, then $2^n$ must be an identity element equivalent to zero.
And it is: because of the constraint of $n$ bits, an attempt to represent
$2^n$ in base 2 truncates it to a bit string of $n$ 0 bits and also causes
zero to have a unique representation. We'll see later that it relates to modular
arithmetic: $2^n \equiv 0 \pmod{2^n}$.
Back to the previous example, then why should we calculate the tricky
$255 - (+91) +1 \equiv \neg (01011011)_2 + (00000001)_2$ if we can just
calculate $256 - (+91) \equiv (00000000)_2 - (01011011)_2$ instead? Straight
answer: simplicity. The former is easier to implement in digital hardware and it
simplifies subtraction as a complement addition.
The concept explained above can be generalized for an $n$ digit number $x$
in base (radix) $b$ and it is called
radix complement
defined as:
$$\begin{align*}
\overline{x} = b^n - x
\end{align*}$$
A quick example for base $b = 5$ with numerals ${0, 1, 2, 3, 4}$ and a
fixed-length of $n = 2$ digits:
$$\begin{array}{l|l|l|l|l}
\begin{array}{c}
0 = (00)_{\overline{5}} \\\
1 = (01)_{\overline{5}} \\\
2 = (02)_{\overline{5}} \\\
3 = (03)_{\overline{5}} \\\
4 = (04)_{\overline{5}}
\end{array}
&
\begin{array}{c}
5 = (10)_{\overline{5}} \\\
6 = (11)_{\overline{5}} \\\
7 = (12)_{\overline{5}} \\\
8 = (13)_{\overline{5}} \\\
9 = (14)_{\overline{5}}
\end{array}
&
\begin{array}{c}
10 = (20)_{\overline{5}} \\\
11 = (21)_{\overline{5}} \\\
12 = (22)_{\overline{5}} \\\
-12 = (23)_{\overline{5}} \\\
-11 = (24)_{\overline{5}}
\end{array}
&
\begin{array}{c}
-10 = (30)_{\overline{5}} \\\
-9 = (31)_{\overline{5}} \\\
-8 = (32)_{\overline{5}} \\\
-7 = (33)_{\overline{5}} \\\
-6 = (34)_{\overline{5}}
\end{array}
&
\begin{array}{c}
-5 = (40)_{\overline{5}} \\\
-4 = (41)_{\overline{5}} \\\
-3 = (42)_{\overline{5}} \\\
-2 = (43)_{\overline{5}} \\\
-1 = (44)_{\overline{5}}
\end{array}
\end{array}$$
You can verify it yourself by hand as an exercise or just play with the
implementation in the next section. Notice that due to an odd number of
elements, $5^2 = 25$, we end up with a nice symmetry around zero: every
negative integer has a positive counterpart.
<form lang="en" style="font-family: monospace;">
<p>
base =
<input
oninput="calc();"
id="b"
type="number"
step="1"
style="width: 4em;"
value="2"
min="2"
max="16"
/>
<small>min=2, max=16</small><br />
n digits =
<input
oninput="calc();"
id="n"
type="number"
step="1"
style="width: 4em;"
value="8"
min="1"
/>
<small>min=1, max=<span id="max_n"></span></small><br />
decimal =
<input
oninput="calc();"
id="decimal"
type="number"
step="1"
style="width: 8em;"
value="-91"
/>
<small
>min=<span id="min_decimal"></span>, max=<span id="max_decimal"></span
></small>
</p>
<p>
radix complement = (<span id="radix-complement-string"></span>)<sub
id="b-sub"
style="text-decoration: overline;"
></sub>
</p>
</form>
<script>
// radixComplementString returns a string with n digits representing decimal
// in a radix complement according to base b.
// Ranges:
// b in [2, 16]
// n in [1, ceil(log_b(2^32))]
// decimal in
// [-(b^n)/2, (b^n)/2] , if b^n is odd,
// [-(b^n)/2, (b^n)/2 - 1], if b^n is even.
function radixComplementString(b, n, decimal) {
// Apply radix complement formula
if (decimal < 0) {
decimal = Math.pow(b, n) + decimal;
}
// Get digit string and pad zeros
return decimal.toString(b).padStart(n, "0");
}
function calc() {
// Get values.
var b = document.getElementById("b").valueAsNumber;
var n = document.getElementById("n").valueAsNumber;
var decimal = document.getElementById("decimal").valueAsNumber;
// Validate base.
if (isNaN(b) || b == 1) {
return; // User hasn't finished typing yet; do nothing.
} else if (b < 2) {
b = 2;
document.getElementById("b").value = "2";
} else if (b > 16) {
b = 16;
document.getElementById("b").value = "16";
}
document.getElementById("b-sub").innerHTML = b.toString();
// Validate n digits.
var max_n = Math.ceil(Math.log(Math.pow(2, 32)) / Math.log(b));
if (isNaN(n)) {
n = 1;
} else if (n < 1) {
n = 1;
document.getElementById("n").value = "1";
} else if (n > max_n) {
n = max_n;
document.getElementById("n").value = max_n.toString();
}
document.getElementById("max_n").innerHTML = max_n.toString();
// Define min and max decimal values.
var N = Math.pow(b, n);
var min_decimal = -Math.floor(N / 2);
var max_decimal = Math.floor(N / 2);
if (N % 2 == 0) {
max_decimal -= 1;
}
document.getElementById("min_decimal").innerHTML = min_decimal.toString();
document.getElementById("max_decimal").innerHTML = max_decimal.toString();
// Validate decimal.
if (isNaN(decimal)) {
decimal = 0;
} else if (decimal < min_decimal) {
decimal = min_decimal;
document.getElementById("decimal").value = min_decimal.toString();
} else if (decimal > max_decimal) {
decimal = max_decimal;
document.getElementById("decimal").value = max_decimal.toString();
}
// Show radix complement string.
document.getElementById("radix-complement-string").innerHTML =
radixComplementString(b, n, decimal);
}
calc();
</script>
Two's complement, or radix complement in general, works because the elements and
operation involved form a cyclic abelian group directly related to the
group of integers modulo m under addition,
denoted as $(\mathbb{Z}_m, +_m)$.
Let $G$ be a set of bit strings with fixed-length $n$ representing integers
in base 2, and $+$ be the addition of integers in base 2 constrained to $n$
bits.
Given $a, b \in G \implies a + b \in G$ because the way $+$ is defined,
$a + b$ must have $n$ bits making $G$ closed under $+$. Also, the
integers that a given $a \in G$ can represent in base 2 are in
$[0, 2^n -1]$, so $+$ is equivalent to addition $\text{mod } 2^n$.
Consequently, letting $m = 2^n$, the group $(G, +)$ can be mapped onto
$(\mathbb{Z}_m, +_m)$, therefore holding all its modular arithmetic
properties.
Regarding the $n = 8$ example from a previous section, we can calculate two's
complement for $-91$ as:
$$\begin{align*}
-91 &= -91 & \\\
&\equiv -91 + 256 &\pmod{256} \\\
&\equiv 165 &\pmod{256}
\end{align*}$$
Then $-91$ is equivalent to $165$ under $\text{mod } 256$ and we can use
$165$'s bit string to represent $-91$ in two's complement. For example,
suppose we have to calculate $112 - 91$, thus:
$$\begin{align*}
112 - 91 &= 112 - 91 & \\\
&\equiv 112 + 165 &\pmod{256} \\\
&\equiv 277 &\pmod{256} \\\
&\equiv 21 &\pmod{256} \\\
&= 21 &
\end{align*}$$