_____ _____ _____ _____
/ ____| / ____|_ /\ | __ \ | __ \
| | | (___ (_) / \ | |__) || |__) |
| | \___ \ / /\ \ | ___/ | ___/
| |____ ____) |_ / ____ \ | | | |
\_____||_____/(_)/_/ \_\|_| |_|
-
-
Save GoBigorGoHome/4d0383d7f7f9e6c5cf6b6639a6495b8a to your computer and use it in GitHub Desktop.
_____ _____ _____ _____
/ ____| / ____|_ /\ | __ \ | __ \
| | | (___ (_) / \ | |__) || |__) |
| | \___ \ / /\ \ | ___/ | ___/
| |____ ____) |_ / ____ \ | | | |
\_____||_____/(_)/_/ \_\|_| |_|
Byte ordering not bit ordering. Order of bits within a byte is same whether in big-endian or small endian.
0x200b43 in 00 20 0b 43 in big-endian, 43 0b 20 00 in little-endian. It seems how the bits are ordered within a bit is not specified.
- logical right shift: fill the left end with zeros
- arithmetic right shift: fill the left end with repetions of the most significant bit
Note it wiritten as "two's complement" not "twos' complement"
The term arises from the fact that for nonnegative x we compute a w-bit representation of -x as 2^w - x.
Let x be arbitrary nonnegative integer representable with w bit, i.e. x is in the range [0, 2^{w-1}-1].
Suppose -x has a w-bit representation as 1......., and suppose the trailing w-1 bits are the unsigned presentation of ineteger P.
Immediately, we have -x = -2^{w-1} + P, then x = 2^{w-1} - P. View the bit pattern of -x as unsigned, the value is 2^{w-1} + P, obviously, 2^{w-1} + P = 2^w - (2^{w-1} - P), that is T2U(-x) = 2^w - T2U(x), since x is nonnegative, T2U(x) is same as x.
Expanding is always possible. Zero expansion for two's-complement number.
Zero extension for unsigned number (this case is trivial).
When converting short to an unsigned, the C standard requires that first short is converted to int (change the size) and then from int to unsigned.
Machine code does not associate a data type with each program value. It mostly uses the same instructions for the two cases, because many arithmetic operations have the same bit level behavior for unsigned and two's complement arithmetic. Some circumstances require different instructions to handle signed and unsigned operations, such as using different versions of right shifts, division and multiplication instructions, and different combinations of condition codes.
_____ _____ _____ _____
/ ____| / ____|_ /\ | __ \ | __ \
| | | (___ (_) / \ | |__) || |__) |
| | \___ \ / /\ \ | ___/ | ___/
| |____ ____) |_ / ____ \ | | | |
\_____||_____/(_)/_/ \_\|_| |_|
-
Write entire functions in assembly code and combine them with C functions during the linking stage.
-
Use gcc's support for embedding assembly code directly within C programs.
different registers server different roles in typical programs
The most unique register is %rsp (stack pointer) indicate the end position in the run-time stack.
The other 15 registers have more flexibility in their uses.
the movq instruction can only have immediate source operands that can be represented as 32-bits two's-complement numbers. This value is then sign extended to produce the 64-bit value for the destination. The movabsq instruction can have an arbirary 64-bit immediate value as its source operand and can only have a register as a destination.
Both classes of instructions can only have a register or memory location as the source and a register as the destination.
There is no movzlq because movl is eactaly what movzlq would have been.
cltq means Convert Long To Quad.
The effect of cltq is to sign-extend %eax to %rax. It has the exact same effect as the instruction movslq %eax, %rax, but it has a more compact encoding.
This problem asks one to explain what is wrong with each line of assembly code given.
(I include this into the note to show how I expressed the answers compared to how the authors (native speakers) had said the same thing (in italic style). This indicates how I can improve my English.)
| code | I say | they say |
|---|---|---|
movb $0xF, (%ebx) |
address must be stored in a 64-bit register | Cannot use %ebx as address register |
movl %rax, (%rsp) |
movl doesn't math %rax in length |
Mismatch between instruction suffix and register ID |
movw (%rax), 4(%rsp) |
cannot move from one memory location to another | Cannot have both source and destination be memory reference |
movb %al, %sl |
I didn't find what's wrong:( | No register named %sl |
movl %eax, $0x123 |
an immediate cannot be the destination | Cannot have immediate as destination |
movl %eax, %dx |
the length of two registers not match | Destination operand incorrect size |
movl %si, 8(%rbp) |
movb and %si do not match in size |
Mismatch between instruction suffix and register ID |
%ax %al
%bx %bl
%cx %cl
%dx %dl
%si %sil
%di %dil
%bp %bpl
%sp %spl
So it's strange that the low-order byte of %si is not named %sl since there is no register named %sx.
On page 177 (first pagraph of section 3.3), it is said
With x86-64, data type
longis implemented with 64 bits, allowing a very wide range of values.
However, on my machine (MSYS2 port of GCC 8.3.0, 64 bit Windows), size of long is 4 bytes.
The code on CS:APP3e was built with GCC 4.8.1 (Ubuntu 4.8.1-2ubuntu~12.04)
So there is some difference between the assembly code given in the book and those given by my compiler.
use ";" to start a single line comment
convert unsigned to long.
Solution from the book is
movzbl (%rdi), %eax
movq %rax, (%rsi)However, logically the first instruction should be movzbq (%rdi), %rax. I am confused here. For this "issue", Bryant, coauthor of the book, made a clarification (not an erratum!) on the book's errata page and refer to the notes the note on Figure 3.5 (p. 184). From the notes:
... copy data from a source, which can be either a register or stored in memory, to a register destination. ... fill out the remaining bytes of the destination with zeros. ...the second specifying the destination size.
....the property that an instruction (e.g.
movl) generating a 4 byte-value with a register as the destination will fill the upper 4 bytes with zeros.
From the author's clarification, it seems that we can infer that movzbl is also an instruction that generates a 4 byte-value. However, if this is true, there is no need to have movzbq and movzwq, as movzbl and movzwl do the same thing.
After all, it's only a guess, the authors didn't make the point.
But, a few more search guides me to a lecture handout made at 2005 by Bryant. From the handout:
... Perhaps unexpectedly, instructions that move or generate 32-bit register values also set the upper 32 bits of the register to zero. Consequently there is no need for an instruction movzlq. Similarly, the instruction
movzbqhas the exact same behavior asmovzblwhen the destination is a register—both set the upper 56 bits of the destination register to zero. This is in contrast to instructions that generate 8 or 16-bit values, such as movb; these instructions do not alter the other bits in the register.
In the book, however, the authors say nothing about the exact same behavior of movzbl and movzbq explicitly, leaving the reader guessing.
P.S. I find another evidence of the same effect of movzbq and movzbl in the book! On page 203, the forth line of the assembly code listing goes
movzbl %al, %eax ; Clear rest of %eax (and rest of %rax)note for this code listing says
Recall also, as discussed in Section 2.4.2, that the
movzblinstruction (line 4) clears not just the high-order 3 bytes of%eax, but the upper 4 bytes of the entire register,%rax, as well.
I got wrong on two entries.
- assign a
charto anunsigned.
Remember that unsigned is short for unsigned int, so the answer should be
movsbl (%rdi), %eax
movl %eax, (%rsi)- assign a
charto ashort
The answer is
movsbw (%rdi), %ax
movw %ax, (%rsi)leaq: (calculate and) load effective address, the destination operand must be a register.
From the book:
leaqhas no other size variants.
in first paragraph of section 3.5.
Compling the function
long scale(long x, long y, long z) {
long t = x * 4 * y + 12 * z;
return t;
}with command gcc -Og -S filename.c, gcc generates the following assembly code
leal (%rcx,%rdx,4), %eax
leal (%r8,%r8,2), %ecx
leal 0(,%rcx,4), %edx
addl %edx, %eax
retRemember that on my laptop, sizeof(long) is 4.
Obviouly, leal is a vriant of leaq.
So the author might be wrong here?
The same code when compiled on Compiler Explorer with x86-64 gcc (trunk) leads to the exact same assembly as given on the book.
So I'll try more with Compiler Explorer.
In addition to the integer registers, the CPU maintains a set of single-bit condition code registers describing attributes of the most recent arithmetic or logical operation. These registers can then be tested to perform conditional branches.
On page 204 the second paragraph, it says
in performing the computation
t = a - b, the carry flag will be set by the CMP instruction when a - b < 0, ...
But I think the carry flag will be set by the CMP instruction when a - b >= 0. See https://stackoverflow.com/a/44623872/6052725 for a supporting evidence.
I am going to make an error report to the author.
是我搞错了。关于 SUB 指令是如何影响进位标志(carry flag, CF)的,书上并未说明。不过我在 StackOverflow 上找到一个提问,正是我想问的东西。 [https://stackoverflow.com/questions/45631132/setting-cf-while-cmp]
简言之,a - b 和 a + (-b) 结果相同,但是二者对 carry flag 的作用可能不同。
The result of
a-bis the same as the result ofa+(-b). However the flags set by the operation a-b are NOT (at least not always) the same flags as the flags set by the operationa+(-b)!On the x86 the "carry flag" is defined as "borrow out" for subtractions.
PC is short for Program Counter.
From the book:
That is, they (PC relative encodings) encode the difference between the address of the target instruction and the address of the instruction immediately following the jump.
..., the value of the program counter when performing PC-relative addressing is the address of the instruction following the jump, not that of the jump itself. This convention dates back to early implementations, when the processor would update the program counters as its first step in executing an instruction.
From Wikipedia:
The PC-relative addressing mode can be used to load a register with a value stored in program memory a short distance away from the current instruction. It can be seen as a special case of the "base plus offset" addressing mode, one that selects the program counter (PC) as the "base register".