_____   _____              _____   _____  
  / ____| / ____|_     /\    |  __ \ |  __ \ 
 | |     | (___ (_)   /  \   | |__) || |__) |
 | |      \___ \     / /\ \  |  ___/ |  ___/ 
 | |____  ____) |_  / ____ \ | |     | |     
  \_____||_____/(_)/_/    \_\|_|     |_|

Ways to incorporate assembly code into C programs:

Write entire functions in assembly code and combine them with C functions during the linking stage.
Use gcc's support for embedding assembly code directly within C programs.

registers in an x86-64 CPU

different registers server different roles in typical programs

The most unique register is %rsp (stack pointer) indicate the end position in the run-time stack. The other 15 registers have more flexibility in their uses.

data movement instructions

`movq` vs `movabsq`

the movq instruction can only have immediate source operands that can be represented as 32-bits two's-complement numbers. This value is then sign extended to produce the 64-bit value for the destination. The movabsq instruction can have an arbirary 64-bit immediate value as its source operand and can only have a register as a destination.

`movz` and `movs` classes of instructions

Both classes of instructions can only have a register or memory location as the source and a register as the destination.

There is no movzlq because movl is eactaly what movzlq would have been.

why `cltq` is so named?

cltq means Convert Long To Quad.

The effect of cltq is to sign-extend %eax to %rax. It has the exact same effect as the instruction movslq %eax, %rax, but it has a more compact encoding.

Practice Problem 3.3

This problem asks one to explain what is wrong with each line of assembly code given.

(I include this into the note to show how I expressed the answers compared to how the authors (native speakers) had said the same thing (in italic style). This indicates how I can improve my English.)

code	I say	they say
`movb $0xF, (%ebx)`	address must be stored in a 64-bit register	Cannot use `%ebx` as address register
`movl %rax, (%rsp)`	`movl` doesn't math `%rax` in length	Mismatch between instruction suffix and register ID
`movw (%rax), 4(%rsp)`	cannot move from one memory location to another	Cannot have both source and destination be memory reference
`movb %al, %sl`	I didn't find what's wrong:(	No register named `%sl`
`movl %eax, $0x123`	an immediate cannot be the destination	Cannot have immediate as destination
`movl %eax, %dx`	the length of two registers not match	Destination operand incorrect size
`movl %si, 8(%rbp)`	`movb` and `%si` do not match in size	Mismatch between instruction suffix and register ID

names of registers and their low-order byte in the original 8086

%ax %al %bx %bl %cx %cl %dx %dl

%si %sil %di %dil %bp %bpl %sp %spl

So it's strange that the low-order byte of %si is not named %sl since there is no register named %sx.

size of `long` in C

On page 177 (first pagraph of section 3.3), it is said

With x86-64, data type long is implemented with 64 bits, allowing a very wide range of values.

However, on my machine (MSYS2 port of GCC 8.3.0, 64 bit Windows), size of long is 4 bytes.

The code on CS:APP3e was built with GCC 4.8.1 (Ubuntu 4.8.1-2ubuntu~12.04)

So there is some difference between the assembly code given in the book and those given by my compiler.

comment in assembly code

use ";" to start a single line comment

Notes (and an issue) on practice problem 3.4

convert unsigned to long.

Solution from the book is

movzbl (%rdi), %eax
movq %rax, (%rsi)

However, logically the first instruction should be movzbq (%rdi), %rax. I am confused here. For this "issue", Bryant, coauthor of the book, made a clarification (not an erratum!) on the book's errata page and refer to the notes the note on Figure 3.5 (p. 184). From the notes:

... copy data from a source, which can be either a register or stored in memory, to a register destination. ... fill out the remaining bytes of the destination with zeros. ...the second specifying the destination size.

....the property that an instruction (e.g. movl) generating a 4 byte-value with a register as the destination will fill the upper 4 bytes with zeros.

From the author's clarification, it seems that we can infer that movzbl is also an instruction that generates a 4 byte-value. However, if this is true, there is no need to have movzbq and movzwq, as movzbl and movzwl do the same thing.

After all, it's only a guess, the authors didn't make the point.

But, a few more search guides me to a lecture handout made at 2005 by Bryant. From the handout:

... Perhaps unexpectedly, instructions that move or generate 32-bit register values also set the upper 32 bits of the register to zero. Consequently there is no need for an instruction movzlq. Similarly, the instruction movzbq has the exact same behavior as movzbl when the destination is a register—both set the upper 56 bits of the destination register to zero. This is in contrast to instructions that generate 8 or 16-bit values, such as movb; these instructions do not alter the other bits in the register.

In the book, however, the authors say nothing about the exact same behavior of movzbl and movzbq explicitly, leaving the reader guessing.

P.S. I find another evidence of the same effect of movzbq and movzbl in the book! On page 203, the forth line of the assembly code listing goes

movzbl %al, %eax  ; Clear rest of %eax (and rest of %rax)

note for this code listing says

Recall also, as discussed in Section 2.4.2, that the movzbl instruction (line 4) clears not just the high-order 3 bytes of %eax, but the upper 4 bytes of the entire register, %rax, as well.

I got wrong on two entries.

assign a char to an unsigned.

Remember that unsigned is short for unsigned int, so the answer should be

movsbl (%rdi), %eax
movl %eax, (%rsi)

assign a char to a short

The answer is

movsbw (%rdi), %ax
movw %ax, (%rsi)

arithmetic and logical operations

leaq: (calculate and) load effective address, the destination operand must be a register.

notes on `leaq`

From the book:

leaq has no other size variants.

in first paragraph of section 3.5.

Compling the function

long scale(long x, long y, long z) {
   long t = x * 4 * y + 12 * z;
   return t;
}

with command gcc -Og -S filename.c, gcc generates the following assembly code

leal	(%rcx,%rdx,4), %eax
leal	(%r8,%r8,2), %ecx
leal	0(,%rcx,4), %edx
addl	%edx, %eax
ret

Remember that on my laptop, sizeof(long) is 4. Obviouly, leal is a vriant of leaq. So the author might be wrong here?

The same code when compiled on Compiler Explorer with x86-64 gcc (trunk) leads to the exact same assembly as given on the book.

So I'll try more with Compiler Explorer.

Condition code

In addition to the integer registers, the CPU maintains a set of single-bit condition code registers describing attributes of the most recent arithmetic or logical operation. These registers can then be tested to perform conditional branches.

An issue with the testing of unsigned comparisons

On page 204 the second paragraph, it says

in performing the computation t = a - b, the carry flag will be set by the CMP instruction when a - b < 0, ...

~~But I think the carry flag will be set by the CMP instruction when a - b >= 0. See https://stackoverflow.com/a/44623872/6052725 for a supporting evidence.~~

~~I am going to make an error report to the author.~~

是我搞错了。关于 SUB 指令是如何影响进位标志(carry flag, CF)的，书上并未说明。不过我在 StackOverflow 上找到一个提问，正是我想问的东西。 [https://stackoverflow.com/questions/45631132/setting-cf-while-cmp]

简言之，a - b 和 a + (-b) 结果相同，但是二者对 carry flag 的作用可能不同。

The result of a-b is the same as the result of a+(-b). However the flags set by the operation a-b are NOT (at least not always) the same flags as the flags set by the operation a+(-b)!

On the x86 the "carry flag" is defined as "borrow out" for subtractions.

Understanding how jump targets are encoded will become important when study linking.

interpret the output of a disassembler

program counter

why PC relative so named？

PC is short for Program Counter.

From the book:

That is, they (PC relative encodings) encode the difference between the address of the target instruction and the address of the instruction immediately following the jump.

..., the value of the program counter when performing PC-relative addressing is the address of the instruction following the jump, not that of the jump itself. This convention dates back to early implementations, when the processor would update the program counters as its first step in executing an instruction.

From Wikipedia:

The PC-relative addressing mode can be used to load a register with a value stored in program memory a short distance away from the current instruction. It can be seen as a special case of the "base plus offset" addressing mode, one that selects the program counter (PC) as the "base register".

GoBigorGoHome/chap1.md

Select an option

No results found

Select an option

No results found

understand byte ordering

two forms of right shift

why is “two's complement” so named

Explanation

effect of expanding or shrinking an encoded integer to fit a represent'n with a different length

relative order of conversion from one data size to another and between unsigned and signed

How machine code does or does not distinguish between signed and unsigned values

Ways to incorporate assembly code into C programs:

registers in an x86-64 CPU

data movement instructions

`movq` vs `movabsq`

`movz` and `movs` classes of instructions

why `cltq` is so named?

Practice Problem 3.3

names of registers and their low-order byte in the original 8086

size of `long` in C

comment in assembly code

Notes (and an issue) on practice problem 3.4

arithmetic and logical operations

notes on `leaq`

Condition code

An issue with the testing of unsigned comparisons

Understanding how jump targets are encoded will become important when study linking.

interpret the output of a disassembler

program counter

why PC relative so named？

GoBigorGoHome/chap1.md

understand byte ordering

two forms of right shift

why is “two's complement” so named

Explanation

effect of expanding or shrinking an encoded integer to fit a represent'n with a different length

relative order of conversion from one data size to another and between unsigned and signed

How machine code does or does not distinguish between signed and unsigned values

Ways to incorporate assembly code into C programs:

registers in an x86-64 CPU

data movement instructions

movq vs movabsq

movz and movs classes of instructions

why cltq is so named?

Practice Problem 3.3

names of registers and their low-order byte in the original 8086

size of long in C

comment in assembly code

Notes (and an issue) on practice problem 3.4

arithmetic and logical operations

notes on leaq

Condition code

An issue with the testing of unsigned comparisons

Understanding how jump targets are encoded will become important when study linking.

interpret the output of a disassembler

program counter

why PC relative so named？

`movq` vs `movabsq`

`movz` and `movs` classes of instructions

why `cltq` is so named?

size of `long` in C

notes on `leaq`