Skip to content

Instantly share code, notes, and snippets.

@yunchih
Last active August 29, 2015 14:08
Show Gist options
  • Select an option

  • Save yunchih/76c147ee8925b11c9ed7 to your computer and use it in GitHub Desktop.

Select an option

Save yunchih/76c147ee8925b11c9ed7 to your computer and use it in GitHub Desktop.

#The TOY Machine Language

The TOY language is an experimental ( educational ) assembly language target for TOY achitecture, and is a part of our final project C14-Assembler. Primary contributors to this project include: Andy , Tony and Borris.

##Directive

  • .DATA
  • .TEXT

Case-insensitive

##Data

  • All variables must be declared within .DATA
  • Declaration must abide to the following form: < variable name > < space > < value > where
    • < variable name > := An alphanumeric symbol with length no more than 10. It should not start with number. _ is allowed. For example: my_var is ok, while 0_0_My_Var is strictly forbidden.
    • < space > := A space or a tab.
    • < value > := Any integer within the range of -32,768 to +32,767 , i.e. 16 bits, signed integer, binary number unsupported. Prefix with 0x if the value is encoded hexadecimally. If you want to declare uninitialized value, set value to ?.
  • Notice that size directive is not supported. By default, all variable must adopt to 16-bits, which correspond to WORD in Intel x86 syntax.

####Legal declaration

.DATA
   myVar 32766
   foo   -10
   bar   ?

##Instruction

####Data movements

  • mov
  • lea

####Arithmetic operations

  • add
  • sub
  • inc
  • dec

####Logical operations

  • and
  • xor
  • shl
  • shr

####Flow control

  • jmp
  • jz
  • cmp

####Program counter operation

  • call
  • ret

####Stack operation

  • push
  • pop
  • top

####IO

  • read
  • print

All instructions are case-insensitive

##Complete Instruction Specification

####Terminology

  • <reg> := R<i> where i in range 1 to F. R case-sensitive. For example: R1, RA are valid register namerreby r1, R17 are invalid.
  • <con8> := Any 8-bit integer.
  • <con16> := Any 16-bit integer.
  • <var> := Any variable declared in .data
  • <mem> := [<reg>] [<var>] or [<con8>]. For example: [R1],[myVariable],[0xF8] are valid, but [ R1 ] is invalid.
  • <instr> := Instruction.
  • <dest> := Destination operand.
  • <src> := Source operand.

####Rule

  • Binary instructions: <instr> <dest> <src>. For example: mov R2 R1.
  • Unary instructions: <instr> <src>. For example: inc R3.

####Mov

Move the content of <src> into the location referred to by <dest>

######Syntax

  • mov <reg>,<reg>
  • mov <reg>,<mem>
  • mov <mem>,<reg>
  • mov <reg>,<con16>
  • mov <mem>,<con16>
  • This is strictly forbidden: mov <mem> <mem>

####Lea

Write the address of <src> into <dest>. ( The content of <src> is unaffected. )

######Syntax

  • mov <reg>,<mem>

####Add / Sub

Add <src> to <dest> / subtract <dest> by <src> and put the result into <dest>.

######Syntax

  • add / sub <reg>,<reg>
  • add / sub <reg>,<mem>
  • add / sub <mem>,<reg>
  • add / sub <reg>,<con16>
  • add / sub <mem>,<con16>

##Symbol / label

  • Label declaration never share common line with other instructions.
; This is OK.
loop1:          
   mov R1, R2
   
; This is not OK
loop2: mov R1, R2        
  • Symbol can be referenced even before it's declared.
  • Symbol contains only alphanumeric characters and _ and must not conflict with register name.
  • Symbol name must not start with digits.
  • A symbol must contains no more than 20 characters.

####Invalid symbol name

R8 ; conflicts with register name
myLongLongLongLongLongSymbol ; too long
2BeOrNot2Be ; starts with digit

##Comment

Anything prefixed by ; is regarded as comment and will not parsed by the assembler.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment