-
-
Save Jarvix/2386170 to your computer and use it in GitHub Desktop.
RFC X____ J. Kuijpers, Ed. | |
Jarvix | |
M. Beermann, Ed. | |
April 17, 2012 | |
0xSCA: Standards Committee Assembly | |
Abstract | |
This document describes an assembly and preprocessor syntax suitable | |
for the DCPU-16 environment. This syntax is called the 0xSCA, or | |
Standards Committee Assembly. | |
This is not a standard. | |
Kuijpers & Beermann [Page 1] | |
Assembly Syntactics April 2012 | |
Table of Contents | |
1. Introduction . . . . . . . . . . . . . . . . . . . . . . . . . 3 | |
1.1. Requirements Language . . . . . . . . . . . . . . . . . . . 3 | |
2. Document Markup . . . . . . . . . . . . . . . . . . . . . . . . 3 | |
2.1. Filename . . . . . . . . . . . . . . . . . . . . . . . . . 3 | |
2.2. Lines . . . . . . . . . . . . . . . . . . . . . . . . . . . 3 | |
2.3. Indentation and whitepacing . . . . . . . . . . . . . . . . 3 | |
3. Preprocessor Markup . . . . . . . . . . . . . . . . . . . . . . 3 | |
3.1. Comments . . . . . . . . . . . . . . . . . . . . . . . . . 3 | |
3.2. Prefix . . . . . . . . . . . . . . . . . . . . . . . . . . 4 | |
3.3. Case insensitivity . . . . . . . . . . . . . . . . . . . . 4 | |
3.4. Directives . . . . . . . . . . . . . . . . . . . . . . . . 4 | |
3.4.1. Inclusion . . . . . . . . . . . . . . . . . . . . . . . 4 | |
3.4.1.1. Code . . . . . . . . . . . . . . . . . . . . . . . 4 | |
3.4.1.2. Binary . . . . . . . . . . . . . . . . . . . . . . 5 | |
3.4.2. Definitions . . . . . . . . . . . . . . . . . . . . . . 5 | |
3.4.3. Data insertion . . . . . . . . . . . . . . . . . . . . 5 | |
3.4.4. Origin relocation . . . . . . . . . . . . . . . . . . . 5 | |
3.4.5. Macros: macro block and macro insertion . . . . . . . . 6 | |
3.4.6. Repeat block . . . . . . . . . . . . . . . . . . . . . 6 | |
3.4.7. Conditionals . . . . . . . . . . . . . . . . . . . . . 6 | |
3.4.8. Error reporting . . . . . . . . . . . . . . . . . . . . 7 | |
3.4.9. Alignment . . . . . . . . . . . . . . . . . . . . . . . 7 | |
3.5. Preprocessor inline arithmetic . . . . . . . . . . . . . . 7 | |
4. Tokenizer Markup . . . . . . . . . . . . . . . . . . . . . . . 8 | |
4.1. Labels . . . . . . . . . . . . . . . . . . . . . . . . . . 8 | |
4.2. Case sensitivity . . . . . . . . . . . . . . . . . . . . . 8 | |
4.3. Inline character literals . . . . . . . . . . . . . . . . . 8 | |
5. Conformance . . . . . . . . . . . . . . . . . . . . . . . . . . 8 | |
5.1. Recognition of conformance . . . . . . . . . . . . . . . . 8 | |
6. Security Considerations . . . . . . . . . . . . . . . . . . . . 8 | |
7. Normative References . . . . . . . . . . . . . . . . . . . . . 8 | |
Authors' Addresses . . . . . . . . . . . . . . . . . . . . . . . . 9 | |
Kuijpers & Beermann [Page 2] | |
Assembly Syntactics April 2012 | |
1. Introduction | |
TODO | |
1.1. Requirements Language | |
The key words "MUST", "MUST NOT", "REQUIRED", "SHALL", "SHALL NOT", | |
"SHOULD", "SHOULD NOT", "RECOMMENDED", "MAY", and "OPTIONAL" in this | |
document are to be interpreted as described in RFC 2119 [RFC2119]. | |
2. Document Markup | |
2.1. Filename | |
Assembly files on the DCPU-16 platform SHOULD have a filename suffix | |
either '.dasm' or '.dasm16'. This is first used by GitHub [GHBP] to | |
identify DCPU-16 assembly files. | |
2.2. Lines | |
An empty line MUST be omitted by the assembler. A line MUST NOT | |
contain more than one instruction. A line MAY both define a label | |
and contain an instruction, in this order. | |
2.3. Indentation and whitepacing | |
Whitespace MUST be allowed between all elements of a line, including | |
but not limited to opcodes, values, syntactic characters and | |
preprocessor directives. Both a space (' ' U+0020) and a tab | |
(U+0009) are considered whitespace characters. | |
Indenting instructions is RECOMMENDED. Labels and preprocessor | |
directives SHOULD NOT indented. NOT indenting labels and | |
preprocessor directives RECOMMENDED. The assembler MUST NOT mandate | |
indentation to assemble successfully. | |
3. Preprocessor Markup | |
3.1. Comments | |
Comments are used to add information to the code, making it more | |
readable and understandable. Comments can consist any character in | |
any combination. This document specifies one-line comments only. | |
Any characters following and in the same line of a semicolon (; | |
U+003B) are comments and MUST be ignored, except when the semicolon | |
Kuijpers & Beermann [Page 3] | |
Assembly Syntactics April 2012 | |
resides within the representation of a string. In that case, the | |
semicolon MUST NOT be treated as a comment. | |
3.2. Prefix | |
Every preprocessor directive starts with an identifier. This | |
identifier is used to distinguish preprocessor directives from other | |
code. | |
For historical reasons, directives can either start with a dot (. | |
U+002E) or a number sign (# U+0023). | |
Preprocessor directives MUST start with a dot (. U+002E) or a number | |
sign (# U+0023). Documents SHOULD NOT mix usage of these, assemblers | |
SHOULD NOT accept mixing these in a single document. | |
Using a dot is RECOMMENDED to distinguish between C preprocessor | |
syntax. | |
3.3. Case insensitivity | |
Assemblers MUST accept directives, definitions and constants without | |
regard to case. | |
3.4. Directives | |
All directives in this section MUST be handled in order and in | |
recognition of their position. For unambigiousity a dot (.) is used | |
here to describe preprocessor directives. | |
3.4.1. Inclusion | |
3.4.1.1. Code | |
.include "file" | |
.include <file> | |
The former directive MUST include the file into the current file. | |
The path is relative to the current file. If the given filename does | |
not exist compilation MUST be aborted. | |
The latter includes the file from an implementation defined location, | |
which may not even exist but trigger certain behaviour, i.e. | |
inclusion of intrinsics. | |
Kuijpers & Beermann [Page 4] | |
Assembly Syntactics April 2012 | |
3.4.1.2. Binary | |
.incbin "file" | |
.incbin <file> | |
incbin MUST include the specified binary as raw, unprocessed data, | |
the path to the file is relative from the current file. All labels | |
behind this directive MUST be offset by the size of the file. | |
The latter form of incbin MUST include the file from an | |
implementation defined location. | |
3.4.2. Definitions | |
.def name [value] | |
.undef name | |
def MUST assign the constant value to name. If the value is omitted, | |
the literal 1 (one) MUST be assumed. | |
undef MUST remove the given symbol from the namespace. If the given | |
symbol does not exist compilation SHOULD continue and a warning MAY | |
be emitted. | |
3.4.3. Data insertion | |
.word value [,value...] | |
.byte value [,value...] | |
.ascii "string" | |
word MUST store the values literally and unpacked at the location of | |
the directive. | |
byte MUST pack (i.e. two bytes per word, first byte is LSB) the | |
values at the location of the directive. | |
ascii MUST store the string unpacked (i.e. character is LSB, one word | |
per character) at the location of the directive. | |
3.4.4. Origin relocation | |
.org address | |
The org preprocessor directive MUST take an address as the only | |
argument. Assemblers SHOULD verify the address is 16-bit sized. | |
Assembler MUST add this address to the address of all labels, | |
creating a relocation of the program. | |
Kuijpers & Beermann [Page 5] | |
Assembly Syntactics April 2012 | |
3.4.5. Macros: macro block and macro insertion | |
.macro name([param [,param...]]) | |
code | |
.end | |
.ins name([param [,param...]]) | |
The macro directive defines a macro, a parametrized block of code | |
that can be inserted any time later. Parameters, if any, are written | |
in parentheses seperated by commas (,). | |
The ins directive MUST insert a formerly defined macros and expands | |
the parameters of the macro with the comma-seperated parameters | |
following the name of the macro to insert. | |
Parameter substitutions can only be constant values and memory | |
references. Preprocessor directives inside the macro MUST be handled | |
upon insertion, not definition. | |
3.4.6. Repeat block | |
.rep times | |
code | |
.end | |
The code in the repeat-block MUST be repeated the number of times | |
specified. 'times' MUST be a positive integer. Preprocessor | |
directives inside the repeat-block MUST be handled when the | |
repetition is complete, to make allow conditional repetitions. | |
3.4.7. Conditionals | |
.if expression | |
codeTrue | |
.else | |
codeElse | |
.end | |
isdef(definition) | |
For the definition of valid expressions, see Section 3.5. | |
The if clause is REQUIRED. The else clause is OPTIONAL. | |
If expression consists of a single constant value, then expression = | |
1 MUST be assumed. | |
If expression evaluates to 1, the codeTrue-block MUST be assembled, | |
in any other case codeElse, if an else clause is specified, MUST be | |
Kuijpers & Beermann [Page 6] | |
Assembly Syntactics April 2012 | |
assembled. | |
isdef(symbol) can be used in place of expression. isdef MUST evaluate | |
to 1 if the given symbol is currently defined, else it MUST evaluate | |
to 0. | |
Nesting of if directives MUST be supported. | |
3.4.8. Error reporting | |
.error message | |
Triggers an assembler error with the message, stopping execution of | |
the assembler. The message SHOULD be shown in combination with the | |
filename and line number. | |
3.4.9. Alignment | |
.align boundary | |
Aligns code or data on doubleword or other boundary. | |
The assembler MUST add NOPs (0x0000) to the generated machinecode | |
until the alignment is correct. The number of words inserted can be | |
calculated using the formula: 'boundary - (currentPosition % | |
boundary)' (% indiciates modulus). | |
3.5. Preprocessor inline arithmetic | |
Source code can include inline arithmetics anywhere a constant value | |
is permitted. Inline arithmetic may only consist of + (addition), - | |
(subtraction), * (multiplication), / (integer division) and % | |
(modulus), parentheses may be used to group expressions. The | |
evaluation order MUST be as follows: multiplication, division, | |
modulus, addition, substraction. | |
The following logical and bitwise operators MUST also be supported: = | |
(equal, also ==), != (not equal, also <>), < (smaller than), > | |
(greater than), <= (smaller or equal), >= (greater or equal), & (bit- | |
wise AND) ^ (bit-wise XOR), | (bit-wise OR), && (logical AND), || | |
(logical OR), ^^ (logical XOR) which MUST be evaluated with respect | |
to this order. | |
Inline arithmetic MUST be evaluated as soon as possible, the result | |
MUST be used as a literal value in place of the expression. | |
Kuijpers & Beermann [Page 7] | |
Assembly Syntactics April 2012 | |
4. Tokenizer Markup | |
4.1. Labels | |
Labels MUST be single-worded identifiers containing only alphabetical | |
characters (/[A-Za-z]/), numbers (/[0-9]/) and underscores (_ | |
U+005F). The label MUST represent the address of following | |
instruction or data. A label MUST NOT start with a number. A label | |
MUST end with a colon (: U+003A). When the label is used, the | |
tokenizer MUST translate the label into the address it represents. | |
Local labels MUST start with a dot (. U+002E) and end with a colon | |
(: U+003A). Local labels MUST be scoped between the surrounding | |
global labels. Local labels in different scopes MUST be able to have | |
the same name. | |
4.2. Case sensitivity | |
Assemblers MUST accept registers and opcodes without regard to case. | |
Assemblers MUST accept labels respecting case. | |
4.3. Inline character literals | |
A character surrounded by apostrophes (' U+0029) MUST be interpreted | |
as its corresponding 7-bit ASCII value in a word (LSB). An assembler | |
MUST support at least the ascii values ranging from 32 to 126 | |
(printable characters). | |
5. Conformance | |
5.1. Recognition of conformance | |
An assembler, formatter and any other assembly related program that | |
is fully compliant to 0xSCA MAY label itself "0xSCA compatible". | |
When using this label, the subject SHOULD include a note of the | |
version of the RFC it is written against. | |
6. Security Considerations | |
This memo has no applicable security considerations. | |
7. Normative References | |
[GHBP] Marti, V., "Take Over The Galaxy with GitHub", April 2012, | |
<https://github.com/blog/ | |
Kuijpers & Beermann [Page 8] | |
Assembly Syntactics April 2012 | |
1098-take-over-the-galaxy-with-github>. | |
[RFC2119] Bradner, S., "Key words for use in RFCs to Indicate | |
Requirement Levels", BCP 14, RFC 2119, March 1997. | |
Authors' Addresses | |
Jos Kuijpers (editor) | |
Jarvix | |
Email: [email protected] | |
URI: http://www.jarvix.org/ | |
Marian Beermann (editor) | |
Email: [email protected] | |
URI: http://www.enkore.de/ | |
Kuijpers & Beermann [Page 9] | |
It is not hard to rewrite :label to label:, some assemblers could even builtin a rewriter that does this for you.
local labels is an idea indeed. it would be .label: which is common in ASM languages
Notch syntax is odd. moreover, his example was an example. Instead, everyone jumps on top of it and sees it as The Thing. With 0xSCA we want to fight against that because it is odd syntax.
I'll be adding local labels.
"With 0xSCA we want to fight against that because it is odd syntax."
Hey, I'm not going to argue about the weirdness of the syntax; it's clearly abnormal.
But, I am concerned somewhat by that tone. If the entire community is using a particular syntax, and dozens of tools have already been built to use that syntax, then it seems quite presumptuous for you to say, "No, you're all doing it wrong!"
I'd just as soon the standard say "any token with a colon in it will be considered a label, stripped of the colon".
There is already a lot of DCPU-16 assembly code with labels defined by starting with a colon. It is the "notch style" and has caught on tremendously quickly. It would probably be acceptable to accept labels either starting or ending with a colon, but not both.
In addition, most compilers output local labels with leading dot. This must be supported.