TUNGUSKA Assembly language documentation

Apologies

This is a terrible manual. It no doubt contains more typos than anything else, and reads like a technical manual from hell. See it as a test of the theory 'Bad documentaton is better than no documentation at all'.

Conventions

The following document uses the following conventions, anything inside hard brackets [like this] is optional. Anything inside braces and separated by pipes {like|this} is a situation where you must select one of the options (in the example, either like or this). 'n' means any numeral, 's' means any string, 'a' means any address or label name, 'l' means explicitly any label name.

Numerical conventions
Instructions and addressing modes
Assembler macros
Inline arithmetics
Labels and assembler variables

1. Numerical conventions

There are two main numeral classes in the Tunguska assembler , trytes and words. Trytes have trit-width 6, and words have trit-width 12. A tryte can generally be used in place of a word, but a word can not be used in place of a tryte. There are two special operators, 'LOW' and 'HIGH' that allow you to extract the individual trytes of a word, and use that as a tryte.

In the strictest sense, it is sometimes possible to use a word in place of a tryte without the assembler complaining, but it isn't recommended.

There are two allowed numeral bases, decimal and balanced nonary. Balanced nonary allows numbers in the range -4,...,4; but since there is no symbols for negative numbers, the letters A,...,D are used to symbolize -1,...,-4. Balanced nonary is to ternary roughly what octal is to binary. There is more documentation of balanced nonary on the tunguska webpage.

31: Regular decimal. No prefix required. Since it is within +-364, both a word and a tryte.
1000: Regular decimal. No prefix required. Only a word, not a tryte.
%0DA: Nonary triplet. Always a tryte long.
%0DA114: Nonary sextet. Always a word long.
%000000: Nonary sextet. Always a word long, even if it is smaller than 364.
LOW %0DA114
: Lower tryte of nonary sextet. Equivalent to %114.

2. Instructions and addressing modes

In general, there is no forbidden instructions in Tunguska. Even if you try to pass an unexpected addressing mode to an operator, the whole machine shouldn't crash and burn. There can be unexpected behavior though, so don't do it on purpose. For a complete list of Tunguska instructions, see the machine specifications.

The tunguska assembler expects all instructions to be uppercase, any lower case instructions will be interpeted as labels or variables, and the assembler will complain.

2.1 Addressing modes

Tunguska has the following 10 addressing modes:

OP: Implicit addressing. No argument.
OP A: Accumulator. Whatever operation is done with the accumulator as argument. Strictly speaking, this is the same as implicit addressing.
OP #n: Immediate addressing. Whatever operation is done with the directly specified numeral as argument.
OP a: Absolute addressing. Whatever operation is done on the memory at address a.
OP a,X: Absolute addressing with X offset. Whatever operation is done on the memory at address a+X.
OP a,Y: Absolute addressing with Y offset. Whatever operation is done on the memory at address a+X.
OP (a): Indirect addressing. Whatever operation is done on the memory pointed to by the memory at address a.
OP (a,X): Indirect addressing with X offset. Whatever operation is done on the memory pointed to by the memory at address a+X.
OP (a,Y): Indirect addressing with Y offset. Whatever operation is done on the memory pointed to by the memory at address a+Y.
OP X,Y: XY-addressing. Whatever operation is done on the memory pointed to by X:Y.

3. Assembler macros

The Tunguska assembler supports a series of pseudo-instructions, or macros that do not affect the machinecode itself, but allows the assembler to enter non-generated data, or perform other operations.

As a rule of thumb, all macros begin with an @, and are all uppercase

@DT {n|s}[, {n|s}, ...]: Define tryte. Accepts a comma separated list of numerals and strings. These are entered into the assembled memory output as-is.
@DW {n|s|a}[, {n|s|a}, ...]: Define word. Accepts a comma separated list of numerals, strings or memory addresses (labels).
@REST {n|s} [{n|s} = 0]: Reserve argument1 number of trytes into memory, set them to argument2.
@EQU l {n|s|a}
: Set variable argument1 to argument2. For most intents and purposes, this is identical to jumping to argument 2 and declaring a label there.
@ORG {n|a}: Jumps instruction counter to address or label specified. Interpret as "this is where I want the following code to go into memory."

4. Inline arithmetics

The Tunguska assembler has support of inline arithmetics, that is, it can calculate pretty much any algebraic function based on values available to it at assembly-time. A magic $$ token is available, resolving to the address of this (the current) memory position. It works pretty much like you'd expect it to, with regular infix syntax. The only quirk is that you can't use paranthesis for precedence override, instead you must use braces.

This works: {1+2} * 3 - 5 + $$ - somelabel*2
This doesn't: (1+2) * 3 - 5 + $$ - somelabel*2

5. Labels and assembler variables

While labels and assembler variables are in many ways interchangable, there are a few differences. Labels can have local child labels and child variables that, from within the label are accessible through .localname and from outside the label through label.localname, but a variable can not.

A label is declared through labelname: in the beginning of a line, and accessed through substituting labelname where-ever an address is requested.

A variable is declared through @EQU variable value, and is accessible in much the same way a label is. Futhermore, it is possible to store not only words, but trytes in variables, which can be accessed for an instance in immediate addressing mode like this: OP #variable.

A very powerful combination is the $$ -token and variables. For an instance, if you want to determine the length of a string automatically for use later, you can use a construct like this:

mystring: 	@DT 	'Hello world!', 2, 'How are you doing?'
		@EQU 	.length		$$ - mystring

The length of mystring (which reads: 'Hello world![new line]How are you doing?') will be stored in mystring.length at no cost of machine memory.