This is the manual for 64tass, the multi pass optimizing macro assembler for the 65xx series of processors. Key features:
linkerwith section support
Contrary how the length of this document suggests 64tass can be used with just basic 6502 assembly knowledge in simple ways like any other assembler. If some advanced functionality is needed then this document can serve as a reference.
This is a development version. Features or syntax may change as a result
of corrections in non-backwards compatible ways in some rare cases. It's
difficult to get everything right
first time.
Project page: https://sourceforge.net/projects/tass64/
The page hosts the latest and older versions with sources and a bug and a feature request tracker.
64tass is a command line assembler, the source can be written in any text
editor. As a minimum the source filename must be given on the command line. The
command line option is highly recommended if the source is Unicode or
ASCII.
-a
64tass -a src.asm
There are also some useful parameters which are described later.
For comfortable compiling I use such Makefile
s (for make):
demo.prg: source.asm macros.asm pic.drp music.bin 64tass -C -a -B -i source.asm -o demo.tmp pucrunch -ffast -x 2048 demo.tmp >demo.prg
This way demo.prg
is recreated by compiling source.asm
whenever source.asm
, macros.asm
, pic.drp
or music.bin
had changed.
Of course it's not much harder to create something similar for win32 (make.bat), however this will always compile and compress:
64tass.exe -C -a -B -i source.asm -o demo.tmp pucrunch.exe -ffast -x 2048 demo.tmp >demo.prg
Here's a slightly more advanced Makefile example with default action as testing in VICE, clean target for removal of temporary files and compressing using an intermediate temporary file:
all: demo.prg x64 -autostartprgmode 1 -autostart-warp +truedrive +cart $< demo.prg: demo.tmp pucrunch -ffast -x 2048 $< >$@ demo.tmp: source.asm macros.asm pic.drp music.bin 64tass -C -a -B -i $< -o $@ .INTERMEDIATE: demo.tmp .PHONY: all clean clean: $(RM) demo.prg demo.tmp
It's useful to add a basic header to your source files like the one below, so that the resulting file is directly runnable without additional compression:
* = $0801 .word (+), 2005 ;pointer, line number .null $9e, format("%4d", start);will be sys 4096 + .word 0 ;basic line end * = $1000 start rts
A frequently coming up question is, how to automatically allocate
memory, without hacks like *=*+1
? Sure
there's .byte
and friends for variables with initial values
but what about zero page, or RAM outside of program area? The solution
is to not use an initial value by using
or not
giving a fill byte value to ?
.fill
.
* = $02 p1 .addr ? ;a zero page pointer temp .fill 10 ;a 10 byte temporary area
Space allocated this way is not saved in the output as there's no data to save at those addresses.
What about some code running on zero page for speed? It needs to be relocated, and the length must be known to copy it there. Here's an example:
ldx #size(zpcode)-1;calculate length - lda zpcode,x sta wrbyte,x dex ;install to zero page bpl - jsr wrbyte rts ;code continues here but is compiled to run from $02 zpcode .logical $02 wrbyte sta $ffff ;quick byte writer at $02 inc wrbyte+1 bne + inc wrbyte+2 + rts .endlogical
The assembler supports lists and tuples, which does not seems interesting at first as it sound like something which is only useful when heavy scripting is involved. But as normal arithmetic operations also apply on all their elements at once, this could spare quite some typing and repetition.
Let's take a simple example of a low/high byte jump table of return
addresses, this usually involves some unnecessary copy/pasting to create a pair
of tables with constructs like >(label-1)
.
jumpcmd lda hibytes,x ; selected routine in X register
pha
lda lobytes,x ; push address to stack
pha
rts ; jump, rts will increase pc by one!
; Build a list of jump addresses minus 1
_ := (cmd_p, cmd_c, cmd_m, cmd_s, cmd_r, cmd_l, cmd_e)-1
lobytes .byte <_ ; low bytes of jump addresses
hibytes .byte >_ ; high bytes
There are some other tips below in the descriptions.
Integer constants can be entered as decimal digits of arbitrary length. An underscore can be used between digits as a separator for better readability of long numbers. The following operations are accepted:
x + y | add x to y | 2 + 2 is 4
|
x - y | subtract y from x | 4 - 1 is 3
|
x * y | multiply x with y | 2 * 3 is 6
|
x / y | integer divide x by y | 7 / 2 is 3
|
x % y | integer modulo of x divided by y | 5 % 2 is 1
|
x ** y | x raised to power of y | 2 ** 4 is 16
|
-x | negated value | -2 is -2
|
+x | unchanged | +2 is 2
|
~x | -x - 1 | ~3 is -4
|
x | y | bitwise or | 2 | 6 is 6
|
x ^ y | bitwise xor | 2 ^ 6 is 4
|
x & y | bitwise and | 2 & 6 is 2
|
x << y | logical shift left | 1 << 3 is 8
|
x >> y | arithmetic shift right | -8 >> 3 is -1
|
Integers are automatically promoted to floats as necessary in expressions.
Other types can be converted to integer using the integer type
int
.
Integer division is a floor division (rounding down) so 7 / 4
is 1
and not 1.75
. If ceiling division is required (rounding up) that
can be done by negating both the divident and the result. Typically it's done like 0 - -5 / 4
which results in 2
.
.byte 23 ; as unsigned .char -23 ; as signed ; using negative integers as immediate values ldx #-3 ; works as '#-' is signed immediate num = -3 ldx #+num ; needs explicit '#+' for signed 8 bits lda #((bitmap >> 10) & $0f) | ((screen >> 6) & $f0) sta $d018
Bit string constants can be entered in hexadecimal form with a leading dollar sign or in binary with a leading percent sign. An underscore can be used between digits as a separator for better readability of long numbers. The following operations are accepted:
~x | invert bits | ~%101 is ~%101
|
y .. x | concatenate bits | $a .. $b is $ab
|
y x n | repeat | %101 x 3 is %101101101
|
x[n] | extract bit(s) | $a[1] is %1
|
x[s] | slice bits | $1234[4:8] is $3
|
x | y | bitwise or | ~$2 | $6 is ~$0
|
x ^ y | bitwise xor | ~$2 ^ $6 is ~$4
|
x & y | bitwise and | ~$2 & $6 is $4
|
x << y | bitwise shift left | $0f << 4 is $0f0
|
x >> y | bitwise shift right | ~$f4 >> 4 is ~$f
|
Length of bit string constants are defined in bits and is calculated from the number of bit digits used including leading zeros.
Bit strings are automatically promoted to integer or floating point as necessary in expressions. The higher bits are extended with zeros or ones as needed.
Bit strings support indexing and slicing. This is explained in detail
in section Slicing and indexing
.
Other types can be converted to bit string using the bit string type bits
.
.byte $33 ; 8 bits in hexadecimal .byte %00011111 ; 8 bits in binary .text $1234 ; $34, $12 (little endian) lda $01 and #~$07 ; 8 bits even after inversion ora #$05 sta $01 lda $d015 and #~%00100000 ;clear a bit sta $d015
Floating point constants have a radix point in them and optionally an
exponent. A decimal exponent is
while a binary one is e
.
An underscore can be used between digits as a separator for better
readability. The following operations can be used:
p
x + y | add x to y | 2.2 + 2.2 is 4.4
|
x - y | subtract y from x | 4.1 - 1.1 is 3.0
|
x * y | multiply x with y | 1.5 * 3 is 4.5
|
x / y | integer divide x by y | 7.0 / 2.0 is 3.5
|
x % y | integer modulo of x divided by y | 5.0 % 2.0 is 1.0
|
x ** y | x raised to power of y | 2.0 ** -1 is 0.5
|
-x | negated value | -2.0 is -2.0
|
+x | unchanged | +2.0 is 2.0
|
~x | almost -x | ~2.1 is almost -2.1
|
x | y | bitwise or | 2.5 | 6.5 is 6.5
|
x ^ y | bitwise xor | 2.5 ^ 6.5 is 4.0
|
x & y | bitwise and | 2.5 & 6.5 is 2.5
|
x << y | logical shift left | 1.0 << 3.0 is 8.0
|
x >> y | arithmetic shift right | -8.0 >> 4 is -0.5
|
As usual comparing floating point numbers for (non) equality is a bad idea due to rounding errors.
The only predefined constant is pi
.
Floating point numbers are automatically truncated to integer as necessary.
Other types can be converted to floating point by using the type float
.
Fixed point conversion can be done by using the shift operators. For example
an 8.16 fixed point number can be calculated as (3.14 << 16) & $ffffff
.
The binary operators operate like if the floating point number would be a fixed
point one. This is the reason for the strange definition of inversion.
.byte 3.66e1 ; 36.6, truncated to 36 .byte $1.8p4 ; 4:4 fixed point number (1.5) .sint 12.2p8 ; 8:8 fixed point number (12.2)
Character strings are enclosed in single or double quotes and can hold any Unicode character.
Operations like indexing or slicing are always done on the original representation. The current encoding is only applied when it's used in expressions as numeric constants or in context of text data directives.
Doubling the quotes inside string literals escapes them and results in a single quote.
y .. x | concatenate strings | "a" .. "b" is "ab"
|
y in x | is substring of | "b" in "abc" is true
|
a x n | repeat | "ab" x 3 is "ababab"
|
a[i] | character from start | "abc"[1] is "b"
|
a[-i] | character from end | "abc"[-1] is "c"
|
a[:] | no change | "abc"[:] is "abc"
|
a[s:] | cut off start | "abc"[1:] is "bc"
|
a[:-s] | cut off end | "abc"[:-1] is "ab"
|
a[s] | reverse | "abc"[::-1] is "cba"
|
Character strings are converted to integers, byte and bit strings as necessary using the current
encoding and escape rules. For example when using a sane encoding "z"-"a"
is
25
.
Other types can be converted to character strings by using the type
str
or by using the repr
and format
functions.
Character strings support indexing and slicing. This is explained in detail
in section Slicing and indexing
.
mystr = "oeU" ; character string constant .text 'it''s' ; it's .word "ab"+1 ; conversion result is "bb" usually .text "text"[:2] ; "te" .text "text"[2:] ; "xt" .text "text"[:-1] ; "tex" .text "reverse"[::-1]; "esrever"
Byte strings are like character strings, but hold bytes instead of characters.
Quoted character strings prefixing by
, b
, l
, n
, p
, s
or x
z
characters can be used to create byte strings. The resulting byte
string contains what .text
, .shiftl
,
.null
, .ptext
and .shift
would
create. Direct hexadecimal entry can be done using the
prefix and
x
denotes a z85 encoded byte string. Spaces can be used between pairs of
hexadecimal digits as a separator for better readability.
z
y .. x | concatenate strings | x"12" .. x"34" is x"1234"
|
y in x | is substring of | x"34" in x"1234" is true
|
a x n | repeat | x"ab" x 3 is x"ababab"
|
a[i] | byte from start | x"abcd12"[1] is x"cd"
|
a[-i] | byte from end | x"abcd"[-1] is x"cd"
|
a[:] | no change | x"abcd"[:] is x"abcd"
|
a[s:] | cut off start | x"abcdef"[1:] is x"cdef"
|
a[:-s] | cut off end | x"abcdef"[:-1] is x"abcd"
|
a[s] | reverse | x"abcdef"[::-1] is x"efcdab"
|
Byte strings support indexing and slicing. This is explained in detail
in section Slicing and indexing
.
Other types can be converted to byte strings by using the type bytes
.
.enc "screen" ;use screen encoding mystr = b"oeU" ;convert text to bytes, like .text .enc "none" ;normal encoding .text mystr ;text as originally encoded .text s"p1" ;convert to bytes like .shift .text l"p2" ;convert to bytes like .shiftl .text n"p3" ;convert to bytes like .null .text p"p4" ;convert to bytes like .ptext
Binary data may be embedded in source code by using hexadecimal byte
strings. This is more compact than using .byte
followed by a lot
of numbers. As expected 1 byte becomes 2 characters.
.text x"fce2" ;2 bytes: $fc and $e2 (big endian)
If readability is not a concern then the more compact z85 encoding may be used which encodes 4 bytes into 5 characters. Data lengths not a multiple of 4 are handled by omitting leading zeros in the last group.
.text z"FiUj*2M$hf";8 bytes: 80 40 20 10 08 04 02 01
For data lengths of multiple of 4 bytes any z85 encoder will do. Otherwise the
simplest way to encode a binary file into a z85 string is to create a source file
which reads it using the line
. Now if the labels
are listed to a file then there will be a z85 encoded definition for this
label.
label = binary('filename')
Lists and tuples can hold a collection of values. Lists are defined from
values separated by comma between square brackets [1, 2, 3]
, an
empty list is []
. Tuples are similar but are enclosed in
parentheses instead. An empty tuple is ()
, a single element tuple
is (4,)
to differentiate from normal numeric expression
parentheses. When nested they function similar to an array. Both
types are immutable.
y .. x | concatenate lists | [1] .. [2] is [1, 2]
|
y in x | is member of list | 2 in [1, 2, 3] is true
|
a x n | repeat | [1, 2] x 2 is [1, 2, 1, 2]
|
a[i] | element from start | ("1", 2)[1] is 2
|
a[-i] | element from end | ("1", 2, 3)[-1] is 3
|
a[:] | no change | (1, 2, 3)[:] is (1, 2, 3)
|
a[s:] | cut off start | (1, 2, 3)[1:] is (2, 3)
|
a[:-s] | cut off end | (1, 2.0, 3)[:-1] is (1, 2.0)
|
a[s] | reverse | (1, 2, 3)[::-1] is (3, 2, 1)
|
*a | convert to arguments | format("%d: %s", *mylist)
|
... op a | left fold | ... + (1, 2, 3) is ((1+2)+3)
|
a op ... | right fold | (1, 2, 3) - ... is (1-(2-3))
|
Arithmetic operations are applied on the all elements recursively,
therefore [1, 2] + 1
is [2, 3]
, and abs([1,
-1])
is [1, 1]
.
Arithmetic operations between lists are applied one by one on their
elements, so [1, 2] + [3, 4]
is [4, 6]
.
When lists form an array and columns/rows are missing the smaller array is
stretched to fill in the gaps if possible, so [[1], [2]] * [3, 4]
is [[3, 4], [6, 8]]
.
Lists and tuples support indexing and slicing. This is explained in detail
in section Slicing and indexing
.
mylist = [1, 2, "whatever"] mytuple = (cmd_e, cmd_g) mylist = ("e", cmd_e, "g", cmd_g, "i", cmd_i) keys .text mylist[::2] ; keys ("e", "g", "i") call_l .byte <mylist[1::2]-1; routines (<cmd_e-1, <cmd_g-1, <cmd_i-1) call_h .byte >mylist[1::2]-1; routines (>cmd_e-1, >cmd_g-1, >cmd_i-1)
Although lists elements of variables can't be changed using indexing (at the moment) the same effect can be achieved by combining slicing and concatenation:
lst := lst[:2] .. [4] .. lst[3:]; same as lst[2] := 4 would be
Folding is done on pair of elements either forward (left) or reverse (right). The list must contain at least one element. Here are some folding examples:
minimum = size([part1, part2, part3]) <? ... maximum = size([part1, part2, part3]) >? ... sum = size([part1, part2, part3]) + ... xorall = list_of_numbers ^ ... join = list_of_strings .. ... allbits = sprites.(left, middle, right).bits | ... all = [true, true, true, true] && ... any = [false, false, false, true] || ...
The range(start, end, step)
built-in function can be used to
create lists of integers in a range with a given step value. At least the end
must be given, the start defaults to 0 and the step to 1. Sounds not very
useful, so here are a few examples:
;Bitmask table, 8 bits from left to right .byte %10000000 >> range(8) ;Classic 256 byte single period sinus table with values of 0–255. .byte 128 + 127.5 * sin(range(256) * pi / 128) ;Screen row address tables _ := $400 + range(0, 1000, 40) scrlo .byte <_ scrhi .byte >_
Dictionaries hold key and value pairs. Definition is done by collecting
key:value pairs separated by comma between braces {"key":"value",
:"default value"}
.
Looking up a non-existing key is normally an error
unless a default value is given. An empty dictionary is {}
.
This type is immutable. There are limitations what may be used as a
key but the value can be anything.
y .. x | combine dictionaries | {1:2, 3:4} .. {2:3, 3:1} is {1:2, 2:3, 3:1}
|
x[i] | value lookup | {"1":2}["1"] is 2
|
x.i | symbol lookup | {.ONE:1, .TWO:2}.ONE is 1
|
y in x | is a key | 1 in {1:2} is true
|
; Simple lookup .text {1:"one", 2:"two"}[2]; "two" ; 16 element "fader" table 1->15->12->11->0 .byte {1:15, 15:12, 12:11, :0}[range(16)] ; Symbol accessible values. May be useful as a function return value too. coords = {.x: 24, .y: 50} ldx #coords.x ldy #coords.y
Code holds the result of compilation in binary and other enclosed objects. In an arithmetic operation it's used as the numeric address of the memory where it starts. The compiled content remains static even if later parts of the source overwrite the same memory area.
Indexing and slicing of code to access the compiled content might be implemented differently in future releases. Use this feature at your own risk for now, you might need to update your code later.
a.b | b member of a | label.locallabel
|
.b in a | if a has symbol b | .locallabel in label
|
a[i] | element from start | label[1]
|
a[-i] | element from end | label[-1]
|
a[:] | copy as tuple | label[:]
|
a[s:] | cut off start, as tuple | label[1:]
|
a[:-s] | cut off end, as tuple | label[:-1]
|
a[s] | reverse, as tuple | label[::-1]
|
mydata .word 1, 4, 3 mycode .block local lda #0 .endblock ldx #size(mydata) ;6 bytes (3*2) ldx #len(mydata) ;3 elements ldx #mycode[0] ;lda instruction, $a9 ldx #mydata[1] ;2nd element, 4 jmp mycode.local ;address of local label
Addressing modes are used for determining addressing modes of instructions.
For indexing there must be no white space between the comma and the register letter, otherwise the indexing operator is not recognized. On the other hand put a space between the comma and a single letter symbol in a list to avoid it being recognized as an operator.
# | immediate |
#+ | signed immediate |
#- | signed immediate |
( ) | indirect |
[ ] | long indirect |
,b | data bank indexed |
,d | direct page indexed |
,k | program bank indexed |
,r | data stack pointer indexed |
,s | stack pointer indexed |
,x | x register indexed |
,y | y register indexed |
,z | z register indexed |
Parentheses are used for indirection and square brackets for long indirection. These operations are only available after instructions and functions to not interfere with their normal use in expressions.
Several addressing mode operators can be combined together. Currently the complexity is limited to 4 operators. This is enough to describe all addressing modes of the supported CPUs.
# | immediate | lda #$12
|
#+ | signed immediate | lda #+127
|
#- | signed immediate | lda #-128
|
#addr,#addr | move | mvp #5,#6
|
addr | direct or relative | lda $12 lda $1234 bne $1234
|
bit,addr | direct page bit | rmb 5,$12
|
bit,addr,addr | direct page bit relative jump | bbs 5,$12,$1234
|
(addr) | indirect | lda ($12) jmp ($1234 )
|
(addr),y | indirect y indexed | lda ($12),y
|
(addr),z | indirect z indexed | lda ($12),z
|
(addr,x) | x indexed indirect | lda ($12,x) jmp ($1234,x)
|
[addr] | long indirect | lda [$12] jmp [$1234]
|
[addr],y | long indirect y indexed | lda [$12],y
|
#addr,b | data bank indexed | lda #0,b
|
#addr,b,x | data bank x indexed | lda #0,b,x
|
#addr,b,y | data bank y indexed | lda #0,b,y
|
#addr,d | direct page indexed | lda #0,d
|
#addr,d,x | direct page x indexed | lda #0,d,x
|
#addr,d,y | direct page y indexed | ldx #0,d,y
|
(#addr,d) | direct page indirect | lda (#$12,d)
|
(#addr,d,x) | direct page x indexed indirect | lda (#$12,d,x)
|
(#addr,d),y | direct page indirect y indexed | lda (#$12,d),y
|
(#addr,d),z | direct page indirect z indexed | lda (#$12,d),z
|
[#addr,d] | direct page long indirect | lda [#$12,d]
|
[#addr,d],y | direct page long indirect y indexed | lda [#$12,d],y
|
#addr,k | program bank indexed | jsr #0,k
|
(#addr,k,x) | program bank x indexed indirect | jmp (#$1234,k,x)
|
#addr,r | data stack indexed | lda #1,r
|
(#addr,r),y | data stack indexed indirect y indexed | lda (#$12,r),y
|
#addr,s | stack indexed | lda #1,s
|
(#addr,s),y | stack indexed indirect y indexed | lda (#$12,s),y
|
addr,x | x indexed | lda $12,x
|
addr,y | y indexed | lda $12,y
|
Direct page, data bank, program bank indexed and long addressing modes
of instructions are intelligently chosen based on the instruction type,
the address ranges set up by .dpage
, .databank
and the current program counter address. Therefore the
,
,d
and ,b
indexing is only used in very special cases.
,k
The immediate direct page indexed
addressing
mode is usable for direct page access. The 8 bit constant is a
direct offset from the start of actual direct page. Alternatively it may
be written as #0,d
.
0,d
The immediate data bank indexed
addressing
mode is usable for data bank access. The 16 bit constant is a direct
offset from the start of actual data bank. Alternatively it may be
written as #0,b
.
0,b
The immediate program bank indexed
addressing mode
is usable for program bank jumps, branches and calls. The 16 bit constant
is a direct offset from the start of actual program bank. Alternatively it may
be written as #0,k
.
0,k
The immediate stack indexed
and data stack indexed
#0,s
accept 8 bit constants as an offset from the
start of (data) stack. These are sometimes written without the immediate
notation, but this makes it more clear what's going on. For the same reason the
move instructions are written with an immediate addressing mode #0,r
as well.
#0,#0
The immediate (#
) addressing mode expects unsigned values of byte or
word size. Therefore it only accepts constants of 1 byte or in range 0–255
or 2 bytes or in range 0–65535.
The signed immediate (#+
and #-
) addressing
mode is to allow signed numbers to be used as immediate constants. It accepts
a single byte or an integer in range −128–127, or two bytes or an integer of
−32768–32767.
The use of signed immediate (like #-3
) is seamless, but
it needs to be explicitly written out for variables or expressions
(#+variable
). In case the unsigned variant is needed but the
expression starts with a negation then it needs to be put into parentheses
(#(-variable)
) or else it'll change the address mode to
signed.
Normally addressing mode operators are used in expressions right after instructions. They can also be used for defining stack variable symbols when using a 65816, or to force a specific addressing mode.
param = #1,s ;define a stack variable const = #1 ;immediate constant lda #0,b ;always "absolute" lda $0000 lda param ;results in lda #$01,s lda param+1 ;results in lda #$02,s lda (param),y ;results in lda (#$01,s),y ldx const ;results in ldx #$01 lda #-2 ;negative constant, $fe
There's a special value for uninitialized memory, it's represented by a
question mark. Whenever it's used to generate data it creates a hole
where the previous content of memory is visible.
Uninitialized memory holes without previous content are not saved unless it's really necessary for the output format, in that case it's replaced with zeros.
It's not just data generation statements (e.g. .byte
) that can
create uninitialized memory, but .fill
, .align
or address manipulation as
well.
* = $200 ;bytes as necessary .word ? ;2 bytes .fill 10 ;10 bytes .align 64 ;bytes as necessary
There are two predefined boolean constant variables, true
and false
.
Booleans are created by comparison operators (<
,
<=
, !=
, ==
, >=
, >
),
logical operators (&&
, ||
, ^^
,
!
), the membership operator (in
) and the
all
and any
functions.
Normally in numeric expressions true
is 1
and
false
is 0
, unless the
command line option was
used.
-Wstrict-bool
Other types can be converted to boolean by using the type bool
.
bits | At least one non-zero bit |
bool | When true |
bytes | At least one non-zero byte |
code | Address is non-zero |
float | Not 0.0
|
int | Not zero |
str | At least one non-zero byte after translation |
The various types mentioned earlier have predefined names. These can used for conversions or type checks.
address | Address type |
bits | Bit string type |
bool | Boolean type |
bytes | Byte string type |
code | Code type |
dict | Dictionary type |
float | Floating point type |
gap | Uninitialized memory type |
int | Integer type |
list | List type |
str | Character string type |
tuple | Tuple type |
type | Type type |
Bit and byte string conversions can take a second parameter to specify and exact size. Values which can fit in shorter space will be padded but longer ones give an error.
.cerror type(var) != str, "Not a string!" .text str(year) ; convert to string
Symbols are used to reference objects. Regularly named, anonymous and local symbols are supported. These can be constant or re-definable.
Scopes are where symbols are stored and looked up. The global scope is always defined and it can contain any number of nested scopes.
Symbols must be uniquely named in a scope, therefore in big programs it's hard to come up with useful and easy to type names. That's why local and anonymous symbols exists. And grouping certain related symbols into a scope makes sense sometimes too.
Scopes are usually created by .proc
and .block
directives, but there are a few other ways. Symbols in a scope can be accessed
by using the dot operator, which is applied between the name of the scope and the symbol (e.g. myconsts.math.pi
).
Regular symbol names are starting with a letter and containing
letters, numbers and underscores. Unicode letters are allowed if the
command line option was used.
There's no restriction on the length of symbol names.
-a
Care must be taken to not use duplicate names in the same scope when the symbol is used as a constant as there can be only one definition for them.
Duplicate names in parent scopes are not a problem and this gives the
ability to override names defined in lower scopes. However this can just as well lead to mistakes if
a lower scoped symbol with the same name was meant so there's a
command line option to warn if such ambiguity exists.
-Wshadow
Case sensitivity can be enabled with the
command line option, otherwise all
symbols are matched case insensitive.
-C
For case insensitive matching it's possible to check for consistent symbol name use
with the
command
line option.
-Wcase-symbol
A regular symbol is looked up first in the current scope, then in lower scopes until the global scope is reached.
f .block
g .block
n nop ;jump here
.endblock
.endblock
jsr f.g.n ;reference from a scope
f.x = 3 ;create x in scope f with value 3
Local symbols have their own scope between two regularly named code symbols and are assigned to the code symbol above them.
Therefore they're easy to reuse without explicit scope declaration directives.
Not all regularly named symbols can be scope boundaries just plain code symbol ones
without anything or an opcode after them (no macros!). Symbols defined as procedures, blocks,
macros, functions, structures and unions are ignored. Also symbols defined by
.var
, :=
or =
don't apply, and there are a few more
exceptions, so stick to using plain code labels.
The name must start with an underscore (_
), otherwise the same character
restrictions apply as for regular symbols. There's no restriction on the length of the name.
Care must be taken to not use the duplicate names in the same scope when the symbol is used as a constant.
A local symbol is only looked up in it's own scope and nowhere else.
incr inc ac bne _skip inc ac+1 _skip rts decr lda ac bne _skip dec ac+1 _skip dec ac ;symbol reused here jmp incr._skip ;this works too, but is not advised
Anonymous symbols don't have a unique name and are always called as a single plus or minus sign. They are also called as forward (+
)
and backward (-
) references.
When referencing them
means the first backward,
-
means the second backwards and so on. It's the same for forward, but with
--
. In expressions it may be necessary to put them into brackets.
+
ldy #4 - ldx #0 - txa cmp #3 bcc + adc #44 + sta $400,x inx bne - dey bne --
Excessive nesting or long distance references create poorly readable code. It's also very easy to copy-paste a few lines of code with these references into a code fragment already containing similar references. The result is usually a long debugging session to find out what went wrong.
These references are also useful in segments, but this can create a nice trap when segments are copied into the code with their internal references.
bne + #somemakro ;let's hope that this segment does + nop ;not contain forward references...
Anonymous symbols are looked up first in the current scope, then in lower scopes until the global scope is reached.
Anonymous labels within conditionally assembled code are counted even if the code itself is not compiled and the label won't get defined. This ensures that anonymous labels are always at the same "distance" independent of the conditions in between.
Constant symbols can be created with the equal sign. These are not re-definable. Forward referencing of them is allowed as they retain the objects over compilation passes.
Symbols in front of code or certain assembler directives are created as constant symbols too. They are bound to the object following them.
Re-definable symbols can be created by the .var
directive
or :=
construct. These are also called as variables. They
don't carry their content over from the previous pass therefore it's not
possible to use them before their definition.
If the variable already exists in the current scope it'll get updated.
If an existing variable needs to be updated in a parent scope then the
::=
variable reassign operator is able to do that.
Variables can be conditionally defined using the :?=
construct.
If the variable was defined already then the original value is retained
otherwise a new one is created with this value.
WIDTH = 40 ;a constant lda #WIDTH ;lda #$28 variabl .var 1 ;a variable var2 := 1 ;another variable variabl .var variabl + 1;update it verbosely var2 += 1 ;compound assignment (add one) var3 :?= 5 ;assign 5 if undefined
The
symbol denotes the current program
counter value. When accessed it's value is the program counter at the
beginning of the line. Assigning to it changes the program counter and
the compiling offset.
*
Built-in functions are pre-assigned to the symbols listed below. If you reuse these symbols in a scope for other purposes then they become inaccessible, or can perform a different function.
Built-in functions can be assigned to symbols (e.g. sinus = sin
), and
the new name can be used as the original function. They can even be passed as
parameters to functions.
floor(-4.8)
is -5.0
round(4.8)
is 5.0
ceil(1.1)
is 2.0
trunc(-1.9)
is -1
frac(1.1)
is 0.1
sqrt(16.0)
is 4.0
cbrt(27.0)
is 3.0
log10(100.0)
is 2.0
log(1)
is 0.0
exp(0)
is 1.0
pow(2.0, 3.0)
is 8.0
sin(0.0)
is 0.0
asin(0.0)
is 0.0
sinh(0.0)
is 0.0
cos(0.0)
is 1.0
acos(1.0)
is 0.0
cosh(0.0)
is 1.0
tan(0.0)
is 0.0
atan(0.0)
is 0.0
tanh(0.0)
is 0.0
rad(0.0)
is 0.0
deg(0.0)
is 0.0
hypot(4.0, 3.0)
is 5.0
atan2(0.0, 3.0)
is 0.0
abs(-1)
is 1
sign(-5)
is -1
These functions return byte strings of various lengths for signed numbers, unsigned numbers and addresses.
The naming of functions is not a coincidence and they return the bytes what the data directives with the same names normally emit.
byte(0)
is x"00"
and
char(-1)
is x"ff"
word(1024)
is x"0004"
and
sint(-1)
is x"ffff"
long(123456)
is x"40E201"
and
lint(-1)
is x"ffffff"
dword(123456789)
is x"15CD5B07"
and
dint(-1)
is x"ffffffff"
addr(start)
is x"0d08"
rta(4096)
is x"ff0f"
all.
all bits set or no bits at all | all($f) is true
|
all characters non-zero or empty string | all("c") is true
|
all bytes non-zero or no bytes | all(x"ac24") is true
|
all elements true or empty list | all([true, true, false]) is false
|
Only booleans in a list are accepted with the
command line option.-Wstrict-bool
any.
at least one bit set | any(~$f) is false
|
at least one non-zero character | any("c") is true
|
at least one non-zero byte | any(x"ac24") is true
|
at least one true element | any([true, true, false]) is true
|
Only booleans in a list are accepted with the
command line option.-Wstrict-bool
This function reads the content of a binary file as a byte string. It also accepts optional offset and length parameters.
Read everything | binary(name)
|
Skip starting bytes | binary(name, offset)
|
Some bytes from offset | binary(name, offset, length)
|
sid = binary("music.sid"); read in the SID file as bytes offs := sid[[$7, $6]] ; data offset (big endian) load := sid[[$9, $8]] ; load address (big endian) init = sid[[$b, $a]] ; init address (big endian) play = sid[[$d, $c]] ; play address (big endian) ; if load address is zero then it's the first 2 bytes of data .if load == 0 load := sid[offs:offs+2] ; load address (little endian) offs += 2 ; skip load address bytes .endif * = load ; set pc to load address .text sid[offs:] ; dump music data
The format
function converts a list of values into a character string.
The converted values are inserted in place of the %
sign. Optional
conversion flags and minimum field length may follow, before the conversion
type character. These flags can be used:
# | alternate form (-$a , ~$a , -%10 , ~%10 , -10. )
|
* | width/precision from list |
. | precision |
0 | pad with zeros |
- | left adjusted (default right) |
| blank when positive or minus sign |
+ | sign even if positive |
~ | binary and hexadecimal as bits |
The following conversion types are implemented:
b | binary |
c | Unicode character |
d | decimal |
e E | exponential float (uppercase) |
f F | floating point (uppercase) |
g G | exponential/floating point |
s | string |
r | representation |
x X | hexadecimal (uppercase) |
% | percent sign |
.text format("%#04x bytes left", 1000); $03e8 bytes left
bit string | length in bits | len($034) is 12
|
character string | number of characters | len("abc") is 3
|
byte string | number of bytes | len(x"abcd23") is 3
|
tuple, list | number of elements | len([1, 2, 3]) is 3
|
dictionary | number of elements | len({1:2, 3:4]) is 2
|
code | number of elements | len(label)
|
The sequence does not change across compilations and is the same every time.
Different sequences can be generated by seeding with .seed
.
floating point number 0.0 <= x < 1.0 | random()
|
integer in range of 0 <= x < e | random(e)
|
integer in range of s <= x < e | random(s, a)
|
integer in range of s <= x < e , step t | random(s, a, t)
|
.seed 1234 ; default is boring, seed the generator .byte random(256); a pseudo random byte (0–255) .byte random([16] x 8); 8 pseudo random bytes (0–15)
integers from 0 to e-1 | range(e)
|
integers from s to e-1 | range(s, a)
|
integers from s to e (not including e ), step t | range(s, a, t)
|
.byte range(16) ; 0, 1, ..., 14, 15 .char range(-5, 6); -5, -4, ..., 4, 5 mylist = range(10, 0, -2); [10, 8, 6, 4, 2]
.warn repr(var) ; pretty print value, for debugging
var .word 0, 0, 0 ldx #size(var) ; 6 bytes var2 = var + 2 ; start 2 bytes later ldx #size(var2) ; what remains is 4 bytes
If the original list contains further lists then these must be all of the same length. In this case the order of lists is determined by comparing their elements from the start until a difference is found. The sort is stable.
; sort IRQ routines by their raster lines sorted = sort([(60, irq1), (50, irq2)]) lines .byte sorted[:, 0] ; 50, 60 irqs .addr sorted[:, 1] ; irq2, irq1
The following operators are available. Not all are defined for all types of arguments and their meaning might slightly vary depending on the type.
- | negative | + | positive |
! | not | ~ | invert |
* | convert to arguments | ^ | decimal string |
The
decimal string operator will be changed to mean the
bank byte soon. Please update your sources to use ^
format("%d", xxx)
instead!
This is done to be in line with it's use in most other assemblers.
+ | add | - | subtract |
* | multiply | / | divide |
% | modulo | ** | raise to power |
| | binary or | ^ | binary xor |
& | binary and | << | shift left |
>> | shift right | . | member |
.. | concat | x | repeat |
in | contains | !in | excludes |
Spacing must be used for the
and x
operators or else they won't
be recognized as such. For example the expression in
should be written as [1,2]x2
instead.
[1,2]x 2
Parenthesis (( )
) can be used to override operator precedence.
Don't forget that they also denote indirect addressing mode for certain
opcodes.
lda #(4+2)*3
Traditional comparison operators give false or true depending on the result.
The compare operator (<=>
) gives −1 for less, 0 for equal and 1 for more.
<=> | compare | ||
== | equals | != | not equal |
< | less than | >= | more than or equals |
> | more than | <= | less than or equals |
=== | identical | !== | not identical |
These unary operators extract 8 or 16 bits. Usually they are used to get parts of a memory address.
< | lower byte | > | higher byte |
<> | lower word | >` | higher word |
>< | lower byte swapped word | ` | bank byte |
lda #<label ; low byte of address ldy #>label ; high byte of address jsr $ab1e ldx #<>source ; word extraction ldy #<>dest lda #size(source)-1 mvn #`source, #`dest; bank extraction
Please note that these prefix operators are not strongly binding like negation or inversion. Instead they apply to the whole expression to the right. This may be unexpected but is required for compatibility with old sources which expect this behaviour.
lda #<label+10 ;This is <(label+10) and not (<label)+10
;The check below is wrong and should be written as (>start) != (>end)
.cerror >start != >end;Effectively this is >(start != (>end))
Boolean conditional operators give false or true or one of the operands as the result.
x || y | if x is true then x otherwise y
|
x ^^ y | if both false or true then false otherwise x || y
|
x && y | if x is true then y otherwise x
|
!x | if x is true then false otherwise true
|
c ? x : y | if c is true then x otherwise y
|
c ?? x : y | if c is true then x otherwise y (broadcasting)
|
x <? y | if x is smaller then x otherwise y
|
x >? y | if x is greater then x otherwise y
|
;Silly example for 1=>"simple", 2=>"advanced", else "normal" .text MODE == 1 && "simple" || MODE == 2 && "advanced" || "normal" .text MODE == 1 ? "simple" : MODE == 2 ? "advanced" : "normal" ;Limit result to 0 .. 8 light .byte 0 >? range(-16, 101)/6 <? 8
Please note that these are not short circuiting operations and both sides are calculated even if thrown away later.
With the
command line option
booleans are required as arguments and only the -Wstrict-bool
operator may return something else.
?
Special addressing length forcing operators in front of an expression can be used to make sure the expected addressing mode is used. Only applicable when used directly at the mnemonic.
@b | to force 8 bit address |
@w | to force 16 bit address |
@l | to force 24 bit address (65816) |
lda @w $0000 ; force the use of 2 byte absolute addressing bne @b label ; prevent upgrade to beq+jmp with long branches in use lda @w #$00 ; use 2 bytes independent of accumulator size
These assignment operators are short hands for updating variables. Constants can't be changed of course.
The variables on the left must be defined beforehand by
or :=
.
.var
Compound assignment operators can modify variables defined in parent scopes as well.
+= | add | -= | subtract |
*= | multiply | /= | divide |
%= | modulo | **= | raise to power |
|= | binary or | ^= | binary xor |
&= | binary and | ||= | logical or |
&&= | logical and | <<= | shift left |
>>= | shift right | ..= | concat |
<?= | smaller | >?= | greater |
x= | repeat | .= | member |
v += 1 ; same as 'v ::= v + 1'
Lists, character strings, byte strings and bit strings support various
slicing and indexing possibilities through the []
operator.
Indexing elements with positive integers is zero based. Negative indexes
are transformed to positive by adding the number of elements to them, therefore
−1 is the last element. Indexing with list of integers is possible as well so
[1, 2, 3][(-1, 0, 1)]
is [3, 1, 2]
.
Slicing is an operation when parts of sequence is extracted from a start
position to an end position with a step value. These parameters are separated
with colons enclosed in square brackets and are all optional. Their default
values are [start:maximum:step=1]
. Negative start and end characters are
converted to positive internally by adding the length of string to them.
Negative step operates in reverse direction, non-single steps will jump over
elements.
This is quite powerful and therefore a few examples will be given here:
a[x]
"abcd"[1]
results in "b"
.a[-x]
"abcd"[-2]
results in "c"
.a[:to]
to. So
[10,20,30,40][:-1]
results in [10,20,30]
.a[from:]
from. So
[10,20,30,40][-2:]
results in [30,40]
.a[from:to]
fromand stopping before
to. The two end positions can be positive or negative indexes. So
[10,20,30,40][1:-1]
results in [20,30]
.a[:]
a[::-1]
"abcd"[::-1]
is "dcba"
.a[from:to:step]
stepth element starting from
fromand stopping before
to. So
"abcdef"[1:4:2]
results in "bd"
.
The fromand
tocan be omitted in case it starts from the beginning or end at the end. If the
stepis negative then it's done in reverse.
a[list]
"abcd"[[1,3]]
will be "bd"
.The fun start with nested lists and tuples, as these can be used to create a matrix. The examples will be given for a two dimensional matrix for easier understanding, but this also works in higher dimensions.
a[x]
[(1,2),(3,4)]
matrix [0]
will give the first row which is (1,2)
a[from:to]
[(1,2),(3,4),(5,6),(7,8)]
matrix [1:3]
will give [(3,4),(5,6)]
a[x]
[(1,2),(3,4)]
matrix [:,0]
will give the first column of all rows which is [1,3]
a[:,from:to]
[(1,2,3,4),(5,6,7,8)]
matrix [:,1:3]
will give [(2,3),(6,7)]
And it works for list of indexes, negative indexes, stepped ranges, reversing, etc. on all axes in too many ways to show all possibilities.
Basically it's just the indexing and slicing applied on nested constructs, where each nesting level is separated by a comma.
Two counters are used while assembling.
The compile offset is where the data and code ends up in memory (or in image file).
The program counter is what labels get set to and what the special star label refers to.
Normally both are the same (code is compiled to the location it runs from) but it does not need to be.
;Offset ;PC ;Hex ;Monitor ;Source * = $0800 .0800 label1 .logical $1000 .0800 1000 label2 * = $1200 .0a00 1200 label3 .endlogical .0a00 label4
Popular in old TASM code where this was the only way to create relocated code, otherwise it's use is not recommended as there are easier to use alternatives below.
;Offset ;PC ;Hex ;Monitor ;Source * = $1000 .1000 ea nop nop .offs 100 .1065 1001 ea nop nop
Changes the program counter only, the compile offset is not changed. When finished all continues where it was left off before.
The naming is not logical at all for relocated code, but that's how it was named in old 6502tass.
It's used for code copied to it's proper location at runtime. Can be nested of course.
;Offset ;PC ;Hex ;Monitor ;Source * = $1000 .logical $300 .1000 0300 a9 80 lda #$80 drive lda #$80 .1002 0302 85 00 sta $00 sta $00 .1004 0304 4c 00 03 jmp $0300 jmp drive .endlogical
Changes the program counter to the expression (if given) and discards the result of compilation. This is useful to define structures to fixed addresses.
.virtual $d400 ; base address sid .block freq .word ? ; frequency pulsew .word ? ; pulse width control .byte ? ; control ad .byte ? ; attack/decay sr .byte ? ; sustain/release .endblock .endvirtual
Or to define stack "allocated" variables on 65816.
.virtual #1,s p1 .addr ? ; at #1,s tmp .byte ? ; at #3,s .endvirtual lda (p1),y ; lda ($01,s),y
Alignment is about constraining data/code placement in memory.
The processor architecture doesn't have hard constraints on instruction or data placement still pages (256 bytes) come up quite often in instruction cycle times tables. Or even in errata like the indirect JMP bug which happens only if the word of the vector is crossing such page.
Other components like video chips can only display object if placed at an address divisible by 64 for example.
For code half of an address table might be spared if it's known that all the addresses have the same high bytes. Or if all interrupt routines are on the same page then it's enough to change the low byte of the vector when selecting another one.
Now it shouldn't come as a surprise that the following directives are mainly concerned about how dividing the program counter address gives a certain remainder.
The divisor in this context is called the alignment interval and is usually a number which is a power of two. Quite often 256, so that's the default.
The remainder is called offset and is by default 0. Negative offsets are a convenience feature and are internally corrected by adding the interval to it.
An interval sized memory area is called a page. It's boundary is at it's start. If data spans more than one page it's known as a page boundary cross.
Having a non-zero offset effectively shifts the boundary of a page in memory
further up or down (if negative). An interval of 256 with offset of 8 gives page
boundaries of $1008
, $1108
or $1208
for
example.
If the alignment is not good enough some alignment directives might try to
correct it by adding padding. This is by default uninitialized (skip forward) but may be a fixed byte
or anything more complex similarly to what the .fill
directive accepts.
When alignment is done within named structures then it's relative to the start of the structure. This means the structure layout will always be the same independent of which address it's instantiated at. Anonymous structures do not change the way the alignment works.
The
command line option can be used to emit
warnings on where and how much padding was necessary for alignment.-Walign
This directive is a passive assertion and checks for a page difference or page crossing.
By default or with a negative interval parameter it verifies that the start and end directives are on the same page. This is what's needed to guard relative branches against jumping across pages:
ldx #3
.page ;now this will execute
- dex ;in 14 cycles for sure
bne -
.endpage
With a positive size parameter it verifies that there's no page cross in the memory range between the directives. This is what's needed to guard against indexed access page cross cycle penalties:
* = $10c0 .page 256 table .fill $40 ;table within the same page .endpage ;different page here but no crossing
Normally a page check results in an error but the
command line option can
reduce it into a warning.
-Wno-error=page
Once this directive reports an error it's time to rearrange the source in a way that the check passes. Or alternatively the alignment directives below can be used to avoid violating the assertion.
This directive is useful when code/data needs to be placed exactly to a page boundary. If that's not already the case sufficient padding is added until the next one is reached.
.align $40 ;sprite bitmap (64 byte aligned) sprite .fill 63 .align $400 ;screen memory (1024 byte aligned) screen .fill 1000 .align $400, ?, -8;sprite pointers (last 8 bytes) spritep .fill 8 .align ; page sized buffer at page boundary sendbuf .fill 256 ;to avoid indexing penalty cycles
Often the start address is not important only avoiding the page boundary matters.
This often can be achieved without any padding at all. If padding is necessary then this directive works
the same as .align
including alignment within structures.
It's typically used to place tables so that absolute indexed read accesses won't suffer page crossing cycle penalties.
.alignblk ;avoid page cross table .byte 0, 1, 2, 3, 4, 5, 6, 7 .endalignblk lda table,x ;no cycles wasted on access
In case the stronger guarantee of having both the start and the end directives in the same page is required then the alignment interval needs to be given as a negative number (e.g. −256). This may be necessary for aligning code with relative branches.
If the block size varies based on its memory location then doing the alignment may become impossible.
Using .alignblk
in the middle of
executable code is usually problematic as the alignment is done there as well.
This directive can do the alignment padding outside of the execution flow.
rts
.alignpageind pageblk;add alignment padding here
wait ldx #3
pageblk .page ;now this will execute
- dex ;in 14 cycles for sure
bne -
.endpage
By default and with a negative interval it tries to avoids page differences.
With positive intervals page crosses. Same as the .page
assertion block.
It is assumed that the padding inserted will move the target block as if it'd be right in front of it. If this isn't the case the alignment will fail.
If the block size varies based on its memory location then doing the alignment may become impossible.
This directive tries to align the target to a page boundary. If not already on one then sufficient padding will be added until the next one is reached.
;Align "pos" to page boundary. It must come right after "neg". .alignind pos neg .fill 8 pos .fill 8 .cerror (<pos) != 0, "pos should be page aligned" .cerror pos - neg != size(neg), "there should be no gap"
It is assumed that the padding inserted will move the target as if it'd be right in front of it. If this isn't the case the alignment will fail.
.fill
directive is used to
reserve space but it may be useful to do alignments as well.
;replacement for a .cerror overrun check and *= combo .fill start_address - * ;align the vectors "block" so it ends at end_address .fill end_address - size(vectors) - * vectors .logical * ;dummy non-scoped block for size() ... ;screen memory is needed but if at $9xxx then take $a000 instead .align $400 ;next 1024 byte alignment .fill (* >> 12) == $9 ? ($a000 - *) : 0 screen .fill 1000
Multi byte numeric data is stored in the little-endian order, which is the natural byte order for 65xx processors. Numeric ranges are enforced depending on the directives used. Signed numbers are stored as two's complement.
When using lists or tuples their content will be used one by one.
Uninitialized data (
) creates holes of different sizes. Character string constants
are converted using the current encoding.
?
Please note that multi character strings usually don't fit into 8 bits
and therefore the .byte
directive is not appropriate for them.
Use .text
instead which accepts strings of any length.
>1000 ff 03 .byte 255, $03 >1002 41 .byte "a" >1003 .byte ? ; reserve 1 byte >1004 fd .char -3 ;Store 4.4 signed fixed point constants >1005 c8 34 32 .char (-3.5, 3.25, 3.125) * 1p4 ;Compact computed jumps using self modifying code .1008 bd 0f 10 lda $1010,x lda jumps,x .100b 8d 0e 10 sta $100f sta smod+1 .100e d0 fe bne $100e smod bne * ;Routines nearby (-128 to 127 bytes) >1010 23 49 jumps .char (routine1, routine2)-smod-2
>1000 42 23 55 45 .word $2342, $4555 >1004 .word ? ; reserve 2 bytes >1006 eb fd 51 11 .sint -533, 4433 ;Store 8.8 signed fixed point constants >100a 80 fc 40 03 20 03 .sint (-3.5, 3.25, 3.125) * 1p8 .1010 bd 19 10 lda $1019,x lda texts,x .1013 bc 1a 10 ldy $101a,x ldy texts+1,x .1016 4c 1e ab jmp $ab1e jmp $ab1e >1019 33 10 59 10 texts .word text1, text2
* = $12000 .012000 7c 03 20 jmp ($012003,x) jmp (jumps,x) >012003 50 20 32 03 92 15 jumps .addr $12050, routine1, routine2 ;Computed jumps by using stack (current bank) * = $103000 .103000 bf 0c 30 10 lda $10300c,x lda rets+1,x .103004 48 pha pha .103005 bf 0b 30 10 lda $10300b,x lda rets,x .103009 48 pha pha .10300a 60 rts rts >10300b ff ef a1 36 f3 42 rets .rta $10f000, routine1, routine2
>1000 56 34 12 .long $123456 >1003 .long ? ; reserve 3 bytes >1006 eb fd ff 51 11 00 .lint -533, 4433 ;Store 8.16 signed fixed point constants >100c 5d 8f fc 66 66 03 1e 85 .lint (-3.44, 3.4, 3.52) * 1p16 >1014 03 ;Computed long jumps with jump table (65816) .1015 bd 2a 10 lda $102a,x lda jumps,x .1018 8d 11 03 sta $0311 sta ind .101b bd 2b 10 lda $102b,x lda jumps+1,x .101e 8d 12 03 sta $0312 sta ind+1 .1021 bd 2c 10 lda $102c,x lda jumps+2,x .1024 8d 13 03 sta $0313 sta ind+2 .1027 dc 11 03 jmp [$0311] jmp [ind] >102a 32 03 01 92 05 02 jumps .long routine1, routine2
>1000 78 56 34 12 .dword $12345678 >1004 .dword ? ; reserve 4 bytes >1008 5d 7a 79 e7 .dint -411469219 ;Store 16.16 signed fixed point constants >100c 5d 8f fc ff 66 66 03 00 .dint (-3.44, 3.4, 3.52) * 1p16 >1014 1e 85 03 00
For cases not covered by the numeric store directives above it's possible to
convert numbers to byte or bit strings and store the resulting string. If the
count expression of bytes()
and bits()
is negative then the stored number is
signed otherwise unsigned.
>1000 74 65 78 74 00 00 00 00 .text bytes("text", 8);pad up to 8 bytes >1008 f4 ff ff ff ff ff ff ff .text bytes(-12, -8) ;8 bytes signed >1010 00 04 00 00 00 00 .text bits(1024, 48) ;48 bits unsigned >1016 f4 ff ff ff ff ff .text bits(-12, -48) ;48 bits signed
The following directives store strings of characters, bytes or bits as bytes. Small numeric constants can be mixed in to represent single byte control characters.
When using lists or tuples their content will be used one by one.
Uninitialized data (
) creates byte sized holes. Character string constants
are converted using the current encoding.?
>1000 4f 45 d5 .text "oeU" >1003 4f 45 d5 .text 'oeU' >1006 17 33 .text 23, $33 ; bytes >1008 0d 0a .text $0a0d ; $0d, $0a, little endian! >100a 1f .text %00011111; more bytes
>1000 .fill $100 ;no fill, just reserve $100 bytes >1100 00 00 00 .fill $4000, 0 ;16384 bytes of 0 ... >5100 55 aa 55 .fill 8000, [$55, $aa];8000 bytes of alternating $55, $aa ... >7040 ff ff ff .fill $8000 - *, $ff;fill up rest of EPROM with $ff ...
Any byte which already has the most significant bit set will cause an error. The last byte can't be uninitialized or missing of course.
The naming comes from old TASM and is a reference to setting the high bit of alphabetic letters which results in it's uppercase version in PETSCII.
.1000 a2 00 ldx #$00 ldx #0 .1002 bd 10 10 lda $1010,x loop lda txt,x .1005 08 php php .1006 29 7f and #$7f and #$7f .1008 20 d2 ff jsr $ffd2 jsr $ffd2 .100b e8 inx inx .100c 28 plp plp .100d 10 f3 bpl $1002 bpl loop .100f 60 rts rts >1010 53 49 4e 47 4c 45 20 53 txt .shift "single", 32, "string" >1018 54 52 49 4e c7
Any byte which already has the most significant bit set will cause an error as this is cut off on shifting. The last byte can't be uninitialized or missing of course.
The naming is a reference to left shifting.
.1000 a2 00 ldx #$00 ldx #0 .1002 bd 0d 10 lda $100d,x loop lda txt,x .1005 4a lsr a lsr a .1006 9d 00 04 sta $0400,x sta $400,x ;screen memory .1009 e8 inx inx .100a 90 f6 bcc $1002 bcc loop .100c 60 rts rts .enc "screen" >100d a6 92 9c 8e 98 8a 40 a6 txt .shiftl "single", 32, "string" >1015 a8 a4 92 9c 8f .enc "none"
.text
, but adds a zero byte to the end. An existing
zero byte is an error as it'd cause a false end marker.
.1000 a9 07 lda #$07 lda #<txt .1002 a0 10 ldy #$10 ldy #>txt .1004 20 1e ab jsr $ab1e jsr $ab1e >1007 53 49 4e 47 4c 45 20 53 txt .null "single", 32, "string" >100f 54 52 49 4e 47 00
.text
, but prepend the number of bytes in front of the
string (pascal style string). Therefore it can't do more than 255 bytes.
.1000 a9 1d lda #$1d lda #<txt .1002 a2 10 ldx #$10 ldx #>txt .1004 20 08 10 jsr $1008 jsr print .1007 60 rts rts .1008 85 fb sta $fb print sta $fb .100a 86 fc stx $fc stx $fc .100c a0 00 ldy #$00 ldy #0 .100e b1 fb lda ($fb),y lda ($fb),y .1010 f0 0a beq $101c beq null .1012 aa tax tax .1013 c8 iny - iny .1014 b1 fb lda ($fb),y lda ($fb),y .1016 20 d2 ff jsr $ffd2 jsr $ffd2 .1019 ca dex dex .101a d0 f7 bne $1013 bne - .101c 60 rts null rts >101d 0d 53 49 4e 47 4c 45 20 txt .ptext "single", 32, "string" >1025 53 54 52 49 4e 47
64tass supports sources written in UTF-8, UTF-16 (be/le) and RAW 8 bit encoding. To take advantage of this capability custom encodings can be defined to map Unicode characters to 8 bit values in strings. Even in plain ASCII sources it could be useful to define escape sequences for control codes.
Predefined encodings names are none
and screen
(screen code),
anything else is user defined. All user encodings start without any character
or escape definitions, add some as required. Please note that the encoding names
are global.
This directive changes the text encoding after it therefore it's usually placed somewhere at the beginning of the source to make sure everything is covered.
While it is possible to juggle with multiple encodings throughout the source
code using the .enc
directive this is not recommended. For such
use case .encode
is better suited.
.enc "screen";screen code mode >1000 13 03 12 05 05 0e 20 03 .text "screen codes" >1008 0f 04 05 13 .100c c9 15 cmp #$15 cmp #"u" ;compare screen code .enc "none" ;normal mode again .100e c9 55 cmp #$55 cmp #"u" ;compare PETSCII
This directive either creates a new text encoding (if used without a parameter) or makes the one in the parameter effective within the enclosed area.
The text encoding can be assigned to a symbol in front of the directive so it can be reused whenever it's needed. This symbol can also act as a conversion function which converts a character string to a byte string using the encoding.
.encode ;starts anonymous local encoding scope .enc "titlefont";special character set .text "game title" .endencode ;restores original encoding vt100 .encode ;define custom encoding .cdef " ~", 32 .edef "{esc}", 27;add escape codes .edef "{moff}", [27, "[", "m"] .edef "{bold}", [27, "[", "1", "m"] .endencode .encode vt100 ;use custom encoding from here .text "{bold}bold{moff} text" lda #"{esc}" .endencode ;restores original encoding cmp #vt100("{esc}");conversion when not in scope .enc vt100 ;select custom encoding (at start of source)
This is a simple single character to byte translation definition. It's useful to map a range of Unicode characters to a range of bytes. The start and end positions are Unicode character codes either by numbers or by typing them. Overlapping ranges are not allowed.
.enc "ascii" ;define an ascii encoding .cdef " ~", 32 ;identity mapping for printable
Similar to .cdef
it is a single character to byte translation
definition. It's easier to use when the character codes are not consecutive.
Overlapping ranges with the former and itself are not allowed.
It tries to assign Unicode character codes from the first expression to byte values from the second. More than one pair of such assignments can be given.
If the byte value expression is not iterable then it will get incremented for each character definition. This allows easy assignment of randomly scattered Unicode values to a consecutive range of bytes values.
.tdef "A", 65 ;A -> 65 .tdef "ACX", 65 ;A -> 65, C-> 66, X -> 67 .tdef "ACX", [65, 33, 11];A -> 65, C-> 33, X -> 11
When these substrings are found in a text they are replaced by bytes defined here. When strings with common prefixes are used the longest match wins. Useful for defining non-typeable control code aliases, or as a simple tokeniser.
.edef "\n", 13 ;one byte control codes .edef "{clr}", 147 .edef "{crlf}", [13, 10];two byte control code .edef "<nothing>", [];replace with no bytes
The example below shows how all this fits together:
petscii .namespace common .segment;common definitions .cdef " @", $20;32-64 is identical .tdef "[£]↑←", $5b, "┼🮌🭳", $db .edef "{clr}", 147, "{cr}", 13 .endsegment upper .encode;uppercase PETSCII #common .cdef "AZ", $41 .tdef "▌▄▔▁▏▒▕🮏◤🮇├▗└┐▂┌┴┬┤▎▍🮈🮂🮃▃🭿▖▝┘▘▚🭹", $a1 .tdef "♠🭲🭸🭷🭶🭺🭱🭴╮╰╯🭼╲╱🭽🭾●🭻♥🭰╭╳○♣🭵♦", $c1 .tdef "◥π─│", [$df, $ff, $c0, $dd] .endencode lower .encode;lowercase PETSCII #common .cdef "az", $41, "AZ", $c1;the easy ranges .tdef "▌▄▔▁▏▒▕🮏🮙🮇├▗└┐▂┌┴┬┤▎▍🮈🮂🮃▃✓▖▝┘▘▚🭹", $a1 .tdef "🮘🮕─│", [$df, $ff, $c0, $dd];random one to ones .endencode .endnamespace .encode petscii.lower >1000 93 d4 45 58 54 20 49 4e .text "{clr}Text in PETSCII{cr}" >1008 20 d0 c5 d4 d3 c3 c9 c9 0d .endencode
Structures and unions can be defined to create complex data types. The offset of fields are available by using the definition's name. The fields themselves by using the instance name.
The initialization method is very similar to macro parameters, the
difference is that unset parameters always return uninitialized data (
)
instead of an error.
?
Structures are for organizing sequential data, so the length of a structure is the sum of lengths of all items.
Structure definition, with named parameters and default values
.struct ;anonymous structure x .byte 0 ;labels are visible y .byte 0 ;content compiled here .endstruct ;useful inside unions nn_s .struct col, row;named structure x .byte \col ;labels are not visible y .byte \row ;no content is compiled here .endstruct ;it's just a definition nn .dstruct nn_s, 1, 2;structure instance (within label) lda nn.x ;direct field access ldy #nn_s.x ;get offset of field lda nn,y ;and use it indirectly nnarray .brept 4 ;4 element "array" here .dstruct nn_s ;fields directly here (without a label) .endrept lda nnarray[0].y;access of "array" field coords2 .bfor x2, y2 in (1,3),(4,2),(7,5) .dstruct nn_s, x2, y2 .next ;initialized "array" from list
Unions can be used for overlapping data as the compile offset and program counter remains the same on each line. Therefore the length of a union is the length of it's longest item.
Union definition, with named parameters and default values
.union ;anonymous union x .byte 0 ;labels are visible y .word 0 ;content compiled here .endunion nn_u .union ;named union x .byte ? ;labels are not visible y .word \1 ;no content is compiled here .endunion ;it's just a definition nn .dunion nn_u, 1 ;union instance here lda nn.x ;direct field access ldy #nn_u.x ;get offset of field lda nn,y ;and use it indirectly
The example below shows how to define structure to a binary include.
.union .binary "pic.drp", 2 .struct color .fill 1024 screen .fill 1024 bitmap .fill 8000 backg .byte ? .endstruct .endunion
Anonymous structures and unions in combination with sections are useful for
overlapping memory assignment. The example below shares zero page allocations
for two separate parts of a bigger program. The common subroutine variables
are assigned after in the zp
section.
* = $02
.union ;spare some memory
.struct
.dsection zp1 ;declare zp1 section
.endstruct
.struct
.dsection zp2 ;declare zp2 section
.endstruct
.endunion
.dsection zp ;declare zp section
Macros can be used to reduce typing of frequently used source lines. Each invocation is a copy of the macro's content with parameter references replaced by the parameter texts.
Copies the code segment as it is, so symbols can be used from outside, but this also means multiple use will result in double defines unless anonymous labels are used.
The code is enclosed in it's own block so symbols inside are non-accessible, unless a label is prefixed at the place of use, then local labels can be accessed through that label.
#
or .
with the parameters. Normally the name of
the macro is used, but it can be any expression..macro
and .segment
for compatibility.;A simple macro copy .macro ldx #size(\1) lp lda \1,x sta \2,x dex bpl lp .endmacro #copy label, $500 ;Use macro as an assembler directive lohi .macro lo .byte <(\@) hi .byte >(\@) .endmacro var .lohi 1234, 5678 lda var.lo,y ldx var.hi,y
The first 9 parameters can be referenced by
–\1
.
The entire parameter list including separators is \9
.
\@
name .macro lda #\1 ;first parameter 23+1 .endmacro #name 23+1 ;call macro
Parameters can be named, and it's possible to set a default value after an equal sign which is used as a replacement when the parameter is missing.
These named parameters can be referenced by \name
or
\{name}
. Names must match completely, if unsure use the quoted
name reference syntax.
name .macro first, b=2, , last lda #\first ;first parameter lda #\b ;second parameter lda #\3 ;third parameter lda #\last ;fourth parameter .endmacro #name 1, , 3, 4 ;call macro
In the original turbo assembler normal references are passed by value and
can only appear in place of one. Text references on the other hand can appear
everywhere and will work in place of e.g. quoted text or opcodes and labels. The
first 9 parameters can be referenced as text by
@1
–@9
.
name .macro jsr print .null "Hello @1!";first parameter .endm #name "wth?" ;call macro
Beyond the built-in functions mentioned earlier it's possible to define custom ones for frequently used calculations.
Function parameters are assigned to comma separated variable names on invocation. These variables are visible in the function scope.
Parameter values may be converted using a function whose name can be given after a colon following the variable name.
Default values may be supplied for each parameter after an equal sign. These values are calculated at function definition time only and are used when a parameter was not specified.
Extra parameters are not accepted, unless the last parameter symbol is preceded with a star, in this case these parameters are collected into a tuple.
Only those external variables and functions are available which were accessible at the place of definition, but not those at the place of invocation.
vicmem .sfunction _font, _scr=0, ((_font >> 10) & $0f) | ((_scr >> 6) & $f0) lda #vicmem($2000, $0400); calculate constant sta $d018
If a multi line macro is used in an expression only the returned result is used. If multiple values are returned these will form a tuple.
If a multi line function is used as macro, directive or pseudo instruction and there's a label in front then the returned value is assigned to it. If nothing is returned then it's used as regular label.
mva .function value, target
lda value
sta target
.endfunction
mva #1, label
To prevent parts of source from compiling conditional constructs can be used. This is useful when multiple slightly different versions needs to be compiled from the same source.
Anonymous labels are still recognized in the non-compiling parts even if they won't get defined. This ensures consistent relative referencing across conditionally compiled areas with such labels.
The .ifne
, .ifeq
, .ifpl
and
.ifmi
directives exists for compatibility only, in practice it's
better to use comparison operators instead.
.if wait==2 ;2 cycles nop .elsif wait==3 ;3 cycles bit $ea .elsif wait==4 ;4 cycles bit $eaea .else ;else 5 cycles inc $2 .endif
.elsif
because it's a popular typo and it's difficult to notice.Similar to the .if
, .elsif
, .else
, .endif
construct, but the compared value needs to be written only once in the switch
statement.
.switch wait .case 2 ;2 cycles nop .case 3 ;3 cycles bit $ea .case 4 ;4 cycles bit $eaea .default ;else 5 cycles inc $2 .endswitch
.switch
conditional compilation block..comment lda #1 ;this won't be compiled sta $d020 .endcomment
.comment
block.The following directives are used to repeat code or data.
The regular non-scoped variants cover most cases except when normal labels are required as those will be double defined.
Scoped variants (those starting with the letter b) create a new scope for each iteration. This allows normal labels without collision but it's a bit more resource intensive.
If the scoped variant is prefixed with a label then the list of individual scopes for each iteration will be assigned to it. This allows accessing labels within.
First a variable is set, usually this is used for counting. This is optional, the variable may be set already before the loop.
Then the condition is checked and the enclosed lines are compiled if it's true. If there's no condition then it's an infinite loop and .break
must be used to terminate it.
After an iteration the second assignment is calculated, usually it's updating the loop counter variable. This is optional as well.
ldx #0 lda #32 lp .for ue := $400, ue < $800, ue += $100 sta ue,x ;do $400, $500, $600 and $700 .endfor dex bne lp
The expression is usually the range
function or some sort of list.
.for col in 0, 11, 12, 15, 1 lda #col ;0, 11, 12, 15 and 1 sta $d020 .endfor
.for
or .bfor
loop block.rept 100 - inx bne - .endrept lst .brept 100 ;each iteration into a tuple label jmp label ;not a duplicate definition .endrept jmp lst[5].label ;use label of 6th iteration
.rept
or .brept
blockWorks as expected however the scoped variant might be tricky to use as
variables of the condition are usually part of the parent scope. So
modifying them in the loop body should be done with compound assignments
or the reassign operator (::=
).
.while
or .bwhile
loop block.for
,
.bfor
, .rept
,
.brept
, .while
and .bwhile
loop for compatibility..goto
i .var 100 loop .lbl nop i .var i - 1 .ifne i .goto loop ;generates 100 nops .endif ;the hard way ;)
Longer sources are usually separated into multiple files for easier handling. Precomputed binary data can also be included directly without converting it into source code first.
Search path is relative to the location of current source file. If it's not found there the include search path is consulted for further possible locations.
To make your sources portable please always use forward slashes
(/
) as a directory separator and use lower/uppercase consistently
in file names!
.include "macros.asm" ;include macros menu .binclude "menu.asm" ;include in a block jmp menu.start
By using offset and length it's possible to break out chunks of data from a file separately, like bitmap and colors for example. Negative offsets are calculated from the end of file.
.binary "stuffz.bin" ;simple include, all bytes .binary "stuffz.bin", 2 ;skip start address .binary "stuffz.bin", 2, 1000;skip start address, 1000 bytes max
Scopes may contain symbols or further nested scopes. The same symbol name can be reused as long as it's in a different scope.
A symbol is looked up in the local scope first. If it's a non-local symbol
then parent scopes and the global scope may be searched in addition. This means
that a symbol in a parent or global scope may be shadowed
.
Symbols of a named scope can be looked up using the
operator.
The searched symbol stands on the right and it's looked up in the scope on the
left. More than one symbol may be looked up at the same time and the result
will be a list or tuple.
.
lda #0
sta vic.sprite.enable
; same as .byte colors.red, colors.green, colors.blue
ctable .byte colors.(red, green, blue)
If the symbol in front is not referenced anywhere then the enclosed source won't be compiled.
Symbols inside are enclosed in a scope and are accessible through the symbol of the procedure using the dot notation. This forces compilation of the whole procedure of course.
ize .proc nop cucc nop .endproc jsr ize jmp ize.cucc
The compilation only if used
behaviour of .proc
eases the building of libraries
from a
collection of subroutines and tables where not everything is needed all the time.
Alternative dead-code reduction techniques I encountered:
includethat part. Much better than conditional compilation but these macro calls still need to be manually managed.
All symbols inside a block are enclosed in a scope. If the block had a symbol then local symbols are accessible through that using the dot notation.
.block inc count + 1 count ldx #0 .endblock
This directive either creates a new scope (if used without a parameter) or activates the one in the parameter.
The scope can be assigned to a symbol in front of the directive so that it can be reactivated later. This enables label definitions into the same scope in different files.
colors .namespace red = 2 blue = 6 .endnamespace lda #colors.red
Any symbols defined inside can be overridden by stronger
symbols in
the same scope from outside. Can be nested as necessary.
This gives the possibility of giving default values for symbols which might
not always exist without resorting to .ifdef
/.ifndef
or
similar directives in other assemblers.
symbol = 1 ;stronger symbol than the one below .weak symbol = 0 ;default value if the one above does not exists .endweak .if symbol ;almost like an .ifdef ;)
Other use of weak symbols might be in included libraries to change default values or replace stub functions and data structures.
If these stubs are defined using
.proc
/.endproc
then their default implementations will
not even exists in the output at all when a stronger symbol overrides them.
Multiple definition of a symbol with the same strength
in the same
scope is of course not allowed and it results in double definition error.
Please note that .ifdef
/.ifndef
directives are
left out from 64tass for of technical reasons, so don't wait for them to appear
anytime soon.
This directive is similar to .namespace
but it gives access to
another scope's variables without leaving the current scope. May be
useful to allow a short hand access in some situations.
It's advised to use the
command line
option to warn about any unexpected symbol ambiguity.-Wshadow
Sections can be used to collect data or code into separate memory areas without moving source code lines around. This is achieved by having separate compile offset and program counters for each defined section.
Defines a section fragment. The name at .endsection
must match but
it's optional.
All .section
fragments are compiled to the memory area
allocated by the .dsection
directive. Compilation happens as the
code appears, this directive only assigns enough space to hold all the content
in the section fragments.
The space used by section fragments is calculated from the difference of
starting compile offset and the maximum compile offset reached. It is possible
to manipulate the compile offset in fragments, but putting code before the
start of .dsection
is not allowed.
* = $02 .dsection zp ;declare zero page section .cerror * > $30, "Too many zero page variables" * = $334 .dsection bss ;declare uninitialized variable section .cerror * > $400, "Too many variables" * = $0801 .dsection code ;declare code section .cerror * > $1000, "Program too long!" * = $1000 .dsection data ;declare data section .cerror * > $2000, "Data too long!" ;-------------------- .section code .word ss, 2005 .null $9e, format("%4d", start) ss .word 0 start sei .section zp ;declare some new zero page variables p2 .addr ? ;a pointer .endsection zp .section bss ;new variables buffer .fill 10 ;temporary area .endsection bss lda (p2),y lda #<label ldy #>label jsr print .section data ;some data label .null "message" .endsection data jmp error .section zp ;declare some more zero page variables p3 .addr ? ;a pointer .endsection zp .endsection code
The compiled code will look like:
>0801 0b 08 d5 07 .word ss, 2005 >0805 9e 32 30 36 31 00 .null $9e, format("%4d", start) >080b 00 00 ss .word 0 .080d 78 start sei >0002 p2 .addr ? ;a pointer >0334 buffer .fill 10 ;temporary area .080e b1 02 lda (p2),y .0810 a9 00 lda #<label .0812 a0 10 ldy #>label .0814 20 1e ab jsr print >1000 6d 65 73 73 61 67 65 00 label .null "message" .0817 4c e2 fc jmp error >0004 p2 .addr ? ;a pointer
Sections can form a hierarchy by nesting a .dsection
into
another section. The section names must only be unique within a section but can
be reused otherwise. Parent section names are visible for children, siblings can be
reached through parents.
In the following example the included sources don't have to know which
code
and data
sections they use, while the bss
section is
shared for all banks.
;First 8K bank at the beginning, PC at $8000 * = $0000 .logical $8000 .dsection bank1 .cerror * > $a000, "Bank1 too long" .endlogical bank1 .block ;Make all symbols local .section bank1 .dsection code ;Code and data sections in bank1 .dsection data .section code ;Pre-open code section .include "code.asm"; see below .include "iter.asm" .endsection code .endsection bank1 .endblock ;Second 8K bank at $2000, PC at $8000 * = $2000 .logical $8000 .dsection bank2 .cerror * > $a000, "Bank2 too long" .endlogical bank2 .block ;Make all symbols local .section bank2 .dsection code ;Code and data sections in bank2 .dsection data .section code ;Pre-open code section .include "scr.asm" .endsection code .endsection bank2 .endblock ;Common data, avoid initialized variables here! * = $c000 .dsection bss .cerror * > $d000, "Too much common data" ;------------- The following is in "code.asm" code sei .section bss ;Common data section buffer .fill 10 .endsection bss .section data ;Data section (in bank1) routine .addr print .endsection bss
.al
lda #$4322
.xl
ldx #$1000
SEP
/REP
instructions.
.autsiz rep #$10 ;implicit .xl ldx #$1000
When data bank is switched off only data bank indexed (,b) addresses create data bank accessing instructions.
.databank $10 ;data bank at $10xxxx lda $101234 ;results in $ad, $34, $12 .databank ? ;no data bank lda $1234 ;direct page or long addressing lda #$1234,b ;results in $ad, $34, $12
When direct page is switched off only the direct page indexed (,d) addresses create direct page accessing instructions.
.dpage $400 ;direct page $400-$4ff lda $456 ;results in $a5, $56 .dpage ? ;no direct page lda $56 ;data bank or long addressing lda #$56,d ;results in $a5, $56
.option allow_branch_across_page = 0 bcc + ;same execution time inx ;needed in both cases + bcs + dex + .option allow_branch_across_page = 1
.error "Unfinished here..." .cerror * > $1200, "Program too long by ", * - $1200, " bytes"
.warn "FIXME: handle negative values too!" .cwarn * > $1200, "This may not work!"
.cpu "6502" ;standard 65xx .cpu "65c02" ;CMOS 65C02 .cpu "65ce02" ;CSG 65CE02 .cpu "6502i" ;NMOS 65xx .cpu "65816" ;W65C816 .cpu "65dtv02" ;65dtv02 .cpu "65el02" ;65el02 .cpu "r65c02" ;R65C02 .cpu "w65c02" ;W65C02 .cpu "4510" ;CSG 4510 .cpu "default" ;cpu set on command line
encryption.
counter .var 0 ;define, same as := counter .var counter + 1 ;redefine, same as += 1
This directive looks up the symbol name to be defined in the other scope for its value. Useful for shorthand definitions without repeating the name if it's unchanged.
red .from vic.colors ;same as red = vic.colors.red init .from + ;expose these symbols publicly play .from + + .block ;other symbols hidden in block init sei ...
.proff ;Don't put filler bytes into listing * = $8000 .fill $2000, $ff ;Pre-fill ROM area .pron * = $8000 .addr reset, restore .text "CBM80" reset cld
For better code readability BCC
has an alias named BLT
(Branch Less Than) and BCS
one named BGE
(Branch Greater Equal).
cmp #3
blt exit ; less than 3?
For similar reasons ASL
has an alias named SHL
(SHift Left) and LSR
one named SHR
(SHift Right). This naming however is not very common.
The implied variants LSR
, ROR
, ASL
and ROL
are a
shorthand for LSR A
, ROR A
, ASL A
and
ROL A
. Using the implied form is considered poor coding style.
For compatibility INA
and DEA
is a shorthand of
INC A
and DEC A
. Therefore there's no implied
variants like INC
or DEC
. The full form with the
accumulator is preferred.
The longer forms of INC X
, DEC X
, INC
Y
, DEC Y
, INC Z
and DEC Z
are
available for INX
, DEX
, INY
,
DEY
, INZ
and DEZ
. For this to work care
must be taken to not reuse the x
, y
and
z
single letter register symbols for other purposes. Same goes for
a
of course.
Load instructions with registers are translated to transfer instructions.
For example LDA X
becomes TXA
.
Store instructions with registers are translated to transfer instructions,
but only if it involves the s
or b
registers. For example
STX S
becomes TXS
.
Many illegal opcodes have aliases for compatibility as there's no standard naming convention.
For writing short code there are some special pseudo instructions for always
taken branches. These are automatically compiled as relative branches when the
jump distance is short enough and as JMP
or BRL
when
longer.
The names are derived from conditional branches and are:
GEQ
, GNE
, GCC
, GCS
,
GPL
, GMI
, GVC
, GVS
,
GLT
and GGE
.
.0000 a9 03 lda #$03 in1 lda #3 .0002 d0 02 bne $0006 gne at ;branch always .0004 a9 02 lda #$02 in2 lda #2 .0006 4c 00 10 jmp $1000 at gne $1000 ;branch farther
If the branch would skip only one byte then the opposite condition is compiled and only the first byte is emitted. This is now a never executed jump, and the relative distance byte after the opcode is the jumped over byte. If the CPU has long conditional branches (65CE02/4510) then the same method is applied to two byte skips as well.
There's a pseudo opcode called GRA
for CPUs supporting
BRA
, which is expanded to BRL
(if available) or
JMP
. A one byte skip will be shortened to a single byte if the
CPU has a NOP
immediate instruction (R65C02/W65C02).
If the branch would not skip anything at all then no code is generated.
.0009 geq in3 ;zero length "branch" .0009 18 clc in3 clc .000a b0 bcs gcc at2 ;one byte skip, as bcs .000b 38 sec in4 sec ;sec is skipped! .000c 20 0f 00 jsr $000f at2 jsr func .000f func
Please note that expressions like Gxx *+2
or Gxx
*+3
are not allowed as the compiler can't figure out if it has to create
no code at all, the 1 byte variant or the 2 byte one. Therefore use normal or
anonymous labels defined after the jump instruction when jumping forward!
To avoid branch too long errors the assembler also supports long branches.
It can automatically convert conditional relative branches to it's opposite and
a JMP
or BRL
. This can be enabled on the command
line using the
option.
--long-branch
.0000 ea nop nop .0001 b0 03 bcs $0006 bcc $1000 ;long branch (6502) .0003 4c 00 10 jmp $1000 .0006 1f 17 03 bbr 1,$17,$000c bbs 1,23,$1000 ;long branch (R65C02) .0009 4c 00 10 jmp $1000 .000c d0 04 bne $0012 beq $10000 ;long branch (65816) .000e 5c 00 00 01 jmp $010000 .0012 30 03 bmi $0017 bpl $1000 ;long branch (65816) .0014 82 e9 lf brl $1000 .0017 ea nop nop
Please note that forward jump expressions like Bxx *+130
,
Bxx *+131
and Bxx *+132
are not allowed as the
compiler can't decide between a short/long branch. Of course these destinations
can be used, but only with normal or anonymous labels defined after the jump
instruction.
In the above example extra JMP
instructions are emitted for
each long branch. This is suboptimal and wasting space if there are several
long branches to the same location in close proximity. Therefore the assembler
might decide to reuse a JMP
for more than one long branch to save
space.
Currently there are two options, either use TMPview
by Style to convert the
source file directly, or do the following:
SYS 9*4096
or SYS 8*4096
depending on version)The resulting file should then (with the restrictions below) assemble using the following command line:
64tass -C -T -a -W -i source.asm -o outfile.prg
64tass is nearly 100% compatible with the original Turbo Assembler
, and supports
most of the features of the original Turbo Assembler Macro
.
The remaining notable differences are listed here.
The original turbo assembler uses case sensitive labels, use the
command line option to enable this behaviour.
--case-sensitive
There are a few differences which can be worked around by the
command line option.
These are:
--tasm-compatible
The original expression parser has no operator precedence, but 64tass has. That
means that you will have to fix expressions using braces accordingly, for example
1+2*3
becomes (1+2)*3
.
The following operators used by the original Turbo Assembler are different:
. | bitwise or, now |
|
: | bitwise eor, now ^
|
! | force 16 bit address, now @w
|
The default expression evaluation is not limited to 16 bit unsigned numbers anymore.
Macro parameters are referenced by
–\1
instead of using the pound sign.
\9
Parameters are always copied as text into the macro and not passed by value as the original turbo assembler does, which sometimes may lead to unexpected behaviour. You may need to make use of braces around arguments and/or references to fix this.
Some versions of the original turbo assembler had bugs that are not reproduced by 64tass, you will have to fix the code instead.
In some versions labels used in the first .block
are globally available. If
you get a related error move the respective label out of the .block
.
Short command line options consist of
and a letter, long options
start with -
.
--
If
is encountered then further options are not recognized and are
assumed to be file names.
--
Options requiring file names are marked with <filename>
. A
single
as name means standard input or output. File name quoting is
system specific.
-
a.out
. This option changes it.
64tass a.asm -o a.prg
This option may be used multiple times to output different sections in different formats of a single compilation.
For multiple outputs the format options and output section selection must be placed before this option. The format selection will be unchanged if no new selection was made but the output section selection and the map file must be repeated for each output. The maximum image size will be the smallest of all selected formats. Using the same name multiple times is not a good idea.
--output
option but appends instead of overwrites.
Normally output files are overwritten but in some cases it's useful to append them instead.
Useful for test compiles.
Normally the memory map is displayed on the standard output together with
other messages. It's possible to write it to a file or to the standard output
by using
as the file name.-
--map
option but appends instead of overwrites.By default all sections go into the output file. Using this option limits the output to specific section and it's children. This is useful to split a larger program into several files.
64tass a.asm --output-section main -o main.prg \ --output-section loader -o loader.prg
While it's possible to enter the address as a number it's recommended to use a label instead.
64tass --long-address --m65816 a.asm
Overlapping blocks are flattened and uninitialized memory is filled up with zeros. Uninitialized memory before the first and after the last valid bytes are not saved. Up to 64 KiB or 16 MiB with long address.
Used for C64 binaries. The first 2 bytes are the little endian address of the first valid byte (load address). This is followed by the data.
64tass --cbm-prg a.asm
* = $2000
start rts
00 20 | load to $2000 |
60 | data |
Overlapping blocks are flattened and uninitialized memory is filled up with zeros. Uninitialized memory before the first and after the last valid bytes are not saved. Up to 64 KiB or 16 MiB with long address.
Useful for small ROM files.
64tass --nostart a.asm
* = $2000
rts
60 | data |
Overlapping blocks are flattened and uninitialized memory is filled up with zeros. Uninitialized memory after the last valid byte is not saved. Up to 4 GiB.
Useful for creating huge multi bank ROM files. See sections for an example.
Overlapping blocks are flattened. Blocks are saved in sorted order and uninitialized memory is skipped. Up to 64 KiB or 16 MiB with long address.
Used for linkers and downloading. Before writing each memory block the length and the memory address is saved in a little endian order. Once everything was saved a zero length block is written without an address or data. These zeros serve as an end marker.
64tass --nonlinear a.asm * = $1000 ;1st segment lda #2 * = $2000 ;2nd segment rts
02 00 | load 2 bytes |
00 10 | to $1000
|
a9 02 | data |
01 00 | load 1 byte |
00 20 | to $2000
|
60 | data |
00 00 | load 0 bytes, end marker |
Overlapping blocks are kept, continuing blocks are concatenated. Saving happens in the definition order without sorting, and uninitialized memory is skipped in the output. Up to 64 KiB.
Used for Atari executables. First 2 bytes of signature is written. Then before saving each memory block the words of load address and last byte address is written in little endian format.
If the --output-exec
command line
parameter was given then a 6 byte run block is added to the end of the
output.
64tass --output-exec=start --atari-xex a.asm
* = $2000
start rts
ff ff | header |
00 20 | load to $2000
|
00 20 | until $2000
|
60 | data |
e0 02 e1 02 | run marker |
00 20 | run address ($2000 )
|
Overlapping blocks are flattened and uninitialized memory is filled up with zeros. Uninitialized memory before the first and after the last valid bytes are not saved. Up to 64 KiB.
Used for Apple II executables. First the load address and the data length words are written in little endian format. This is followed by the data.
64tass --apple-ii a.asm
* = $0c00
rts
00 0c | load to $0c00
|
01 00 | length is 1 byte |
60 | data |
Overlapping blocks are flattened and uninitialized memory is filled up with zeros. Uninitialized memory before the first and after the last valid bytes are not saved. Up to 16 MiB.
Used for single segment C256 Foenix executables. After the PGX signature a four byte little endian load address is written. This is followed by the data.
64tass --c256-pgx a.asm
* = $1000
rts
50 47 58 01 | PGX signature |
00 10 00 00 | load to $1000
|
60 | data |
Overlapping blocks are flattened. Blocks are saved in sorted order and uninitialized memory is skipped. Up to 16 MiB.
Used for multi segment C256 Foenix binaries. It starts with a single byte signature. Then before each memory block a three byte load address and length is written in little endian format.
If the --output-exec
command line
parameter was given then a 6 byte execution block is added to the end of the output.
64tass --output-exec=start --c256-pgz a.asm
* = $1000
start rts
5a | PGZ signature byte |
00 10 00 | load to $1000
|
01 00 00 | length is 1 bytes |
60 | data |
00 10 00 | execute at $1000
|
00 00 00 | execution marker |
Overlapping blocks are kept, data is stored in the definition order, and uninitialized areas are skipped. I8HEX up to 64 KiB, I32HEX up to 4 GiB.
Used for EPROM programming or downloading. Data bytes are written using
00
records. If the file is larger than 64 KiB then 04
records are used as needed. The output ends with a 01
record.
If the --output-exec
command line
parameter was given then a 05
record is added with the execution
address right before the end 01
record.
64tass --intel-hex a.asm
* = $0c00
rts
Example Intel HEX output:
:010C00006093 :00000001FF
Overlapping blocks are kept, data is stored in the definition order, and uninitialized areas are skipped. Up to 64 KiB.
64tass --mos-hex a.asm
* = $0c00
rts
Example MOS Technology output:
;010C0060006D ;0000010001
Overlapping blocks are kept, data is stored in the definition order, and uninitialized memory areas are skipped. S19 up to 64 KiB, S28 up to 16 MiB and S37 up to 4 GiB.
Used for EPROM programming or downloading. First a S0
header record is
written which is followed by S1
, S2
, or
S3
data records. Then an S5
or S6
count
record comes and a S9
, S8
or S7
termination record.
If the --output-exec
command line
parameter was given then the termination record will use this address.
Without this the address of the first data record is used.
64tass --s-record a.asm
* = $0c00
rts
Example Motorola S-record output:
S00600004844521B S1040C00608F S5030001FB S9030C00F0
Normally no conversion takes place, this is for backwards compatibility with a DOS based Turbo Assembler editor, which could create PETSCII files for 6502tass. (including control characters of course)
Using this option will change the default none
and screen
encodings to
map 'a'–'z'
and 'A'–'Z'
into the correct PETSCII range of $41–$5A
and $C1–$DA
,
which is more suitable for an ASCII editor. It also adds predefined petcat style
PETSCII literals to the default encodings, and enables Unicode letters in symbol names.
For writing sources in UTF-8/UTF-16 encodings this option is required!
64tass a.asm .0000 a9 61 lda #$61 lda #"a" >0002 31 61 41 .text "1aA" >0005 7b 63 6c 65 61 72 7d 74 .text "{clear}text{return}more" >000e 65 78 74 7b 72 65 74 75 >0016 72 6e 7d 6d 6f 72 65 64tass --ascii a.asm .0000 a9 41 lda #$41 lda #"a" >0002 31 41 c1 .text "1aA" >0005 93 54 45 58 54 0d 4d 4f .text "{clear}text{return}more" >000e 52 45
BXX *+5 JMP xxx
. Branch too long messages are usually solved
by manually rewriting them as BXX *+5 JMP xxx
.
64tass can do this automatically if this option is used. BRA
is of course not converted.
64tass a.asm * = $1000 bcc $1233 ;error... 64tass a.asm * = $1000 bcs *+5 ;opposite condition jmp $1233 ;as simple workaround 64tass --long-branch a.asm * = $1000 bcc $1233 ;no error, automatically converted to the one above bcs @b $1233 ;keep this one short regardless and fail if too far
64tass a.asm label nop Label nop ;double defined... 64tass --case-sensitive a.asm label nop Label nop ;Ok, it's a different label...
Same syntax is allowed as in source files. Be careful with strings, the shell might eat the quotes unless escaped.
Using hexadecimal numbers might be tricky as the shell might try to expand them as variables. Either quoting or backslash escaping might help.
In Makefiles all $ signs need to be escaped by doubling them. This needs to
be done over the normal shell escaping. For example
becomes $1000
.\$$1000
64tass -D ii=2 -D var=\"string\" -D FAST:=true a.asm
lda #ii ;result: $a9, $02
FAST :?= false ;define if undefined
64tass --quiet a.asm
Switches the expression evaluator into compatibility mode. This
enables
, .
and :
operators and disables 64tass specific extensions, disables precedence handling
and forces 16 bit unsigned evaluation (see !
differences to original Turbo Assembler
below)
If an included source or binary file can't be found in the directory of the source file then this path is tried. More than one directories can be specified by repeating this option. If multiple matches exist the first one is used.
Writes a dependency rules suitable for make
from the list of files used during compilation.
Please choose source file names which are compatible with Makefiles as there are very little escaping possibilities.
--dependencies
option but appends instead of overwrites.This is useful for automatic dependency generation to avoid missing target errors on file rename.
The following Makefile uses the rules generated by 64tass (in .dep
)
to achieve automatic dependency tracking:
demo: demo.asm .dep 64tass --make-phony -M.dep $< -o $@ .dep: -include .dep
Diagnostic message switched start with a
and can have an optional -W
prefix to
disable them. The options below with this prefix are enabled by default, the others are disabled.no-
Normally compilation errors a written to the standard error output.
It's possible to redirect them to a file or to the standard output by
using
as the file name.-
--error
option but appends instead of overwrites.Disables warnings during compile. For fine grained diagnostic message suppression see the diagnostic options section.
64tass --no-warn a.asm
This is for cases where the fault log is automatically processed and no one ever looks at it and therefore there's no point to display the source lines.
64tass --no-caret-diag a.asm
This is for cases where the fault log is processed by an editor which also displays the compilation output somewhere. Only lines which are the result of macro processing will be output to aid debugging. Those which would just duplicate what's in the source editor window will be not.
64tass --macro-caret-diag a.asm
no-
prefix disable all except those enabled.For example
makes this check an error. The -Werror=implied-reg
variant is useful with -Wno-error=
to set some to warnings.-Werror
There are several opcodes for the same task, especially for the "6502i" target. This warning helps to find where their use.
Can be used to see where space is wasted for alignment.
Sometimes alternative addressing modes are used as the fitting one is not available. For example there's no lda direct page y so instead data bank y is used with a warning.
Page crossing branches execute with a penalty cycle. This option helps to locate them easily.
This option can be used to enforce letter case matching of symbols in case insensitive mode. This gives similar results to the case sensitive mode (symbols must match exactly) with the main difference of disallowing symbol name definitions differing only in case (these are reported as duplicates).
It may be hard to notice if a
was missed. The code still compiles but
there's a huge difference between #
and
cpx #const
. Unless the right sort of garbage was on zero
page at the time of testing...
cpx const
This check might have a lot of false positives if zero page locations are accessed by using small numbers, which is a popular coding style. But there are ways to reduce them.
For "known" fixed locations address(x)
can be used, preferably
bound to a symbol. Automatic allocation of zero page variables works too
(e.g. zpstuff .byte ?
). And basically everything which is a
traditional "label" or derived from a label with an offset.
Some instructions have implied aliases like
for asl
for compatibility
reasons, but this shorthand is not the preferred form.asl a
A leading zero could be a prefix for an octal number but as octals are not supported the result will be decimal.
This option gives a warning for instructions which were modified by the long branch function. Less intrusive than disabling long branches and see where it fails.
Such macro calls can easily be mistaken to be labels if invoked without parameters. Also it's hard to notice that an unchanged call turned into label after the definition got renamed. This warning helps to find such calls so that prefixes can be added.
Unfortunately there were some features added previously which shouldn't have been included. This option disables warnings about their uses.
Floating point numbers have a finite precision and comparing them might give unexpected results.
For example 2.1 + 0.2 == 2.3
is true but gives a warning as the left side
is actually bigger by approximately 4.44E−16.
Normally this is solved by rounding or changing the comparison values.
A lot of parameters and the data dumping directives need integers but floating point numbers are accepted as well. The style of rounding used may or may not be what you wanted.
By default floor rounding (to lower) is used and not truncate (towards zero). The reason for this is to enable calculation of fixed point integers by using floating point.
The difference is subtle and only noticable for negative numbers. The division
of -300/256
is -2
which matches floor(-300/256.0)
but not
trunc(-300/256.0)
.
To get symmetric sine waves around zero trunc()
needs to be used. Some other
calculation might result in 126.9999997
due to inaccuracies in logarithm which
would need round()
.
To avoid unexpected rounding this option helps to find those places where no explicit rounding was done.
jmp ($xxff)
bug.
With this option it's fine that the high byte is read from the wrong
address on a 6502, NMOS 6502 and 65DTV02.
jmp (vector) .alignpageind vector, 256; jmp bug workaround vector .addr ? ; by avoiding page cross
You may disable this if you use labels which look like mistyped versions of implied addressing mode instructions and you don't want to put them in the first column.
This check is there to catch typos, unsupported implied instructions, or unknown aliases and not for enforcing label placement.
Can be used to ignore .page
assertion
block failures. As a middle ground
can turn
the assertion to a warning only.-Wno-error=page
There are some common mistakes, but experts and those who read this don't need extra notes about them. These are:
.byte
instead of .text
.
.byte
enforces the 0–255 range for each value.label *=*+1
style space reservations.
*=
is also the compound multiply operator. The *=*+1
needs to be on a separate line without a label. A better alternatively is to use .fill 1
or .byte ?
..byte
or .word
.char
, .sint
.lda #xxx
lda #+xx
will make it workThese cross platform development annoyances are checked for:
Not all of the unary operators are strongly binding and this may cause surprises. This warning is intended to catch mistakes like this:
.cerror >start != >end; possibly wrong it's >(start != (>end)) .cerror (>start) != >end; correct high byte check
size()
and len()
can be used to measure a memory area.
Normally there's no offset used but a positive offset may be used to reduce
available space up until nothing remains.
On the other hand if a negative offset is used then more space will be available (ahead of the area) which may or may not be desired.
var .byte ?, ?, ? var2 = var - 2 ; start 2 bytes earlier ldx #size(var2) ; size is 5 bytes as it's 2 bytes ahead
Normally
means compound multiply of the variable in front.
Unfortunately this looks the same a symbol *= ...
which is an old-school way to allocate space.
label *=*+x
If the symbol was a variable defined earlier then the multiply is performed without a warning. If it's a new label definition then this warning is used to note that possibly a variable definition was missed earlier.
If the intention was really a label definition then the
can be moved to a separate line, or in case of space allocation it could be
improved to use *=
or .byte ?
..fill x
Applying offsets to memory locations may result in addresses which end up outside of the processors address space.
For example "tmp"
is at $1000
and then it's
addressed as lda tmp-$2000
then the result will be lda $f000
or lda $fff000
depending on the CPU. If this is fine then this
warning can be disabled otherwise it can be made into an error by using
-Werror=wrap-addr
.
Adding an offset to a bank 0 address may end up outside of bank 0. If this happens a warning is issued and the address wraps around.
The warning may be ignored using this command line parameter. Alternatively
it could be turned into an error by using -Werror=wrap-bank0
.
Adding an offset to a direct page address may end up outside of the direct page. For a 65816 or 65EL02 an alternative addressing mode is used but on other processors if this happens a warning is issued and the address wraps around.
The warning may be ignored using this command line parameter. Alternatively
it could be turned into an error by using -Werror=wrap-dpage
.
While assembling the compile offset may reach the end of memory image. If this happens a warning is issued and the compile offset is set to the start of image.
The warning may be ignored using this command line parameter. Alternatively
it could be turned into an error by using -Werror=wrap-mem
.
The image size depends on the output format. See the Output options section above.
While assembling the program counter may reach the end of the current program bank. If this happens a warning is issued as a real CPU will not cross the bank on execution. On the other hand some addressing modes handle bank crosses so this might not be actually a problem for data.
The warning may be ignored using this command line parameter. Alternatively
it could be turned into an error by using -Werror=wrap-pc
.
Adding an offset to a program bank address may end up outside of the current program bank. If this happens a warning is issued and the address wraps around.
The warning may be ignored using this command line parameter. Alternatively
it could be turned into an error by using -Werror=wrap-pbank
.
The single
operator is only there for compatibility reasons and should be written as =
normally.==
Warns on things that could be optimized, at least according to the limited analysis done. Currently it's easy to fool with these constructs:
.byte $2c
and similar tricks to skip instructions.*+5
and similar tricks to skip instructions, or to loop like *-1
.It's also rather simple and conservative, so some opportunities will be missed. Most CPUs are supported with the notable exception of 65816 and 65EL02, but this could improve in later versions.
Checks if local variables shadow
other variables of same name in upper scopes in
ambiguous ways.
This is useful to detect hard to notice bugs where a new local variable takes the place of a global one by mistake.
bl .block a .byte 2 ;'a' is a built-in register x .byte 2 ;'x' is a built-in register asl a ; accumulator or the byte above? .end asl bl.x ; not ambiguous
Boolean values can be interpreted as numeric 0/1 and other types as booleans. This is convenient but may cause mistakes.
To pass this option the following constructs need improvements:
1
and 0
as boolean constants. Use the slightly longer true
and false
..if (lbl & 1) != 0
.!
. Write it out like lbl == 0
.||
, &&
and ^^
operators.1 + (lbl > 3)
. It's better as (lbl > 3) ? 2 : 1
.Symbols which have no references to them are likely redundant. Before removing them check if there's any conditionally compiled out code which might still need them.
The following options can be used to be more specific:
Symbols which appear in a default 64tass symbol list file and their root symbols are treated as used for exporting purposes.
These options will select the default architecture. It can be overridden by
using the
directive in the source..cpu
64tass --m65xx a.asm
lda $14 ;regular instructions
64tass --m65c02 a.asm
stz $d020 ;65c02 instruction
64tass --m65ce02 a.asm inz
64tass --m6502 a.asm
lax $14 ;illegal instruction
64tass --m65dtv02 a.asm
sac #$00
64tass --m65816 a.asm
lda $123456,x
--nostart
as well.
64tass --m65el02 a.asm
lda #0,r
64tass --mr65c02 a.asm rmb 7,$20
64tass --mw65c02 a.asm wai
64tass --m4510 a.asm map eom
64tass -l labels.txt a.asm
* = $1000
label jmp label
result (labels.txt):
label = $1000
This option may be used multiple times. In this case the format and root scope options must be placed before this option. Using the same name multiple times is not a good idea.
64tass --vice-labels -l all.l --export-labels --labels-root=export -l myexport.inc source.asm
This writes symbols for VICE into all.l
and symbols from scope
export
into myexport.inc
.
--labels
option but appends instead of overwrites.This option can be used to limit the output to only a subset of labels. The parameter is an expression which must resolve to a namespace. It's usually just the name of a label in the root scope which contains the labels to be listed.
This option can be used to limit the output to a section which code labels refer to.
If defined adds a prefix to labels for some formats.
List labels without any side effects. Usually for display purposes or for later include.
The difference to normal symbol listing is that 64tass assumes these
symbols will be used in another source. In practice this means that any
.proc
/.endproc
blocks appearing in the symbol file will
always be compiled even if unused otherwise.
This format may be used to translate memory locations to something readable in VICE monitor. Therefore simple numeric constants will not show up unless converted to an address first.
VICE symbols may only contain ASCII letters, numbers and underscore. Symbols not meeting this requirement will be omitted.
There's a good chance VICE will complain about already existing labels on import. In the past an attempt was made to filter out such duplicates to eliminate these warnings. However soon it was pointed out that omitted labels are now unavailable for commands like setting breakpoints. As the latter use case is rather more important than some bogus import warnings one has to live with them.
64tass --vice-labels -l labels.l a.asm
* = $1000
label jmp label
result (labels.l):
al 1000 .label
For now colons are used as scope delimiter due to a VICE limitation, but this will be changed to dots in the future.
The normal VICE label list does not include symbols like chrout =
$ffd2
or keybuff = 631
as these are numeric constants and
not memory addresses.
Of course there are ways around that. For example:
chrout = address($ffd2) keybuff = address(631) * = $ffd2 chrout .fill 3 * = 631 keybuff .fill 10
For those who don't want to waste time on explicitly marking addresses as such there's an easy way out by using this command line option.
The tradeoff is that depending on the coding style the label list will become polluted by non-address constants to various degrees. However if one mostly uses numeric constants for addresses only this may be acceptable.
The output will contain symbol locations and paths.
label = $x
fashion for interoperatibility.
Somewhat limited but much easier to parse than the normal output with all it's data types.
It's a text file in the following format:
<type>:<range>:<name>
--labels-add-prefix
command line optionIf the --labels-section
command
line option was given then the range is relative to the section start.
If more than one type of labels need to be listed then the --labels-append
command
line option can be used to append them.
64tass --mesen-labels @labeloptions.txt a.asm Option file (labeloptions.txt): --labels-section=rom --labels-add-prefix=P --labels labels.mlb --labels-section=ram --labels-add-prefix=R --labels-append labels.mlb --labels-section=save --labels-add-prefix=S --labels-append labels.mlb --labels-section=work --labels-add-prefix=W --labels-append labels.mlb --labels-section=registers --labels-add-prefix=G --labels-append labels.mlb
; 64tass Turbo Assembler Macro V1.5x listing file ; 64tass -L list.txt a.asm ; Fri Dec 9 19:08:55 2005 ;Offset ;Hex ;Monitor ;Source ;****** Processing input file: a.asm .1000 a2 00 ldx #$00 ldx #0 .1002 ca dex loop dex .1003 d0 fd bne $1002 bne loop .1005 60 rts rts ;****** End of listing
--list
option but appends instead of overwrites.; 64tass Turbo Assembler Macro V1.5x listing file ; 64tass --no-monitor -L list.txt a.asm ; Fri Dec 9 19:11:43 2005 ;Offset ;Hex ;Source ;****** Processing input file: a.asm .1000 a2 00 ldx #0 .1002 ca loop dex .1003 d0 fd bne loop .1005 60 rts ;****** End of listing
; 64tass Turbo Assembler Macro V1.5x listing file ; 64tass --no-source -L list.txt a.asm ; Fri Dec 9 19:13:25 2005 ;Offset ;Hex ;Monitor ;****** Processing input file: a.asm .1000 a2 00 ldx #$00 .1002 ca dex .1003 d0 fd bne $1002 .1005 60 rts ;****** End of listing
; 64tass Turbo Assembler Macro V1.5x listing file ; 64tass --line-numbers -L list.txt a.asm ; Fri Dec 9 19:13:25 2005 ;Line ;Offset ;Hex ;Monitor ;Source :1 ;****** Processing input file: a.asm 3 .1000 a2 00 ldx #$00 ldx #0 4 .1002 ca dex loop dex 5 .1003 d0 fd bne $1002 bne loop 6 .1005 60 rts rts ;****** End of listing
; 64tass Turbo Assembler Macro V1.5x listing file ; 64tass --verbose-list -L list.txt a.asm ; Fri Dec 9 19:13:25 2005 ;Offset ;Hex ;Monitor ;Source ;****** Processing input file: a.asm * = $1000 .1000 a2 00 ldx #$00 ldx #0 .1002 ca dex loop dex .1003 d0 fd bne $1002 bne loop .1005 60 rts rts ;****** End of listing
Command line arguments can be read from a file as well. This is useful to store common options for multiple files in one place or to overcome the argument list length limitations of some systems.
The filename needs to be prefixed with an at sign, so @argsfile
reads options from argsfile
. It will only work if there's not another file
named @argsfile
. The content is expanded in-place of @argsfile
.
Stored options must be separated by white space. Single or double quotes can be used in case file names have white space in their names.
Backslash can be used to escape the character following it and it must be used to escape itself. Single and double quotes need to be escaped if needed for string quoting.
Forward slashes can be used as a portable path separation on all systems.
Faults and warnings encountered are sent to the standard error for logging. To
redirect them to a file use the
command
line option. The message format is the following:
-E
<filename>:<line>:<character>: <severity>: <message>
The faulty line will be displayed after the message with a caret pointing to
the error location unless this is disabled by using
option.
--no-caret-diag
a.asm:3:21: error: not defined symbol 'label' lda label ^ a.asm:3:21: note: searched in the global scope
This is helpful for macro expansions as it displays the processed line which usually looks different to the one in the original source file.
Error buried deep in included files or macros display a backtrace of files after an In
file included from
text where all the files and positions involved are listed down to the main file.
In file included from main.asm:3:3: included.asm:2:11: error: not defined symbol 'test' #macro1 test ^ In file included from included.asm:2:3, main.asm:3:3: macros.asm:3:7: note: original location in an expanded macro was here lda test ^
Messages ending with
are user controllable. This means
that using [-Wxxx]
on the command line will silence them and
-Wno-xxx
will turn them it into a fault. See Diagnostic options for more details.
-Werror=xxx
bxx *+5 jmp
)@
and so will interpret the immediately following numbers and letters as part if the symbol. Please insert a space between @b
, @w
or @l
and the following label or number now to avoid surprises!Original 6502tass written for DOS by Marek Matula of Taboo.
It was ported to ANSI C by BigFoot/Breeze. This is when it's name changed to 64tass.
Soci/Singular reworked the code over the years to the point that practically nothing was left from original at this point.
Improved TASS compatibility, PETSCII codes by Groepaz.
Additional code: my_getopt command-line argument parser by Benjamin Sittler, avl tree code by Franck Bui-Huu, ternary tree code by Daniel Berlin, snprintf Alain Magloire, Amiga OS4 support files by Janne Peräaho.
Pierre Zero helped to uncover a lot of faults by fuzzing. Also there were a lot of discussions with oziphantom about the need of various features.
Main developer and maintainer: soci at c64.rulez.org
By default raw 8-bit encoding is used and nothing is translated or escaped. This mode is for compiling sources which are already PETSCII.
noneencoding for raw 8-bit
Does no translation at all, no translation table, no escape sequences.
screenencoding for raw 8-bit
The following translation table applies, no escape sequences.
Input | Byte | Input | Byte |
---|---|---|---|
00–1F | 80–9F | 20–3F | 20–3F |
40–5F | 00–1F | 60–7F | 40–5F |
80–9F | 80–9F | A0–BF | 60–7F |
C0–FE | 40–7E | FF | 5E |
Unicode encoding is used when the
option is given on the command line.
-a
noneencoding for Unicode
This is a Unicode to PETSCII mapping, including escape sequences for control codes.
Glyph | Unicode | Byte | Glyph | Unicode | Byte |
---|---|---|---|---|---|
–@ | U+0020–U+0040 | 20–40 | A–Z | U+0041–U+005A | C1–DA |
[ | U+005B | 5B | ] | U+005D | 5D |
a–z | U+0061–U+007A | 41–5A | £ | U+00A3 | 5C |
π | U+03C0 | FF | ← | U+2190 | 5F |
↑ | U+2191 | 5E | ─ | U+2500 | C0 |
│ | U+2502 | DD | ┌ | U+250C | B0 |
┐ | U+2510 | AE | └ | U+2514 | AD |
┘ | U+2518 | BD | ├ | U+251C | AB |
┤ | U+2524 | B3 | ┬ | U+252C | B2 |
┴ | U+2534 | B1 | ┼ | U+253C | DB |
╭ | U+256D | D5 | ╮ | U+256E | C9 |
╯ | U+256F | CB | ╰ | U+2570 | CA |
╱ | U+2571 | CE | ╲ | U+2572 | CD |
╳ | U+2573 | D6 | ▁ | U+2581 | A4 |
▂ | U+2582 | AF | ▃ | U+2583 | B9 |
▄ | U+2584 | A2 | ▌ | U+258C | A1 |
▍ | U+258D | B5 | ▎ | U+258E | B4 |
▏ | U+258F | A5 | ▒ | U+2592 | A6 |
▔ | U+2594 | A3 | ▕ | U+2595 | A7 |
▖ | U+2596 | BB | ▗ | U+2597 | AC |
▘ | U+2598 | BE | ▚ | U+259A | BF |
▝ | U+259D | BC | ○ | U+25CB | D7 |
● | U+25CF | D1 | ◤ | U+25E4 | A9 |
◥ | U+25E5 | DF | ♠ | U+2660 | C1 |
♣ | U+2663 | D8 | ♥ | U+2665 | D3 |
♦ | U+2666 | DA | ✓ | U+2713 | BA |
🭰 | U+1FB70 | D4 | 🭱 | U+1FB71 | C7 |
🭲 | U+1FB72 | C2 | 🭳 | U+1FB73 | DD |
🭴 | U+1FB74 | C8 | 🭵 | U+1FB75 | D9 |
🭶 | U+1FB76 | C5 | 🭷 | U+1FB77 | C4 |
🭸 | U+1FB78 | C3 | 🭹 | U+1FB79 | C0 |
🭺 | U+1FB7A | C6 | 🭻 | U+1FB7B | D2 |
🭼 | U+1FB7C | CC | 🭽 | U+1FB7D | CF |
🭾 | U+1FB7E | D0 | 🭿 | U+1FB7F | BA |
🮂 | U+1FB82 | B7 | 🮃 | U+1FB83 | B8 |
🮇 | U+1FB87 | AA | 🮈 | U+1FB88 | B6 |
🮌 | U+1FB8C | DC | 🮏 | U+1FB8F | A8 |
🮕 | U+1FB95 | FF | 🮘 | U+1FB98 | DF |
🮙 | U+1FB99 | A9 |
Escape | Byte | Escape | Byte | Escape | Byte |
---|---|---|---|---|---|
{bell} | 07 | {black} | 90 | {blk} | 90 |
{blue} | 1F | {blu} | 1F | {brn} | 95 |
{brown} | 95 | {cbm-*} | DF | {cbm-+} | A6 |
{cbm--} | DC | {cbm-0} | 30 | {cbm-1} | 81 |
{cbm-2} | 95 | {cbm-3} | 96 | {cbm-4} | 97 |
{cbm-5} | 98 | {cbm-6} | 99 | {cbm-7} | 9A |
{cbm-8} | 9B | {cbm-9} | 29 | {cbm-@} | A4 |
{cbm-^} | DE | {cbm-a} | B0 | {cbm-b} | BF |
{cbm-c} | BC | {cbm-d} | AC | {cbm-e} | B1 |
{cbm-f} | BB | {cbm-g} | A5 | {cbm-h} | B4 |
{cbm-i} | A2 | {cbm-j} | B5 | {cbm-k} | A1 |
{cbm-l} | B6 | {cbm-m} | A7 | {cbm-n} | AA |
{cbm-o} | B9 | {cbm-pound} | A8 | {cbm-p} | AF |
{cbm-q} | AB | {cbm-r} | B2 | {cbm-s} | AE |
{cbm-t} | A3 | {cbm-up arrow} | DE | {cbm-u} | B8 |
{cbm-v} | BE | {cbm-w} | B3 | {cbm-x} | BD |
{cbm-y} | B7 | {cbm-z} | AD | {clear} | 93 |
{clr} | 93 | {control-0} | 92 | {control-1} | 90 |
{control-2} | 05 | {control-3} | 1C | {control-4} | 9F |
{control-5} | 9C | {control-6} | 1E | {control-7} | 1F |
{control-8} | 9E | {control-9} | 12 | {control-:} | 1B |
{control-;} | 1D | {control-=} | 1F | {control-@} | 00 |
{control-a} | 01 | {control-b} | 02 | {control-c} | 03 |
{control-d} | 04 | {control-e} | 05 | {control-f} | 06 |
{control-g} | 07 | {control-h} | 08 | {control-i} | 09 |
{control-j} | 0A | {control-k} | 0B | {control-left arrow} | 06 |
{control-l} | 0C | {control-m} | 0D | {control-n} | 0E |
{control-o} | 0F | {control-pound} | 1C | {control-p} | 10 |
{control-q} | 11 | {control-r} | 12 | {control-s} | 13 |
{control-t} | 14 | {control-up arrow} | 1E | {control-u} | 15 |
{control-v} | 16 | {control-w} | 17 | {control-x} | 18 |
{control-y} | 19 | {control-z} | 1A | {cr} | 0D |
{cyan} | 9F | {cyn} | 9F | {delete} | 14 |
{del} | 14 | {dish} | 08 | {down} | 11 |
{ensh} | 09 | {esc} | 1B | {f10} | 82 |
{f11} | 84 | {f12} | 8F | {f1} | 85 |
{f2} | 89 | {f3} | 86 | {f4} | 8A |
{f5} | 87 | {f6} | 8B | {f7} | 88 |
{f8} | 8C | {f9} | 80 | {gray1} | 97 |
{gray2} | 98 | {gray3} | 9B | {green} | 1E |
{grey1} | 97 | {grey2} | 98 | {grey3} | 9B |
{grn} | 1E | {gry1} | 97 | {gry2} | 98 |
{gry3} | 9B | {help} | 84 | {home} | 13 |
{insert} | 94 | {inst} | 94 | {lblu} | 9A |
{left arrow} | 5F | {left} | 9D | {lf} | 0A |
{lgrn} | 99 | {lower case} | 0E | {lred} | 96 |
{lt blue} | 9A | {lt green} | 99 | {lt red} | 96 |
{orange} | 81 | {orng} | 81 | {pi} | FF |
{pound} | 5C | {purple} | 9C | {pur} | 9C |
{red} | 1C | {return} | 0D | {reverse off} | 92 |
{reverse on} | 12 | {rght} | 1D | {right} | 1D |
{run} | 83 | {rvof} | 92 | {rvon} | 12 |
{rvs off} | 92 | {rvs on} | 12 | {shift return} | 8D |
{shift-*} | C0 | {shift-+} | DB | {shift-,} | 3C |
{shift--} | DD | {shift-.} | 3E | {shift-/} | 3F |
{shift-0} | 30 | {shift-1} | 21 | {shift-2} | 22 |
{shift-3} | 23 | {shift-4} | 24 | {shift-5} | 25 |
{shift-6} | 26 | {shift-7} | 27 | {shift-8} | 28 |
{shift-9} | 29 | {shift-:} | 5B | {shift-;} | 5D |
{shift-@} | BA | {shift-^} | DE | {shift-a} | C1 |
{shift-b} | C2 | {shift-c} | C3 | {shift-d} | C4 |
{shift-e} | C5 | {shift-f} | C6 | {shift-g} | C7 |
{shift-h} | C8 | {shift-i} | C9 | {shift-j} | CA |
{shift-k} | CB | {shift-l} | CC | {shift-m} | CD |
{shift-n} | CE | {shift-o} | CF | {shift-pound} | A9 |
{shift-p} | D0 | {shift-q} | D1 | {shift-r} | D2 |
{shift-space} | A0 | {shift-s} | D3 | {shift-t} | D4 |
{shift-up arrow} | DE | {shift-u} | D5 | {shift-v} | D6 |
{shift-w} | D7 | {shift-x} | D8 | {shift-y} | D9 |
{shift-z} | DA | {space} | 20 | {sret} | 8D |
{stop} | 03 | {swlc} | 0E | {swuc} | 8E |
{tab} | 09 | {up arrow} | 5E | {up/lo lock off} | 09 |
{up/lo lock on} | 08 | {upper case} | 8E | {up} | 91 |
{white} | 05 | {wht} | 05 | {yellow} | 9E |
{yel} | 9E |
screenencoding for Unicode
This is a Unicode to PETSCII screen code mapping, including escape sequences for control code screen codes.
Glyph | Unicode | Translated | Glyph | Unicode | Translated |
---|---|---|---|---|---|
–? | U+0020–U+003F | 20–3F | @ | U+0040 | 00 |
A–Z | U+0041–U+005A | 41–5A | [ | U+005B | 1B |
] | U+005D | 1D | a–z | U+0061–U+007A | 01–1A |
£ | U+00A3 | 1C | π | U+03C0 | 5E |
← | U+2190 | 1F | ↑ | U+2191 | 1E |
─ | U+2500 | 40 | │ | U+2502 | 5D |
┌ | U+250C | 70 | ┐ | U+2510 | 6E |
└ | U+2514 | 6D | ┘ | U+2518 | 7D |
├ | U+251C | 6B | ┤ | U+2524 | 73 |
┬ | U+252C | 72 | ┴ | U+2534 | 71 |
┼ | U+253C | 5B | ╭ | U+256D | 55 |
╮ | U+256E | 49 | ╯ | U+256F | 4B |
╰ | U+2570 | 4A | ╱ | U+2571 | 4E |
╲ | U+2572 | 4D | ╳ | U+2573 | 56 |
▁ | U+2581 | 64 | ▂ | U+2582 | 6F |
▃ | U+2583 | 79 | ▄ | U+2584 | 62 |
▌ | U+258C | 61 | ▍ | U+258D | 75 |
▎ | U+258E | 74 | ▏ | U+258F | 65 |
▒ | U+2592 | 66 | ▔ | U+2594 | 63 |
▕ | U+2595 | 67 | ▖ | U+2596 | 7B |
▗ | U+2597 | 6C | ▘ | U+2598 | 7E |
▚ | U+259A | 7F | ▝ | U+259D | 7C |
○ | U+25CB | 57 | ● | U+25CF | 51 |
◤ | U+25E4 | 69 | ◥ | U+25E5 | 5F |
♠ | U+2660 | 41 | ♣ | U+2663 | 58 |
♥ | U+2665 | 53 | ♦ | U+2666 | 5A |
✓ | U+2713 | 7A | 🭰 | U+1FB70 | 54 |
🭱 | U+1FB71 | 47 | 🭲 | U+1FB72 | 42 |
🭳 | U+1FB73 | 5D | 🭴 | U+1FB74 | 48 |
🭵 | U+1FB75 | 59 | 🭶 | U+1FB76 | 45 |
🭷 | U+1FB77 | 44 | 🭸 | U+1FB78 | 43 |
🭹 | U+1FB79 | 40 | 🭺 | U+1FB7A | 46 |
🭻 | U+1FB7B | 52 | 🭼 | U+1FB7C | 4C |
🭽 | U+1FB7D | 4F | 🭾 | U+1FB7E | 50 |
🭿 | U+1FB7F | 7A | 🮂 | U+1FB82 | 77 |
🮃 | U+1FB83 | 78 | 🮇 | U+1FB87 | 6A |
🮈 | U+1FB88 | 76 | 🮌 | U+1FB8C | 5C |
🮏 | U+1FB8F | 68 | 🮕 | U+1FB95 | 5E |
🮘 | U+1FB98 | 5F | 🮙 | U+1FB99 | 69 |
Escape | Byte | Escape | Byte | Escape | Byte |
---|---|---|---|---|---|
{cbm-*} | 5F | {cbm-+} | 66 | {cbm--} | 5C |
{cbm-0} | 30 | {cbm-9} | 29 | {cbm-@} | 64 |
{cbm-^} | 5E | {cbm-a} | 70 | {cbm-b} | 7F |
{cbm-c} | 7C | {cbm-d} | 6C | {cbm-e} | 71 |
{cbm-f} | 7B | {cbm-g} | 65 | {cbm-h} | 74 |
{cbm-i} | 62 | {cbm-j} | 75 | {cbm-k} | 61 |
{cbm-l} | 76 | {cbm-m} | 67 | {cbm-n} | 6A |
{cbm-o} | 79 | {cbm-pound} | 68 | {cbm-p} | 6F |
{cbm-q} | 6B | {cbm-r} | 72 | {cbm-s} | 6E |
{cbm-t} | 63 | {cbm-up arrow} | 5E | {cbm-u} | 78 |
{cbm-v} | 7E | {cbm-w} | 73 | {cbm-x} | 7D |
{cbm-y} | 77 | {cbm-z} | 6D | {left arrow} | 1F |
{pi} | 5E | {pound} | 1C | {shift-*} | 40 |
{shift-+} | 5B | {shift-,} | 3C | {shift--} | 5D |
{shift-.} | 3E | {shift-/} | 3F | {shift-0} | 30 |
{shift-1} | 21 | {shift-2} | 22 | {shift-3} | 23 |
{shift-4} | 24 | {shift-5} | 25 | {shift-6} | 26 |
{shift-7} | 27 | {shift-8} | 28 | {shift-9} | 29 |
{shift-:} | 1B | {shift-;} | 1D | {shift-@} | 7A |
{shift-^} | 5E | {shift-a} | 41 | {shift-b} | 42 |
{shift-c} | 43 | {shift-d} | 44 | {shift-e} | 45 |
{shift-f} | 46 | {shift-g} | 47 | {shift-h} | 48 |
{shift-i} | 49 | {shift-j} | 4A | {shift-k} | 4B |
{shift-l} | 4C | {shift-m} | 4D | {shift-n} | 4E |
{shift-o} | 4F | {shift-pound} | 69 | {shift-p} | 50 |
{shift-q} | 51 | {shift-r} | 52 | {shift-space} | 60 |
{shift-s} | 53 | {shift-t} | 54 | {shift-up arrow} | 5E |
{shift-u} | 55 | {shift-v} | 56 | {shift-w} | 57 |
{shift-x} | 58 | {shift-y} | 59 | {shift-z} | 5A |
{space} | 20 | {up arrow} | 1E |
ADC | 61 65 69 6D 71 75 79 7D | AND | 21 25 29 2D 31 35 39 3D |
---|---|---|---|
ASL | 06 0A 0E 16 1E | BCC | 90 |
BCS | B0 | BEQ | F0 |
BIT | 24 2C | BMI | 30 |
BNE | D0 | BPL | 10 |
BRK | 00 | BVC | 50 |
BVS | 70 | CLC | 18 |
CLD | D8 | CLI | 58 |
CLV | B8 | CMP | C1 C5 C9 CD D1 D5 D9 DD |
CPX | E0 E4 EC | CPY | C0 C4 CC |
DEC | C6 CE D6 DE | DEX | CA |
DEY | 88 | EOR | 41 45 49 4D 51 55 59 5D |
INC | E6 EE F6 FE | INX | E8 |
INY | C8 | JMP | 4C 6C |
JSR | 20 | LDA | A1 A5 A9 AD B1 B5 B9 BD |
LDX | A2 A6 AE B6 BE | LDY | A0 A4 AC B4 BC |
LSR | 46 4A 4E 56 5E | NOP | EA |
ORA | 01 05 09 0D 11 15 19 1D | PHA | 48 |
PHP | 08 | PLA | 68 |
PLP | 28 | ROL | 26 2A 2E 36 3E |
ROR | 66 6A 6E 76 7E | RTI | 40 |
RTS | 60 | SBC | E1 E5 E9 ED F1 F5 F9 FD |
SEC | 38 | SED | F8 |
SEI | 78 | STA | 81 85 8D 91 95 99 9D |
STX | 86 8E 96 | STY | 84 8C 94 |
TAX | AA | TAY | A8 |
TSX | BA | TXA | 8A |
TXS | 9A | TYA | 98 |
ASL | 0A | BGE | B0 |
---|---|---|---|
BLT | 90 | CPA | C1 C5 C9 CD D1 D5 D9 DD |
GCC | 4C 90 | GCS | 4C B0 |
GEQ | 4C F0 | GGE | 4C B0 |
GLT | 4C 90 | GMI | 30 4C |
GNE | 4C D0 | GPL | 10 4C |
GVC | 4C 50 | GVS | 4C 70 |
LSR | 4A | ROL | 2A |
ROR | 6A | SHL | 06 0A 0E 16 1E |
SHR | 46 4A 4E 56 5E |
This processor is a standard 6502 with the NMOS illegal opcodes.
ANC | 0B | ANE | 8B |
---|---|---|---|
ARR | 6B | ASR | 4B |
DCP | C3 C7 CF D3 D7 DB DF | ISB | E3 E7 EF F3 F7 FB FF |
JAM | 02 | LAX | A3 A7 AB AF B3 B7 BF |
LDS | BB | NOP | 04 0C 14 1C 80 |
RLA | 23 27 2F 33 37 3B 3F | RRA | 63 67 6F 73 77 7B 7F |
SAX | 83 87 8F 97 | SBX | CB |
SHA | 93 9F | SHS | 9B |
SHX | 9E | SHY | 9C |
SLO | 03 07 0F 13 17 1B 1F | SRE | 43 47 4F 53 57 5B 5F |
AHX | 93 9F | ALR | 4B |
---|---|---|---|
AXS | CB | DCM | C3 C7 CF D3 D7 DB DF |
INS | E3 E7 EF F3 F7 FB FF | ISC | E3 E7 EF F3 F7 FB FF |
LAE | BB | LAS | BB |
LXA | AB | TAS | 9B |
XAA | 8B |
This processor is an enhanced version of standard 6502 with some illegal opcodes.
BRA | 12 | SAC | 32 |
---|---|---|---|
SIR | 42 |
GRA | 12 4C |
---|
ANC | 0B | JAM | 02 |
---|---|---|---|
LDS | BB | NOP | 04 0C 14 1C 80 |
SBX | CB | SHA | 93 9F |
SHS | 9B | SHX | 9E |
SHY | 9C |
AHX | 93 9F | AXS | CB |
---|---|---|---|
LAE | BB | LAS | BB |
TAS | 9B |
This processor is an enhanced version of standard 6502.
ADC | 72 | AND | 32 |
---|---|---|---|
BIT | 34 3C 89 | BRA | 80 |
CMP | D2 | DEC | 3A |
EOR | 52 | INC | 1A |
JMP | 7C | LDA | B2 |
ORA | 12 | PHX | DA |
PHY | 5A | PLX | FA |
PLY | 7A | SBC | F2 |
STA | 92 | STZ | 64 74 9C 9E |
TRB | 14 1C | TSB | 04 0C |
CLR | 64 74 9C 9E | CPA | D2 |
---|---|---|---|
DEA | 3A | GRA | 4C 80 |
INA | 1A |
This processor is an enhanced version of standard 65C02.
Please note that the bit number is not part of the instruction name (like
rmb7 $20
). Instead it's the first element of coma separated
parameters (e.g. rmb 7,$20
).
BBR | 0F 1F 2F 3F 4F 5F 6F 7F | BBS | 8F 9F AF BF CF DF EF FF |
---|---|---|---|
NOP | 44 54 82 DC | RMB | 07 17 27 37 47 57 67 77 |
SMB | 87 97 A7 B7 C7 D7 E7 F7 |
This processor is an enhanced version of R65C02.
STP | DB | WAI | CB |
---|
HLT | DB |
---|
This processor is an enhanced version of 65C02.
ADC | 63 67 6F 73 77 7F | AND | 23 27 2F 33 37 3F |
---|---|---|---|
BRL | 82 | CMP | C3 C7 CF D3 D7 DF |
COP | 02 | EOR | 43 47 4F 53 57 5F |
JMP | 5C DC | JSL | 22 |
JSR | FC | LDA | A3 A7 AF B3 B7 BF |
MVN | 54 | MVP | 44 |
ORA | 03 07 0F 13 17 1F | PEA | F4 |
PEI | D4 | PER | 62 |
PHB | 8B | PHD | 0B |
PHK | 4B | PLB | AB |
PLD | 2B | REP | C2 |
RTL | 6B | SBC | E3 E7 EF F3 F7 FF |
SEP | E2 | STA | 83 87 8F 93 97 9F |
STP | DB | TCD | 5B |
TCS | 1B | TDC | 7B |
TSC | 3B | TXY | 9B |
TYX | BB | WAI | CB |
WDM | 42 | XBA | EB |
XCE | FB |
CLP | C2 | CPA | C3 C7 CF D3 D7 DF |
---|---|---|---|
CSP | 02 | HLT | DB |
JML | 5C DC | SWA | EB |
TAD | 5B | TAS | 1B |
TDA | 7B | TSA | 3B |
This processor is an enhanced version of standard 65C02.
ADC | 63 67 73 77 | AND | 23 27 33 37 |
---|---|---|---|
CMP | C3 C7 D3 D7 | DIV | 4F 5F 6F 7F |
ENT | 22 | EOR | 43 47 53 57 |
JSR | FC | LDA | A3 A7 B3 B7 |
MMU | EF | MUL | 0F 1F 2F 3F |
NXA | 42 | NXT | 02 |
ORA | 03 07 13 17 | PEA | F4 |
PEI | D4 | PER | 62 |
PHD | DF | PLD | CF |
REA | 44 | REI | 54 |
REP | C2 | RER | 82 |
RHA | 4B | RHI | 0B |
RHX | 1B | RHY | 5B |
RLA | 6B | RLI | 2B |
RLX | 3B | RLY | 7B |
SBC | E3 E7 F3 F7 | SEA | 9F |
SEP | E2 | STA | 83 87 93 97 |
STP | DB | SWA | EB |
TAD | BF | TDA | AF |
TIX | DC | TRX | AB |
TXI | 5C | TXR | 8B |
TXY | 9B | TYX | BB |
WAI | CB | XBA | EB |
XCE | FB | ZEA | 8F |
CLP | C2 | CPA | C3 C7 D3 D7 |
---|---|---|---|
HLT | DB |
This processor is an enhanced version of R65C02.
ASR | 43 44 54 | ASW | CB |
---|---|---|---|
BCC | 93 | BCS | B3 |
BEQ | F3 | BMI | 33 |
BNE | D3 | BPL | 13 |
BRA | 83 | BSR | 63 |
BVC | 53 | BVS | 73 |
CLE | 02 | CPZ | C2 D4 DC |
DEW | C3 | DEZ | 3B |
INW | E3 | INZ | 1B |
JSR | 22 23 | LDA | E2 |
LDZ | A3 AB BB | NEG | 42 |
PHW | F4 FC | PHZ | DB |
PLZ | FB | ROW | EB |
RTS | 62 | SEE | 03 |
STA | 82 | STX | 9B |
STY | 8B | TAB | 5B |
TAZ | 4B | TBA | 7B |
TSY | 0B | TYS | 2B |
TZA | 6B |
ASR | 43 | BGE | B3 |
---|---|---|---|
BLT | 93 | NEG | 42 |
RTN | 62 |
CLR | 64 74 9C 9E |
---|
This processor is an enhanced version of 65CE02.
MAP | 5C |
---|
EOM | EA |
---|