Category:68000 Assembly: Difference between revisions

← Older edit

Category:68000 Assembly (view source)

Revision as of 13:42, 20 August 2022

6,797 bytes added , 1 year ago

m

→‎The Vector Table

Puppydrum64

1,489

edits

Revision as of 17:47, 11 September 2021 (view source) Puppydrum64 (talk \| contribs) m (→‎Effective Address) ← Older edit		Latest revision as of 13:42, 20 August 2022 (view source) Puppydrum64 (talk \| contribs) m (→‎The Vector Table)
(28 intermediate revisions by the same user not shown)
Line 1: {{stub~~}}{{language~~}} {{merge language \| M680x0 }} {{language}} 68000 assembly is the assembly language used for the Motorola 68000, or commonly known as the 68K. It should not be confused with the 6800 (which predates it). The Motorola 68000 is a big-endian processor with full 32-bit capabilities (despite most systems that use it being considered 16-bit.) It was used in many computers such as the Amiga or the Canon Cat, as well as game consoles such as the Sega Genesis and Neo Geo. Line 22 ⟶ 24: This difference isn't usually relevant in the majority of situations, so don't concern yourself too much. It's much more important when doing [[6502 Assembly]] where registers are smaller than the address space. ===Notation Conventions=== How you write the source code depends on your assembler and the syntax it uses. This page is written using Motorola syntax but there is also Milo syntax which has different conventions. How you go about defining numbers or text in your code varies wildly between assemblers. I'm using VASM and these are the rules I have to follow, but your assembler may be different. * A number with a # in front represents a constant, literal value. For example, the 3 in <code>MOVE.B #3,D0</code> represents the number 3. * A number without a # in front represents a memory location. For example, the 3 in <code>MOVE.B 3,D0</code> represents <i>the byte stored at memory address <code>$00000003</code></i>. * A number that doesn't have a $ or % prefix is a decimal (base 10) value. * A number that begins with a $ is a hexadecimal value. * A number that begins with a % is a binary value. * Single and double quotes can be used for ASCII values. When you type <code>MOVE.L "EGGS",D0</code> for example, this is equivalent to <code>MOVE.L #$69717183,D0</code>. * The operand of <code>BTST</code>,<code>BSET</code>,<code>BCLR</code>, or <code>BCHG</code> represents a bit position, starting at the rightmost binary digit as 0 and counting up from right to left. For example, <code>BCLR #3,D0</code> performs the same bitwise operation that you would get from doing <code>AND.B #%11110111,D0</code>. * The operand before the comma is the "source", and the operand after is the "destination." For example, <code>MOVE.L D3,D2</code> takes the value in D3 and stores it into D2, not the other way around. This is the opposite of x86 and ARM, which have the source on the right and the destination on the left. Data blocks, on the other hand, begin with <code>DC.B</code>, <code>DC.W</code>, or <code>DC.L</code> and each represents a constant numeric value. Strangely, you do NOT prefix these with # to signify them as constants (doing so will cause an error on most assemblers). However, you can use the $ or % modifiers to denote hexadecimal or binary. Keep in mind that there is no requirement to use hexadecimal, decimal, or binary in your source code. It all gets converted to binary anyway. However, it is recommended to use the notation that is appropriate for how your data is meant to be interpreted, for readability purposes. ===Data Registers=== There are eight 32-bit data registers on the 68000, numbered D0-D7. As the name implies, these are designed to hold data. ~~Similar~~Much tolike ~~the~~in [[ARM ~~processor~~Assembly]], each one is identical in terms of which commands it can use. A command that can be used for D0 can be used for any other D-register. <lang 68000devpac>MOVE.B #$FF,D0 ;move the hexadecimal value 0xFF into the bottom byte of D0. Line 40 ⟶ 68: MOVE.L D2,(A5) ;store the contents of D2 into the memory address pointed to by A5.</lang> Note that it's also possible to transfer values to/from memory directly, without involving address registers at all. For constant memory locations, this is fine. However, the real strength of the address registers is in their pre-decrement and post-increment modes, which constant memory locations cannot use. <lang 68000devpac>MOVE.L ($00FF0000),D0 MOVE.W D1,($00FFFFFE) MOVE.W ($00FF0000),($00FF1000)</lang> The use of parentheses is not required on most assemblers, but can be used as a reminder to someone reading your code that these represent the values stored at the specified memory locations rather than literal numbers. ====Post-Increment==== Line 78 ⟶ 114: ====The Stack==== The 68000's stack is commonly referred to as <code>SP</code> but it is also address register <code>A7</code>. This register is handled differently than the other address registers when pushing bytes onto the stack. A byte value pushed onto the stack will be padded to the <b>right</b> ~~with zeroes~~. The stack needs to pad byte-length data so that it can stay word-aligned at all times. Otherwise the CPU would crash as soon as you tried to use the stack for anything other than a byte! ~~<lang 68000devpac>MOVE.B #$FF,-(SP) ;push #$FF then #$00 onto the stack, in that order.~~ ~~MOVE.B (SP)+,D0 ;The values are popped in the order #$00 #$FF.</lang>~~ You can abuse this property of the stack to quickly swap bytes around. Suppose you had a number like <code>#$11223344</code> stored in <code>D0</code> and you wanted to change it to <code>#$11224433</code>: ~~<lang 68000devpac>~~ ~~MOVE.W D0,-(SP) ;push #$3344 onto the stack~~ ~~MOVE.B (SP)+,D0 ;pop them in the order #$44 #$33.</lang>~~ ===Length=== Line 108 ⟶ 135: MOVE.L #0,D3 ;D3 = #$00000000</lang> Loading immediate values into address registers is different. You can only move words or longer into address registers, and if you move a word, the value is sign-extended. This means that if the top nibble of the word is 8 or greater, the value gets padded to the left with Fs, and is padded with zeroes if the top nibble is 7 or less. ItIf you're adding a constant value less than 7FFF to an address, it's ~~best~~usually safe to ~~always~~use ~~move~~the ~~longs~~word ~~into~~length ~~address~~operation, ~~registers.~~which ~~That~~takes ~~way~~less ~~you~~bytes ~~know~~to ~~what~~encode ~~you're~~than the long length ~~getting~~version. <lang 68000devpac>MOVEA.W #$8000,A4 ;A4 = #$FFFF8000. Remember the top byte is ignored so this is the same as #$00FF8000. MOVEA.W #$7FFF,A3 ;A3 = #$00007FFF</lang> ==~~Traps~~The Flags== The flags are stored in the Condition Code Register, also known as the <code>CCR</code>. The 68000 has no built-in commands like <code>CLC</code> for clearing/setting individual flags. Rather, you can alter them directly with <code>MOVE</code>,<code>AND</code>,<code>OR</code>, and <code>EOR</code>. Unfortunately, this means you'll have to remember which bits represent which flags. Or, if your assembler supports macros, you can define a macro that handles this for you. Traps are the equivalent of <code>INT</code> in [[8086 Assembly]]/[[x86 Assembly]] or <code>SVC</code>/<code>SWI</code> in [[ARM Assembly]]. The 68000 only supports 17 of these in total. A trap is very similar to a subroutine call, except it can also occur automatically if certain conditions are met, such as dividing by zero. This saves the trouble of having to do the following: The flags update automatically after most operations, and take into consideration the operand sizes when doing so. Check out this example: ~~<lang 68000devpac>;this is an example of how NOT to do it!~~ <lang 68000devpac> MOVE.L #$12FF,D0 ADD.B #1,D0 </lang> Since we used <code>ADD.B</code>, D0 now contains $1200, and the extend, carry, and zero flags are all set. Had we done <code>ADD.W</code>, we would get $1300 in D0 with none of those flags set. The flags are based on what the actual instruction "sees", not the entire register at all times. ~~CMP.L #0,D0~~ ~~BEQ DivideByZeroError~~ ~~DIVU D0,D1~~ ~~RTS~~ ~~DivideByZeroError:~~ ~~; put your error handler here</lang>~~ * X: The eXtend flag is bit 4 of the CCR, and is similar to the carry flag. It gets set and cleared often for the same reasons and is used with the <code>ADDX</code>, <code>SUBX</code>, <code>NEGX</code>, <code>ROXL</code>, and <code>ROXR</code> commands. Why the 68000 has both this and the carry flag, I still don't know. This is completely unnecessary, as the <code>DIVU</code> and <code>DIVS</code> use a trap to handle a divide by zero automatically. If the CPU would attempt to divide by zero, the processor automatically calls the relevant trap (Trap 5 in this case). The return address and processor flags are saved, and execution jumps to the address specified in the trap list (a table of pre-defined memory addresses, stored at $000080 and going up. The standard 16 traps are stored here, with a 17th stored at $00001C. ~~===Hardware-Defined Traps===~~ ~~Certain traps have specific meanings as defined by the 68000 itself:~~ * Trap 4 occurs if the <code>ILLEGAL</code> command is executed. This is similar to <code>BRK</code> on [[6502 Assembly]]. If you're programming on a system or emulator with no built-in debugger, it's a handy way of seeing if execution is arriving at a certain point. If Trap 4 is pointed to a system reset or a hexdump routine you created, you'll know in an instant if the code before it is bugged. This (admittedly contrived) example will show the basic concept. ~~<lang 68000devpac>;trying to see if this routine works properly~~ ~~TestRoutine:~~ ~~CMP.L D0,D1 ;programmer expects these to always be equal~~ ~~BEQ weShouldGoHere~~ ~~ILLEGAL~~ ~~weShouldGoHere:~~ * N: The negative flag is bit 3 of the CCR, and is set when the last operation resulted in a "negative" value. What constitutes a negative value depends on the size of the last operation - for <code>.B</code> instructions, $80-$FF. For <code>.W</code> instructions, $8000-$FFFF, and for <code>.L</code> instructions, $80000000-$FFFFFFFF. ~~;rest of routine which depends on D0 and D1 being equal.</lang>~~ * Trap 5 occurs if a <code>DIVU</code> or <code>DIVS</code> instruction attempts to divide by zero. * Z: The zero flag is bit 2 of the CCR and works like you would expect - it's set whenever an operation results in zero. Unlike x86 Assembly, this also includes moving 0 directly into a register, clearing a register <b>or memory</b> with <code>CLR</code>, etc. * Trap 6 detects if a value is out of bounds for a desired range. The <code>CHK</code> command takes a numeral, data register, or memory address (either explicit or pointed to by an address register) and compares it to a data register containing the upper bound. If the first operand is greater than the upper bound, trap 6 will be called. Otherwise execution will resume as normal. ~~<lang 68000devpac>MOVE.L #500,D3 ;our maximum range~~ ~~CHK D0,D3 ;if D0 > D3 then Trap 6 will occur.</lang>~~ * The 17th trap is called the "Overflow Trap" and can only be called with the <code>TRAPV</code> instruction, which calls it <i>if the overflow flag is set.</i> * V: The overflow flag is bit 1 of the CCR. It is set whenever a math operation results in a value crossing the $7F-$80 boundary. (Wraparound from 00 to FF doesn't count as overflow, but it does set the carry flag.) ~~===Kernel-Defined Traps===~~ Depending on the hardware, certain traps are built-in to perform certain tasks, such as reading a keyboard or mouse, or are user-defined. To create your own trap routine, you'll need to first write the routine, then store its address in the corresponding trap number. Trap 0 is stored at $000080, Trap 1 at $000084, and so on. The overflow trap is stored at address $00001C. * C: The carry flag is bit 0 of the CCR. It is set when a math operation results in a carry or borrow. Rolling over from FF to 00, or a 1 getting "pushed out" via a bit shift or rotate, set the carry flag. When using <code>CMP</code>, the carry flag determines the unsigned magnitude comparison. Carry set is less than, carry clear is greater than or equal. In truth, the flags are a 16-bit register, of which the CCR is just the "low half". The SR (status register) is the full 16-bit register. There are a few additional flags in the upper half, which are used by the operating system. You can read these but for the most part you won't need to write to them. ==The Vector Table== Certain memory locations have special meanings to the 68000 CPU. Most of these are used for system calls or exception handling. Sixteen of them can be accessed with the <code>TRAP #?</code> command, which takes an immediate constant ? ranging from 0 to 15 as an operand. This effectively equates to a <code>JSR</code> to the address stored at <code>$00000080 + (4?)</code>. For the most part, what each <code>TRAP</code> does depends on the system's firmware and so you'll need to read the documentation. Some machines allow the user to define their own TRAPs at $00000080 - $000000BF, others are hardcoded in. Even others will likely take a "middle ground" and have the hard-coded trap read a function pointer from some other address that the user is allowed to write to, and jump to that. The 68000's vector table contains more than this, but to keep things simple there will be a few things omitted. Address $00000000 contains the initial value of the stack pointer. You never need to <code>MOVE.L (0),SP</code>, the CPU does this automatically. * Address $00000004 contains the program start. The second thing the CPU does (after loading the default stack pointer from $00000000) is <code>JMP</code> to the address stored in $00000004 (note that it doesn't jump TO $00000004, it loads the 4 bytes stored there and makes that the new program counter value.) * The next 8 longwords are the hardware traps. Each represents the memory location of a function or procedure meant for a hardware error (the 68000 doesn't have segmentation faults but it's a similar concept. Among them include handlers for division by zero, signed overflow, etc.) * Interrupt requests and user-defined traps go here as well. Programming your own trap is a lot like programming a typical subroutine. However, there are a few differences: * The statement to return back to your regular program must be <code>RTE</code> rather than <code>RTS</code>. Otherwise, you'll most likely crash the CPU. * It's best to push all registers (D0 thru D7 and A0-A6) onto the stack at the start and pop them off at the end. This is not required for the traps you call with the <code>TRAP #?</code> command, but for the others it's absolutely necessary, since you can't know in advance when they'll happen. You can leave out A7 when doing this since that's the stack pointer itself. ==Interrupts== The 68000 supports 7 different interrupts, often called IRQs or Interrupt Requests. Enabling interrupts is often a twofold process: first, the interrupt source must be enabled, which is usually an implementation-defined process that involves interacting with memory-mapped ports. Second, the status register must be set accordingly to allow the interrupt to occur, which can be achieved with <code>MOVE #$2x00,SR</code> where X is the desired interrupt level. X can range from 0 to 7, and for an interrupt to occur, its interrupt level (which is determined by the hardware implementation and the placement of the desired address in the vector table) must be greater than X or it will not happen (even if the source is enabled.) ==Alignment== Line 174 ⟶ 212: DC.W $1000,$2000,$3000,$4000</lang> Another way is to pad the data with an extra byte, so that there is an even number of entries in the table. ~~This allows the programmer to do a "dummy read" (i.e. reading with post-increment with the sole purpose of incrementing the pointer, with the value read being of zero interest.)~~ This becomes impractical with large data tables, so the <code>EVEN</code> directive can be placed after a series of bytes. If the byte count is odd, <code>EVEN</code> will pad the data with an extra byte. If it's already even, the <code>EVEN</code> command is ignored. This saves you the trouble of having to count a long series of bytes without worrying about wasting space. <lang 68000devpac>MyString: DC.B "HELLO WORLD 12345678900000",0 EVEN ;some assemblers require this to be on its own line</lang> A third way is to perform a "dummy read." This is when a value is read from an address using pre-decrement or post-increment, with the sole purpose of moving the pointer, and the value being read is of zero interest. This method lets you work with mixed data types in the same table, but it requires the programmer to know in advance where the byte-length data begins and ends. <lang 68000devpac>TestData: DC.B $02,$03,$04 DC.W $0345 LEA TestData,A0 ;load effective address of TestData into A0. MOVE.B (A0)+,(A1)+ ;copy $02 to a new memory location MOVE.B (A0)+,(A1)+ ;copy $03 to a new memory location MOVE.B (A0)+,(A1)+ ;copy $04 to a new memory location ;if we did MOVE.W (A0)+,(A1)+ now we'd crash. First we need to adjust the pointers. MOVE.B (A0)+,D7 ;dummy read to D7. Now A0 is word aligned. MOVE.B (A1)+,D7 ;dummy read to D7. Now A1 is word aligned. MOVE.W (A0)+,(A1)+ ;copy $0345 to a new memory location</lang> Using <code>ADDA.L #1,A0</code> and <code>ADDA.L #1,A1</code> would have worked also, instead of the dummy read. The 68000 gives the programmer a lot of different ways to do a task. ==Subroutines== Subroutines work exactly the same as they do in [[6502 Assembly]]. Even the commands are the same; <code>JSR</code> and <code>RTS</code>. The only difference is that return spoofing doesn't require the return address to be decremented by 1. The 68000 also adds <code>BSR</code> for nearby subroutines. These still need to end in an <code>RTS</code> just the same, but saves CPU cycles compared to a <code>JSR</code>. ==Citations== #'Akuyou', Keith. <i>Learn Multiplatform Assembly Programming with Chibiakumas!</i> Las Vegas, NV. Self-published, 05 April 2021. #[[https://www.chibiakumas.com/68000/ ChibiAkumas Motorola 68000 Tutorial]] #[[http://www.easy68k.com/paulrsm/doc/trick68k.htm 68000 Tricks and Traps]] ~~{{merge language \| M680x0 }}~~