I'm working on modernizing Rosetta Code's infrastructure. Starting with communications. Please accept this time-limited open invite to RC's Slack.. --Michael Mol (talk) 20:59, 30 May 2020 (UTC)

Category:8086 Assembly

From Rosetta Code
(Redirected from 8086 Assembly)
This page is a stub. It needs more information! You can help Rosetta Code by filling it in!
Language
8086 Assembly
This programming language may be used to instruct a computer to perform a task.
See Also:


Listed below are all of the tasks on Rosetta Code which have been solved using 8086 Assembly.
Your Help Needed
If you know 8086 Assembly, please write code for some of the tasks not implemented in 8086 Assembly.
8086 Assembly is the assembly language used by the Intel 8086 processor. This processor was used for the first time in the IBM PC, and in its various clones. The 8086 gave birth, starting with the 80186 processor, to the X86 family, that nowadays is the most used processor family in desktop computers. All the 32 and 64 bit processors from this family are able to operate in a 8086 compatibility mode, for backward compatibility with legacy software and running very low-level code (like the BIOS). For the evolution of this assembly implementation to 32 bits, see X86 assembly.

Architecture Overview[edit]

Segmented Memory[edit]

The 8086 uses a segmented memory model, similar to the Super Nintendo Entertainment System. Unlike banked memory models used in the Commodore 64 and late NES games, segment addresses are held in segment registers. These segment registers are 16 bit and get left-shifted by 4 and added to the pointer register of interest to determine the memory address to look up. The 8086 has four in total, but only the Data Segment and Extra Segment can be used by the programmer. (The other two are reserved for the stack pointer and instruction pointer.) On the 8086, you can only load segment registers with the value in a data register, or with the POP command. So first you must load a segment into a data register, THEN into a segment register.

;This is NOT valid code!
mov ds, @data ;you're trying to move the data segment location directly into DS. You can't!
 
 
;This is the proper way:
mov ax, @data ;I chose AX but I could have used BX, CX, or DX.
mov ds, ax ;load DS with the data segment.

Data Registers[edit]

There are four data registers: AX, BX, CX, and DX. While most commands can use any of them, some only work with particular data registers. System calls in particular are very specific about which registers can be used for what. On MS-DOS, the AX register is used for selecting the desired interrupt to use with the INT command.

Each data register is 16-bit, but has two eight bit halves ending in H or L, e.g. AH, AL. Instructions can be executed using the whole register or just half of it.

mov ax, 1000h ;move 1000h into AX, or equivalently, move 10h into AH and 00h into AL.

Moving a value smaller than 16 bits into ?X is the same as moving it into ?L, and moving 0 into ?H. (? represents the data register of your choice. They are all the same in this regard.)

mov ax,0030h
;is the same as:
mov al, 30h
mov ah, 0h

Commands that alter the contents of a single register will not affect the other half.

mov ax,00FFh
inc al

If we had executed INC AX, then AX would have been incremented to 0x100. Since we only incremented AL, only AL was incremented in this case, and AH still equals 0x00!

Generally speaking, the 8086's registers serve the following purposes:

  • AX is the "Accumulator" and is used for advanced mathematics routines, as well as the source/destination for the STOSW and LODSW commands when loading/storing bytes from consecutive regions of memory.
  • BX can be used as a variable offset on the Source Index/Destination Index registers (more on those later).
  • CX is used as a loop counter. The JCXZ command jumps to the specified address, but only if CX = 0. In addition, CL is used to specify a shift amount when performing bit shifts and rotates. On later x86 CPUs, this can be any constant value, but on the original 8086 you can only specify 1 or CL.
  • DX contains the "high word" of a 32-bit product when multiplying AX with another 16-bit register. For example, if AX contains 0x2000 and BX contains 0x10, the command MUL BX will create the product 0x20000, with DX containing 0x0002 and AX containing 0x0000. In addition, DX is also used with the IN and OUT commands when selecting which ports to read from/write to external hardware.

Other Registers[edit]

When writing to or reading from consecutive sections of memory, it is helpful to apply an offset from a base value. The Base Pointer register BP, Source Index SI, and Destination Index DI can point to various regions of memory. Many commands that work with these registers can auto-increment or decrement them after each load or store. In addition, they can be optionally offset by a constant, the value stored in BX, or both at the same time. Unlike the data registers, BP,DI, and SI cannot be split in half and worked on separately. Only data registers allow you to work in 8-bit.

The syntax for offsetting an index register will vary depending on your assembler. Many assemblers will often accept multiple different ways of writing it, and it comes down to personal preference.

mov si, offset MyArray
mov bx,2
 
mov al,[bx+si] ;loads decimal 30 into AL
 
MyArray:
byte 10,20,30,40,50

The Stack[edit]

As with Z80 Assembly, you can't push 8-bit registers onto the stack. For data registers, you have to push/pop both halves.

;This is valid code.
push ax
push bx
 
pop bx
pop ax
;This is NOT valid code.
push ah
push al
push bh
push bl
 
 
pop bl
pop bh
pop al
pop ah

As with all processors that use a stack, if you push one or more registers and want to restore the backed-up values correctly, you must pop them in the reverse order. You can pop them out of order on purpose to swap registers around. In fact, this is a quick way to move the segment from DS into ES, or vice-versa:

push DS
pop ES ;you can't do "mov es, ds" but you can do this!

The proper way to use the stack to preserve registers:

 
call foo
 
mov ax,4C00h
int 21h ;exit program and return to DOS. Instruction pointer cannot move beyond this except with a function call.
 
 
foo:
push ax
push bx
push cx
 
pop cx
pop bx
pop ax
ret

If one of the push/pop commands in the routine above were missing, the RET instruction would not properly return to where it came from. As long as you pop at the end the same number of registers you pushed at the start, the stack is "balanced" and your return instruction will return correctly. This is because RET is actually POP IP (IP being the instruction pointer, which you can think of as what "line" of code the CPU is on.) The CPU assumes the top of the stack is the correct place to return to, but has no way of actually verifying it. If the function you just wrote causes the CPU to crash or jump to a completely different part of the code, there's a good chance you might have forgotten to balance the stack properly.

Arithmetic[edit]

The 8086 has a lot of advanced mathematics commands. Like the 68000 it can multiply and divide. Although the 8086 doesn't work with 32 bit numbers in a single register, it has many more options for Binary Coded Decimal values, including commands for both "packed" (two digits per byte) and "unpacked" (one digit per byte.) By contrast, 68000 Assembly only has commands for the "packed" format.

The 8087 co-processor adds many different commands for floating-point mathematics. These instructions typically start with the letter "F" (e.g. FADD, FMUL, etc.) When the 8086 reads an instruction that is meant for the 8087, it passes that instruction to the 8087 which then executes it. The 8087's registers are more complicated than that of the 8086; the 8087 uses a stack-based register system.

While both the 8086 and the 8087 can read the same code, data, and memory, they cannot read the contents of each other's registers. So for example, if you wanted to use the 8087 to perform a calculation and then output the result to the screen, you'd need to give the 8087 the command to do the math and store the result into RAM. Then, the 8086 would read that RAM and output the result to the screen. The 8087 also doesn't have the robust indexed addressing modes of the 8086 - so the 8086 will often do the job of looking up a value from a table, then dumping that value into a temporary "loading zone" at a known location where the 8087 can more easily read from. In order to make sure that the 8086 and 8087 stay in sync, the 8086 can WAIT for the 8087 to finish its current instruction before executing more instructions. This is incredibly useful in the event that the 8086 needs to use a calculation that the 8087 did - it might end up reading from the "loading zone" before the 8087 actually has put the result of the calculation in it! (Most assemblers will handle this for you.)

Looping Constructs[edit]

The 8086 has a lot more of these than most CPUs. The most obvious one is LOOP, which will subtract 1 from CX, then jump back to a specified label if CX is nonzero after the subtraction. If CX becomes zero after subtracting 1, then no jump will occur and the instruction pointer simply moves to the next instruction.

mov cx,0100h ;set the loop counter's starting value. This must be outside the loop, otherwise you'll loop forever!
 
foo:
;; your code that you want to loop goes here
loop foo

It's important not to alter CX inside the loop. Sometimes you'll have to, like if you want to do bit shifts with a shift amount other than 1, for example. There's a simple fix - use the stack to help you out!

mov cx,0100h
 
foo:
push cx
mov cl,2
ror ax,cl
pop cx
loop foo

By using PUSH CX and POP CX, you can temporarily use CX for something else, as long as you restore it before the LOOP instruction.


Another frequently used looping construct is REP. REP can be combined with certain instructions to repeat that instruction until CX equals zero. However, unlike LOOP, which can be used to repeat a block of instructions, REP can only repeat one. It doesn't work on all instructions, only the "string" instructions which operate on a block of memory. Typically these include MOVSB, LODSB, STOSB, CMPSB, and SCASB (each has a variant that ends in W instead of B, for 16-bit data.) There are also REPZ and REPNZ, which stand for "Repeat until Zero" and "Repeat until Nonzero" respectively. These two only work properly with CMPSB and SCASB, as MOVSB, LODSB, STOSB do not affect the flags. (The CPU doesn't detect whether CX equals zero using the flags, as the flags never reflect this equality to zero like you would expect.)


Citations[edit]

  1. 'Akuyou', Keith. Learn Multiplatform Assembly Programming with Chibiakumas! Las Vegas, NV. Self-published, 05 April 2021.

See Also[edit]

Ralf Brown's Interrupt List - a detailed documentation of MS-DOS system calls

Subcategories

This category has the following 3 subcategories, out of 3 total.

Pages in category "8086 Assembly"

The following 98 pages are in this category, out of 98 total.