Array: Difference between revisions

m
→‎Encoding an Array's End: Provided more fitting examples that actually relate to the text.
m (→‎Null Terminator: Clarification)
m (→‎Encoding an Array's End: Provided more fitting examples that actually relate to the text.)
Line 118:
This method is most commonly used with strings. An ASCII value that is not associated with any keyboard key, typically 0, is placed at the end of a string. In a typical PrintString assembly routine, the routine is given a pointer to the 0th entry of the string as its only parameter. The routine reads from the pointer, prints that letter, increments the pointer, and repeats until the terminator is read, at which point the routine ends. Without the terminator, the program would not know when to stop reading and eventually crash. A string variable in [[C]] will place a 0 at the end of a string without you having to define it yourself. This method works well for strings and other arrays where the terminator's value is not a possible value for actual data. On more general arrays where the entries represent non-ASCII data, this causes problems where you have a datum that just so happens to equal the terminator.
 
<lang asm>PrintString: ;input: [DS:SI] =8086 stringAssembly pointerexample
; input: [DS:SI] = string pointer
mov al,[ds:si]
jz Terminated ;we've reached the terminator
Line 124 ⟶ 125:
jmp PrintString
Terminated:
ret</lang>
 
<lang asm>HelloText: ;6502, z80, 8086
db "Hello World",0 ;a value in quotes is its ASCII equivalent, anything else is a numeric value.</lang>
 
In cases like this, high-level languages often implement <i>escape characters</i> which when encountered in a string, result in a branch to a section that reads the next character without checking if it's a terminator or other control code. Effectively this removes the special meaning of that character but only if an escape character is before it. This concept is often reversed to allow the programmer to easily implement ASCII control characters, such as <code>\n</code> for new line (in ASCII this is represented by a 13 followed by a 10, for carriage return + line feed.) In this case the backslash signals to <code>printf()</code> that the next letter is associated with a particular ASCII control code. If the next character read is an "n" then ASCII 13 gets printed, followed by ASCII 10. After this, normal reading and printing resumes.
<lang asm>HelloText: ;6502, z80, 8086
db "Hello World",0
 
The example below shows a simple implementation of an escape character. The first instance of the null terminator gets printed as-is rather than used to end the routine.
HelloText:
 
dc.b "Hello World",0 ;68000
<lang asm>PrintString:; 8086 Assembly example
even ;aligns to the next 2 byte boundary if not already aligned.
; modified to use the \ as an escape character.
; input: [DS:SI] = string pointer
mov al,[ds:si]
cmp al,5Ch ;ascii for backslash, this is the escape character
jz EscapeNextChar
cmp al,0h ;check the terminator
jz Terminated ;we've reached the terminator
call PrintChar ;call to hardware-specific printing routine
jmp PrintString
EscapeNextChar:
mov al,[ds:si] ;perform an additional read, except this read doesn't compare the fetched character to anything.
call PrintChar ;print that character as-is
jmp PrintString
 
Terminated:
ret
 
HelloText:
db "Hello World\",0 ,13,10,0</lang>
 
HelloText
.byte "Hello World",0 ;ARM
.align 4 ;aligns to the next 4 byte boundary if not already aligned.</lang>
 
=====End Label=====
1,489

edits