Jump to content

Gotchas: Difference between revisions

m
syntax highlighting fixup automation
m (syntax highlighting fixup automation)
Line 10:
===Numeric Literals===
Integer literals used in instruction operands need to begin with a #, otherwise, the CPU considers them to be a pointer to memory. This applies to any integer representation, even ASCII.
<langsyntaxhighlight lang="6502asm">LDA 'J' ;load the 8-bit value stored at memory address 0x004A into the accumulator.
OR 3 ;bitwise OR the accumulator with the 8-bit value stored at memory address 0x0003
 
 
LDA #'7' ;load the ASCII code for the numeral 7 (which is 0x37) into the accumulator.</langsyntaxhighlight>
 
However, data blocks do ''not'' get the # treatment:
 
<langsyntaxhighlight lang="6502asm">byte $45 ;this is the literal constant value $45, not "the value stored at memory address 0x0045"</langsyntaxhighlight>
 
 
Line 31:
 
Some examples of bad memory-mapped port I/O for various hardware:
<langsyntaxhighlight lang="6502asm">INC $2005 ;the intent was to scroll the NES's screen to the right one pixel. That's not gonna happen.
;What actually happens? Who knows! (I made this mistake once long ago.)</langsyntaxhighlight>
 
===Inverted Carry===
The carry flag is "backwards" compared to most languages with regard to comparisons and subtractions.
On most CPUs, carry set is used to mean "less than", and carry clear is used to mean "greater than or equal." The 6502 is the opposite!
<langsyntaxhighlight lang="6502asm">LDA #$20
CMP #$19
BCS foo ;this branch is always taken, since #$20 >= #$19. If this were any other CPU this branch would never be taken!</langsyntaxhighlight>
 
Not only that, carry clear indicates a borrow when subtracting. This is also the opposite to most CPUs. To do a normal subtraction you need to ''set'' the carry before subtracting.
<langsyntaxhighlight lang="6502asm">LDA #8
SEC
SBC #4 ;eight minus four</langsyntaxhighlight>
 
===Carry===
Line 50:
 
Therefore, any time you want to add two numbers without involving the carry flag you have to do this:
<langsyntaxhighlight lang="6502asm">CLC
ADC ___ ;your operand/addressing mode of choice goes here</langsyntaxhighlight>
 
Failure to correctly use the carry flag can often result in unexpected "off-by-one" errors.
Line 75:
* When using <code>$nn,x</code> or <code>$nn,y</code>, if <code>$nn + x</code> or <code>$nn + y</code> exceeds 255, the instruction will wrap around back to $00 rather than advancing to $0100. For a more concrete example:
 
<langsyntaxhighlight lang="6502asm">LDX #$FF
LDA $80,X ;loads from address $007F, not $017F</langsyntaxhighlight>
 
In other words, ''an indexed zero page addressing mode cannot exit the zero page.''
 
It should be noted that the above bug does '''not''' apply to indexed absolute addressing.
<langsyntaxhighlight lang="6502asm">LDA $2080,X ;loads from the correct address regardless of the value of X.</langsyntaxhighlight>
 
This also happens with the indirect JMP operation, which jumps to the 16-bit address stored at the specified address.
<langsyntaxhighlight lang="6502asm">LDA #$20
STA $3000
LDA #$40
STA $3001
JMP ($3000) ;evaluates to JMP $4020</langsyntaxhighlight>
 
The following will '''not''' execute in the way you expect, because this instruction has a similar bug where it doesn't advance to the next page when calculating the address.
<langsyntaxhighlight lang="6502asm">LDA #$20
STA $30FF
LDA #$40
STA $3100
JMP ($30FF) ;rather than take the high byte from $3100, the high byte is taken from $3000 instead.</langsyntaxhighlight>
 
As long as you don't put a number that looks like <code>$nnFF</code> in the parentheses as shown above, you can avoid this bug entirely.
Line 102:
===Numeric Literals===
Integer literals used in instruction operands need to begin with a #, otherwise, the CPU considers them to be a pointer to memory. This applies to any integer representation, even ASCII.
<langsyntaxhighlight lang="68000devpac">MOVE.L $12345678,D0 ;move the 32-bit value stored at memory address $12345678 into D0
MOVE.L #$12345678,D0 ;load the D0 register with the constant value $12345678</langsyntaxhighlight>
 
However, data blocks do ''not'' get the # treatment:
 
<langsyntaxhighlight lang="68000devpac">DC.B $45 ;this is the literal constant value $45, not "the value stored at memory address 0x0045"</langsyntaxhighlight>
 
===LEA Does Not Dereference===
When dereferencing a pointer, it is necessary to use parentheses. For these examples,
<langsyntaxhighlight lang="68000devpac">MOVEA.L #$A04000,A0 ;load the address $A04000 into A0
MOVE.L A0,D0 ;move the quantity $A04000 into D0
MOVE.B (A0),D0 ;get the 8-bit value stored at memory address $A04000, and store it into the lowest byte of D0.
MOVE.W (4,A0),D1 ;get the 16-bit value stored at memory address $A04004, and store it into the low word of D1.</langsyntaxhighlight>
 
However, the <code>LEA</code> instruction (load effective address) uses this parentheses syntax, but ''does not dereference!'' For extra weirdness, you don't put a # in front of a literal operand either.
 
<langsyntaxhighlight lang="68000devpac">LEA $A04000,A0 ;effectively MOVEA.L #$A04000,A0
LEA (4,A0),A0 ;effectively ADDA.L #4,A0</langsyntaxhighlight>
 
===Partitioned Registers===
One key feature of the 68000 is that its instructions can operate at different "lengths" (8-bit, 16-bit, or 32-bit). When performing an operation at the "word length" (16-bit), only the least significant 16 bits are affected, and the rest of the register is ignored. The flags also only reflect the result of the calculation with respect to the instruction's "length", not the entire register. For example:
<langsyntaxhighlight lang="68000devpac">MOVE.L #$12345678,D0 ;set the entire register to a known value for demonstration purposes.
 
MOVE.W #$7FFF,D0 ;D0 = $12347FFF
ADD.W #1,D0 ;D0 = $12348000
TRAPV ;the above operation set the overflow flag, so this instruction will call the signed overflow handler.
;Even though the entire register didn't overflow, the portion we were operating on did, so that counts.</langsyntaxhighlight>
 
As implied by the previous example, loading a value at a length less than 32 bits into a register will leave the "high bits" the same. This can often cause subtle errors that lead to your program failing unexpectedly.
 
<langsyntaxhighlight lang="68000devpac">MOVE.B #16-1,D0 ;loop 16 times
forloop:
; loop body goes here
DBRA D0,forloop</langsyntaxhighlight>
 
The above code is flawed in that <code>DBRA</code> (and its cousins) ''operate at word length.'' Given that, and the fact that we only loaded the loop counter at byte length, the loop will execute <code>$nn10</code> times instead of the intended <code>$10</code> times, where <code>$nn</code> is the prior value of the register being used as the loop counter.
Line 143:
===Automatic Sign-Extension===
When moving values into ''address registers at word length'', the value is sign-extended first.
<langsyntaxhighlight lang="68000devpac">MOVEA.W #$8000,A0 ;MOVEA.L #$FF8000,A0
MOVEA.W #$7FFF,A0 ;MOVEA.L #$007FFF,A0</langsyntaxhighlight>
 
There is no sign-extension when moving values into data registers.
<langsyntaxhighlight lang="68000devpac">MOVE.W #$FF,D0 ;MOVE.W #$00FF,D0
MOVE.L #$8000,D2 ;MOVE.L #$00008000,D2</langsyntaxhighlight>
 
If you want sign-extension on data registers, you'll need to do it manually:
<langsyntaxhighlight lang="68000devpac">MOVE.W #$FF,D0
EXT.W D0 ;D0 = $????FFFF
 
MOVE.L #$8000,D1
EXT.L D1 ;D1 = $FFFF8000</langsyntaxhighlight>
 
=={{header|MIPS Assembly}}==
Line 161:
Due to the way MIPS's instruction pipeline works, an instruction placed after a branch instruction is executed during the branch, ''even if the branch is taken.''
 
<langsyntaxhighlight lang="mips">move $t0,$zero ;load 0 into $t0
beqz $t0,myLabel ;branch if $t0 equals 0
addiu $t0,1 ;add 1 to $t0. This happens during the branch, even though the program counter never reaches this instruction.</langsyntaxhighlight>
 
Now, you may think that the 1 gets added first and therefore the branch doesn't take place since the conditions are no longer true. However, this is '''not the case.''' The condition is already "locked in" by the time <code>addiu $t0,1</code> finishes. If you compared again immediately upon arriving at <code>myLabel</code>, the condition would be false.
Line 171:
On earlier versions of MIPS, this also happened when loading from memory. The register you loaded into wouldn't have the new value during the instruction after the load. This "load delay slot" doesn't exist on MIPS III (which the Nintendo 64 uses) but it does exist on the PlayStation 1.
 
<langsyntaxhighlight lang="mips">la $a0,0xDEADBEEF
lw $t0,($a0) ;load the 32-bit value at memory address 0xDEADBEEF
addiu $t0,5 ;5 is actually added BEFORE the register has gotten its new value from the memory load above. It will be clobbered.</langsyntaxhighlight>
 
Like with branches, putting a <code>NOP</code> after a load will solve this problem.
Line 181:
Issues with array rank and type should perhaps be classified as gotchas. J's display forms are not serialized forms and thus different results can look the same.
 
<langsyntaxhighlight Jlang="j"> ex1=: 1 2 3 4 5
ex2=: '1 2 3 4 5'
ex1
Line 190:
11 12 13 14 15
10+ex2
|domain error</langsyntaxhighlight>
 
Also, constant values with a single element are "rank 0" arrays (they have zero dimensions) while constant values with some other count of elements are "rank 1" arrays (they have one dimension -- the count of their elements -- they are lists).
Line 198:
Another gotcha with J has to do with function composition and J's concept of "[[wp:Rank_(J_programming_language)|rank]]". Many operations, such as <code>+</code> are defined on individual numbers and J automatically maps these over larger collections of numbers. And, this is significant when composing functions. So, a variety of J's function composition operators come in pairs. One member of the pair composes at the rank of the initial function, the other member of the pair composes at the rank of the entire collection. Picking the wrong compose operation can be confusing for beginners.
 
For example:<langsyntaxhighlight Jlang="j"> 1 2 3 + 4 5 6
5 7 9
+/ 1 2 3 + 4 5 6
Line 205:
21
1 2 3 +/@+ 4 5 6
5 7 9</langsyntaxhighlight>
 
Here, we are adding to lists and then (after the first sentence) summing the result. But as you can see in the last sentence, summing the individual numbers by themselves doesn't accomplish anything useful.
Line 285:
Perl has lists (which are data, and ephemeral) and arrays (which are data structures, and persistent), distinct entities but tending to be thought of as inter-changable. Combine this with the idea of <i>context</i>, which can be 'scalar' or 'list', and the results might not be as expected. Consider the handling of results from a subroutine, in a scalar context:
 
<langsyntaxhighlight lang="perl">sub array1 { return @{ [ 1, 2, 3 ] } }
sub list1 { return qw{ 1 2 3 } }
 
Line 296:
 
say scalar array2(); # prints '3', number of elements in array
say scalar list2(); # prints '1', last item in list</langsyntaxhighlight>
 
The behavior is documented, but does provide an evergreen topic for SO questions and blog posts.
Line 332:
<br>
Forward calls may thwart constant setup, eg:
<!--<langsyntaxhighlight Phixlang="phix">(phixonline)-->
<span style="color: #008080;">forward</span> <span style="color: #008080;">procedure</span> <span style="color: #000000;">p</span><span style="color: #0000FF;">()</span>
<span style="color: #000000;">p</span><span style="color: #0000FF;">()</span>
Line 340:
<span style="color: #0000FF;">?</span><span style="color: #000000;">hello</span> <span style="color: #000080;font-style:italic;">-- fatal error: hello has not been assigned a value</span>
<span style="color: #008080;">end</span> <span style="color: #008080;">procedure</span>
<!--</langsyntaxhighlight>-->
Not a problem if the first executed statement in your program is a final main(), or more accurately not a problem after such a last statement has been reached.<br>
Quite a few of the standard builtins avoid a similar issue, at the cost of things not officially being "constant" anymore, using a simple flag and setup routine.
Line 395:
strings may be used almost interchangeably in most cases.
 
<syntaxhighlight lang="raku" perl6line>say 123 ~ 456; # join two integers together
say "12" + "5.7"; # add two numeric strings together
say .sqrt for <4 6 8>; # take the square root of several allomorphic numerics</langsyntaxhighlight>
 
<pre>123456
Line 411:
each object are within. Works great for strings.
 
<syntaxhighlight lang="raku" perl6line>say my $bag = <a b a c a b d>.Bag;
say $bag{'a'}; # a count?
say $bag< a >; # another way</langsyntaxhighlight>
 
<pre>Bag(a(3) b(2) c d)
Line 421:
But numerics can present unobvious problems.
 
<syntaxhighlight lang="raku" perl6line>say my $bag = (1, '1', '1', <1 1 1>).Bag;
say $bag{ 1 }; # how many 1s?
say $bag{'1'}; # wait, how many?
say $bag< 1 >; # WAT
dd $bag; # The different numeric types LOOK the same, but are different types behind the scenes</langsyntaxhighlight>
 
<pre>Bag(1 1(2) 1(3))
Line 452:
and we want to print the contents to the console.
 
<syntaxhighlight lang="raku" perl6line>.say for (1,2,3), (4,5,6);</langsyntaxhighlight>
 
<pre>(1,2,3)
Line 460:
flatten the object due to the single argument rule.
 
<syntaxhighlight lang="raku" perl6line>.say for (1,2,3);</langsyntaxhighlight>
 
<pre>1
Line 470:
like multiple object parameters even if there is only one. (Note the trailing comma after the list.)
 
<syntaxhighlight lang="raku" perl6line>.say for (1,2,3),;</langsyntaxhighlight>
 
<pre>(1,2,3)</pre>
Line 476:
Conversely, if we want the flattening behavior when passing multiple objects, we need to manually, explicitly flatten the objects.
 
<syntaxhighlight lang="raku" perl6line>.say for flat (1,2,3), (4,5,6);</langsyntaxhighlight>
 
<pre>1
Line 508:
 
I've tried to construct an example below which illustrates these pitfalls.
<langsyntaxhighlight lang="ecmascript">class Rectangle {
construct new(width, height) {
// Create two fields.
Line 540:
System.print(rect.area) // runtime error: Null does not implement *(_)
System.print(rect.diagonal) // runtime error (if previous line commented out)
// Rectangle does not implement 'sqrt(_)'</langsyntaxhighlight>
 
=={{header|X86 Assembly}}==
===Don't use LOOP===
This doesn't affect the 16-bit 8086, but <code>LOOP</code> has some quirks where it's slower than:
<langsyntaxhighlight lang="asm">label:
;loop body goes here
DEC ECX
JNZ label</langsyntaxhighlight>
 
Which is very ironic considering that <code>LOOP</code> was originally designed to be a more efficient version of the above construct. (It was more efficient in the original 8086, but not in today's version.) Thankfully, compilers are "aware" of this and don't use <code>LOOP</code>.
Line 555:
===JP (HL) Does Not Dereference===
For every other instruction, parentheses indicate a dereference of a pointer.
<langsyntaxhighlight Z80lang="z80">LD HL,(&C000) ;load the word at address &C000 into HL
LD A,(HL) ;treating the value in HL as a memory address, load the byte at that address into A.
EX (SP),HL ;exchange HL with the top two bytes of the stack.
 
JP (HL) ;set the program counter equal to HL. Nothing is loaded from memory pointed to by HL.</langsyntaxhighlight>
 
This syntax is misleading since nothing is being dereferenced. For example, if HL equals &8000, then <code>JP (HL)</code> effectively becomes <code>JP &8000</code>. It doesn't matter what bytes are stored at &8000-&8001. In other words, <code>JP (HL)</code> ''does not dereference.''
Line 568:
Depending on how the Z80 is wired, ports can be either 8-bit or 16-bit. This creates somewhat confusing syntax with the <code>IN</code> and <code>OUT</code> commands. A system with 16-bit ports will use <code>BC</code> even though the assembler syntax doesn't change. Luckily, this isn't something that's going to change at runtime. The documentation will tell you how to use your ports.
 
<langsyntaxhighlight lang="z80">ld a,&46
ld bc,&0734
out (C),a ;write &46 to port &0734 if the ports are 16-bit. Otherwise, it writes to port &34.</langsyntaxhighlight>
 
Unfortunately, this means that instructions like <code>OTIR</code> and <code>INIR</code> aren't always useful, since the B register is performing double duty as the high byte of the port and the loop counter. Which means that your port destination on systems with 16-bit ports is constantly moving! Not good!
Line 576:
===RET/RETI/RETN===
Here's one I didn't learn until recently. Depending on the wiring, <code>RETI</code> (return and enable interrupts) and <code>RETN</code> (return from Non-Maskable Interrupt) may end up functioning the same as a normal <code>RET</code>. This means that sometimes you have to use the following (which just makes anyone else reading your code think that you don't know what <code>RETI</code> does.)
<langsyntaxhighlight lang="z80">EI ;RETI doesn't enable interrupts on this Z80.
RET</langsyntaxhighlight>
 
Fortunately, there's a bit of a "reverse gotcha" that helps us out. When interrupts are enabled with <code>EI</code>, there is no chance that an interrupt will occur during the next instruction. <code>EI</code> doesn't actually enable interrupts until the instruction ''after'' it is finished.
10,333

edits

Cookies help us deliver our services. By using our services, you agree to our use of cookies.