m68k subtraction and absolute value
Moderator: BigEvilCorporation
m68k subtraction and absolute value
Hello I have two quick questions about the m68k is the sub (subtraction) command signed and if so is the sign bit the MSB or the LSB? Also is zero positive or negative for the signed bit? I want to calculate the delta of two points and then get the ABS (absolute value) of it.
-
- Very interested
- Posts: 2984
- Joined: Fri Aug 17, 2007 9:33 pm
It depends on how you look at it... the 68000 add/sub are BOTH signed AND unsigned at the same time. The only thing that matters is how you interpret the results. If you are thinking signed, the two numbers added/subtracted are two's complement numbers, and then you use signed branch operators (mi, pl, gt, ge, lt, le). If you are thinking unsigned, the numbers are absolute integers and you use unsigned branch operators (hi, hs, lo, ls).
If you remember your programming intro, two's complement means that you consider the msb the sign - 0 = positive, and 1 = negative. On the 68000, the msb of the result of any operation is stored in the N flag to show a negative result. In two's complement, 0 is always positive.
One way of doing abs() on the 68000 after a subtract would be
If you remember your programming intro, two's complement means that you consider the msb the sign - 0 = positive, and 1 = negative. On the 68000, the msb of the result of any operation is stored in the N flag to show a negative result. In two's complement, 0 is always positive.
One way of doing abs() on the 68000 after a subtract would be
Code: Select all
sub.w d1,d0
bpl.b 1f
neg.w d0
1:
-
- Very interested
- Posts: 710
- Joined: Sat Feb 18, 2012 2:44 am
Slightly off topic, but the thread is very close to what I've been meaning to post about.
With the information in the preceding posts, I wonder how could I improve my pretty lame abs() function. Basically:
How could I make it better?
Thanks!
DJCC
With the information in the preceding posts, I wonder how could I improve my pretty lame abs() function. Basically:
Code: Select all
s16 mathutils_abs(s16 value) // terrible and badly implemented.
{
if (value < 0)
{
value = value * -1;
}
return value;
}
Thanks!
DJCC
I saw this
http://www.strchr.com/optimized_abs_function
They say that you can do this
To port it to sgdk just do something like this
I am not sure though if this works for big endian it should be the same.
http://www.strchr.com/optimized_abs_function
They say that you can do this
Code: Select all
int my_abs(int a) {
int mask = (a >> (sizeof(int) * CHAR_BIT - 1));
return (a + mask) ^ mask;
}
I am not sure though if this works for big endian it should be the same.
Code: Select all
int my_abs(s16 a)
{
s16 mask = (a >> 15);
return (a + mask) ^ mask;
}
-
- Very interested
- Posts: 2984
- Joined: Fri Aug 17, 2007 9:33 pm
Shifting is WAY too slow for an abs() function! Might as well do a multiply by -1 with an actual multiply op!!
The best portable C would probably be
The unary minus operator should compile to negate on the 68000. If you don't care for portability, you could do
The best portable C would probably be
Code: Select all
if (val < 0) val = -val;
Code: Select all
s16 inline absolute(s16 val)
{
asm("tst.w %0\n\t"
"bpl.b 1f\n\t"
"neg.w %0\n"
"1:\n"
: "=d" (val) : : "cc"
);
return val;
}
So I attempted to port this: http://en.wikipedia.org/wiki/Bresenham% ... lification
over to the genesis but for some reason it does not work correctly I have looked at each line of code at least 10 times each if not more and I can not understand what the problem is.
over to the genesis but for some reason it does not work correctly I have looked at each line of code at least 10 times each if not more and I can not understand what the problem is.
Code: Select all
draw_line:
;function line(x0, y0, x1, y1)
;x0,y0,x1,y1 are stored in ram with that name
;they are all words
;d0 temp broken after draw pixel
;d1 temp broken after draw pixel
;d2 temp broken after draw pixel
;d3 temp broken after draw pixel
;d4=loopvar
;d5=dx
;d6=dy
;d7=err
;sx and sy are a word each (2 bytes)
;function line(x0, y0, x1, y1)
;dx := abs(x1-x0)
move.w (x0),D0
move.w (x1),D1
sub.w d0,d1 ;second number-first
bpl.w delta_one
neg.w d1
delta_one:
move.w d1,d5;d5=dx
;dy := abs(y1-y0)
move.w (y0),D0
move.w (y1),D1
sub.w d0,d1 ;second number-first
bpl.w delta_two
neg.w d1
delta_two:
move.w d1,d6;d6=dy
;if x0 < x1 then sx := 1 else sx := -1
;see what sx should equal
;cmp.w (x0),(x1) I don't see why you can not do this however I get an illegal addressing error
move.w (x1),d2
cmp.w (x0),d2
bgt.s set_sx;branch if A < B
move.w #-1,(sx)
set_sx:
move.w #1,(sx)
;if y0 < y1 then sy := 1 else sy := -1
;do the same for sy
;cmp.w (y0),(y1) second number appears to have to be a register
move.w (y1),d2
cmp.w (y0),d2
bgt.s set_sy;branch if A < B
move.w #-1,(sy)
set_sy:
move.w #1,(sy)
;d5=dx
;d6=dy
;d7=err
;err := dx-dy
move.w d5,d7;dx
;move.w d6,d0;dy
sub.w d6,d7 ;second number - first
;loop
line_loop:
;draw_pixel paramaters
;d0=x
;d1=y
;d2=temp
;d3=color
;setPixel(x0,y0)
lea (bmp_buffer),A0
moveq #0,d2
move.w (x0),d0
move.w (y0),d1
move.b #1,d3
bsr draw_pixel ;breaks d0-d3 and A0
moveq #0,d2
;rts
;if x0 = x1 and y0 = y1 exit loop
move.w (x1),d2
cmp.w (x0),d2
beq check_y
carry_on_carry_on: ;part of the asm fun that you don't get in c is more chances to come up with creative label names that are not to be used again
;e2 := 2*err
;d0=e2
move.w d7,d0
muls.w #2,d0
;if e2 > -dy then
move.w d6,d2
neg.w d2
cmp.w D0,D2;branch if A < B
BGT endif_error_one ;we needed to reverse the conditional statment a bit to avoid alot of branches
;err := err - dy
sub.w d6,d7;second number - first
;x0 := x0 + sx
;add.w (sx),(x0)
move.w (x0),d2
add.w (sx),d2
move.w d2,(x0)
;end if
endif_error_one:
;if e2 < dx then
cmp.w D0,D5 ;a > b
BLT endif_error_two
;err := err + dx
add.w d5,d7
;y0 := y0 + sy
;add.w (sy),(y0) what would have been one instruction now becomes 3
move.w (y0),d2
add.w (sy),d2
move.w d2,(y0)
;end if
endif_error_two:
;end loop
bra line_loop
end_line:
rts
check_y:
move.w (y1),d2
cmp.w (y0),d2
bne carry_on_carry_on
rts
-
- Very interested
- Posts: 2984
- Joined: Fri Aug 17, 2007 9:33 pm
Couple comments:
While the 68000 is rather orthogonal (especially compared to the x86), you don't have COMPLETE orthogonality. Compare has the following forms -
cmp ea,Dn
cmpa ea,An
cmpi #imm,ea
cmpm (An)+,(Am)+
There's no cmp address1,address2.
Never use mul for powers of 2. On the 68000, use add Dn,Dn to double a value; use multiple add Dn,Dn up to *8. For *16 or higher, use lsl #n,Dn instead. That will be much faster than mul.
As to your code, I didn't see anything offhand... what does it actually draw? "Does not work" really doesn't tell us anything useful. Does it draw backwards? The wrong way? Not stop? Stop too early?
While the 68000 is rather orthogonal (especially compared to the x86), you don't have COMPLETE orthogonality. Compare has the following forms -
cmp ea,Dn
cmpa ea,An
cmpi #imm,ea
cmpm (An)+,(Am)+
There's no cmp address1,address2.
Never use mul for powers of 2. On the 68000, use add Dn,Dn to double a value; use multiple add Dn,Dn up to *8. For *16 or higher, use lsl #n,Dn instead. That will be much faster than mul.
As to your code, I didn't see anything offhand... what does it actually draw? "Does not work" really doesn't tell us anything useful. Does it draw backwards? The wrong way? Not stop? Stop too early?
In general, if..else looks kinda like this:sega16 wrote:So I attempted to port this: http://en.wikipedia.org/wiki/Bresenham% ... lification
over to the genesis but for some reason it does not work correctly I have looked at each line of code at least 10 times each if not more and I can not understand what the problem is.
Code: Select all
_if:
CMP dx,dy
Bcc _else
_then:
do this
BRA _end
_else:
do that
_end:
It works now but it looks weird
Here is the fixed code.
Here is the fixed code.
Code: Select all
;function line(x0, y0, x1, y1)
;dx := abs(x1-x0)
move.w (x0),D0
move.w (x1),D1
sub.w d0,d1 ;second number-first
bpl.w delta_one
neg.w d1
delta_one:
move.w d1,d5;d5=dx
;dy := abs(y1-y0)
move.w (y0),D0
move.w (y1),D1
sub.w d0,d1 ;second number-first
bpl.w delta_two
neg.w d1
delta_two:
move.w d1,d6;d6=dy
;if x0 < x1 then sx := 1 else sx := -1
;see what sx should equal
;cmp.w (x0),(x1) I don't see why you can not do this however I get an illegal addressing error
move.w (x1),d2
cmp.w (x0),d2
bgt.s set_sx;branch if A < B
move.w #-1,(sx)
bra endif_set_sx
set_sx:
move.w #1,(sx)
endif_set_sx:
;if y0 < y1 then sy := 1 else sy := -1
;do the same for sy
;cmp.w (y0),(y1) second number appears to have to be a register
move.w (y1),d2
cmp.w (y0),d2
bgt.s set_sy;branch if A < B
move.w #-1,(sy)
bra endif_set_sy
set_sy:
move.w #1,(sy)
endif_set_sy:
In assembly, 'if a <> b then blah' becomes 'if NOT a <> b then branch'. So, in your case:
* if e2 > -dy then blah
becomes
* if e2 <= -dy then branch
That BGT (Branch if Greater Than) should be BGE (Branch if Greater or Equal).
* if e2 < dx then blah
becomes
* if e2 >= dx then branch
That BLT (Branch if Less Than) should be BLE (Branch if Less or Equal).
Everything else looks OK.
* if e2 > -dy then blah
becomes
* if e2 <= -dy then branch
That BGT (Branch if Greater Than) should be BGE (Branch if Greater or Equal).
* if e2 < dx then blah
becomes
* if e2 >= dx then branch
That BLT (Branch if Less Than) should be BLE (Branch if Less or Equal).
Everything else looks OK.
-
- Very interested
- Posts: 2984
- Joined: Fri Aug 17, 2007 9:33 pm
I was going to mention the branch conditions, but he beat me to it!sega16 wrote:Thank you it works now. I applyed TascoDLX changes and changed the muls to add and then after both of the changes it worked.
The mul would work fine, it would simply be slower. Also remember that small values can be broke into shifts or adds as well. For example, *6 = *2 + *4.
Code: Select all
add.w d0,d0 ; *2
move.w d0,d1
add.w d1,d1 ; *4
add.w d0,d1 ; *6
Another big win for speed is to use the registers as much as possible when you can. Avoid memory accesses. In the case of your line drawing routine, you might consider using a few address registers for holding coords. Remember that you can add/sub/cmp address registers as words or longs. Address registers are not as useful for bytes.
For me add d0,d0 and muls.w #2,d0 did two different things here is rom:
http://www.mediafire.com/download.php?gzqmperxcrc9nb7 with everything the same except the one called bmp test mul.bin uses the muls operation and the one called bmp test add.bin uses addition
Add works just like it should mul does not work it seems to skip some lines.
the reason it runes slow has nothing to do with the line drawing it is because I tried using the dma macros from the sonic 1 assembly but nothing happened. So I had to use a software transfer to get it to work. Also my bmp to tiles command doubles each pixel horizontally so when I input 160x112 it outputs 320x112 and the code in the hblank will double the lines however since I was using a software transfer instead of dma during vblank I had to disable ints for the software transfer to work.
http://www.mediafire.com/download.php?gzqmperxcrc9nb7 with everything the same except the one called bmp test mul.bin uses the muls operation and the one called bmp test add.bin uses addition
Add works just like it should mul does not work it seems to skip some lines.
the reason it runes slow has nothing to do with the line drawing it is because I tried using the dma macros from the sonic 1 assembly but nothing happened. So I had to use a software transfer to get it to work. Also my bmp to tiles command doubles each pixel horizontally so when I input 160x112 it outputs 320x112 and the code in the hblank will double the lines however since I was using a software transfer instead of dma during vblank I had to disable ints for the software transfer to work.
-
- Very interested
- Posts: 2984
- Joined: Fri Aug 17, 2007 9:33 pm
Very weird... I tried it on real hardware to verify it's not an emulator problem... does the same thing.sega16 wrote:For me add d0,d0 and muls.w #2,d0 did two different things here is rom:
http://www.mediafire.com/download.php?gzqmperxcrc9nb7 with everything the same except the one called bmp test mul.bin uses the muls operation and the one called bmp test add.bin uses addition
Add works just like it should mul does not work it seems to skip some lines.
All I can think of is the fact that muls.w #2,d0 sets d0 as a long, not a word, whereas add.w d0,d0 doesn't affect the upper part of d0, and something in your code is relying on the upper part of d0 remaining the same... maybe in the draw function.