Sprite List Code Messed Up

Ask anything your want about Megadrive/Genesis programming.

Moderator: BigEvilCorporation

User avatar
Scorpion Illuminati
Interested
Posts: 28
Joined: Fri Oct 02, 2015 4:58 pm

Sprite List Code Messed Up

Post by Scorpion Illuminati » Thu Nov 09, 2017 8:50 pm

I am having this weird problem where the sprite is changing randomy and soft locking the game. ClearSprites and AddSprite are both subroutines that I have written myself based on some notes written I wrote based on a conversation I had with Sik. While UpdateSprites is just a modification of LoadSpriteTables.

Code: Select all

; *************************************************************
; ClearSprite
; Clears the sprite counter
; *************************************************************
ClearSprites:
    move.b #0, (sprite_count)                                                  ; set sprite count to 0
    rts

; *************************************************************
; AddSprite
; Adds a sprite to the sprite table and increments counter
; d0 - sprites y position
; d1 - width and height tile bits
; d2 - special (flipping, palette index, priority)
; d3 - tile id
; d4 - sprites x position
; a0 - sprite table
; *************************************************************
AddSprite:
    move.b (sprite_count), d5                                              ; sprint count in d5
    move.b d5, d6                                                              ; d6 is used to calculate the sprite table offset
    cmp #80, d5                                                                ; are there too many sprites in the table already?
    blt SpriteLimitNotReached                                               ; if not branch
    rts                                                                              ; return
SpriteLimitNotReached:
    lsl.b #$03, d6                                                               ; calculate sprite table offset (sprite count * sprite size)
    add.w d6, a0                                                                ; increment address by offset
    move.w d0, (a0)                                                            ; move sprites y position into table
    addq.l #4, a0                                                                ; increment address by a word
    move.b d1, (a0)                                                            ; move sprites demensions into table
    addq.l #2, a0                                                                ; increment address by a byte
    move.b d2, (a0)                                                            ; save sprites special bits in table
    addq.l #2, a0                                                                ; increment address by a byte
    addq.l #1, d5                                                                ; increment d5(sprite_count) by 1
    move.b d5, (a0)                                                            ; index of next sprite is d5(sprite_count +1) 
    addq.l #2, a0                                                                ; increment address by a byte
    move.b d3, (a0)                                                            ; save sprites tile id in table
    addq.l #2, a0                                                                ; increment address by a byte
    move.w d4, a0                                                              ; save sprites x position in table
    addq #1, (sprite_count)                                                  ; increment sprite counter by 1
    rts

; *************************************************************
; Updates and draws the sprite table
; *************************************************************
UpdateSprites:
    lea (spritelist_table), a0                                                 ; sprite table in a0
    move.b (sprite_count), d0                                              ; load sprite counter into d0
    and.l #0x000000FF, d0
    mulu.w #SizeSpriteDesc, d0                                           ; Offset into sprite table
    swap d0                                                                     ; To upper word
    or.l #vdp_write_sprite_table, d0                                      ; Add to write address
    move.l d0, vdp_control                                                  ; Set read address

    move.l (a0)+, vdp_data                                                 ; 8 bytes of data
    move.l (a0)+, vdp_data
    rts

Code: Select all

LoadSpriteTables:
	; a0 --- Sprite data address
	; d0 (b) Number of sprites

	move.l	#vdp_write_sprite_table, vdp_control
	
	and.l   #0x000000FF, d0
	subq.b	#0x1, d0           ; Minus 1 for counter
	@AttrCopy:
	move.l	(a0)+, vdp_data    ; 8 bytes of data
	move.l	(a0)+, vdp_data
	dbra	d0, @AttrCopy
	
	rts
Notes:

Code: Select all

OK so you want an arbitrary number of sprites
1) Reserve 640 bytes in RAM for a table *Complete*

2) Make a ClearSprites function that sets the sprite count to 0 *Complete?*

3) Make an AddSprite function that inserts a sprite into the table
(may want to check that there aren't too many sprites, i.e. 80)

4) Make an UpdateSprites function that loads the table to VRAM
:P

Incrementing a0
addq.l #1, a0

Replace #1 with a value from 1 to 8
Or replace addq with add if you need a larger value
:v

Storing offset to a0
lea (a0,d5.w), a0
-or-
add.w d5, a0

Pick either
Both do the job here

Then every frame:
1) Call ClearSprites at the beginning
2) Call AddSprite for every sprite you want to draw
3) Call UpdateSprites at the end
You don't need to keep track of sprites once added.
Also if you do run into the limit (80 sprites), huuuuh, I'd say to just drop the excess sprites (i.e. just return without adding them)
Here is a link to the ROM in case you wanna see what i mean.

Sincerely,

Scorpion Illuminati
Scorpion Illuminati - An open source rhythm game for the Sega Genesis
http://www.scorpionilluminati.tk

Chilly Willy
Very interested
Posts: 2487
Joined: Fri Aug 17, 2007 9:33 pm

Re: Sprite List Code Messed Up

Post by Chilly Willy » Thu Nov 09, 2017 10:55 pm

Code: Select all

lsl.b #$03, d6
The shift size is too small. If d6 is more than 31, it'll overflow the byte. Make that lsl.w.

Miquel
Very interested
Posts: 180
Joined: Sat Jul 30, 2016 12:33 am

Re: Sprite List Code Messed Up

Post by Miquel » Fri Nov 10, 2017 11:22 am

Scorpion Illuminati wrote:
Thu Nov 09, 2017 8:50 pm
addq.l #4, a0 ; increment address by a word

[several times:]
addq.l #2, a0 ; increment address by a byte
Ups! what you say is not what you do. To increment 2bytes you should add '2', to increment 1byte you should add '1', or better a post-increment.

Anyway, perhaps you should reconsider the whole thing. What about having a function that takes an Actor and a list of sprites in a custom structure and builds the required hardware sprite list.

AddSprite sounds convenient for starters toying with the machine, but not for an actual game setup.

If you are good at C code consider starting by doing the initial code with C, ask GCC to show the assembler code, learn from it, and finally write your own asm code.
Are you talking about the "monolith"? I'm in communications with them, just tell me the question or ask yourself. If it's about it's movement: yes it really moves, but at a tremendous speed! From the Sun to Jupiter in less than 2 seconds!

User avatar
Sik
Very interested
Posts: 592
Joined: Thu Apr 10, 2008 3:03 pm
Contact:

Re: Sprite List Code Messed Up

Post by Sik » Fri Nov 10, 2017 12:46 pm

Miquel wrote:
Fri Nov 10, 2017 11:22 am
Anyway, perhaps you should reconsider the whole thing. What about having a function that takes an Actor and a list of sprites in a custom structure and builds the required hardware sprite list.

AddSprite sounds convenient for starters toying with the machine, but not for an actual game setup.
I use the clear/add/update setup since forever and it works just fine. It's the low-level functionality, anything more complex (like what you suggested) would be built on top of that, not replacing it. (EDIT: example I have at hand, note how AddMetasprite piggybacks on top of AddSprite, and you could even have higher level parts of the game piggyback on this the same way)

That he botched the implementation is a whole different matter. He's still trying to get his head around assembly.
Sik is pronounced as "seek", not as "sick".

Miquel
Very interested
Posts: 180
Joined: Sat Jul 30, 2016 12:33 am

Re: Sprite List Code Messed Up

Post by Miquel » Fri Nov 10, 2017 2:17 pm

Sick, I destroy a lot of things in a day, that's how I learn. That's not bad, on the contrary.

I'm not claiming to have The Only Good Solution, just the idea that works best for most environments. Best means it's fast while it has some common capabilities in mind.

About the "Reset/Clear" function:
As you do with a typical 3D game where you resend all the meshes (triangles) every frame to the render engine, I think is convenient to reset the sprite list every frame not as an optional behaviour in your custom code but as an embedded working idea for most situations. Are there exceptions? Of course! but for the complexity of a MD game is necessary this approach as a basis.

About the "AddSprite" function, but first a little about the 68K philosophy:
The idea behind the implementation of the 68000 itself is that ideally a function/routine uses all free 15 registers and when it needs more register another function is called. Was designed this way because an experienced programmer uses around 10 local variables in a complex function. Probably right now you are thinking that not all functions should be complex in a program, if so: don't forget about inlining/macros/defines and it's virtues.

This is how the cpu is optimized at it's maximum, you use all 15 registers for your current task, you need to call for another task, you push almost all registers, push the parameters, do the call to subroutine, do the task, pop back all the register,... and so on.
So calling to a subroutine is a heavy task!

About the "AddSprite":
In the same way in other environments, while going for real, you don't do a "PutPixel" or a "PutTriangle" because the continuing calling/returning will completely kill the cpu, the same happens on the 68K when trying to use a "AddSprite".

And it's not only that, but if you think how you a typical game structures it's resources there is no need for a AddSprite, but for very exceptional cases. Usually, you have a sprite list array not a single sprite per Actor. And all the operations you have to apply to this sprite list are the same. For sure you have to add current Actor position, but you can also add other features like blinking, mirroring,... those operations effects all the sprite list all the same.

I don't claim it to be to only path but it's a matter of organization and performance.
Are you talking about the "monolith"? I'm in communications with them, just tell me the question or ask yourself. If it's about it's movement: yes it really moves, but at a tremendous speed! From the Sun to Jupiter in less than 2 seconds!

User avatar
Scorpion Illuminati
Interested
Posts: 28
Joined: Fri Oct 02, 2015 4:58 pm

Re: Sprite List Code Messed Up

Post by Scorpion Illuminati » Fri Nov 10, 2017 8:11 pm

Hey just did some research, and took your advice and reworked the code a bit. Here is what I have so far:

Code: Select all

; *************************************************************
; ClearSprite
; Clears the sprite counter
; *************************************************************
ClearSprites:
    move.b #0, (sprite_count)                                                  ; set sprite count to 0
    rts

; *************************************************************
; AddSprite
; Adds a sprite to the sprite table and increments counter
; d0 - sprites y position
; d1 - width and height tile bits
; d2 - special (flipping, palette index, priority)
; d3 - tile id
; d4 - sprites x position
; a0 - sprite table
; *************************************************************
AddSprite:
    move.b (sprite_count), d5                                                  ; sprint count in d5
    move.b d5, d6                                                              ; d6 is used to calculate the sprite table offset
    cmp.b #80, d5                                                              ; are there too many sprites in the table already?
    blt SpriteLimitNotReached                                                  ; if not branch
    rts                                                                        ; return
SpriteLimitNotReached:
    lsl.w #$03, d6                                                             ; calculate sprite table offset (sprite count * sprite size)
    add.w d6, a0                                                               ; increment address by offset
    move.w d0, (a0)+                                                           ; move sprites y position into table
    move.b d1, (a0)+                                                           ; move sprites demensions into table
    move.b d2, (a0)+                                                           ; save sprites special bits in table
    addq.l #1, d5                                                              ; increment d5(sprite_count) by 1
    move.b d5, (a0)+                                                           ; index of next sprite is d5(sprite_count +1) 
    move.b d3, (a0)+                                                           ; save sprites tile id in table
    move.w d4, (a0)+                                                           ; save sprites x position in table
    addq #1, (sprite_count)                                                    ; increment sprite counter by 1
    rts

; *************************************************************
; Updates and draws the sprite table
; *************************************************************
UpdateSprites:
    lea (spritelist_table), a0                                                 ; sprite table in a0
    move.b (sprite_count), d0                                                  ; load sprite counter into d0
	and.l #0x000000FF, d0
	mulu.w #SizeSpriteDesc, d0                                                 ; Offset into sprite table
	swap d0                                                                    ; To upper word
	or.l #vdp_write_sprite_table, d0                                           ; Add to write address
	move.l d0, vdp_control                                                     ; Set read address

	move.l (a0)+, vdp_data                                                     ; 8 bytes of data
	move.l (a0)+, vdp_data
    rts
Yes Sik is absolutely correct, I am learning 68k assembly language at the same time as trying to implement some of the basic engine features which doesn't make it any easier. :( but, i'm really trying. Looking at the sprite tables through the gensKMod debugger produces the following output:
Num-Ypos-Xpos-Size-Link-Pal-Tile-Flags
00-0-175-08x08-0-0-263-000
So Ypos, Link and Tile all have incorrect data in the VDP. Using blastem's debugger I poked into the RAM and 00ff0014 and above are always random each time the game is booted which is strange. Anyway here is a link to the latest working copy of the source code. Any assistance in this matter would be greatly appreciated.

Sincerely,

Scorpion Illuminati
Scorpion Illuminati - An open source rhythm game for the Sega Genesis
http://www.scorpionilluminati.tk

Chilly Willy
Very interested
Posts: 2487
Joined: Fri Aug 17, 2007 9:33 pm

Re: Sprite List Code Messed Up

Post by Chilly Willy » Sat Nov 11, 2017 4:11 am

You got inconsistent size problems everywhere.

First you're moving bytes,

Code: Select all

    move.b (sprite_count), d5                                                  ; sprint count in d5
    move.b d5, d6                                                              ; d6 is used to calculate the sprite table offset
    cmp.b #80, d5  
then you're dealing with them as a different size not knowing what was in the upper parts of the registers to begin with.

Code: Select all

lsl.w #$03, d6
...
addq.l #1, d5
You need to be sure the registers are not holding stale data larger than a byte, or clear them to begin with.

Code: Select all

    moveq #0,d5
    moveq #0,d6
    move.b (sprite_count), d5                                                  ; sprint count in d5
    move.b d5, d6                                                              ; d6 is used to calculate the sprite table offset
    cmp.b #80, d5  

Miquel
Very interested
Posts: 180
Joined: Sat Jul 30, 2016 12:33 am

Re: Sprite List Code Messed Up

Post by Miquel » Tue Nov 14, 2017 3:10 pm

About "sprite_count" being a byte, a Golden Rule on a 68K:
Unless necessary avoid working with byte (.b) or long (.l) addressing , word (.w) it's ideal for most cases; because it's faster and avoids the use of extra instructions.

Code: Select all

addq #1, (sprite_count)
It's ok, but this is faster:

Code: Select all

move.b d5, (sprite_count)

Code: Select all

move.b (sprite_count), d0
and.l #0x000000FF, d0
It's ok, but this is faster:

Code: Select all

clr.l d0
move.b (sprite_count), d0

Code: Select all

mulu.w #SizeSpriteDesc, d0
SizeSpriteDesc is for sure a 2^n number: use shifting!

Forgive me for being reiterative, but looking at how GCC builds code you will began to learn this tricks.

About "UpdateSprites":
Better use DMA for ALL hardware sprites, time on VBLANK is precious!

One more thing: I hope you have in mind who works with the stack the called or the caller, by the looks of things it seems that you have decided it's the caller, but this will produce larger code, just saying you should know/decide this things. And remember to clean the high part of registers when needed, this is also a good reason to be the called who works with the stack.
Are you talking about the "monolith"? I'm in communications with them, just tell me the question or ask yourself. If it's about it's movement: yes it really moves, but at a tremendous speed! From the Sun to Jupiter in less than 2 seconds!

User avatar
flamewing
Interested
Posts: 38
Joined: Tue Sep 23, 2014 2:39 pm
Location: France

Re: Sprite List Code Messed Up

Post by flamewing » Tue Nov 14, 2017 5:49 pm

Miquel wrote:
Tue Nov 14, 2017 3:10 pm
Unless necessary avoid working with byte (.b) or long (.l) addressing , word (.w) it's ideal for most cases; because it's faster and avoids the use of extra instructions.
.b is neither faster nor slower than .w, in general. In some cases, .b is faster (sCC instructions, bCC.s when branch is not taken), in others .w is faster (mostly, when you need a word or long). But avoiding .l in favor of .b or .w is good advice in general.
Miquel wrote:
Tue Nov 14, 2017 3:10 pm

Code: Select all

move.b (sprite_count), d0
and.l #0x000000FF, d0
It's ok, but this is faster:

Code: Select all

clr.l d0
move.b (sprite_count), d0
Use "moveq #0, d0" instead of "clr.l d0", please, and you will save 2 cycles (this is 50% of the moveq time, for comparison). clr microcode was not properly optimized, and it takes an additional microinstruction to clear the register compared to moveq.
Miquel wrote:
Tue Nov 14, 2017 3:10 pm
Forgive me for being reiterative, but looking at how GCC builds code you will began to learn this tricks.
GCC Is a terrible place to learn assembly in general, and 68k in particular. Sure, it can do some incredible optimizations from source to assembly; but most of the optimizations it does, at least for the 68k, are on the machine-agnostic optimizer. It's 68k optimizer is awful, especially if you don't not pass -fomit-frame-pointers, -mpc-rel and -mshort. And while -mshort will require you to use a custom stdint.h (to get correct int32_t and intptr_t), its absence will make GCC use .l almost everywhere something end up being a C int (and promoting to int by accident is ridiculously easy in C).

Miquel
Very interested
Posts: 180
Joined: Sat Jul 30, 2016 12:33 am

Re: Sprite List Code Messed Up

Post by Miquel » Tue Nov 14, 2017 6:57 pm

1) .b instructions are 2 byte larger (except for move) so they are necessarily slower.

2) I will check that but I very much doubt, that clr is slower that moveq. If that is the case, there is something wrong with the microcode. Both must be the 4 typical cycles.

3) That's a matter of opinion, GCC is an excellent compiler. In some instances not as good as an expert human, in others better. Perhaps you are too used to good compilers to realize what a bad compiler really is.

EDIT:

About question 2, "clr vs moveq":

What happens is that in the case of clr, the operand is read before the writing (to save dye space, a side effect). In case of a register that means nothing, "clr.l d0" and "moveq.l #0, d0" both take 4 typical cycles, in case of "clr.l (a0)", yes, it will add 2 more cycles in comparison with "moveq.l #0,(a0)".

I'm checking at what GCC does and:
- I see a lot: "clr.l d0" -like
- Not a single "clr.l (a0)" -like
- Some "clr.w -(%sp)", "clr.b -45(%a6)" -like which, unless I'm much mistaken, are not doable with moveq, in other words: saves cycles by combining instructions.

So really not that bad compiler at 68K level after all.
Are you talking about the "monolith"? I'm in communications with them, just tell me the question or ask yourself. If it's about it's movement: yes it really moves, but at a tremendous speed! From the Sun to Jupiter in less than 2 seconds!

User avatar
Stef
Very interested
Posts: 2627
Joined: Thu Nov 30, 2006 9:46 pm
Location: France - Sevres
Contact:

Re: Sprite List Code Messed Up

Post by Stef » Tue Nov 14, 2017 8:02 pm

GCC >= 3.0 and < 6.0 were not good at generating m68k code, but things improved greatly since GCC 6.0 (the one SGDK is using), of course it won't replace hand made assembly but it allows decent performance with pure C code.

Miquel
Very interested
Posts: 180
Joined: Sat Jul 30, 2016 12:33 am

Re: Sprite List Code Messed Up

Post by Miquel » Tue Nov 14, 2017 8:50 pm

I think, at this point we must be more concrete at our assertions, unless we are just collecting opinions.

GCC could be best or worst but never (or very extremely rare I must say) mistaken, taken that you know what are you doing at C. That's a steep over human learning, as this thread shows. As a reference point is more valid that anything else but a dedicated teacher.
Are you talking about the "monolith"? I'm in communications with them, just tell me the question or ask yourself. If it's about it's movement: yes it really moves, but at a tremendous speed! From the Sun to Jupiter in less than 2 seconds!

User avatar
Stef
Very interested
Posts: 2627
Joined: Thu Nov 30, 2006 9:46 pm
Location: France - Sevres
Contact:

Re: Sprite List Code Messed Up

Post by Stef » Tue Nov 14, 2017 11:30 pm

Of course GCC generated code is always correct ! And i second you, it's a good way to learn, just looking generated code...
myself sometime when i need to write assembly code i start with generated code from GCC from my old C function then i manually tune it :) Almost time it's much faster than writing all the code in assembly from scratch ;)

User avatar
flamewing
Interested
Posts: 38
Joined: Tue Sep 23, 2014 2:39 pm
Location: France

Re: Sprite List Code Messed Up

Post by flamewing » Wed Nov 15, 2017 10:04 am

Miquel wrote:
Tue Nov 14, 2017 6:57 pm
1) .b instructions are 2 byte larger (except for move) so they are necessarily slower.
That is not true at all. This is true for immediate .l instructions, but for most other instructions, .b and .w differ by a couple of bits. Those that are different by more than a couple bits are generally shorter in .b by 2 bytes. Hint: I already wrote a 68k disassembler in 68k assembly that is used in a lot of Sonic hacks as a generic error screen.
Miquel wrote:
Tue Nov 14, 2017 6:57 pm
3) That's a matter of opinion, GCC is an excellent compiler. In some instances not as good as an expert human, in others better. Perhaps you are too used to good compilers to realize what a bad compiler really is.
No, that is not it. GCC can do some amazing things on x86 and x86_x64, but it's people optimizer for 68k misses some obvious stuff. For example, it emits coffee like this:

Code: Select all

move.w d0, d1 ; both d0 and d1 will be used for different things
tst.w d1
bCC.b whatever
It also does not tend to hoist constants out of loops if they are used as address unless you use it more than once, which makes the code slower, and tends to do some dumb stuff. Example at the bottom.

But those are understandable: GCC's peephole optimizer has a lot more rules aimed at later models of the 68k family, which are pipelined, and so these things so not affect performance to much.
Miquel wrote:
Tue Nov 14, 2017 6:57 pm
About question 2, "clr vs moveq":

What happens is that in the case of clr, the operand is read before the writing (to save dye space, a side effect). In case of a register that means nothing, "clr.l d0" and "moveq.l #0, d0" both take 4 typical cycles, in case of "clr.l (a0)", yes, it will add 2 more cycles in comparison with "moveq.l #0,(a0)".
A hint: when someone says things like "the clr microcode is not well optimized", that is a good hint that they know what they are talking about.

The microcode for clr (on registers) takes one microopcode to clear .w or .b, and two microopcodes to clear .l; the additional microopcode finalizes prefetch, which is concurrently started with one of the word clears (or byte clear, for .b). That makes it 4 cycles for .b or .w, 6 cycles for .l, both with one read cycle and no write cycles.

moveq, on the other hand, copies the byte and sign extends on one microopcode (and starts prefetch) one the first microopcode, and finalizes prefetch on the other. Thus, 4 cycles, one read cycle and no write cycles.

For memory operands, yes, clr is even worse because it reads from the destination before writing.

You can also check on yacht, which is based on microcode analysis and measurements, and then analyze the patents if you still don't believe me.
Miquel wrote:
Tue Nov 14, 2017 6:57 pm
I'm checking at what GCC does and:
- I see a lot: "clr.l d0" -like
- Not a single "clr.l (a0)" -like
- Some "clr.w -(%sp)", "clr.b -45(%a6)" -like which, unless I'm much mistaken, are not doable with moveq, in other words: saves cycles by combining instructions.
If you only ever do a single .b or .w clr on memory, it will be faster, yes. If you do more than one then doing this will be faster:

Code: Select all

moveq #0,d0
move.w d0,-(sp)
move.b d0,-(sp)
And even if you do a single .l, This will be faster:

Code: Select all

moveq #0,d0
move.l d0,-(sp)
For what is worth, GCC will generate these if you pass -m68000 or -march=68000, so it is something to do if you don't already.

Here is the example I mentioned above:

Code: Select all

#include "stdint.h"
void InitVDP(uint16_t *init_vals, int size) {
    volatile uint16_t *Ctrl = (uint16_t*)0xC00004;
    for (int ii = 0; ii < size; ii++) {
        *Ctrl = init_vals[ii];
    }
}
This gets compiled by GCC 7.2 with -O3 -m68000 -fomit-frame-pointers -mpc-rel to:

Code: Select all

InitVDP(unsigned short*, int):
    move.l 8(sp),d0 ; No way to tell GCC we will always have at least one element
    jle .L1
    move.l 4(sp),a0
    add.l d0,d0 ; No way to tell it count is < 0x18
    add.l a0,d0
.L3:
    move.w (a0)+,d1 ; Why break up into two moves?
    move.w d1,$C00004 ; The address should have been in an address register
    cmp.l d0,a0
    jne .L3 ; what happened to dbra and use count?
.L1:
    rts
Here is what it ought to generate (when also adding -mshort to the command line):

Code: Select all

InitVDP(unsigned short*, int):
    move.w 8(sp),d0 ; No way to tell GCC we will always have at least one element
    jle .L1
    move.l 4(sp),a0
    subq.w #1,d0
    lea $C00004,a1
.L3:
    move.w (a0)+,(a1)
    dbra d0, .L3
.L1:
    rts
That is not even what I would write by hand (no reason to pass the parameters in the stack, since it only uses registers that can be clobbered anyway), but GCC will happily inline this function, so it is not too much of an issue.

And try out what GCC emits when you unroll loop manually. Suddenly, GCC will put Ctrl into a1 and use it (great). But it will also stop using post-increment mode in a0, and will use "move.w N(a0),(a1)". Which is faster in 68020+ (no data dependencies to get in the way of the pipeline), but slower on 68000.

Note that I am not bashing GCC; it is important to know about it's shortcomings, especially in code like this, when you only have ~128k cycles per frame.

This post ended up being too long...

Chilly Willy
Very interested
Posts: 2487
Joined: Fri Aug 17, 2007 9:33 pm

Re: Sprite List Code Messed Up

Post by Chilly Willy » Wed Nov 15, 2017 4:06 pm

Page 8-6 of the 68000 user manual clearly states that clr.l reg takes 6 cycles. It's been well known for a long time that moveq #0,d0 is faster than clr.l d0. You only use clr when you need to only clear a byte or word and need the upper bits to remain unchanged.

And yes, gcc is a terrible way to learn assembly for any processor. It's not meant to be human-readable. If you're already experienced in assembly, you can learn a few tricks from it, but I don't recommend beginners look at the gcc generated assembly files. :D

Post Reply

Who is online

Users browsing this forum: No registered users and 1 guest