DMA setup performance penalty?
Moderators: BigEvilCorporation, Mask of Destiny
-
- Very interested
- Posts: 710
- Joined: Sat Feb 18, 2012 2:44 am
DMA setup performance penalty?
Hi,
If I understand correctly, during vblank I can copy up to 7k odd bytes from the CPU to the VDP. But that's the theoretical maximum of one DMA transfer.
How much of bandwidth would I lose if I setup multiple DMA transfers? Does the setup and stopping of the DMA transfer on the hardware side take enough time to lose a non-trivial amount of bandwidth?
Thanks!
djcc
If I understand correctly, during vblank I can copy up to 7k odd bytes from the CPU to the VDP. But that's the theoretical maximum of one DMA transfer.
How much of bandwidth would I lose if I setup multiple DMA transfers? Does the setup and stopping of the DMA transfer on the hardware side take enough time to lose a non-trivial amount of bandwidth?
Thanks!
djcc
-
- Very interested
- Posts: 3131
- Joined: Thu Nov 30, 2006 9:46 pm
- Location: France - Sevres
- Contact:
I would say it depends a lot from how many DMA and how you do your DMA setup !
If you do it through SGDK which make many "safe" stuff as setting vram increment register and checking about splited transfer (crossing 128 KB bank) you will lose a lot of time. If you do a very fast asm setup method which only write the DMA registers then you won't lose much (except if you plan to split in many shorts DMA..)
By the way, do you need that because of some crappy PCM playback issues ?
If you do it through SGDK which make many "safe" stuff as setting vram increment register and checking about splited transfer (crossing 128 KB bank) you will lose a lot of time. If you do a very fast asm setup method which only write the DMA registers then you won't lose much (except if you plan to split in many shorts DMA..)
By the way, do you need that because of some crappy PCM playback issues ?
-
- Very interested
- Posts: 292
- Joined: Sat Apr 21, 2007 1:14 am
Re: DMA setup performance penalty?
When I was doing some FMV tests there was no appreciable change in speed between doing one huge transfer and a lot of small transfers. Eventually there's a point of diminishing returns, but I tried DMA'ing an 8K data block broken into repeated transfers of sizes from 1x8K down to 8x1K with roughly the same performance.djcouchycouch wrote:How much of bandwidth would I lose if I setup multiple DMA transfers? Does the setup and stopping of the DMA transfer on the hardware side take enough time to lose a non-trivial amount of bandwidth?
I did precalculate the DMA parameters so I could move them directly from a table in memory to the VDP registers, there was no calculation of VRAM addresses or source addresses. I can see there being some overhead in calculating those parameters every time for doing small DMA transfers.
The process of entering and exiting DMA takes hardly any time at all, so there isn't a significant penalty for when the VDP takes over the 68K bus or releases it afterwards. It's only a few clock cycles.
-
- Very interested
- Posts: 710
- Joined: Sat Feb 18, 2012 2:44 am
In my setup, I was thinking of building a queue of DMA transfers (with source and destination points precalculated) during the normal game loop and then fire them all during vblank. So I was wondering if I'd be losing any bandwidth if I had multiple DMA calls like that. But according to Charles it shouldn't be a problem. I don't think I'd be hitting the maximum amount of bandwidth, anyway.
I'm not doing it for any sound playback. I was thinking about using it for copying a lot of sprite/tile data from rom to VDP during a frame for lots of dynamic changes, and seeing how much I could get away with.
I'm already using DMA to copy a sprite character's animation to VDP every frame.
I'm not doing it for any sound playback. I was thinking about using it for copying a lot of sprite/tile data from rom to VDP during a frame for lots of dynamic changes, and seeing how much I could get away with.
I'm already using DMA to copy a sprite character's animation to VDP every frame.
-
- Very interested
- Posts: 2440
- Joined: Tue Dec 05, 2006 1:37 pm
- Location: Estonia, Rapla City
- Contact:
VRAM transfer queue like that is the way to go. It will maximize the VBL use.
For sound you will have to have a synchronous driver that stays out of ROM for duration of VBL, and uses precached data to last that time.
For sound you will have to have a synchronous driver that stays out of ROM for duration of VBL, and uses precached data to last that time.
Mida sa loed ? Nagunii aru ei saa
http://www.tmeeco.eu
Files of all broken links and images of mine are found here : http://www.tmeeco.eu/FileDen
http://www.tmeeco.eu
Files of all broken links and images of mine are found here : http://www.tmeeco.eu/FileDen
-
- Very interested
- Posts: 710
- Joined: Sat Feb 18, 2012 2:44 am
I just realized that I even if I could fit 3 kilobytes of bandwidth, it would only be about 96 tiles. A bit less than what I was hoping/expecting! I don't really have any specific effects in mind, but wow, that's not a lot Currently just copying the main character's tiles from rom to VDP is 36 tiles.
-
- Very interested
- Posts: 3131
- Joined: Thu Nov 30, 2006 9:46 pm
- Location: France - Sevres
- Contact:
3 KB is not much, i guess you can expect to have close to 5 KB.djcouchycouch wrote:I just realized that I even if I could fit 3 kilobytes of bandwidth, it would only be about 96 tiles. A bit less than what I was hoping/expecting! I don't really have any specific effects in mind, but wow, that's not a lot Currently just copying the main character's tiles from rom to VDP is 36 tiles.
But the key is that you should interleave sprite tiles update.
For instance you split whole update on 4 frames where:
frame 0 = player character
frame 1 = enemies group 1
frame 2 = enemies group 2
frame 3 = bullets / explosions
Of course that limit you on a maximum of 15 image/s for animation but that is enough in almost case
-
- Very interested
- Posts: 710
- Joined: Sat Feb 18, 2012 2:44 am
Yes, this is a good idea. I don't think I have many (or any) animations that change every frame.Stef wrote:
3 KB is not much, i guess you can expect to have close to 5 KB.
But the key is that you should interleave sprite tiles update.
For instance you split whole update on 4 frames where:
frame 0 = player character
frame 1 = enemies group 1
frame 2 = enemies group 2
frame 3 = bullets / explosions
Of course that limit you on a maximum of 15 image/s for animation but that is enough in almost case