I'm fighting for too long with my dma stuff so I'm looking for some help

Here is the story
On my game, every sprite on screen is 4x4
Each sprite has its own dedicated tiles on VRAM
Every vint, I dma the tiles of each sprite if they change according current animation (which means I could transfer sprite_count * 256bytes per vint)
according DMA doc, the minimum available on vint are 205bytes * 86 scanlines
So I have all the bandwidth I want
unfortunatly, it starts to lag a lot at 7 sprites (yes, only SEVEN !)
What seems to occur :
main loop
vint handler, not finish in time so finish while main loop (re)start
Does it mean I could only transfert 205 bytes per DMA call ?
How could I know if I'm out of scanline (in this case, I'll skip current dma queue and keep it for next vint) ?
Does i mean although you're able to get 80 sprites on screen, you can't get 80 DIFFERENT sprites ?
Do you know how to master all of this ?
or perhaps, I'm totally on the wrong way with my 1 sprite = 16 tiles ?
thanks for any help, I would like to avoid to rewrite all for nothing...