Page 2 of 3
Posted: Tue Jun 12, 2007 9:33 am
by TmEE co.(TM)
I believe there is something I don't know about the DMA process. The result of the incorrect transfer is same on both Fusion and my MD... It is really funny, I transfer 320x240 pixels, everything is OK, then I transfer little less, and everything gets really messy after some amount of correctness... I believe some bit gets into wrong place...
Posted: Wed Jun 13, 2007 3:41 pm
by Jorge Nuno
Well this is my crappy DMAcopy routine (LOL), always works in fusion and can do DMA to CRAM also, but it needes heavy optimisation in size and speed (comparing to yours)
Can you test it please?
Read the "Parametros" to see the inputs to make this work
Code: Select all
;******************************************************************************************************************************
;DMAcopy 68000MEM -> VRAM
;
;Parametros: a0(Source start), a1(Dest start), d0(Length, bytes), variable: DMA_Launch(1,2)
;Destroi: -
;Retorna: -
;******************************************************************************************************************************
DMA68kVRAM:
MOVEM.l d0-d5/a2-a3, -(a7) ;PUSH 8 regs
LEA VDPcontrol, a2
LEA DMA_Launch, a3
MOVE.w #$8174, (a2) ;Activate DMA
MOVE.l a0, d1 ;Break a0 in 3 pieces(bytes)
ANDI.l #$FFFFFF, d1
LSR.l #1, d1
MOVE.w d1, d2
MOVE.b d2, d3
SWAP d1 ;d1: High order Address source
;d2: Mid order Address source
LSR.l #8, d2 ;d3: Low order Address source
MOVE.w #$9700, d4 ;Base config of reg 23 ($17) DMAset + HiAddress
OR.b d1, d4
MOVE.w d4, (a2)
MOVE.w #$9600, d4 ;Base config of reg 22 ($16) MidAddress
OR.b d2, d4
MOVE.w d4, (a2)
MOVE.w #$9500, d4 ;Base config of reg 21 ($15) LowAddress
OR.b d3, d4
MOVE.w d4, (a2) ;####DMA Source Address configured####
MOVE.l #$9300, d1 ;Base config of reg 19 ($13) WordLength LOW
LSR.l #1, d0 ;#words=#bytes/2
OR.b d0, d1
MOVE.w d1, (a2)
MOVE.l #$9400, d1 ;Base config of reg 20 ($14) WordLength HIGH
LSR.l #8, d0
OR.b d0, d1
MOVE.w d1, (a2) ;####DMA Length configured####
MOVE.l a1, d1 ;Manipulate Dest with a data reg
MOVE.b #$40, d2 ;UPconfig base for VRAM write
CMPI.b #2, (a3) ;Check CRAM write
BNE VRAM
CRAM: ORI.b #$80, d2 ;Toggle to CRAM write
;[1][1][0][0][0][0][0][0]b
SUB.b #1, (a3) ;Deactivate DMA to CRAM
VRAM: MOVE.w d1, d5 ;6 medium bits of Dest Address
LSR #8, d5
ANDI.l #$3f, d5 ;[---][---][A13][A12][A11][A10][A9][A8]
OR.l d5, d2 ;[CD1][CD0][A13][A12][A11][A10][A9][A8]
SWAP d2
LSL.l #8, d2 ;[4/c][Add][0][0][0][0][0][0]h
SUB.b #1, (a3) ;Deactivate DMA to VRAM
;MediumAddress READY
MOVE.b d1, d3 ;[A7][A6][A5][A4][A3][A2][A1][A0]
SWAP d3 ;
ANDI.l #$FF0000, d3 ;CleanUP
;LowerAddress READY
MOVE.w d1, d4 ;2 upper bits of Dest Address
LSL.l #2, d4
SWAP d4 ;[---][---][---][---][---][---][A15][A14]
MOVE.b #$80, d5
OR.b d4, d5 ;LOWpart command(has A15 and a14)
ANDI.l #$83, d5
OR.l d2, d5 ;MEDpart command
OR.l d3, d5 ;Hipart command
MOVE.l d5, (a2) ;####LAUNCH DMA!####
DMAbusyF:
MOVE.w (a2), d0 ;Read status register to check DMA-busy
ANDI.w #$0002, d0 ;Clean Useless bits
BNE DMAbusyF ;if(DMAbusy==$10) {wait}, else {continue}
DMAend:
MOVE.w #$8164, (a2) ;Desactiva o DMA
MOVEM.l (a7)+, d0-d5/a2-a3 ;POP 8 regs
RTS ;POP PC
;******************************************************************************************************************************
EDIT: umm you are not cleaning d2 enough i think...
try putting this AND.l #$00030000, d2 instead of clr.w d2 in the LoadTiles section
EDIT2: DMA reviewed a little and translated to en
Posted: Fri Jun 15, 2007 12:07 pm
by TmEE co.(TM)
Nice and ugly routine of yours

. Yesterday I did HUGE optimizations of my code, DMA gets incorrect after more correctness. I would post some pictures, but I want to keep my project secret until it is least playable.
My new and better code:
Code: Select all
LoadTiles: ; D0=Start tile , D3=tiles , A0=Source address
LSL.W #5, D0 ; uses DMA, works incorrectly sometimes
MOVE.W D0, D1
AND.W #$3FFF, D0
OR.W #$4000, D0
SWAP D0
ROL.W #2, D1
AND.W #3, D1
MOVE.W D1, D0
MOVE.L D0, D2
LSL.W #4, D3
MOVE.W D3, D0
JSR DoDMAtoVRAM
RTS
DoDMAtoVRAM: ; Does DMA to VRAM
MOVE.L #CPORT, A1 ; D0 = WORDs to transfer
MOVE.W D0, D1 ; D2 = VRAM address
AND.W #$00FF, D1 ; A0 = source address
OR.W #$9300, D1
SWAP D1
LSR.W #8, D0
OR.W #$9400, D0
MOVE.W D0, D1
MOVE.L D1, (A1)
MOVE.L A0, D0
LSR.L #1, D0
MOVE.W #$9500, D1
OR.B D0, D1
SWAP D1
LSR.L #8, D0
MOVE.W #$9600, D1
OR.B D0, D1
MOVE.L D1, (A1)
MOVEQ #9, D1
LSR.L D1, D0
MOVE.W #$9700, D1
OR.B D0, D1
MOVE.W D1, (A1)
OR.W #$80, D2
MOVE.L D2, (A1)
RTS
Posted: Fri Jun 15, 2007 5:20 pm
by Jorge Nuno
Jorge Nuno wrote:
EDIT: umm you are not cleaning d2 enough i think...
try putting this AND.l #$00030000, d2 instead of clr.w d2 in the LoadTiles section
Did you tried this? ^^
It was on my previous post but it was for your old routine
And now my routine uses less registers
Posted: Fri Jun 15, 2007 7:32 pm
by TmEE co.(TM)
Actually I didn't, as I optimized my routines BEFORE you posted your message

(and I don't keep old code). BTW, you can change these 2 AND.B to OR.B under LowAddress label, should be faster then.
Posted: Fri Jun 15, 2007 10:35 pm
by Jorge Nuno
TmEE co.(TM) wrote:Actually I didn't, as I optimized my routines BEFORE you posted your message

(and I don't keep old code). BTW, you can change these 2 AND.B to OR.B under LowAddress label, should be faster then.
LOL I always make my code work in the 1st place and then I try to optimize it if possible
I edited the the DMA post
Every ADD -> changed to OR, i didn't remember of OR instruction existence when i wrote it
You could just copy the routine you posted here before and then fix it like i said
Posted: Sat Jun 16, 2007 10:53 am
by TascoDLX
Jorge Nuno wrote:try putting this AND.l #$00030000, d2 instead of clr.w d2 in the LoadTiles section
Nope, not a problem...
Code: Select all
LoadTiles:
LSL.W #5, D0
SWAP D0
CLR.W D0 <- top half of D0 cleared (watch the swaps)
SWAP D0
MOVE.L D0, D2 <- D0 copied to D2
LSL.L #2, D2
CLR.W D2
SWAP D2 <- here: just two bits in the bottom half
AND.L #$3FFF, D0
SWAP D0 <- here: the other fourteen bits in the top half
OR.L D0, D2 <- combine
LSL.W #4, D3
MOVE.W D3, D0
JSR DoDMAtoVRAM
RTS
All checks out.
Back to the problem at hand: in case you're going from ROM to VRAM, here's a bit from Charles MacDonald's wonderful VDP document:
When a transfer is done out of the ROM area ($000000-3FFFFF), the machine
will lock up unless the write that triggers the DMA operation is done
using RAM.
Usually this means putting the command word or the latter half of the
command word in RAM and moving that into the control port, putting
the command word on the stack and moving that into the control port,
or having the instruction that moves the command word into the control
port execute out of RAM.
Not a problem if the code is executing out of RAM, but might be helpful.
Posted: Sat Jun 16, 2007 12:10 pm
by TmEE co.(TM)
I've read CMD wonderful docs, official docs, everything else I've found, tried everything, but still no luck

I'll bet it is something simple, it happens in Fusion too, every pixel shows the same garbage... but until that time, I use 68K loops.
Posted: Mon Jun 18, 2007 6:14 am
by TascoDLX
In your "new and better code":
Code: Select all
MOVEQ #9, D1 <-- right shift
LSR.L D1, D0 <-- by nine?
MOVE.W #$9700, D1
OR.B D0, D1
MOVE.W D1, (A1)
I'm sure it's just a typo as your previous code was correct.
Some other ideas:
You could try checking the DMA status flag -- VDP status bit 1 -- before starting the transfer, in case you have overlapping DMA ops.
Also, at least as a precaution, you may want to keep DMA disabled -- VDP register $01 bit 4 -- when you don't need it. I don't know the chances of accidentally triggering a DMA transfer but most games keep it disabled.
Posted: Mon Jun 18, 2007 12:06 pm
by TmEE co.(TM)
Thanks for noticing, I have no idea why I need to shift right 9 bit positions...
I'll disable DMA and enable it only if I need it. Any accidental triggering should be impossible, as I don't touch that register more than once, in the boot up proccess. And I only used DMA to load tiles, so 68K would be halted, so DMAing when DMA is in progress would be impossible.
I'll correct my weirdness and I think then it starts to work.
Posted: Mon Jun 18, 2007 9:24 pm
by TmEE co.(TM)
Everything worked from the beginning... I think I stumbeld on a weird feature, I think DMA cannot go over 64K boundaries (like on PC, where DMA won't cross 64K segment). Its just a guess, but ALIGN $10000 helped to correct it. And BTW, I got Jorge's code working in my code, and the result was pixel by pixel same, so I came to this conclusion.
Posted: Tue Jun 19, 2007 2:40 am
by Jorge Nuno
Well thanks for testing it

, and i tested your routine you first posted and it did it ok, so i couldn't find the error... and btw you were trying do DMA more than 64K? or was just memory mirroring?
gfxDMACopy
Posted: Tue Jun 19, 2007 7:47 am
by ob1
Here's the one I use. It is more versatile, since you can specify size, source and destination.
Code: Select all
gfxDMACopy:
; 496 cycles.
link a6,#0
movem.l d3-d6/a2,-(sp)
move.l #GFX_CTRL,a2 ; GFX_CTRL
move.w #$817C,(a2) ; Enable DMA
move.w 8(a6),d3 ; int size
move.l 10(a6),d4 ; long addr_src
move.w 14(a6),d6 ; long addr_dest
lsl.l #8,d3 ; --------FEDCBA98 76543210--------
lsr.w #8,d3 ; --------FEDCBA98 --------76543210
andi.l #$00FF00FF,d3 ; --------FEDCBA98 --------76543210
addi.l #$94009300,d3 ; 9 4 FEDCBA98 9 3 76543210
move.l d3,(a2)
lsr.l #1,d4 ; -FEDCBA987654321 0FEDCBA987654321
move.l d4,d5
andi.l #$007F00FF,d5 ; --------87654321 --------87654321
; 68k - VDP RAM : bit 7 of reg 23 = 0
addi.l #$97009500,d5 ; 9 7 87654321 9 5 87654321
move.l d5,(a2)
lsr.w #8,d4 ; --------0FEDCBA9
addi.w #$9600,d4 ; 9 6 0FEDCBA9
move.w d4,(a2)
andi.l #$0000FFFF,d6 ; ---------------- FEDCBA9876543210
rol.w #2,d6 ; ---------------- DCBA9876543210FE
swap d6 ; DCBA9876543210FE ----------------
addi.w #$0021,d6 ; DCBA9876543210FE ----------100001
ror.l #2,d6 ; 01DCBA9876543210 FE----------1000
lsl.b #2,d6 ; 01DCBA9876543210 FE--------1000--
rol.w #2,d6 ; 01DCBA9876543210 --------1000--FE
move.l d6,(a2)
move.w #$816C,(a2) ; Disable DMA
movem.l (sp)+,d3-d6/a2
unlk a6
rts
Maybe it helps.
Posted: Tue Jun 19, 2007 11:57 am
by TmEE co.(TM)
Jorge Nuno wrote:Well thanks for testing it

, and i tested your routine you first posted and it did it ok, so i couldn't find the error... and btw you were trying do DMA more than 64K? or was just memory mirroring?
No no, I transferred ~20KB, but that image part of the image started from one 64KB chunk, and ended in another (I used these "chucks" to make things more clear, in reality there isn't any).
I'll try OB1's code too, I'm pretty sure it acts the same.
Posted: Tue Jun 19, 2007 5:18 pm
by Stef
TmEE co.(TM) wrote:Everything worked from the beginning... I think I stumbeld on a weird feature, I think DMA cannot go over 64K boundaries (like on PC, where DMA won't cross 64K segment). Its just a guess, but ALIGN $10000 helped to correct it. And BTW, I got Jorge's code working in my code, and the result was pixel by pixel same, so I came to this conclusion.
hmmm sound interesting, i didn't know that... I didn't emulated it in this way in Gens (0x400000 boundary for ROM and 0x10000 for RAM).
Well, i've to fix my library functions :p
Thanks for the tips
