VDP odds and ends

For anything related to VDP (plane, color, sprite, tiles)

Moderators: BigEvilCorporation, Mask of Destiny

Charles MacDonald
Very interested
Posts: 292
Joined: Sat Apr 21, 2007 1:14 am

VDP odds and ends

Post by Charles MacDonald » Sat Mar 09, 2013 6:09 am

Doing some experiments, may add some test programs later.

Mode 4 CRAM access

You must set the auto-increment value to $00 to sequentially write to CRAM addresses. Each word written has the following format:

---- ---- ---bb ggrr : How word is interpreted in mode 4
---- bbb- --gg grrr : How word is interpreted in mode 5

Notice that when loading a palette in mode 4 you just put the 6-bit value in the LSB of the word written as before. But bits 11-9 are latched too, and if you load a palette in mode 4 and switch to mode 5 the values are re-interpreted as shown.

This applies in the other direction too, e.g. writing $0EEE in mode 5 becomes $0E3F when you switch to mode 4, though only bits $003F matter.

Mode 4 VRAM access

In mode 4 VRAM is addressed differently. You have to use an auto-increment value of $01. The mapping of address register bits (as written by a %10xxxxxxxxxxxxxx command) to physical VRAM address bits is as follows:

Address register bit 0 = Specifies to byte-swap word written
Address register bit 1 = VRAM address bit 2
Address register bit 2 = VRAM address bit 3
Address register bit 3 = VRAM address bit 4
Address register bit 4 = VRAM address bit 5
Address register bit 5 = VRAM address bit 6
Address register bit 6 = VRAM address bit 7
Address register bit 7 = VRAM address bit 8
Address register bit 8 = VRAM address bit 9
Address register bit 9 = VRAM address bit 1
Address register bit 10 = VRAM address bit 10
Address register bit 11 = VRAM address bit 11
Address register bit 12 = VRAM address bit 12
Address register bit 13 = VRAM address bit 13
VRAM address bit 14 = 0
VRAM address bit 15 = 0

Since the address counter is 14 bits, it wraps after writing a word to address $3FFE to $0000. When in mode 4 only the first 16K is used, and the latter 48K of VRAM is free to store Mode 5 specific data.

In this mode it is helpful to consider that the VDP takes the upper 13 bits of the 14-bit register and treats that as a word offset into VRAM, such that an auto-increment of 1 is adding 1 to the word offset after each write.

Because the VRAM addressing changes depending on what mode you are in, you have to take care when accessing data intended for when mode when in the other. The easy method is to just switch to mode 4 when accessing the first 16K and mode 5 for the latter. But if you were going to load a tile set in mode 5 that was intended for mode 4, the tile data would need to be preprocessed and have the data byte locations moved around to accommodate the different addressing expected in mode 4.

Other Mode 4 issues

Remember that you have to be in mode 5 to change register $8F to set the auto-increment, so you'll do a lot of switching in and out of mode 5 as you load VRAM and CRAM data.

If the width is set to 320 pixels the sync is a little off and there's junk in the rightmost 8 columns, that mostly relates to sprite data. If you keep it set to 256 pixels then there's much less distortion when switching between mode 4 and 5.

Both interlace modes work in mode 4, but the positioning of even and odd frames is off by a large amount (like 32 lines?) so it is unusable. Even if you blanked out the top and the bottom and put data in the middle such that it would display correctly, the half-line offset between frames isn't quite right either so you don't really get a full 525 scanlines across two frames, some of them overlap instead of appearing below one another.

In mode 4 the 128K addressing feature is disabled. This bit is the 4K/16K DRAM size for the TMS9918, which does nothing on the SMS though some software may set or clear it. For compatibility with those games, bit 7 of VDP register 1 has no effect in mode 4.

Speaking of mode 4 there's already some older documentation here about it:

http://cgfm2.emuviews.com/txt/msvdp.txt

It mentions some of the specifics about how Mode 4 in the Genesis VDP operates differently to that in the SMS and SMS 2.

Read codes

Using that DTACK generator circuit I posted about earlier:

viewtopic.php?p=17678

I tried mapping out the code value for reads by running through all possible combinations. Here are the results:

00 : Normal VRAM read
04 : Normal VSRAM read
08 : Normal CRAM read
0C : 8-bit VRAM read

20 : Normal VRAM read
24 : Normal VSRAM read
28 : Normal CRAM read
2C : 8-bit VRAM read

01-03 : Return VRAM read buffer on each read
05-07 : Return VRAM read buffer on each read
09-0B : Return VRAM read buffer on each read
0D-0F : Return VRAM read buffer on each read

21-23 : Return VRAM read buffer on each read
25-27 : Return VRAM read buffer on each read
29-2B : Return VRAM read buffer on each read
2D-2F : Return VRAM read buffer on each read

10-1F : Return VRAM read buffer on each read
30-3F : Return VRAM read buffer on each read

For the 8-bit VRAM read, the LSB works like normal and the MSB of the value returned is fixed to the byte at VRAM offset $C000. I wonder why that location specifically?

More fun with $C0001E

When you set bit 8 of this register, color 0 of a palette is visible instead of them all mapping to the same one. So you can get 64 colors instead of 60.
Last edited by Charles MacDonald on Sun Mar 10, 2013 4:56 pm, edited 1 time in total.

tristanseifert
Interested
Posts: 35
Joined: Tue Sep 06, 2011 2:16 am
Location: /dev/sa0
Contact:

Post by tristanseifert » Sat Mar 09, 2013 10:06 pm

Nice to know there's people out there that look at all these oddities of the VDP — maybe we'll have a 100% accurate understanding of the VDP someday.

Anyways, relating to $C0001E, I did some tests regarding $C0001C a few months ago, and documented my results here, as well as a test ROM I ran on real hardware to test the behaviour. At the time it appeared to me that $C0001E is a mirror of $C0001C, but eh.[/url]

Charles MacDonald
Very interested
Posts: 292
Joined: Sat Apr 21, 2007 1:14 am

Re: VDP odds and ends

Post by Charles MacDonald » Sun Mar 10, 2013 7:20 am

Here's a test program that shows a four-color 256x224 bitmap in mode 4 and 5. You can press the C button to toggle modes, default is mode 5. When you change modes only some registers are rewritten, all the VRAM and CRAM contents remain unchanged.

The data for mode 4 is loaded in mode 5, so check out the data generation program which makes a regular linear 16K chunk of mode 4 VRAM and then swizzles the address bits around to the expected format.

Demo program and source code:
http://www.sendspace.com/file/n2d07b

I forgot to set register $89 to $10, this keeps the mode 4 graphics centered in comparison with the mode 5 graphics. That way when you switch modes, there is no vertical shift and the only thing that changes is that the the upper and lower 16 lines are visible (mode 5) or blanked (mode 4) on a CRT. Probably doesn't matter for emulators. :)

Eke
Very interested
Posts: 884
Joined: Wed Feb 28, 2007 2:57 pm
Contact:

Re: VDP odds and ends

Post by Eke » Sun Mar 10, 2013 4:45 pm

Charles MacDonald wrote: The data for mode 4 is loaded in mode 5, so check out the data generation program which makes a regular linear 16K chunk of mode 4 VRAM and then swizzles the address bits around to the expected format.
I am not sure to get this, does this mean that when switched into Mode 4, the VDP actually does the address line mixing thing when reading pattern and pixel data from VRAM ? It's not just when writing (resp. reading) data in VRAM from the 68k interface during Mode 4 ? It was not clear from your initial post, I thought it only affected the address register.

From looking at your code, you indeed do all the VRAM writes during Mode 5, then switch into Mode 4 and writes $FF into registers #2 to #6, which put the Pattern Table base address at $3800. Do you mean that VDP will in reality read the first pattern data from another VRAM address ?
Doesn't $3800 translate to $3800 as well when using the address decoding you described ?

This is really strange behavior, I cannot find any logic behind this

Charles MacDonald
Very interested
Posts: 292
Joined: Sat Apr 21, 2007 1:14 am

Re: VDP odds and ends

Post by Charles MacDonald » Sun Mar 10, 2013 4:54 pm

I am not sure to get this, does this mean that when switched into Mode 4, the VDP actually does the address line mixing thing when reading pattern and pixel data from VRAM ? It's not just when writing data to VRAM from the 68k interface during Mode 4 ? It was not clear from your initial post, I thought it only affected the address register.
Oops, sorry. To be clear, the VDP changes the address line ordering in mode 4 when it accesses VRAM, either for CPU I/O or rendering.

However the address you specify in the VDP table registers, or when writing a VRAM read or VRAM write command to the VDP, is unmodified.

If you operate exclusively in mode 4 this change is transparent to the programmer. It's only an issue when you switch modes because the data is in different locations than what was expected.

FWIW, it's the exact same thing on the TMS9918 where the address lines are changed for 4K mode compared to 16K.
From looking at your code, you indeed do all the VRAM writes during Mode 5, then switch into Mode 4 and writes $FF into registers #2 to #6, which put the Pattern Table base address at $3800. Do you mean that VDP will in reality read the first pattern data from another VRAM address ?
I just do that because the same registers have different values in mode 5, since the mode 5 name table location is $C000 and the mode 4 name table is at $3800.
Doesn't $3800 translate to $3800 as well when using the address decoding you described ?
The address decoding only affects how the VDP reads and writes VRAM. The addresses set in the VDP table registers as well as the VRAM read or write command are specified normally.

So writing #$781E to the control port is a VRAM write to offset $381E, but the actual address in VRAM that will be accessed is subject to the address bit reordering I described.

Eke
Very interested
Posts: 884
Joined: Wed Feb 28, 2007 2:57 pm
Contact:

Post by Eke » Sun Mar 10, 2013 5:20 pm

Thanks for the clarification, so this means the address lines are re-arranged at the VRAM interface, I wonder why they made it working like this and how it is implemented in hardware.

The way I see the VDP doing internal processing, when outputing a VRAM address for reading, there are some bits that are fixed (those which comes from the internal VDP registers that are currently used to retrieve data) and other that are related to some internal counter (column counter, sprite counter, pixel counter, etc). It's just strange to think they made those bits related to different address lines when doing Mode 4.

Also, I bet it's probably the same address decoding when using the Z80 interface, switching to Mode 5 in this case will get you the same re-arranged VRAM...

Charles MacDonald
Very interested
Posts: 292
Joined: Sat Apr 21, 2007 1:14 am

Post by Charles MacDonald » Sun Mar 10, 2013 5:46 pm

It's just strange to think they made those bits related to different address lines when doing Mode 4.
I'd almost expect it if they had "cut and pasted" the SMS VDP design into the Genesis VDP, and then only hacked up the RAM interface to use VRAM instead of pseudo SRAMs. But surely that's not how a proper chip is designed... (Maybe it is?) :D
Also, I bet it's probably the same address decoding when using the Z80 interface, switching to Mode 5 in this case will get you the same re-arranged VRAM...
Excellent point! I bet that's probably the case. I did some "mode 5 in Z80 mode" demos a while ago but never checked that.
I suppose I have to now. :D

Nemesis
Very interested
Posts: 791
Joined: Wed Nov 07, 2007 1:09 am
Location: Sydney, Australia

Post by Nemesis » Sun Mar 10, 2013 9:15 pm

Thanks for doing these tests! One critical thing to check though: What happens when the FIFO is full of writes, and you switch to/from VDP mode 4? Do the pending writes which are currently held in the FIFO get written according to the mode 4 bit mappings or the mode 5 bit mappings? This will tell us whether the address mapping occurs when entries are added from the FIFO, or whether the address mapping occurs when entries are read from the FIFO and written to the target. Also, it'll be interesting to know if a VRAM write can be "split" with this, IE, if the first byte-wide write has already been performed, then the mode 4 enable state is changed, does the second byte write go to the remapped address? This could also reveal something about how these byte-wide writes are handled from the FIFO.

Charles MacDonald
Very interested
Posts: 292
Joined: Sat Apr 21, 2007 1:14 am

Post by Charles MacDonald » Sun Mar 10, 2013 10:14 pm

Nemesis wrote:Thanks for doing these tests! One critical thing to check though: What happens when the FIFO is full of writes, and you switch to/from VDP mode 4?
This probably needs to be tested more exhaustively, but this was a quick test I did now:

- Set mode 5, screen on, wait for active display
- Write $40000000 to ctrl
- Write 16 longs of $aabbaabb to data
- Write $81404800 to ctrl
- Long delay, then write $eeff to data

In VRAM (read in mode 5), the data is:

- 28 words of $AABB at $0000+ (at the Mode 5 normal addresses)
- last 4 words of $AABB at 70,74,78,7C (at the Mode 4 "swizzled" addresses)
- 1 word of $EEFF at $0800

So setting mode 4 did change how the addresses loaded into the FIFO in mode 5 were interpreted.

I would imagine the changed addressing scheme is transparent to the VDP (FIFO part, etc.) and right at the interface between internally generated addresses and the pins to the VRAM chips, that's where the mode 4/5 adjustments occur.

I also remember running SMS tests from the Z80 side on a Super Magic Drive years ago, and cases that caused screen corruption on a SMS didn't happen on the Genesis. So the FIFO may always be active, even when the Z80 is in control, and it doesn't know (or react differently to) the current display mode.

EDIT: As expected, doing this with $40000003 as the first write put 28 words at $C000+ and the last four at $0070+.
Last edited by Charles MacDonald on Mon Mar 11, 2013 2:08 pm, edited 1 time in total.

TmEE co.(TM)
Very interested
Posts: 2440
Joined: Tue Dec 05, 2006 1:37 pm
Location: Estonia, Rapla City
Contact:

Post by TmEE co.(TM) » Sun Mar 10, 2013 10:20 pm

MD VDP gives twice as many access slots per line to SMS stuff than real SMS VDP in passive scan. Active scan still gives one slot every 32 pixels, but the FIFO is definitely there. Unlike 68K, VDP cannot stall Z80 so you'll end up having missed writes once FIFO is overflown (68K will be stalled until FIFO has at least one free slot).
Mida sa loed ? Nagunii aru ei saa ;)
http://www.tmeeco.eu
Files of all broken links and images of mine are found here : http://www.tmeeco.eu/FileDen

Eke
Very interested
Posts: 884
Joined: Wed Feb 28, 2007 2:57 pm
Contact:

Post by Eke » Mon Mar 11, 2013 11:59 am

Thanks for this last test.
So it also confirms the FIFO holds the full VDP command (including destination code and address) and that writes to CTRL port are not delayed until FIFO is empty (which would be necessary if code/addr value were not written to FIFO).

Chilly Willy
Very interested
Posts: 2984
Joined: Fri Aug 17, 2007 9:33 pm

Post by Chilly Willy » Fri Mar 15, 2013 10:31 pm

My internet has been down for the last week... which figures that this would appear right as it went down. I was going to ask about using mode 4 from the 68000... I'm working on an SMS emulator for the 32X since you can't switch to SMS mode while the 32X is in place. Basically, I run a Z80 emulation continuously on the Master SH2, and have the 68000 emulate IO operations to the VDP/PSG/etc. I was wondering how one would do mode 4 commands, read/write data, that sort of thing. The old VDP doc mentioned mode 4, but not how you do the 68000 accesses.

Charles MacDonald
Very interested
Posts: 292
Joined: Sat Apr 21, 2007 1:14 am

Post by Charles MacDonald » Sat Mar 16, 2013 12:41 am

Wow, this sounds super cool.

How are you handling the mode 4 video? Just plugging the A/V out cable into the Genesis and not connecting it to the 32X? I didn't check exhaustively, but the 32X freaks out if the Genesis is in mode 4 and won't pass the video through correctly.

Here's what you need to know:

Mode 4 is enabled by setting bit 2 of register $81 and resetting bit 2 of register $80. Once in that mode, all VDP commands are 16-bits only, so you can't touch VSRAM or read CRAM, do DMA, etc. So commands might look like this:

$C000 : Write to CRAM
$7800 : Write to VRAM $3800
$0000 : Read from VRAM
$8700 : Change border color register

If you wrote something like $40000003 (write to $C000) like you would in mode 5, the VDP sees it as two writes of $4000 (write to $0000) and $0000 (read from $0000). So just keep in mind that commands are 16 bits instead of 32.

You can only do word reads/writes from VRAM which is a little awkward as the SMS was byte-oriented. Longword access to the control or data port is fine.

Other things to watch out for:

- Set register $90 to $00, the playfield width bits affect how many bytes the VDP skips per row in mode 4 but doesn't actually make the screen wider.

- You need to read the status register to acknowledge an interrupt, they aren't automatically acknowledged like they are in mode 5.

If you run into any trouble please let me know, this is definitely an emulator I'd like to see up and running. :D

Chilly Willy
Very interested
Posts: 2984
Joined: Fri Aug 17, 2007 9:33 pm

Post by Chilly Willy » Sat Mar 16, 2013 2:07 am

Yeah, that's the kind of info I need. I need to do some testing of the video through the 32X - the info available says that as long as the 32X video is OFF, it should pass through 256 wide mode unmolested. I had assumed that meant mode 4 as well as 256 wide mode 5, but maybe that's not the case.

Handling the VDP accesses is one of the main things I need to work on - I was just passing on data reads/writes while accumulating two bytes for commands. I'm still not sure which addresses I should be dealing with for these, or the width. Here's the main 68000 code for the SMS IO right now:

Code: Select all

in_byte:
        move.w  #0x00FF,d1
        move.w  0xA15122,d0         /* COMM2 holds port address */
        btst    #7,d0
        bne.b   in_1x
        btst    #6,d0
        bne.b   in_01
| b7-6 = 00
        move.w  d1,0xA15120         /* done - COMM1 = return byte */
        bra     main_loop
| b7-6 = 01
in_01:
        lea     0xC00008,a0
        andi.w  #1,d0
        move.b  0(a0,d0.w),d1       /* vcounter/hcounter */
        move.w  d1,0xA15120         /* done - COMM1 = return byte */
        bra     main_loop
in_1x:
        btst    #6,d0
        bne.b   in_11
| b7-6 = 10
        btst    #0,d0
        bne.b   0f
        move.b  0xC00001,d1         /* vdp data LSB */
        move.w  d1,0xA15120         /* done - COMM1 = return byte */
        bra     main_loop
0:
        andi.w  #0xFD,0xA15128      /* reset STATE_INT */
        move.b  0xC00005,d1         /* vdp status LSB */
        move.w  d1,0xA15120         /* done - COMM1 = return byte */
        bra     main_loop
| b7-6 = 11
in_11:
        btst    #0,d0
        bne.b   0f
        /* read joy1 */
        move.w  joy_1,d0
        andi.w  #0x003F,d0
        move.w  joy_2,d1
        andi.w  #0x0003,d1
        lsl.w   #6,d1
        or.w    d0,d1
        not.b   d1
        move.w  d1,0xA15120         /* done - COMM1 = return byte */
        bra     main_loop
0:
        /* read joy2 */
        move.w  joy_2,d1
        andi.w  #0x003F,d1
        lsr.w   #2,d1
        ori.w   #0x00C0,d1
        move.w  joy_1,d0
        or.w    joy_2,d0
        andi.w  #0x0800,d0
        beq.b   1f
        ori.w   #0x0010,d1          /* RESET */
1:
        not.b   d1
        mov.b   ioctl,d0
        andi.w  #0x00A0,d0          /* THB and THA */
        add.w   d0,d0
        add.b   d0,d0
        lsr.w   #1,d0
        or.b    d0,d1
        move.w  d1,0xA15120         /* done - COMM1 = return byte */
        bra     main_loop

out_byte:
        move.w  0xA15122,d1         /* COMM2 holds port address */
        btst    #7,d1
        bne.b   out_1x
        btst    #6,d1
        bne.b   out_01
| b7-6 = 00
        btst    #0,d1
        bne.b   0f
        move.b  d0,memctl
        move.w  #0,0xA15120         /* done */
        bra     main_loop
0:
        move.b  d0,ioctl
        move.w  #0,0xA15120         /* done */
        bra     main_loop
| b7-6 = 01
out_01:
        move.b  d0,0xC00011         /* write PSG */
        move.w  #0,0xA15120         /* done */
        bra     main_loop
out_1x:
        btst    #6,d1
        bne.b   out_11
| b7-6 = 10
        btst    #0,d1
        bne.b   0f
        move.b  d0,0xC00000         /* write vdp data */
        move.w  #0,0xA15120         /* done */
        bra     main_loop
0:
        eori.b  #1,vdpflg
        beq.b   1f
        move.b  d0,vdplsb
        move.w  #0,0xA15120         /* done */
        bra     main_loop
1:
        lsl.w   #8,d0
        move.b  vdplsb,d0
        move.w  d0,0xC00004         /* write vdp control */
        move.w  #0,0xC00004
        move.w  #0,0xA15120         /* done */
        bra     main_loop
| b7-6 = 11
out_11:
        /* to do - FM */
        move.w  #0,0xA15120         /* done */
        bra     main_loop
Turns out the entire project is assembly - one assembly file for the MD side, and ten assembly files for the 32X side, of which seven are for the Z80 emulation (I separated out each opcode page by prefix, or lack thereof, to make working on the Z80 code easier). I'm still debugging the Z80 code, but I really need to get that MD side code above working right so I can see/hear when it's actually doing something.

Anyone who wants to look at ALL the code as it stands now, you'll find it here:
http://www.mediafire.com/download.php?nezihzjt1h9zo93

Like virtually everything else I do for retro, this will be all open source, and anyone who wants to help out, even just suggestions, is welcome. I really appreciate the help in handling the VDP - clearly the most important port of the emulator next to the CPU.

Charles MacDonald
Very interested
Posts: 292
Joined: Sat Apr 21, 2007 1:14 am

Post by Charles MacDonald » Sat Mar 16, 2013 2:21 am

If you have the RAM to spare, can you emulate the VDP interface and VRAM I/O with a local chunk of 16k ram, then mark dirty words and write those to VRAM during vblank? I'm sure that breaks 'racing the beam' effects and games that might try to read back VRAM within the same frame as writing it, but those are rare cases. Then you don't need to spend nearly as much time doing real VRAM I/O and passing data back and forth.

If you want an easy game to start with, try Teddy Boy (32K game). It doesn't need much to get up and running.

I noticed you are passing the HV counter value back directly, is the 32X fast enough to emulate the Z80 in realtime?

Post Reply