VDP Internals

For anything related to VDP (plane, color, sprite, tiles)

Moderators: BigEvilCorporation, Mask of Destiny

Mask of Destiny
Very interested
Posts: 624
Joined: Thu Nov 30, 2006 6:30 am

VDP Internals

Post by Mask of Destiny » Tue Nov 06, 2012 11:44 am

I've recently acquired a logic analyzer (an Open Bench Logic Sniffer, specifically) and I've hooked it up to the control and address lines of the VRAM in my Genesis 2 (VA1 board with 315-5708-01) to try and get some information about the internal workings of the VDP. Below are my initial findings based on a game (Super Street Fighter 2: Special Championship Edition) that uses 32-cell mode. More observations will follow (including 40-cell mode) as I have time.

Address Mapping

Addresses are not mapped to VRAM physical addresses in the most straightforward fashion. This doesn't matter for emulators or folks writing games/demos, but may be of interest to others that want to snoop on the VRAM bus. The lists below show the mapping between logical address bits and physical row and column address bits.

Row:
  1. A2
  2. A3
  3. A4
  4. A5
  5. A6
  6. A7
  7. A8
  8. A9
Column:
  1. A0
  2. A1
  3. A10
  4. A11
  5. A12
  6. A13
  7. A14
  8. A15
I suspect things are mapped this way so that column addresses can be loaded directly from the relevant register. All the bits that need to change as the line progresses are in the 8-bit row address.

General RAM Access Info

The Genesis uses a dual port DRAM for its video RAM. This dual port RAM has both a random access port which allows (slow) random access to arbitrary bytes and (fast) sequential access to bytes within a single row. Setting up the serial access port requires using the same address and control lines as the parallel access port (this is called a read transfer operation), but once that is done you can use the ports independently until you need to switch to a different row on the serial access port. The Genesis VDP doesn't take advantage of this though. It uses the serial port almost exclusively and only uses the random access port for 68K initiated access (either through data port read/writes or DMA operations).

The VDP always reads data from the serial port in 4 byte chunks and all memory operations (even random access and refresh cycles) take 4 cycles of SC (the serial port clock). Please note that during a serial port access, the data for the 4 bytes being addressed in the memory operation are actually on the data bus during the next memory operation.

In 32-cell mode SC appears to run at approximately 10.74MHz, which gives 684 cycles and 171 memory operations per line. Of these 171 operations, 4 will be refresh cycles (CAS before RAS refresh), 16 random access (for the 68K whether it uses them or not), 1 will be used to read the horizontal scroll values for Planes A and B, 16 will be used to read the second half of sprite attribute table entries (i.e. the part that specifies horizontal position and which tile to use), 34 will be used to read name table entries for planes A and B and the remainder will be used to read tiles. The order of these operations is in the following section.

Access Order and Timing

For a good portion of rendering (the part that involves the background planes), memory operations are bundled in groups of 4. Since there are only a few variations on what's in those groups I'll define them ahead of time to allow for a more compact representation of the information:

Group A_stile:
Name Table Plane A
Sprite Tile Row(previous line)
Plane A Tile Row
Plane A Tile Row

Group A_68k:
Name Table Plane A
68K Access Slot
Plane A Tile Row
Plane A Tile Row

Group A_refresh:
Name Table Plane A
Refresh
Plane A Tile Row
Plane A Tile Row

Group B_stile:
Name Table Plane B
Sprite Tile Row (previous line)
Plane B Tile Row
Plane B Tile Row

Group B_sat:
Name Table Plane B
Sprite Attribute Table entry (well, half of one)
Plane B Tile Row
Plane B Tile Row

The order of memory operations for an active display line is as follows:

!HSYNC low
7X Sprite Tile Row (previous line)
68K Access Slot
Horizontal Scroll Data
4X Sprite Tile Row (previous line)
!HSYNC high
A_stile
B_stile
A_68k
B_sat
A_68k
B_sat
A_68k
B_sat
A_refresh
B_sat
A_68k
B_sat
A_68k
B_sat
A_68k
B_sat
A_refresh
B_sat
A_68k
B_sat
A_68k
B_sat
A_68k
B_sat
A_refresh
B_sat
A_68k
B_sat
A_68k
B_sat
A_68k
B_sat
A_refresh
B_sat
2X 68K Slot
14X Sprite Tile Row
68K Slot
5X Sprite Tile Row
!HSYNC Low
7X Sprite Tile Row
68K Slot
Horizontal Scroll Data (next line)
4X Sprite Tile Row
!HSYNC High

Notes

From this it's clear why doing CRAM DMA during HBlank still causes display noise: there's only one 68K slot during HBlank. Turning off the display frees up some slots at the expense of some of the sprite data, but you can't use the entirety of HBlank because the horizontal scroll value is read early.

Unused 68K access slots will perform a read of whatever address was used in the previous memory operation.

Next Steps
40 cell mode
More details on sprite rendering access
Timing of read of the other half of the Sprite Attribute Table entries
Refresh timing during VBlank/Display off


Hopefully at least some of this information is new and useful. I had fun figuring it out at least.

Eke
Very interested
Posts: 885
Joined: Wed Feb 28, 2007 2:57 pm
Contact:

Post by Eke » Tue Nov 06, 2012 12:23 pm

Well, it seems to confirm Nemesis analysis, see this thread:
viewtopic.php?t=851

It's also interesting to compare the evolution from the TMS9918 timings, cf. http://www.smspower.org/Development/TMS ... ingDiagram

Anyway, i didn't knew about the unused external access slots performing reads to the current address so thank you for these tests, it's always nice to have alternate measures. What about the address, it should be the auto-incremented address ? I wonder what happens if a CRAM or VSRAM access is done by CPU, most likely it uses external slot as well and default VRAM read is not done this time. Also, you say 4 bytes are always read per access, are you sure that this remains true for external (CPU) access slots ?

Mask of Destiny
Very interested
Posts: 624
Joined: Thu Nov 30, 2006 6:30 am

Post by Mask of Destiny » Tue Nov 06, 2012 12:48 pm

Eke wrote:Well, it seems to confirm Nemesis analysis, see this thread:
viewtopic.php?t=851
Ah, I either missed that or forgot about it (probably the latter).

One thing I think I can clear up from that thread is that the VDP uses a (almost certainly double-buffered) line buffer. Rendering to the buffer is almost certainly done using SC whereas reading from the buffer is done using EDCLK.
Eke wrote:What about the address, it should be the auto-incremented address ?
No. The address is exactly the same as the address from the previous memory operation (the immediately previous operation, not necessarily the previous external access slot operation). Presumably there's a latch for the full 16-bit address that feeds the address line control logic.
Eke wrote:I wonder what happens if a CRAM or VSRAM access is done by CPU, most likely it uses external slot as well and default VRAM read is not done this time.
Everything I've seen and read suggests that the external access slots occur at the same time regardless of the target RAM. It's hard to directly measure this, but the behavior of the CRAM access "snow" is suggestive though.
Eke wrote:Also, you say 4 bytes are always read per access, are you sure that this remains true for external (CPU) access slots ?
Every access to the serial port is 4 bytes. External access uses the random access port and only reads (or writes) a single byte. The VDP still outputs SC during this time though so technically it's also reading data from the serial port, but this data is ignored.

Charles MacDonald
Very interested
Posts: 292
Joined: Sat Apr 21, 2007 1:14 am

Post by Charles MacDonald » Tue Nov 06, 2012 4:18 pm

Thanks for doing these tests, and in such detail! It's really fascinating stuff.

I always wondered why vertical column scrolling worked in two tile units, your findings basically prove why. :D
One thing I think I can clear up from that thread is that the VDP uses a (almost certainly double-buffered) line buffer. Rendering to the buffer is almost certainly done using SC whereas reading from the buffer is done using EDCLK.
I completely agree; there's a few cases where you can see this exposed in a few weird ways:

- When the line buffer is scanned out to the display it is also erased to prepare it for rendering the next line. Some of Sega's arcade games do this too. But the erasing doesn't happen if the display is blanked. If you blank the leftmost column for one line, then disable column blanking on the next line; the leftmost 8 pixels retain the sprite data of the previous line since they weren't erased/displayed. The rightmost pixels of the line are all normal.

- The same thing happens if you turn the screen off, then on at the next line. Since the line buffer wasn't scanned out to the display it retains the old data, and since the screen was off the VDP didn't render anything into the other buffer which is empty. On the line you seeing (where the old data is being shown); the VDP now renders into the empty buffer and on the next line everything is back to normal.

This is why turning the screen off and on mid-frame causes some graphical distortion, since when the screen is on again you get whatever sprites were present at the time it was turned off.

- The VDP renders sprites to the line buffer for line 0 at line 262 (last line of the previous frame). So messing with sprite settings at that point affects line 0, and making the same changes on line 0 affects line 1, etc.

Can't wait to see what you find out about H40 mode! Thanks again for sharing all this.

Eke
Very interested
Posts: 885
Joined: Wed Feb 28, 2007 2:57 pm
Contact:

Post by Eke » Tue Nov 06, 2012 5:43 pm

I think only sprites are rendered into the line buffer while plane A/B pixels are rendered one column (2-cells) earlier than they are displayed. I would not be surprised it is the reason H counter does not exactly match the display range (i.e 0x00 does not correspond to first displayed pixel).

Other interesting bits:

1) vblank flag is cleared at the start of line 262 (or 313 for pal) so the last "inactive" line can process sprite data from VRAM. This makes 262-224-1 =37 lines available for max. access rate in NTSC mode, 313-224-1 = 88 lines or 313-240-1 = 72 in PAL mode, which is still one more than what is specified in official doc (DMA timing section p.34) so maybe the first line of vblank is special as well ?

2) there are 4 fixed refresh slots in H32 mode and 5 in H40 mode, even when display is disabled, which makes 342/2 - 4 = 167 external (CPU or VDP DMA) access in H32 mode and 420/2 - 5 = 205 in H40 mode which fits with official doc metrics.

3) the maximal wait time when VDP Write FIFO is full is the maximal delay between 2 external access slots, which, during active display, is exactly
32 pixels or 64 SClocks (case when the last write occured before a refresh cycle), which makes 64/(53,69/5) = 5,96 us in H32 mode and 64/(53,69/4) = 4,77 us in H40 mode, which again fits with official doc numbers

TmEE co.(TM)
Very interested
Posts: 2443
Joined: Tue Dec 05, 2006 1:37 pm
Location: Estonia, Rapla City
Contact:

Post by TmEE co.(TM) » Tue Nov 06, 2012 6:14 pm

H32 has 161 access slots in passive and 16 on active line
H40 has 198 / 18

BG tile data is seen accessed from VRAM 16 pixels before it is displayed
Mida sa loed ? Nagunii aru ei saa ;)
http://www.tmeeco.eu
Files of all broken links and images of mine are found here : http://www.tmeeco.eu/FileDen

Charles MacDonald
Very interested
Posts: 292
Joined: Sat Apr 21, 2007 1:14 am

Post by Charles MacDonald » Tue Nov 06, 2012 6:26 pm

Eke wrote:I think only sprites are rendered into the line buffer while plane A/B pixels are rendered one column (2-cells) earlier than they are displayed. I would not be surprised it is the reason H counter does not exactly match the display range (i.e 0x00 does not correspond to first displayed pixel).
That sounds right, I don't think the background tiles are buffered either, just the sprites. This is why you can do mid-scanline VRAM changes and modify the background ahead of the raster beam.

Hopefully no games depend on that behavior though. :)

Mask of Destiny
Very interested
Posts: 624
Joined: Thu Nov 30, 2006 6:30 am

Post by Mask of Destiny » Tue Nov 06, 2012 8:34 pm

Eke wrote:I think only sprites are rendered into the line buffer while plane A/B pixels are rendered one column (2-cells) earlier than they are displayed. I would not be surprised it is the reason H counter does not exactly match the display range (i.e 0x00 does not correspond to first displayed pixel).
I had assumed that EDCLK was 2X the pixel clock for the current mode, which would make not buffering the background tiles problematic. But I now see what you said as making more sense. The weird behavior of sprites respecting priority when being combined with the background, but not with each other makes more sense with only sprites using a line buffer.
Eke wrote: 1) vblank flag is cleared at the start of line 262 (or 313 for pal) so the last "inactive" line can process sprite data from VRAM. This makes 262-224-1 =37 lines available for max. access rate in NTSC mode, 313-224-1 = 88 lines or 313-240-1 = 72 in PAL mode, which is still one more than what is specified in official doc (DMA timing section p.34) so maybe the first line of vblank is special as well ?
Getting timing information around start and end of VBLANK is probably my next step.
Charles MacDonald wrote:That sounds right, I don't think the background tiles are buffered either, just the sprites. This is why you can do mid-scanline VRAM changes and modify the background ahead of the raster beam.
Well you could still do that if the entire line was buffered. You would just need to do your writes a line early.

Mask of Destiny
Very interested
Posts: 624
Joined: Thu Nov 30, 2006 6:30 am

Post by Mask of Destiny » Mon Nov 12, 2012 12:24 am

I've been spending some time with Sonic 2 and a slightly different wiring configuration (ditching EDCLK and !SE for !VSYNC and !IPL2). I haven't seen anything that contradicts the 40-cell mode diagram Nemesis put together, but I was able to capture some data for the last active line that might be interesting. On the last active line of the display, the last 17 slots before !HSYNC goes low appear to be external access slots.

This is a bit strange, but I believe it supports the "line buffer for sprites only" hypothesis. My best guess is that the VDP doesn't really treat the last line differently than the others for the most part. As a result, it starts rendering sprites for the next line even though that line will never be displayed. 6 slots after it does the final tile read for bg plane b, it switches over into "inactive display area" mode.

Other potentially interesting observations :

In 40-cell mode, !IPL2 goes low for VINT (at least I'm pretty sure it's VINT, don't have !IPL1 hooked up so I suppose it could be an HINT) 40 ticks of SC after !HSYNC goes high on the first inactive line. I don't have the timing for 32-cell mode yet.

!HSYNC is low for exactly half the number of ticks of SC (so 32 for 40-cell and 26 for 32-cell) for some portion of the inactive display period. This starts 10th pulse of !HSYNC after the begin of the inactive display period and it goes back to normal on the 7th !HSYNC pulse after !VSYNC goes high again. Stated another way, there are 8 short !HSYNC pulses before !VSYNC goes low and 6 short !HSYNC pulses after !VSYNC goes high. All of the six of the !HSYNC pulses that occur while !VSYNC is low are short. The time betweein !HSYNC pulses is also cut in half.

Based on a little reading, this seems to be for generating pre-equalization and post-equalization pulses. This is generally not an important detail for proper emulation, but it does have a small impact if you want to be truly cycle accurate as SC slows down while !HSYNC is low in 40-cell mode (as observed in the other thread it's mostly MCLK/5 during this period with a few cycles of MCLK/4).

Notes:
When I say active and inactive, I'm referring to the periods where the VDP is actively rendering and not rendering respectively. So non-blanking lines on which the VDP just displays the border color would be inactive.

Above measurements are from an NTSC Genesis.

Mask of Destiny
Very interested
Posts: 624
Joined: Thu Nov 30, 2006 6:30 am

Post by Mask of Destiny » Mon Nov 12, 2012 7:55 am

I've done some more captures and analysis and I now have some information about the beginning of the active display period.

Rendering resumes at the same point in the line that it left off, 17 slots before !HSYNC goes low. It will then render line 255 (based on the Horizontal Scroll Table offset read) which fills the sprite line buffer for line 0 (again based on the HScroll Table offset). Presumably line 255 is not actually sent to the display.

I captured all of the period from the end of active display period to the beginning of the active display period for the next frame (in chunks, I only have 24KB of sample memory so I can only capture 3 or 4 lines at a time if I want reasonable resolution) and I haven't seen any evidence of the VDP reading the first half of the sprite attribute table entries. Assuming I didn't screw anything up (which is quite possible given the number of individual captures I had to make), it would appear that the VDP only reads the first half of the entries from the table when register 5 is set. It then must detect writes to the appropriate addresses to keep its cache up to date. I believe someone around here or one of the other forums has said as much before, but I can't remember.

This unfortunately leaves me without any solid reads as to when the VDP scans the cached copy of the SAT to assign sprites to the available slots for each line. My best guess is that it scans the entire table once per line starting after the two back-to-back external access slots. That would give 160 cycles of SC until it starts rendering the line (and reading the other half of the SAT entries) which is 2 cycles per sprite. That's not a whole lot, but it's not completely unreasonable that it could be done in that time. This would also help explain why it starts rendering line 255 so early, though I would expect to see it fire up 21 slots before !HSYNC rather than 17.

I think I'm roughly as far as I can go without writing some test programs and that's not likely to happen until I get a bit further on my emulator project. So no updates for a while unless someone has questions or wants me to check something in particular.

Nemesis
Very interested
Posts: 791
Joined: Wed Nov 07, 2007 1:09 am
Location: Sydney, Australia

Post by Nemesis » Tue Nov 13, 2012 12:21 am

Hey, I've been very busy and not online much lately, but I just logged on and saw this thread, and coincidentally, I'm doing more timing tests on the VDP right now myself. I still have a lot of unpublished information in this area, which I'm planning to release along with my emulator early next year. I have a cycle accurate VDP core online. There's still quite a bit to do, but the timing model is working very well. I have the "Direct Color" CRAM DMA demos working for example ( viewtopic.php?t=1203&start=0 ), with just the horizontal alignment slightly off to the left right now due to some instruction timing problems.

Is there anything anyone wants to know about VDP timings specifically that I can add to this thread?
I captured all of the period from the end of active display period to the beginning of the active display period for the next frame (in chunks, I only have 24KB of sample memory so I can only capture 3 or 4 lines at a time if I want reasonable resolution) and I haven't seen any evidence of the VDP reading the first half of the sprite attribute table entries. Assuming I didn't screw anything up (which is quite possible given the number of individual captures I had to make), it would appear that the VDP only reads the first half of the entries from the table when register 5 is set. It then must detect writes to the appropriate addresses to keep its cache up to date. I believe someone around here or one of the other forums has said as much before, but I can't remember.
The VDP doesn't actually read the SAT cache entries when register 5 is set, it only ever updates the contents of the cache if it detects a VRAM write being made within the SAT cache "window" based on the current location of the SAT as set by register 5. This means, you can change register 5, and the cache will continue to use the cached data from the old memory location. CastleVania Bloodlines relies on this.

As for when this list is traversed, that's a really tricky one isn't it? As you point out, there's no external access to indicate when this occurs. It seems logical to me that it occurs in sync with the sprite pattern reads, with two entries read per pattern slot (IE, it takes 2SC cycles to process an entry), but it's hard to confirm this. The best way might be in H32 mode to target a VRAM write to the cached portion of the SAT designed to be released in the access slot which comes immediately before the read of the hscroll data. There's still 6 sprite pattern read slots after this point. If you perform a write to the cached portion of the SAT at this time, while the list itself is still being traversed, you could determine which exact entry in the list the VDP is up to when that write is made by observing the effect it has on the way the VDP renders sprites on the following line. Remember that VRAM writes are 8-bit, so only half of the full 16-bit write would get through at that access slot, but it should be possible to construct a test ROM which uses this method to determine the synchronization of the external access slots to the traversal of the SAT cache data, relative to this one fixed point at least. From there, it should be possible to infer the exact timing and progression of the rest of the reads. I'm planning to perform this test myself, since this is one of the things I still need to determine for my emulator.

Eke
Very interested
Posts: 885
Joined: Wed Feb 28, 2007 2:57 pm
Contact:

Post by Eke » Tue Nov 13, 2012 12:57 am

Is there anything anyone wants to know about VDP timings specifically that I can add to this thread?
Well, for the fun (understand not a single game except the few aforementionned demos need it), i started some time ago to design a cycle accurate (and still efficient hopefully) VDP renderer, based on your infos about VRAM access but the thing i am missing is the VSRAM fetch timings, related to external access slots

I am pretty sure it is fetched close to HSCROLL data in normal V-Scroll mode and just before Plane A/B fetches in 2-Cell Vscroll mode but i don't know the exact timing and some games actually rely on this to be properly emulated if you start to render at pixel granularity ( no problem when using usual line granularity off course ^^ )

Other missing infos are timings related to VDP registers changes and when exactly they are being fetched/latched. i guessed most of them from analysis but it will be good to have an exhaustive and accurate description: like someone wise once said, that's the kind of thing there is no point to try to do accurately if it´s not *exactly* accurate :-)

Nemesis
Very interested
Posts: 791
Joined: Wed Nov 07, 2007 1:09 am
Location: Sydney, Australia

Post by Nemesis » Tue Nov 13, 2012 1:30 am

Well, for the fun (understand not a single game except the few aforementionned demos need it), i started some time ago to design a cycle accurate (and still efficient hopefully) VDP renderer, based on your infos about VRAM access but the thing i am missing is the VSRAM fetch timings, related to external access slots

I am pretty sure it is fetched close to HSCROLL data in normal V-Scroll mode and just before Plane A/B fetches in 2-Cell Vscroll mode but i don't know the exact timing and some games actually rely on this to be properly emulated if you start to render at pixel granularity ( no problem when using usual line granularity off course ^^ )
I've confirmed that on every single mapping read slot for layers A and B, the current register state of the following registers are re-evaluated:
-Layer mapping base address (reg 2/3/4)
-Vertical and horizontal field size (reg 16)
-Undocumented reg 1 bit 7 display mode

I believe that the vertical scroll mode is also latched at every mapping read slot, and the vscroll data read from VSRAM at each mapping read slot for the NEXT slot, IE, so in the first two cells, the scroll data is actually latched for the following two cells, but I have yet to run tests to confirm this. That's on my list to do in this current round of timing tests.

The following registers are tied to the analog render process, and are re-evaluated every pixel:
-Palette depth (reg 0 bit 2)
-Shadow highlight mode (reg 12 bit 3)
-Display enable (reg 1 bit 6)
-Mode 5 enable (reg 1 bit 2)
-Background colour settings (reg 7)
-RS0/RS1 bits for H32/H40 and EDCLK selection (reg 12, bits 7/0)

The hscroll mode (reg 11, bit 1/0) is re-evaluated every time hscroll data is read

The following registers are re-evaluated every frame, at the point where vblank is set in the status register:
-V28/V30 screen mode (reg 1, bit 3)
-Interlace mode (reg 12, bits 2/1)


As a sidenote, because I don't think it's documented anywhere, and is kind of critical when you start talking about timing issues, RS0 (reg 12 bit 7) switches the VDP to use the EDCLK signal to drive the serial clock, while RS1 (reg 12 bit 0) enables the H40 cell mode, with an internal MCLK/4 clock divider. The EDCLK signal is only required on the Mega Drive in order to make H40 mode produce a valid PAL/NTSC video signal. It's possible to use EDCLK to drive a H32 mode display, or use a H40 mode display with the internal MCLK/4 divider, it just produces a video signal outside the specifications for PAL/NTSC video, but it's perfectly valid, and could be synched to with the right monitor.

Mask of Destiny
Very interested
Posts: 624
Joined: Thu Nov 30, 2006 6:30 am

Post by Mask of Destiny » Tue Nov 13, 2012 3:00 am

Nemesis wrote:The VDP doesn't actually read the SAT cache entries when register 5 is set, it only ever updates the contents of the cache if it detects a VRAM write being made within the SAT cache "window" based on the current location of the SAT as set by register 5. This means, you can change register 5, and the cache will continue to use the cached data from the old memory location. CastleVania Bloodlines relies on this.
Ah, that does indeed fit better with the overall level of sophistication in the VDP.
Nemesis wrote:As for when this list is traversed, that's a really tricky one isn't it? As you point out, there's no external access to indicate when this occurs. It seems logical to me that it occurs in sync with the sprite pattern reads, with two entries read per pattern slot (IE, it takes 2SC cycles to process an entry), but it's hard to confirm this.
After thinking about it some more, I think this is close, but not quite correct. The timing of when the VDP resumes rendering on line 255/-1 is suggestive. The only useful thing it could be doing 17 slots before !HSYNC (4 slots after the start of reading sprite pattern data) is scanning the sprite list since the results of the tile pattern reads there will be thrown away. The VDP could start scanning the table before it turns everything else on, but this is more complicated than necessary and you would expect if they were going to the trouble of supporting that they would wait to turn on the rest of rendering to free up more external access slots.

If they're clever, sprite processing could even take 3 cycles of SC per sprite. The VDP doesn't necessarily have to finish processing all 80 sprites before it hits the first sprite mapping slot, just the first 61. Even if you remove the external access slot and the slot immediately before the sprite mapping read, you've still got 188 cycles of SC to play with. That's five more than you need though. It makes me wonder if there's some kind of pipelining going on.
Nemesis wrote:The best way might be in H32 mode to target a VRAM write to the cached portion of the SAT designed to be released in the access slot which comes immediately before the read of the hscroll data. There's still 6 sprite pattern read slots after this point. If you perform a write to the cached portion of the SAT at this time, while the list itself is still being traversed, you could determine which exact entry in the list the VDP is up to when that write is made by observing the effect it has on the way the VDP renders sprites on the following line. Remember that VRAM writes are 8-bit, so only half of the full 16-bit write would get through at that access slot, but it should be possible to construct a test ROM which uses this method to determine the synchronization of the external access slots to the traversal of the SAT cache data, relative to this one fixed point at least. From there, it should be possible to infer the exact timing and progression of the rest of the reads. I'm planning to perform this test myself, since this is one of the things I still need to determine for my emulator.
Another option might be with some carefully timed writes to the display enable bit in the register 1. If you do it just on line 255/-1, you can mess with just sprite rendering without messing up the rest of the display.

Thanks for sharing the additional info, Nemesis!

Chilly Willy
Very interested
Posts: 2984
Joined: Fri Aug 17, 2007 9:33 pm

Post by Chilly Willy » Tue Nov 13, 2012 4:32 am

Could the internal timing be used to figure out a way to start the DMA at the same time every frame? Right now, the code used in the DMA direct color example is black magic.

Post Reply