VDP registers timings

For anything related to VDP (plane, color, sprite, tiles)

Moderators: BigEvilCorporation, Mask of Destiny

Eke
Very interested
Posts: 885
Joined: Wed Feb 28, 2007 2:57 pm
Contact:

Post by Eke » Mon May 19, 2008 8:02 am

Actually, the dots that appear are the colour of the value you write. Essentially, when you write to CRAM, the single pixel that is being drawn at that moment displays the colour you were writing to the CRAM at that moment, regardless of the colour that should have been displayed. This occurs even when the display is disabled, or when the VDP is rendering the screen boarders, and one write always changes the colour of a single pixel if the VDP is in the process of rasterizing a line.

I've done extensive testing on this behaviour. You can actually use it to measure the timing of the FIFO buffer during active scan, as it allows you to see visually on the screen when each CRAM write gets through. I've emulated mid-line writes to CRAM, as well as this flicker bug, in my emulator. As for other settings though, like when specific register changes take effect, I haven't sampled that information yet. I'm planning to test each and every register setting, and writes to all the various buffers in ram to determine when the VDP samples all its data, and compile the results into a single document. It's essential in order to get the VDP emulation "perfect", and avoid all potential single-line raster problems like the one in Galahad, so I'll definitely be doing it at some point.


I may as well add a bit more info, I've noticed from observing the FIFO timing that there are a series of "slots" in fixed positions for each scanline when buffered writes go through. I'll be calculating the precise coordinates for those quite soon, but they appear to be 2 cells apart starting from the left side of the screen. There was also one mark in an odd position near the end of each line, which I believe corresponds with the position where the HINT is triggered.

There are a number of factors that lead me to believe that the VDP renders the screen in 2-cell segments internally, and it is logical that the only time data can be changed and produce predictable results is at the end of each of these 2 cell blocks, which the FIFO enforces. I suspect changes to ram are seen at the end of each of these blocks, but there are some questionable values, like hscroll data which I'd have to test to be sure, and I very much doubt mid-line changes to the sprite table are seen. As for register changes, I suspect these are only applied at some point during hblank. Again, this needs testing though.
Thanks a lot for this, I'm impatiently waiting for your documentation.

Currently, I think I have finally achieved a 100% accurate VDP emulation since I can not find anymore games with single line glitches or flickering (but who know, testing ALL genesis games take sooo much time :?), but there are indeed still lot of genesis timings stuff to discover, like interrupts exact occurence and HVC values regarding other VDP events (HBlank/Vblank), hblank/vblkank/vint flags triggering, register latches, etc...

About the FIFO, the documentation give the max number of permitted CPU access to V-RAM by line (btw, data rate is likely the dma rate) , as well as the max waiting time for the FIFO to execute one command, did you figured the exact slot timesharing between VDP & CPU ? Can we consider taht a VRAM write and a VRAM read takes approximately the same number of VDP clock ?

About the sprites, you probably already know but Charles McDonald discovered that there was an internal copy of the SAT, when rendering sprites, some of the sprite attributes were taken from this table, some other taken from current VRAM table location
Most probably, sprite data is latched somewhere on the line.

TmEE co.(TM)
Very interested
Posts: 2440
Joined: Tue Dec 05, 2006 1:37 pm
Location: Estonia, Rapla City
Contact:

Post by TmEE co.(TM) » Mon May 19, 2008 12:09 pm

you can fill sprite wide (8 to 32pixels) screen area vertically by using 1 sprite... I haven't done any further tests on that though.
Mida sa loed ? Nagunii aru ei saa ;)
http://www.tmeeco.eu
Files of all broken links and images of mine are found here : http://www.tmeeco.eu/FileDen

Jorge Nuno
Very interested
Posts: 374
Joined: Mon Jun 11, 2007 3:09 am
Location: Azeitão, PT

Post by Jorge Nuno » Mon May 19, 2008 9:35 pm

Eke wrote:
[....]
Most probably, sprite data is latched somewhere on the line.
Yup, it's fetched at its end, in the Hsync period...

Nemesis
Very interested
Posts: 791
Joined: Wed Nov 07, 2007 1:09 am
Location: Sydney, Australia

Post by Nemesis » Tue May 20, 2008 4:13 am

About the FIFO, the documentation give the max number of permitted CPU access to V-RAM by line (btw, data rate is likely the dma rate) , as well as the max waiting time for the FIFO to execute one command, did you figured the exact slot timesharing between VDP & CPU ? Can we consider taht a VRAM write and a VRAM read takes approximately the same number of VDP clock?
I need to do more tests to make definitive statements, but here's what I can say at this point.

I know from testing that writes to CRAM/VRAM/VSRAM are only allowed at fixed positions during each line. I have the ability to calculate the exact positions of each of these slots. A single slot allows a single bus request from an external device to be processed. The FIFO buffers each VRAM write until one of these slots comes around, then it uses the slot to perform the write. The slots have a regular spacing of what appears to be a 2-cell period. I suspect that very few slots exist during hblank, but I haven't done any testing on that whatsoever.

I'm almost certain that VRAM reads and DMA occur on the exact same timing, with the exact same restrictions, as VRAM writes. I would be extremely surprised if they did not. An external device is obviously always left waiting for a read however, regardless of the state of the FIFO, because the operation cannot be completed until the read can be processed. What I don't know at this point is, if there is one or more writes in the FIFO queue and a read request comes through, does the read "jump the queue" and get processed first, or do the buffered writes already in the FIFO get processed first. If reads always get processed first, it should be possible to read back an old value even if you've already tried to overwrite it, if the write is stuck in the FIFO. If reads have to wait in the FIFO just like writes do, the maximum wait time for reads will be much longer than for writes, because a read can then only be processed when all existing entries in the FIFO have been processed.


Here's what the official documentation says about FIFO access timing:
Image

Note that the documentation says that the wait times vary between H32 and H40 cell mode, and that for H32, 16 accesses are processed, while for H40, 20 accesses are processed. This matches my observed period of 2 cells for each access. They provide maximum wait times of 5.96 microseconds for H32, and 4.77 microseconds for H40. The "maximum wait time" I can be sure of at this point is the time between two cells on a single line, when running at 50Hz, in H32 cell mode. Using some dodgy math and completely ignoring the hblank/vblank periods:

(((1000000/50)/224)/16)=5.58
(((1000000/50)/224)/20)=4.46

That's close, but not quite the same as the officially reported values. If I was taking into account hblank/vblank, my results would be lower, not higher. Make of that what you will. I suspect the maximum wait time actually occurs around hblank, either leading into it or coming out of it, as the VDP loads its information for the next line. Of course, these numbers in the official docs could be wrong anyway. Personally, I doubt they took into account blanking periods when they came up with these figures. They're just a guideline anyway.


As you can see, I need to do a lot more tests. I haven't even started running tests to see when register changes are applied yet. I suspect the VDP reads the internal register settings and copies/applies them during the hblank of each frame, meaning no register changes can take effect mid-line, but again this needs to be tested. I've got the means and the method to test all these cases, I just need the time.

tomaitheous
Very interested
Posts: 256
Joined: Tue Sep 11, 2007 9:10 pm

Post by tomaitheous » Tue May 20, 2008 5:17 am

Nice bit of info. I wonder if the doc is wrong though. It says 16 words writes per active line for CRAM, but from what tests I know for CRAM writes, the pixel garbage/artifact on screen showed up in 32pixel wide increments. Suggesting only 8 words for CRAM in H32 per active line. Anyone else confirm this?

Eke
Very interested
Posts: 885
Joined: Wed Feb 28, 2007 2:57 pm
Contact:

Post by Eke » Tue May 20, 2008 8:23 am

I know from testing that writes to CRAM/VRAM/VSRAM are only allowed at fixed positions during each line. I have the ability to calculate the exact positions of each of these slots. A single slot allows a single bus request from an external device to be processed. The FIFO buffers each VRAM write until one of these slots comes around, then it uses the slot to perform the write. The slots have a regular spacing of what appears to be a 2-cell period. I suspect that very few slots exist during hblank, but I haven't done any testing on that whatsoever
the only thing we knew was:

- there are 16 CPU slots in H32 mode
- there are 20 CPU slots in H40 mode
- remaining slots are VDP read slots

what was not certain:
- what is the length of one slot and does VDP/CPU have the same slot length (this could be retrieved from the total number of slots and line duration)?
- what is the slot order ?

from what you say (how did you verify this btw ?), CPU + VDP slots share a 2 cell period

Maybe (pure speculation again), there 1 cell period for CPU access, 1 cell periodfor VDP access. What we are missing in the exact read/write cycle duration for VRAM/CRAM/VSRAM
I'm almost certain that VRAM reads and DMA occur on the exact same timing, with the exact same restrictions, as VRAM writes. I would be extremely surprised if they did not. An external device is obviously always left waiting for a read however, regardless of the state of the FIFO, because the operation cannot be completed until the read can be processed. What I don't know at this point is, if there is one or more writes in the FIFO queue and a read request comes through, does the read "jump the queue" and get processed first, or do the buffered writes already in the FIFO get processed first. If reads always get processed first, it should be possible to read back an old value even if you've already tried to overwrite it, if the write is stuck in the FIFO. If reads have to wait in the FIFO just like writes do, the maximum wait time for reads will be much longer than for writes, because a read can then only be processed when all existing entries in the FIFO have been processed.
well, these are only speculations: on a hardware basis, I don't see why a read command should have priority over existing writes command, they are most probably added to the FIFO last entry and are blocked as long as FIFO is full, like writes

this definitely require real testing but I don't see how you can have access to the internal VRAM bus
Note that the documentation says that the wait times vary between H32 and H40 cell mode, and that for H32, 16 accesses are processed, while for H40, 20 accesses are processed. This matches my observed period of 2 cells for each access. They provide maximum wait times of 5.96 microseconds for H32, and 4.77 microseconds for H40. The "maximum wait time" I can be sure of at this point is the time between two cells on a single line, when running at 50Hz, in H32 cell mode. Using some dodgy math and completely ignoring the hblank/vblank periods:

(((1000000/50)/224)/16)=5.58
(((1000000/50)/224)/20)=4.46
here's how I once retrieved the correct value, your formula is not correct as it has nothing to do with 50hz/60hz:

one line is approx. 63.69 us
hdisplay is 63.69*256/342 = 47.67 us (same for both modes)
this gives you :

H32 (worst case is 8 words for VRAM) : 47.67/8 = 5.96 us
H40 (worst case is 10 words for VRAM): 47.67/10 = 4.77 us

see ;)


EDIT

I have redone my maths recently, here is a better explanation:
the worst case is still a single VRAM word write which actually takes two internalVDP writes operation, which make a maximal wait time of 2*2 = 4 cells...

4 cells = 4 * 8 = 32 pixels and knowing the dot clock in both modes:

H32 is 53693175/10 --> max. wait time is 32 / 5.3693175 = 5.96 us
H32 is 53693175/8--> max. wait time is 32 * 8 / 53.693175 = 4.77 us
Last edited by Eke on Fri Feb 20, 2009 2:47 pm, edited 1 time in total.

notaz
Very interested
Posts: 193
Joined: Mon Feb 04, 2008 11:58 pm
Location: Lithuania

Post by notaz » Tue May 20, 2008 9:09 am

Nemesis wrote:The slots have a regular spacing of what appears to be a 2-cell period.
Yeah, it seems VDP does some additional processing every 2 cells, that's why we have 2-cell vscroll too.

ob1
Very interested
Posts: 463
Joined: Wed Dec 06, 2006 9:01 am
Location: Aix-en-Provence, France

Post by ob1 » Tue May 20, 2008 9:59 am

notaz wrote:Yeah, it seems VDP does some additional processing every 2 cells, that's why we have 2-cell vscroll too.
!?! Could it be ? I thought it was a hardware VDP limitation.

Nemesis
Very interested
Posts: 791
Joined: Wed Nov 07, 2007 1:09 am
Location: Sydney, Australia

Post by Nemesis » Tue May 20, 2008 1:01 pm

[here's how I once retrieved the correct value, your formula is not correct as it has nothing to do with 50hz/60hz:

one line is approx. 63.69 us
hdisplay is 63.69*256/342 = 47.67 us (same for both modes)
this gives you :

H32 (worst case is 8 words for VRAM) : 47.67/8 = 5.96 us
H40 (worst case is 10 words for VRAM): 47.67/10 = 4.77 us
Yep, that sounds more sensible. At least we know how they got those figures then.


I've been going over my notes at home, and re-running some of the tests I did. It'll be easier to demonstrate what I'm talking about with a few roms:
http://nemesis.hacking-cult.org/MegaDri ... v28h40.bin
http://nemesis.hacking-cult.org/MegaDri ... v28h40.bin

And since these won't give you anything on an emulator, here are some short videos of them running:
http://nemesis.hacking-cult.org/MegaDri ... v28h40.avi
http://nemesis.hacking-cult.org/MegaDri ... v28h40.avi

These are video captures directly from a PAL unit, model 1600. The noise and slight wobble is due to the capture card. On the system, those are perfect vertical lines.

These are a couple of the test roms I've made to test the times at which writes are performed. Both of them are in a constant loop writing data to the CRAM. Both are running in H40 mode. The first rom shown uses only the CRAM flicker bug to produce an image. Technically, the backdrop colour is using a palette entry which remains set to 0 constantly. Each pixel of colour is a single write being written to the CRAM, which is also written to the output as a result of the bug. The second rom is writing the data to the palette entry which is also being used for the backdrop.


Looking at these test roms again, there are a few interesting points I'd forgotten.
1. The "slots" at which buffered writes are allowed through are at fixed positions in each line, but interestingly, there's a regular "gap" where you'd expect a write would be processed, but it doesn't happen.
2. Corresponding with the "gap" mentioned above, writes are held in the FIFO, even when drawing the screen boarders outside of active scan. The period is so small that it is impossible for the M68000 clocked at the speed it is to perform a second write or query the status register before the first write is removed from the FIFO, but the delay exists none the less. You can see it on the video in the boarder as a series of blank columns where no writes ever appear, with the next column having an unusually high number of writes.
3. The FIFO is purged at the maximum rate (one access per pixel) near the end of the last line, but before the line has finished being drawn. You can see the results of this dumped near the end of the last line of the display. This shows that the VDP pre-fetches data for batches of pixels, and marks the point at which access to the VDP is no longer restricted by active scan.

From checking these test roms again too, the numbers in the official docs regarding the number of writes per line in H32 and H40 cell mode are wrong. The number in H40 cell mode is a multiple of 3, with 16 visible on the image. Apart from that, I've already said most of what I can without doing further tests, which I don't have the time to do tonight.

Eke
Very interested
Posts: 885
Joined: Wed Feb 28, 2007 2:57 pm
Contact:

Post by Eke » Thu Jun 10, 2010 1:36 pm

Nemesis wrote: I've been going over my notes at home, and re-running some of the tests I did. It'll be easier to demonstrate what I'm talking about with a few roms:
http://nemesis.hacking-cult.org/MegaDri ... v28h40.bin
http://nemesis.hacking-cult.org/MegaDri ... v28h40.bin
I tried both programs on a PAL MD2 and only got a black screen. Same goes for your Sprite Test Program. Did you by any chances forget to write the TMSS register at the start of your program ?
Nemesis wrote:Looking at these test roms again, there are a few interesting points I'd forgotten.
1. The "slots" at which buffered writes are allowed through are at fixed positions in each line, but interestingly, there's a regular "gap" where you'd expect a write would be processed, but it doesn't happen.
2. Corresponding with the "gap" mentioned above, writes are held in the FIFO, even when drawing the screen boarders outside of active scan. The period is so small that it is impossible for the M68000 clocked at the speed it is to perform a second write or query the status register before the first write is removed from the FIFO, but the delay exists none the less. You can see it on the video in the boarder as a series of blank columns where no writes ever appear, with the next column having an unusually high number of writes.
3. The FIFO is purged at the maximum rate (one access per pixel) near the end of the last line, but before the line has finished being drawn. You can see the results of this dumped near the end of the last line of the display. This shows that the VDP pre-fetches data for batches of pixels, and marks the point at which access to the VDP is no longer restricted by active scan.

From checking these test roms again too, the numbers in the official docs regarding the number of writes per line in H32 and H40 cell mode are wrong. The number in H40 cell mode is a multiple of 3, with 16 visible on the image. Apart from that, I've already said most of what I can without doing further tests, which I don't have the time to do tonight.
Although I could not test your programs on the real thing, I actually tested a few demo programs that intensively write CRAM during active line (this onefor example which run in 40-cell mode) and was able to see what you described.

I also tested an old 512 color demo that runs in H32 mode to see the difference and here is what I observed:


H40 (15 visible access):
--x----x--x--x----x--x--x----x--x--x----x--x--x----xx--

H32 (12 visible access):
--x----x--x--x----x--x--x----x--x--x----xx--

x = one CRAM access from CPU
- = one cell (8-pixels) interval

Unfortunately, it's impossible to see what happen during Horizontal Blanking but there don't seem to be any access during the right border (could not see the left one) or maybe CRAM 'dot' does not happen during border. There might be some additional access during the invisible area though to match the allowed number of access given in documentation (16 in H32 mode, 18 in H40 mode).

Also, I'm not exactly sure about the pixel interval between the start of the line and the first visible access: it seems to be approximately 2 cell in both modes but I'm not sure. Same goes for the interval between the 2 last access (occuring at max. rate) and the end of line.

EDIT: i adjusted my TV factory settings to see more of the horizontal borders and it seems that:

1. in both modes, there is a slot during the left border, occuring 2 pixels before the end of border (so the next slot occurs 14 pixels after start of active line).

2. in both modes, last 2 slots are spaced by 2 pixels (maximal FIFO rate ?) and occurs approx. 18 pixels before the end of active line (maybe triggered just after HINT and just before vcounter is incremented / sprites processed ?)

3. in H32 mode ONLY, there is an additional slot during the right border, occuring approx. 13 pixels after the start of the border (end of sprite pre-processing ?).

4. active line rendering (a bit shifted comparing to line display) seems to be divided in 64-pixels sections, with 3 slots per section and 16 pixels between slots. Maximum wait time is between the last slot of a section and the first one of the next section: it is 32 pixels, which fits with the values given in official doc

5. Although the first slot of the first section is not visible (it would occur before left border, 18 pixels from the start of active line), it seems there are five complete (3 slots) sections in H40 mode, four in H32 mode. Counting the two (three in H32 mode) additional slots triggered at the end of the line and assuming official doc is valid regarding number of access per line (18 in H40 mode, 16 in H32 mode), it would mean there is one additional slot happening somewhere during Hblank but I guess it's impossible to know where exactly without VRAM bus spying...
Last edited by Eke on Mon Jun 14, 2010 9:38 am, edited 4 times in total.

TmEE co.(TM)
Very interested
Posts: 2440
Joined: Tue Dec 05, 2006 1:37 pm
Location: Estonia, Rapla City
Contact:

Post by TmEE co.(TM) » Fri Jun 11, 2010 1:55 pm

Sonic3 does 3 writes per line in first stage, and you can clearly see 2 writes right next to each other on left side before BG+sprite pixels start to be drawn.... I've never looked into it but it probably triggers a DMA right when HBL happens.....

I'm quite positive the values in manual are right.
Mida sa loed ? Nagunii aru ei saa ;)
http://www.tmeeco.eu
Files of all broken links and images of mine are found here : http://www.tmeeco.eu/FileDen

Nemesis
Very interested
Posts: 791
Joined: Wed Nov 07, 2007 1:09 am
Location: Sydney, Australia

Post by Nemesis » Mon Jun 14, 2010 11:35 pm

I'm sure the values in the manual are wrong, or at least, on a PAL system they're wrong. In the H40 test, I'm alternating between three colours as quickly as possible. Notice how in the video, the result gives stable columns of the same colour down the active region of the display? That means the number of writes per line must be a multiple of 3, otherwise the bands wouldn't be stable, they'd alternate each line. The manual says 20 writes per line are allowed, which doesn't have 3 as a factor. We can see 16 on the image, so maybe the number is 16, 19, 22, 25, etc.

I didn't do enough testing to confirm exactly how many writes per line are allowed, but it'd be easy to do, just keep on looking for the highest number which gives stable columns. I'm back into VDP testing mode now, so I'll be looking at this again in the near future, as well as testing on other systems and video modes. I'll make sure I post what I find.

Eke
Very interested
Posts: 885
Joined: Wed Feb 28, 2007 2:57 pm
Contact:

Post by Eke » Tue Jun 15, 2010 7:22 am

Nemesis wrote:I'm sure the values in the manual are wrong, or at least, on a PAL system they're wrong. In the H40 test, I'm alternating between three colours as quickly as possible. Notice how in the video, the result gives stable columns of the same colour down the active region of the display? That means the number of writes per line must be a multiple of 3, otherwise the bands wouldn't be stable, they'd alternate each line. The manual says 20 writes per line are allowed, which doesn't have 3 as a factor. We can see 16 on the image, so maybe the number is 16, 19, 22, 25, etc.
Again, there is nowhere in the manual where it's said there are 20 access per line during active display, at least not in the official one (maybe they made an error in the transcript).

Image

As you can see, they only confirm the 16 access per line in H32 mode and they made a mistake by recopying the values from within vblank for active display timings (which maybe was later "corriged" in the transcript)

On the contrary, we have the table for DMA access timings, which logically should use the exact same slots as CPU access :idea:
Image

It suggests there are 18 access per line in H40 mode (again 16 in H32 mode):
- DMA fill requires one additional write when DMA starts so it is taken in account regarding the amount of transfer bytes (though it should only affect the first line if DMA takes more than one)
- DMA copy requires a read followed by a write so the number of transfer bytes is divided by 2.

18 is a multiple a 3, which fits with your observations. In H32 mode, I observed there was an additional slot during the right border which is not seen in H40 mode.

PS: Also notice how the amount of lines available for full speed access is 2 less than the supposed number of lines during VBLANK (224+36=260, 224+87=311, 240+71=311), so if it's not a typo error, maybe the two last lines of the top border use slowest timings like active display. Maybe it's related to the fact that the VBLANK flag is cleared one line before line 0. Again, some stuff that need to be verified :wink:

TmEE co.(TM)
Very interested
Posts: 2440
Joined: Tue Dec 05, 2006 1:37 pm
Location: Estonia, Rapla City
Contact:

Post by TmEE co.(TM) » Tue Jun 22, 2010 12:57 am

I tested these and I could not see anything fun happen outisde usable screen... BUT

You can clearly see the image is made of 6x 64 pixel wide zones. Each zone does have 3 writes possible.

I took a photo of the TV :
http://www.fileden.com/files/2008/4/21/ ... sBands.jpg

Now, you can see there's a green strip on right.... but there's also a green one on left.... but colors only go RGBRGBRGB..... so there's 2 missing strips there. Now when you add 2, you nicely get 3 strips per one zone...

Here is Sonic3 first stage :

http://www.fileden.com/files/2008/4/21/ ... Change.jpg

There's 16 lines used to dp transfer, and each line does transfer 3 colors. I did some fun stuff with Genecyst and looked what colors were changed and what not, and 16 colors were not changed, this means 48 do chance and 48 / 16 = 3. Its a little difficult to tell but there are 2 pixels right next to each other, much like how it is on the right side of the screen...
There seems to be something fun going on regarding those tight together writes... it seems that when you have got it on say right side, you do not get it on left side...

Another question is what goes on in 256 pixel mode, there's only 5 zones, but manual says 16 writes are possible... 5 zones would mean 15 writes, unless there is a hidden write.

BTW, the right side where 2 strips are close together is the point where VDP starts building sprites. It does 32 sprite pixels in 8 screen pixels...
I need to do a bit more tests around it... I basically just literally looked at VRAM activity (used red color for probing...)... now that I have a camera I need to redo that stuff...
Mida sa loed ? Nagunii aru ei saa ;)
http://www.tmeeco.eu
Files of all broken links and images of mine are found here : http://www.tmeeco.eu/FileDen

Eke
Very interested
Posts: 885
Joined: Wed Feb 28, 2007 2:57 pm
Contact:

Post by Eke » Wed Nov 10, 2010 5:27 pm

Something I recently figured by looking at these shots:

- when looking at the borders, it seems maximal rate is the pixel clock divided by 2: indeed, in each 64-pixels section, you can see 31 CPU slots and one reserved slot. Interestingly, the position of the reserved slot corresponds to the 4th slot that is "jumped" during the active area.

- in H40 mode, there are 420 pixels: this makes a total of 210 slots. If you remove the five reserved slots (one per section, as shown on the picture), it gives you 205 access slots for CPU, same as value in official doc. In H32 mode, there would be only 342 pixels and one 64-pixels section less than in H40 mode: this would make 342/2 - 4 = 171- 4 = 167 access slot for CPU.

I guess we know where these values come from now :wink:


Its a little difficult to tell but there are 2 pixels right next to each other, much like how it is on the right side of the screen...
I think this counts for only one access. The color is the same, while on Nemesis video you can clearly see those on the right are two different access. It's very likely the effect is visible on neighbors pixels.

Anyway, I think CPU access slot are fixed: the first visible one (the one we see in Sonic 3, green strip on the left in Nemesis test program and some homebrew) is 2-pixels (one access cycle) before the start of active area, in the left border. There is most likely another one (16-pixels before) which is not visible and marks the start of the first 64-pixels area (3 CPU slots per area, followed by a reserved slot, refresh cycle maybe ?).

All of this shows that VDP is processing data in advance compared to pixel output, probably with one column (16-pixels) offset. The 2-pixels additional delay probably corresponds to one cycle internal latency.

Similarily, the end of data processing happens 16-18 pixels before the end of active display (at the same time, VCounter is incremented & HINT is triggered). At this time, VDP stops reading data from RAM and probably fetches data from internal registers to prepare for next line, leaving full bus access to CPU for a short time (2 consecutive accesses are visible in H32 & H40 modes).

Then it starts prefetching sprites for next line so it needs full bus access again. Most likely, there is one extra CPU access slot available somewhere during HBLANK (2 in H32 mode, one being visible at the end of the right border).

Post Reply