MOVEM quirk

For hardware talk only (please avoid ROM dumper stuff)
Post Reply
Charles MacDonald
Very interested
Posts: 292
Joined: Sat Apr 21, 2007 1:14 am

MOVEM quirk

Post by Charles MacDonald » Thu Mar 26, 2015 6:09 pm

I ran into an interesting problem when debugging software on a 68K SBC. When doing a memory-to-register transfer using MOVEM of any size (word,long) with any EA mode, it will read one extra address past the last legitimate location. This read is word sized even if transferring words or longwords.
For example when examining the bus for this sequence:

MOVEM.W ($C000).w, D0
JMP 0x083E(PC)

The words in ROM are:

0 - 48B8 : movem.w opcode
2 - 0001 : movem.w register list
4 - C000 : movem.w destination
6 - 4EFA : jmp opcode
8 - 083E : jmp displacement

The bus activity is:

1. Read word 0 (48B8)
2. Read word 2 (0001)
3. Read word 4 (C000)
4. Read word 6 (4EFA, prefetch opcode of jmp)
5. Read FFC000.w
6. Read FFC002.w (extra read that isn't expected)
7. Read word 8 (083E, jmp displacement)

(Note: If using MOVEM.L, the extra word read is inserted between bus cycles 6 and 7 and is from FFC004.w)

This unexpected behavior caused two issues:

1. When checksumming ROM by reading using "movem.l (a0)+, d4-d7" to gather data words, I got a bus fault as the first address past the end of ROM is undefined and a bus error is triggered. (e.g. ROM limit is 0x0FFFFF, but MOVEM will read from 0x100000).

2. When repeatedly reading the data register of an IDE device to read a 256-word sector, reading 8 words at a time by "movem.l (IDE_DREG).w, d0-d7", I was losing data due because really 9 words were read per iteration.

If you map extra hardware to the Genesis (or any 68K system) and use MOVEM near the edges of the memory ranges, or add memory-mapped I/O devices like the IDE interface in my case, you definitely have to watch out for this.
What's interesting is that there is no similar behavior for writes. It's only for reads, of any size, with any register list (one register, all registers, etc.) and any EA mode.

To avoid reading past a memory range and avoid triggering a bus fault, I made sure the last iteration of the checksum loop transferred one less register:

movem.l (a0)+, d0-d6
move.l (a0)+, d7

For memory mapped I/O, I would suggest spreading the registers out so there is a unused memory location past the last valid I/O address. In my case next available address maps to an IDE register that is safe to read without affecting anything, and I had spread out the 16-bit register through a 256 byte region.
So by doing a MOVEM.L of eight registers (16 bytes) at offset 0x0F0, there is one dummy read of 0x0100 which doesn't map to anything critical. Thus the extra read is redirected and won't cause problems. In hindsight I would have interleaved the registers to have a unused area between each one.

I think this behavior is still consistent with the Motorola timing. They say each register access takes 2n bus cycles. It really takes 1, but if you consider all the extra cycles used for prefetching and the dummy read then it amounts to 2n total. It was probably simpler to explain it like that rather than discuss the prefetch mechanism in detail. Though with all the prefetching it's harder to see where one instruction ends and the next one begins as it's all interleaved.
Last edited by Charles MacDonald on Fri Jun 19, 2015 7:20 pm, edited 1 time in total.

Stef
Very interested
Posts: 3131
Joined: Thu Nov 30, 2006 9:46 pm
Location: France - Sevres
Contact:

Post by Stef » Fri Mar 27, 2015 9:38 am

Interesting, does the extra cycle happen also with (An) indexing ?, i guess so ! That also explains why MOVEM.w Mem-->Reg is 12+4n cycles
when MOVEM.w Reg --> Mem is 8+4n cycles.

It always annoyed me, because of that you use MOVEM.w Mem-->Reg only when you have 3 or more registers to transfer :-/

In both case we should have only 8 cycles + 4n :
- 4 cycles for instruction fetch
- 4 cycles for registers mask
- 4*n depending number of register to read/write.

So here if the MOVEM.w Mem-->Reg does an extra BUS cycle it explains where the 4 extra cycles are lost. It's probably due to the internal implementation of the MOVEM.w Mem-->Reg, still the extra BUS cycle happen at the end of the instruction when i would have expected it to happen at beginning (duplicated BUS cycle for instance to wait for some internal logic).

Charles MacDonald
Very interested
Posts: 292
Joined: Sat Apr 21, 2007 1:14 am

Post by Charles MacDonald » Sat Mar 28, 2015 12:52 am

Stef wrote:Interesting, does the extra cycle happen also with (An) indexing ?, i guess so ! That also explains why MOVEM.w Mem-->Reg is 12+4n cycles
when MOVEM.w Reg --> Mem is 8+4n cycles.
That's right, it happened when the source was (An), (An+), ($nnnn).w, etc. I tried all of the ones I could think of.

Post Reply