Confusing crash Address Error.
Moderator: BigEvilCorporation
-
- Very interested
- Posts: 2984
- Joined: Fri Aug 17, 2007 9:33 pm
If you need the high word preserved, I think move.w #0,d0 is the same speed. One of the first things you learn about 68000 assembly is avoid clr whenever possible as it's (unnecessarily) slow, and will even cause problems when used on some hardware registers - NEVER use clr on hardware registers - period. Now in many cases, the difference in speed doesn't matter, but if you're trying to optimize an inner loop or working on hardware, it's one of the things you learn.r57shell wrote:may be I need high word of d0? but yes, it's fastest way in other cases.
According to timing:
I don't know why you think clr.w is bad. As far as I know, only moveq beats clr.l, everything else - clr better. For me it's obvious, because any immediate operation requires extension word (immediate value), except that ones with q at the end: moveq, addq, subq... which is one word opcode. And reading extension word requires at least 4 cycles, that's why one word opcode beats two word opcodes.
Code: Select all
moveq #0,d0 =4
clr.l d0 =6
clr.w d0 =4
move.w #0,d0 =8
move.l #0,d0 =12
As I said, clr beats any of move #0, <ea>. You may check if you want.
It even almost same as move d0,<ea>, and that is strange. (move faster only on -(an), (d8, An, Xn)). So best way for clearing several variables:
Something strange because in Programmer Manual stated
It even almost same as move d0,<ea>, and that is strange. (move faster only on -(an), (d8, An, Xn)). So best way for clearing several variables:
Code: Select all
moveq #0,d0
move d0, <ea>
...
and in timings that read is invisible. May be RW cycle with one bus access? But as stated in User ManualIn the MC68000 and MC68008 a memory location is read before
it is cleared.
Something wrong in timings or in clr description.The test and set (TAS)
instruction uses this cycle to provide a signaling capability without deadlock between
processors in a multiprocessing environment. The TAS instruction (the only instruction
that uses the read-modify-write cycle) only operates on bytes. Thus, all read-modify-write
cycles are byte operations.
-
- Very interested
- Posts: 619
- Joined: Thu Nov 30, 2006 6:30 am
I'm not sure how you're timing the instruction, but you're wrong. The overall instruction duration is the same except for the case in which <ea> is a register direct mode. This is quite clear if you look at the microcode. clr uses the standard effective address microcode subroutines which do a separate read first before the instruction specific code runs. I suppose it's possible they changed this between the patent filing and the final chip, but since the manual agrees with the microcode in this case that seems unlikely.r57shell wrote:As I said, clr beats any of move #0, <ea>. You may check if you want.
Now where exactly the bus operations occur in the overall duration will be different (too lazy to check the microcode listings at the moment to describe it precisely) and I suppose this could impact the measured performance based on how things sync up with the various refresh delays.
Yeah TAS is the only instruction to use the read-modify-write cycle and I don't think that cycle is any faster (or at least not significantly so) than two separate bus operations. It's only purpose is to prevent another bus master from taking the bus betwen the read and the write.r57shell wrote:Something strange because in Programmer Manual statedand in timings that read is invisible. May be RW cycle with one bus access? But as stated in User ManualIn the MC68000 and MC68008 a memory location is read before
it is cleared.Something wrong in timings or in clr description.The test and set (TAS)
instruction uses this cycle to provide a signaling capability without deadlock between
processors in a multiprocessing environment. The TAS instruction (the only instruction
that uses the read-modify-write cycle) only operates on bytes. Thus, all read-modify-write
cycles are byte operations.
-
- Very interested
- Posts: 2984
- Joined: Fri Aug 17, 2007 9:33 pm
In the programmers reference, there's this note for clr:
CLR is good for clearing a word register, but no better for anything else, and worse for memory.
The hardware manual specifically states that CLR <memory> takes 8 cycles + ea calculation time, does one read cycle, and one write cycle. EA calculation is 0 only for data or address register direct. It's 4 cycles for simple EAs, and as much as 16 cycles for more complex EAs. Note that the extra read is in the calculate EA timing, not the instruction timing. The calculate EA timing for (An), for example, states + 4 cycles + 1 read cycle. So the fastest clr <memory> is 12 cycles, 2 reads, and one write.NOTE
In the MC68000 and MC68008 a memory location is read before
it is cleared.
CLR is good for clearing a word register, but no better for anything else, and worse for memory.
Where I can look at the microcode.Mask of Destiny wrote:The overall instruction duration is the same except for the case in which <ea> is a register direct mode. This is quite clear if you look at the microcode.
Yeah I checked this cycle timing, and it looks just read and write one after another without releasing bus.Mask of Destiny wrote:Yeah TAS is the only instruction to use the read-modify-write cycle and I don't think that cycle is any faster (or at least not significantly so) than two separate bus operations.
I don't know how I was computing timing previous time, but it's obvious that I was retarded.
Code: Select all
dn an (An) (An)+ –(An) (d16, An) (d8, An, Xn)* (xxx).W (xxx).L
clr.w 4(1/0) 4(1/0) 12(2/1) 12(2/1) 14(2/1) 16(3/1) 18(3/1) 16(3/1) 20(4/1)
move.w #0 8(2/0) 8(2/0) 12(2/1) 12(2/1) 12(2/1) 16(3/1) 18(3/1) 16(3/1) 20(4/1)
move.w dn 4(1/0) 4(1/0) 8(1/1) 8(1/1) 8(1/1) 12(2/1) 14(2/1) 12(2/1) 16(3/1)
-
- Very interested
- Posts: 619
- Joined: Thu Nov 30, 2006 6:30 am
Check out : this thread Tasco Deluxe posted some links to patents in the first reply. One of them has a full listing of the micro and nanocode for a pre-production version of the 68000. There are some differences between that and the final microcode (for instance, this version has a different looping instruction called dcnt instead of dbra), but for most instructions it's probably the same. Until someone is able to determine the organization of the individual microcode bits on the die, this is the best we have.r57shell wrote: Where I can look at the microcode.
-
- Very interested
- Posts: 2984
- Joined: Fri Aug 17, 2007 9:33 pm
Everyone has a brain-fart now and then. I've said some really bone-headed stuff myself on occasion. I try not to get too defensive when people call me on it.r57shell wrote: I don't know how I was computing timing previous time, but it's obvious that I was retarded.Code: Select all
dn an (An) (An)+ –(An) (d16, An) (d8, An, Xn)* (xxx).W (xxx).L clr.w 4(1/0) 4(1/0) 12(2/1) 12(2/1) 14(2/1) 16(3/1) 18(3/1) 16(3/1) 20(4/1) move.w #0 8(2/0) 8(2/0) 12(2/1) 12(2/1) 12(2/1) 16(3/1) 18(3/1) 16(3/1) 20(4/1) move.w dn 4(1/0) 4(1/0) 8(1/1) 8(1/1) 8(1/1) 12(2/1) 14(2/1) 12(2/1) 16(3/1)
And I like that table. It makes it really easy to see which opcode you should use under which conditions.