So I was wondering exactly how effective the XOR checksum that Commodore used for the amiga sectors was. If I read a sector, and perform the checksum, and the checksum matches the stored checksum, then the data is the same, right? Not always. I had expected it to be better, maybe not MD5 or SHA-1 level, but I expected it to be decent.
I had run into this paper a year ago when I was looking at my transfer checksums. But this certainly also applies to the amiga sectors, too.
Some good excerpts:
- the percentage of undetected two-bit
errors becomes approximately 1/k (k being checksum size), which is rather poor performance for
a checksum (i.e., 12.5% undetected errors for an 8-bit checksum, and 3.125% for a 32-bit checksum).
- The XOR checksum has the highest probability
of undetected errors for all checksum algorithms in this study…..
- There is generally no reason to continue the common practice of
using an XOR checksum in new designs, because it has the same software computational cost as an
addition-based checksum but is only about half as effective at detecting errors.
So I wanted to actually prove myself that XOR is that bad, looking at it from the amiga sector perspective, so I wrote a quick program to:
- Create a random block of data 512 bytes long
- Calculate the 32-bit XOR checksum based on 32-bit chunks at a time and store it as the last byte (as does the amiga)
- Select a number of bit inversions(basically corrupt the data in a specific way) which can affect data and/or stored checksum
- Recalculate the checksum of the modified block
- If the checksums MATCH, meaning that two different sets of data yield the same checksum, then this is an undetected error.
- Count the number of undetected errors vs total errors.
That paper lists the properties of an XOR checksum, and I wanted to compare results:
- Detects all single-bit errors. (my testing confirms)
- Fails to detect all two-bit errors. (my testing confirms, see below)
- If total number of bit errors is odd, XOR checksum detects it. (confirmed with 1,3,5,7 bit errors)
- Fails to detect any even number of bit errors that occur in the same bit position of the checksum computational block. (confirmed with 2,4,6,8)
The two-bit errors is really the case that I worry about. If two bit errors occur in the same bit position, and the inverted oppositely(1—>0 and 0—>1), then it won’t be detected. So how often does this happen with random data? Paper’s author, Maxino, says 3.125% of the time. I can definitely confirm this. My testing shows 1.8 million hits over 56 million tries. There are some differences with Amiga data, but I think the results can be the same. I might load some example amiga sectors and try them.
Once the number of bit errors increase, the probability of them happening in the same bit position, in the same direction, and occurring evenly on each position goes down —- and therefore the overall chance of undetection decreases as well.
4 bit undetected errors happen 0.3% of the time.
6 bit undetected errors happen 0.04% of the time.
8 bit undetected errors happen 0.009% of the time.
I have other testing to do, including running burst error tests.
So for each sector, you really don’t want to see only two bit errors occur. A single bit error, sure. Multiple bit errors, ok, because we can detect them. You don’t want to think you have a good copy when it fact the checksum passing was a false positive.