One thing I like about the holidays is the chance to finish off hobby projects. Earlier this month, I released the first beta of the xum1541, which is a C64 USB floppy adapter. The first release supported basic functions to read and write disks via the OpenCBM utilities.
Today, I finished testing for parallel nibbler support. The code is available in the OpenCBM cvs repository, and directions are on my xum1541 page. When used with nibtools, it can now copy protected disks and transfer data much faster than before. I’ve successfully tested both read and write support on Windows and Mac OS X. This is quite a milestone as it is the first USB interface to support the parallel nibbler protocol.
A bit of explanation is in order. The built-in interface for a 1541 floppy drive is serial and has CLK, DATA, and ATN signals. It is a serial version of the parallel IEEE-488 bus with conditions such as EOI signalled in-band instead of requiring separate wires. Commodore originally used IEEE-488 on the PET, but moved to the IEC serial protocol to cost-reduce the cables and avoid shortages in Belkin’s supply. The serial protocol is slower than parallel, but the legendary slowness of Commodore drives had more to do with attempting to maintain backwards compatibility with older drives, not the serial protocol itself. Third-party speeder cartridges fixed this in software by repurposing the serial signals for higher-speed signalling.
To get the full bitrate the drive mechanism is capable of though, hardware modifications were required. Copiers such as Burst Nibbler added an 8-bit parallel cable in addition to the serial lines. This was relatively easy since there are two 6522 IO chips in the 1541 drive. Each has two 8-bit IO ports, and one of them is not normally used. So the parallel cable can be connected to the unused lines. Since the drive ROM does not use these lines, the copier has to load a custom routine into the drive’s RAM while initializing. It is then activated to manage the data transfer.
When Commodore hardware died out, users still needed to transfer data to and from floppies. The X-series of cables was invented, using the PC printer port for interfacing. That worked for a while until Windows NT and above made it harder to get accurate inb/outb timing, and then the DB25 printer port disappeared completely. USB established itself as the next great thing.
USB is high bandwidth but also high latency. The bit-banging approach to interfacing via the printer port would no longer work. It takes around 1 ms to get data to a USB device, no matter how small. Since the 1541 drive mechanism transfers data at 40 KB/sec, that is about 25 microseconds per byte, much less than the latency. The xum1541 does all the handshaking with the drive in an AT90USB microcontroller running at 8 MHz, giving great accuracy. The data transfers to the host are done via a double-buffered hardware USB engine. It has a state machine that handles the actual USB signalling, so we can flip buffers while it is clocking data out to the host. This gives us the cycles we need for the drive.
The protocol is actually pretty simple. The setup routines, such as which track to select, signal a byte is ready for the drive by toggling ATN, while the drive toggles DATA to acknowledge it has seen it. The custom drive code reads these bytes and then jumps to the appropriate handler. When it is done, it sends back a status byte via the same protocol.
For the high-speed transfer, something even lighter weight is needed. The drive CPU is a 6502 running at 1 Mhz, which gives about 12 instructions per byte. The transfer protocol is started with a handshaked read or write as above. Then the drive begins to transfer data one byte at a time, toggling the DATA line each time a byte is ready. The microcontroller stays in sync by waiting for each transition and then reading a byte from the parallel cable. Thus, the path from the initial handshake to the data transfer loop must be very quick and then continue without interruption.
The parallel transfer gets you something else besides high speed. Many protection schemes were built on the fact that the 1541 only has 2 KB of RAM, not enough to store a full track, which is up to 8 KB. If a mastering machine wrote a track pattern that had many similarities, ordinary copying software that read the track in pieces could not be sure it lined up properly when reassembling the pattern on the backup copy. The protection scheme, which could read and analyze the entire track in one pass, would detect this difference and refuse to run the game. To duplicate this kind of disk, users either added 8 KB of RAM to the drive or added a parallel cable. Both allow an entire track to be read in a single pass.
It was fun implementing this protocol because microcontrollers are a dedicated platform. You can count clock cycles for your instructions and be guaranteed latency. Compared to desktop PCs, where you’re running concurrently with questionable software written by people who definitely don’t count cycles, this is a dream. If you make a mistake, it is your fault. There is nothing like an SMI handler that could lock the CPU for seconds while it handles a volume button press.
Happy Holidays from all of us at Root Labs!