Anachro-mputing blog

retro and not-so-retro computing topics
Related links:
  • homebrew games, demos, and misc software (old)
  • Old links
  • NOWUT webpage
  • quick download links: (now with MD5 for the paranoid)

  • Updates

    2024 Nov 30 BDOS disassembly that can be rebuilt with NOWUT

    BDOS is the main memory-resident portion of CP/M, occupying 3.5KB of space (compared to only 2KB for the command processor, which can be overwritten during program execution). It provides the console and file I/O APIs through a single entry point, invoked by CALL 5.

    By disassembling an original CP/M 2.2 binary and doing some manual cleanup, I've produced a source file in NOWUT Z280 syntax. It can be relocated to a different base address during linking.

    Fixing the stack pointer alignment is definitely the low-hanging fruit, in terms of performance tweaks for the Z280 (when the Z280 is using a 16-bit data bus). The 'seek' routine can be made much smaller, but I didn't find it to be a performance bottleneck, at least during sequential file I/O.

    Download here

    2024 Nov 21 Windows Root Certificates

    I mentioned previously that I suspected Windows File Protection was being triggered because I deleted some expired root certificates (in Control Panel / Internet Options). Well, then I noticed that Windows Media was broken somehow, causing a number of games to crash when attempting to play their intro movie. Both of these issues were fixed by reverting the change.

    I have no idea where this data is stored. Doing a brute-force search of the filesystem and the registry yielded nothing. I had to restore a recent backup of the entire WINNT directory (which is often a good thing to have) to get those certificates back.

    All the certificates have an expiration date, meaning new ones will need to be added regularly, but you're not allowed to delete old ones? It seems that this data is doomed to grow endlessly. A poor design choice, IMO.

    2024 Nov 19 looking at CP/M code

    A number of programs I've downloaded, from sites like the *HUMONGOUS* CP/M Software Archives, have crashed when I tried to run them. Some of those triggered a 'privileged instruction' exception on the Z280. Are the files corrupt? Do they need CP/M 3 or a specific hardware configuration? Some CP/M programs have source code available, but it tends to use Intel 8080 syntax which I am unfamiliar with. I decided to make a Z80/Z280 disassembler so I could investigate and modify things.

    One thing I noticed was an unfortunate habit of setting the stack pointer to odd addresses. This is bad for Z280 performance, since PUSH and POP instructions transfer 16-bit data. But I also found, in CP/M itself, a DI opcode followed by a HALT. It seems this is designed to freeze the system when some bytes preceding the BDOS entry point are found not to match expected values, for instance if a program has too large a memory footprint and overwrites the beginning of BDOS. This would be one possible explanation for 'privileged instruction' exceptions.

    Another thing I saw was the 8080 version of a memory copy routine (since it does not have the LDIR instruction):
    lc00D2:
            dec c                                 ; $0000E750 0D
            retz                                  ; $0000E751 C8
            ld a,[de]                             ; $0000E752 1A
            ld [hl],a                             ; $0000E753 77
            inc de                                ; $0000E754 13
            inc hl                                ; $0000E755 23
            jp lc00D2                             ; $0000E756 C3 50 E7
    It might be fun to optimize things like this using Z280 instructions.
    2024 Nov 1 Windows File Protection

    According to Wikipedia, Windows File Protection is a component of 2000 and XP which is designed to silently restore certain system files whenever they are overwritten by wayward software. It goes on to explain that the system was scrapped with Windows Vista (and replaced with the whole insanity that is winsxs, making NTFS a requirement for the OS to run), though one would never know it from doing a web search for 'Windows File Protection' and getting mostly results about Windows 10 and 11.

    The preferred versions of system files are supposed to be sourced from the system32\dllcache directory. It's not clear what determines which files go in there in the first place. In the past, when I've triggered WFP by way of WinDBG kernel debugger (and inserting breakpoints into code in memory), I just get a prompt wanting me to insert the Windows CD. It never says what file it is looking for, so I can't be sure whether it is something in system32\dllcache or not.

    One of my PCs now has a WFP prompt every time it boots. I'm guessing this resulted from me deleting some expired root certificates. Maybe WFP needed one of those to check file signatures? Not worth experimenting with at the moment. I just click cancel and move on.

    Now I'm dealing with a laptop where the audio driver refuses to work (and it's not because of WDMAUD.DRV this time). Every time I tried installing one driver or another, I'd see in the system event logs several notices that WFP had preserved some files. (KSCLOCKF.AX and a few others like that.) The files are dated 1999, so there's no way the drivers were trying to install even older versions. Could this be why the audio driver doesn't work? (It always fails with the generic 'code 10' 'device can't start'.)

    So I went looking for a way to disable WFP to see if that would fix the audio. Some websites claim it can be disabled with a registry value:

    HKEY_LOCAL_MACHINE\Software\Microsoft\Windows NT\CurrentVersion\Winlogon\SFCDisable
    But this does NOT work. One site has a DLL patch for disabling WFP in Windows XP (though this DLL doesn't exist in 2000): https://www.overclockersclub.com/guides/disablesystemfilechecker/

    What I did was simply rename SFC.EXE. No more WFP messages in the system event log after that. Though oddly enough, KSCLOCKF.AX also doesn't get replaced with a different version when installing audio drivers. It doesn't get installed at all. I deleted the files and they stayed deleted after installing drivers. And the audio still doesn't work. WTF.

    2024 Oct 16 Z280 + CP/M + working hard disk

    I was helpfully informed by posters on VCF that CP/M 2.2 has an 8MB partition limit, on account of using 16 bits to specify a (128-byte) record. So I made an 8MB partition on my vintage WDAC2340, and it somewhat worked but had big problems. I fixed some bugs in the IDE driver code, but still had problems. It was recommended that I use the PIP command to copy files to the HDD with verification.

    PIP b:=a:*.*[v]

    I did this and it failed every time. When I read back one of the corrupt files and compared it with the original I found that sectors occasionally had a byte missing. Every subsequent byte in the sector would be shifted ahead by one and a garbage byte would be inserted at the end. When reading the corrupt file from the HDD again, errors would show up in different places. I immediately suspected unreliability of the ISA interface, which had already been evident from the wrong pixels in the VGA demo (though the VGA demo uses memory cycles while the IDE interface uses I/O cycles, so it wasn't guaranteed that they'd share the same issue). I also confirmed that IORDY is driven when reading the IDE data register.

    So I reprogrammed the CPLD with a modification intended to solve this, by holding the Z280's /WAIT signal low longer than IORDY. This reduced the number of wrong pixels in the VGA demo (they are now less than one in a million) but failed to fix file corruption. I also tried adding waitstates to I/O cycles, with no improvement.

    Just for the heck of it, I connected a different drive, a WDAC1170. PIP now ran without errors!

    Certainly the 340MB drive is not defective, since I tested it beforehand in my 286, but some kind of quirk of my Z280 board apparently doesn't agree with it. I would have thought the Winbond 83757f chip on the 'super I/O' card was the more important node in dictating timing. But somehow changing to another HDD made a difference.

    Some source code:

  • imagemak.no NOWUT source for a Win32 program that builds a CP/M disk image
  • zsys.no NOWUT Z280 source for CP/M 2.2 BIOS with IDE driver (work in progress)
  • What next? Now that I have a local mass storage device, there are plenty more projects to be done. Booting CP/M from HDD. More partitions. Reading up on how FCBs work so I can make CP/M programs. Monkeying with the IDE driver to improve performance. And if possible, exchanging the super I/O card for a sound card that has an IDE port, so I can have both sound and IDE with one slot.

    2024 Oct 11 MOAR DRAM and Z280 updates

    This little ISA card is going to give my 286 8MB of RAM, after I produce the verilog code for the CPLD.

    (If you're wondering why I've installed a decoupling capacitor next to a footprint where no IC or socket has been installed, you're not alone...)

    Since my 286 is only 8MHz (though overclocking to 10MHz might be possible), running with 0 waitstates shouldn't be too complicated.

    As for the Z280 keyboard situation, it is now solved (maybe for good). I found another issue beyond that of debouncing/filtering the keyboard signals. Because of an oversight in the CPLD code, /IOR was being asserted when the CPU read the keycode from the CPLD. No ISA cards are using the same port address, but the ISA data buffer was turning on and fighting the CPLD during these bus cycles. The bus buffer was originally a 74LS245, but recently I swapped it with a 74ALS245 to see if it would fix the wrong pixels in my VGA demo (which it didn't). The CPLD could pretty well overpower a 74LS part attempting to drive a 'high', since those tend to output only 400uA. But the 74ALS has a higher current than that, so bits were being read high by the CPU when they should have been low.

    With the keyboard working again, I can resume trying to access a 3.5" IDE drive from the Z280. In fact I can already read and write sectors using CHS addressing and a lame loop of IN or OUT instructions. But there may be an occasional data corruption issue, similar to the wrong pixels when writing to the VGA card. Needs more debugging. (I don't think I can use INIR or OTIR to transfer sector data, because I need the B register to hold the high byte of the IO address. In theory I should be able to use the Z280 DMA controller though.)

    2024 Oct 8 mass storage for the Z280

    Running CP/M on the Z280 board is currently constrained by the size of RAMdisk I can make, and the need to transfer data into and out of it via serial. A 512KB disk image doesn't go super fast, even at 115kbps. I originally planned to try connecting an SD card dongle to my DIAGS board to enable a form of mass storage, but haven't been able to do so because of a lack of usable 44-pin CPLDs.

    What are some other options? I considered making a miniature 'hardcard', an 8-bit ISA card which could mount a 2.5" 44-pin IDE drive. That would be cute, and would avoid needing any ribbon cable, but I have no idea where to get the 44-pin connector except by finding an old laptop that has one and is worthless enough to steal from. I would also need to decide how to interface the 16-bit ATA interface to the 8-bit slot. Using the low byte only, losing half the disk space and complicating interoperability with other systems, feels messy but works fine. And it should work with the Z280's DMA controller. Latching the high bytes, so they can still be transferred, is possible but implementing it in the best way to facilitate fast data transfer might be a little complicated and require several ICs (or another big CPLD, which I'd rather avoid).

    I also ran across DIP-32 DiskOnChip devices on eBay. I hadn't seen these before, and at first glance they look ideal for a small retro machine like this. But they are old chips and the reliability of the flash memory at this point is questionable. Worse, the data sheet I found does not have enough information on how to use it. The device contains an x86 PC option ROM which is supposed to emulate a disk, but this obviously isn't going to work on Z280.

    I briefly considered bit-banging ATA through the parallel ports on my YM2203 ISA card, but that would be slow and require ugly wiring.

    I decided to just plug in a normal ISA IDE card, and try to write some IDE driver code that uses 8 bits only. At least I have plenty of these around already, and plenty of 40-pin IDE drives.

    In preparation, I soldered the second ISA socket onto the PCB, and while I was at it, I used a wire to connect an additional address line from the Z280 to the '7128. Despite holding 2MB of the DRAM, the board was laid out with only 20 address lines connected due to a shortage of pins. In the end, one pin was unused though so I wanted to use it to get direct access to the whole 2MB.

    After changing the CPLD code to add the additional address line, suddenly there are big problems with the keyboard! I can only assume that chip resources have been reallocated in the process of compiling the verilog code, such that the logic which was working by sheer luck before has now failed. I am going to need some more robust debouncing of the keyboard signals.

    Cleaning things up at the analog level doesn't seem possible. It has 5.6K pullup resistors. Adding a 0.01uF capacitor does not fix the ringing but makes the rise time very slow. A 1K pullup improves the rise time but is already causing a noticable rise in the 'low' voltage level. A 0.1uF capacitor removes the ringing but the rise time becomes hopeless.

    I found a post from someone else about trouble with PS/2 keyboard signals. https://jacob.jkrall.net/ps2-on-65c02
    2024 Sep 24 CP/M 2.2 running on Z280

    I put working VGA and keyboard code into the EPROM, along with a menu that enables some primitive functionality like serial transfers and tweaking memory. Since that made it possible to power on the board and see signs of life without relying on the serial port, I could experiment with different CPU clocks. Though the only oscillator I have handy, besides 16 and 24MHz, is 28.6MHz. So I put that in, and yes the Z280 can run at 14.3MHz. It also lets it run the UART at 111860bps which is close enough to 115200 that it communicates with the PC just fine.

    After doing some reading into CP/M, and writing a small BIOS, I was able to load this vintage operating system into a RAMdisk on the Z280 board and start it!

    There is a lot of CP/M software I can now try to run. As CP/M 2.2 has no time API however, there are not many benchmark programs...

    2024 Sep 9 Solution to an annoying and obscure problem

    A while ago, I replaced a 500GB HDD in a PC with a 750GB HDD. I made a straight sector copy from the old to the new, because this is the easiest way to end up with a disk that boots (without having to hack boot sectors or reinstall the OS). The downside of a sector copy is that the partitions stay the same size. But aha! This disk had two partitions, so I just keep the first (bootable) one the same as it was, and remake the second one to fill the additional HDD space.

    This usually works, but something went a little weird this time.

    While booting up, Windows wanted to run CHKDSK. But this message came up: "The specified disk appears to be a non-Windows 2000 disk. Do you want to continue?" (This message text exists in ULIB.DLL, and in the Win XP version of the DLL the text has changed to "...non-Windows XP..." so I'm guessing other Windows versions may be susceptible to this same kind of problem.)

    I get no opportunity to respond to this prompt. 'No' is automatically chosen, which means CHKDSK doesn't run. This is not good, because after Windows has booted it is not possible to run CHKDSK (or the equivalent GUI option) on the boot partition since the OS itself has a zillion open file handles on it.

    The message must exist for some reason, but it gives no hint as to what that reason is. It's also arguably wrong on its face. How could the disk that Windows 2000 just booted from be a "non-Windows 2000 disk"? I compared the MBR and boot sectors between the new HDD and the previous HDD to look for discrepancies but found nothing.

    The issue remained unresolved for some time, until I noticed that I couldn't use my wireless ethernet card because the directory which had contained the configuration utility had gone missing. Then I had the idea of booting to (Win98) DOS and running SCANDISK. I couldn't remember if it supported FAT32 but tried it and found that it does. The first thing it complained about was the 'media descriptor byte', before fixing various other things. Afterward, Windows wouldn't boot because it claimed the SYSTEM registry hive was corrupt. I replaced it with SYSTEM.ALT. Then Windows booted, but still prompted to run CHKDSK and still declined to actually run it, as before.

    What exactly is the 'media descriptor byte'? For HDDs, it is supposed to be $F8, and it is located both in the boot sector (at offset $15) and the first sector of the FAT (at offset 0). I checked this and it wasn't the problem. But while looking at the FAT, and comparing it to other partitions I finally found THE issue:

    The second 32-bit word contains $0FFFFFFF. But the other computer has $FFFFFFFF. How did it get that way and why does it matter? I have no idea. But I changed it to $FFFFFFFF, and now the whole ordeal is over :) excepting any additional fallout from cross-linked files and lost cluster chains.

    2024 Aug 28 Random thing that I noticed

    So I was looking at keyboard scancode and ASCII charts and seeing this:

    Then I look down at the row of numbers on the keyboard:

    How about that, eh? There's a correspondance between the numerals, shifted characters, and ASCII values, with the exception of the zero being moved to the other end. However, this is a Japanese keyboard. The same correspondance doesn't exist on a US keyboard.

    2024 Aug 25 NOWUT version 0.34 release

    This version fixes several bugs. Aside from the ones noted in previous blog updates, I discovered that 'datadumps' could, in fact, be inserted in the middle of an assembly code routine, despite what NOWUT.TXT said. This behavior was unintentionally present since version 0.21. This is now fixed, and datadumps have been tweaked to hold more values (up to 48). On the ARM CPU they can also be further apart, owing to the 12-bit relative addressing. A new statement, FLUSHIMM, allows code to force datadump placement (if any) at a particular location. This is the only new feature.

    Check the documentation. Download the complete archive.

    And be sure to obtain Go Link from Go Tools website.

    2024 Aug 13 The latest news about the oldest stuff...

    Previously I reported about the process of repairing my quite old washing machine when it would no longer 'spin'. (Later it also had a seized drain pump which snapped the drive belt, and was repaired again.) Now an issue has arisen with the matching gas dryer, giving me cause to go off-topic again.

    The dryer was no longer heating up. Popping open the lower front panel, I could see the igniter cycling on and off but the gas evidently was not flowing. This valve assembly, FSP 688631, has two solenoids which both need to energize and retract an internal, spring-loaded plunger for gas to flow. The one in the center has failed.

    2024 Jul 21 5x Disassemblers release 0.98

    Unsuper, Revmips, and Disarm are NOWUT-syntax disassemblers for SuperH, MIPS, and ARM, respectively. These support raw binary input only. I mostly used them for debugging NOWUT itself, and a little bit of snooping inside other code.

    DeHunk is the disassembler for 68000/68010 code. It supports Amiga Hunk, Human68K, Atari .PRG, and raw.

    PEon is the x86 disassembler. It is the most complex one, yet still has some issues. It is intended to support raw, MZ, and PE. In the case of PE, it attempts to use import and export tables, and relocation data to improve the disassembly.

    Win32 builds, i386 Linux builds, and source: 5x disassemblers

    2024 Jul 4 latest Z280

    With 12MHz CPU clock, the Z280 can run its serial port at 37500bps. I've tested this and it seems to be close enough to the standard rate of 38400bps to work with the PC. It's a nice boost over the 14400bps I have been using.

    With changes to the CPLD logic, the keyboard is working without problems. And now, I have 80x25 text mode working on the Trident card. In theory, I should be able to write some new code for the EPROM so the system can boot to a menu or prompt. Then I can make a server program on the PC side that allows the Z280 system to fetch programs.

    2024 Jul 2 microscopes

    I thought it would be fun to interface a DSLR to a microscope and get high-res images of miniscule things. Working out how to do this took a lot of web searches and some experimentation. I already had the DSLR, so that's a given. I did not have a microscope. A few articles raised the possibility of assembling some adapters/tubes/bellows to put a microscope objective on a camera, but that would leave the (nontrivial) problem of getting the camera properly positioned with respect to the subject for long enough to take a photo, when the area of interest and the depth of field are tiny.

    So I bought a cheap microscope to play with. I haven't seen any information about it online, but it looks very similar to other microscopes with different branding that were made in Japan in the '60s. It also appears to conform to the JIS standard, with 170mm tube length and 36mm parfocal length. It has 10x 0.25, 20x 0.40, and 40x 0.65 objectives, and a 5x eyepiece.

    Next I got this set of adapters: Canon EF to 42mm thread, and 42mm thread to 23mm tube. The 23mm part fits inside the microscope tube, in place of the eyepiece. The obvious problem with this setup is that the camera's sensor is going to be a couple inches too far away, beyond where eyepiece would be. That's probably bad, though I'm not sure how bad.

    I thought it would be good to get the camera closer by unscrewing the top section of tube and then inserting the adapter. However, the diameter of the bottom section is slightly smaller, and has a screw in the way. I put the adapter on my lathe and turned it down a bit, and then cut a slot so it would go in.

    While the JIS standard has 170mm as the tube length, I've also seen a separate specification of 146.5mm for the image distance. AFAICT, the latter distance would be the correct location for the camera sensor focal plane (which is marked on the outside my camera). With my current setup it is somewhere in between 146.5mm and 170mm.

    The next problem is that I don't have any slides. I have an integrated circuit die which I wanted to look at, but it can't be lit from underneath like a slide can because it isn't transparent. At first, I had it setting on a piece of paper. That was bad, because the paper reflected ambient light all over the place which illuminated the inside of the microscope tube. The inside of the tube is NOT blackened. This stray light made the image hazy and low contrast. Setting the die on a dark surface worked better.

    Now I can take a blurry, distorted die photo at 10x magnification. It's nearly impossible to get enough light for a photo at 20x even with 45W CFL bulbs close in on either side. Better die photos most likely will require upgraded equipment, but in the meantime I'll have to try putting other things under there and see what they look like.

    2024 Jun 28 MIDI to VGM translator, beta 4

    Update to this program:

  • output file ends with a 0x66 code (fixes VGM ROM CREATOR)
  • new X68000 build, and X68000 player for YM2151 VGMs
  • supports vibrato using controller #1 (modulation wheel) and XM2MID can convert 4xx vibrato effects
  • SN76 drums are less buggy and terrible
  • non-DOS builds allocate more memory to avoid "reproc overflow"
  • some other tweaks, new options, and a README file
  • MID2VGM-beta4 download

    2024 Jun 23 more NOWUT bugs, and Z280 keyboard port

    I've bumped into another NOWUT bug, this time pertaining to data relocations:

            dd symbol123
    NOWUT is supposed to generate a COFF relocation so that the linker will patch in an address here. It used one relocation type for little-endian targets and another for big-endian. In some cases (like ARMLE) , this fails to produce the type that LINKBIN now expects.
            dw symbol280
    A closely related problem is that on the Z280, where addresses are generally limited to 16 bits, NOWUT was nevertheless allocating 32 bits for them.
    vectable:
            dw 0,nullint280                ; $00 reserved
            dw 0,nullint280                ; $04 NMI
            dw 0,nullint280                ; $08 int A
            dw 0,nullint280                ; $0C int B
            dw 0,keyboardint               ; $10 int C
            dw 0,nullint280                ; $14 counter/timer 0
            dw 0,nullint280                ; $18 counter/timer 1
            dw 0,nullint280                ; $1C reserved
            dw 0,nullint280                ; $20 counter/timer 2
            dw 0,nullint280                ; $24 DMA 0
            dw 0,nullint280                ; $28 DMA 1
            dw 0,nullint280                ; $2C DMA 2
            dw 0,nullint280                ; $30 DMA 3
            dw 0,nullint280                ; $34 UART receive
            dw 0,nullint280                ; $38 UART transmit
            dw 0,nulltrap                  ; $3C single-step trap
            dw 0,nulltrap                  ; $40 breakpoint-on-halt trap
            dw 0,sysreset                  ; $44 division exception
            dw 0,sysreset                  ; $48 stack overflow warning
            dw 0,sysreset                  ; $4C page fault
            dw 0,nullint280                ; $50 system call trap
            dw 0,nulltrap                  ; $54 privileged instruction trap
            dw 0,sysreset                  ; $58 EPU <- memory trap
            dw 0,sysreset                  ; $5C memory <- EPU trap
            dw 0,sysreset                  ; $60 A <- EPU trap
            dw 0,sysreset                  ; $64 EPU internal trap
    So my nice Z280 interrupt vector table was badly mangled. After correcting it, I now have a working keyboard interrupt on the Z280. Although the story doesn't end there, as I've noticed that occasionally the keyboard data coming in is corrupt. As of right now I'm going to blame it on this ringing in the signals:

    I might have to refrain from trying to clock flip-flops in the CPLD with the keyboard clock directly.
    2024 Jun 7 Z280 board PS/2 keyboard port

    For the cost of ~20 macrocells in the 7128 I've made the keyboard port on my Z280 board functional, at least in one direction. There is no provision for sending commands from the host to the keyboard. The verilog code uses a shift register to record data bits whenever the keyboard drives the CLK line low. (AT / PS/2 keyboard protocol is similar to I2C, in that it uses a pair of open-collector clock and data lines.) There is also a 4-bit counter to keep track of which bit in the sequence we are expecting to arrive next. If we are expecting a start bit, and the data line is not low (like it is supposed to be) then we ignore it and start over. A wrong stop bit similarly causes a restart.

    To ensure it doesn't go out of sync and start joining half of one byte with half of the next, I implemented a timeout, so incomplete bytes are dropped after a while. The keyboard bitrate is supposed to be 10~15KHz, but creating a divider to get from the 6MHz bus clock to there would have taken too many macrocells. So instead, I save the two highest bits of the DRAM refresh counter, and then compare those against the current count to determine if a keyboard byte has taken too long to finish.

    Aside from the verilog code, the port also needed to be rewired. Much like the DE-9 serial port, the pin assignments came out backward. If I ever make a second revision of the Z280 board, there are more than a few things to be fixed on it. (Though I would rather get a 68-pin NEC V60 and make a board out of that.)

    So the 7128 collects data, transfers a byte to the data bus flip-flops, lowers the /INTC line, and then the CPU can read the byte from an IO port. My Z280 code can tell that INTC is active by polling the 'interrupt pending' bit in control register $16. Somehow I should be able to get an actual interrupt service routine to start, but I have not yet succeeded at this.

    2024 Jun 2 another NOWUT bug

    The COPYBYTES statement produces an infinite loop on ARM CPUs...

    2024 Jun 1 plan for Z280 board memory banking

    In order to run NOWUT code with faked 21-bit addressing on the Z280 board, I might be able to convert a linear address to a segment/offset pair like this, beginning with a valid 21-bit address in DE/HL:

    ld b,d
    ld c,e
    out [c],h
    res 7,h

    DE/HL will be used to contain 32-bit values, like DX/AX are on 8086. When there is a valid 21-bit address (to access the Z280 board's 2MB RAM) in DE/HL, the byte in D will be all zeros. So the first thing I'll do is load B with D to make it zero, using a one-byte opcode. The reason for this is that the value in B drives address bits 8-15 when doing the OUT instruction. If I don't clear B to zero, then it is possible that I could write garbage data to a valid IO port on an ISA card (which generally decodes 10 address bits for IO cycles).

    C is copied from E, and therefore contains a value in the range 0-31. Here I'm going to use the old "the address is the data" trick, and write H to whatever IO port this is. When the CPLD sees an IO write to a port in the 0-31 range, it can take these 5 address bits and combine them with bit 7 from the data bus (from the H register) to make a 6-bit bank number which selects a 32KB bank. Next I'll clear bit 7 in H so that (HL) points to the first 32KB in the CPU's 16-bit logical address space, where the selected bank is now accessible.

    The second 32KB will remain unswitched, for use by code and stack, or only stack when using the separated program/data MMU feature.

    Only six bytes worth of fiddling around. The equivalent 8086 code is similar in length:

    mov bx,dx
    shl bx
    cs mov ds,[bx].w
    2024 May 31 NOWUT bug

    Turns out that the LOADLITTLE and LOADBIG statements, which work fine for SH2, MIPS, and ARMBE, were never corrected for the little-endian counterparts SH4, MIPSLE, and ARMLE. Oops!

    2024 May 31 16-bit addressing, Z280 board, and Z280 alternatives

    64KB segments can be a pain when you're working with larger amounts of data. (80286 protected mode was famously described as "brain damaged.") The way that NOWUT gets around this on 8086 is by monkeying with the DS register before every memory access that isn't the stack. The segment value is either a constant which has been inserted by the linker, or it is calculated at run time by taking the high word of a 32-bit (only 21 bits used) linear address and running it through a lookup table. This carries a performance penalty but also makes it far easier to create real-mode DOS programs.

    I've been trying to come up with an equivalent scheme for the Z280. It doesn't have segment registers, instead it has a clunky MMU which is accessed through IO ports. That's not good. But on the other hand, I can fiddle with address bits inside the CPLD and maybe cause some magic to happen there. Even so, there are a number of challenges:

  • I can't switch an entire 64KB bank because I'd lose the program itself. The Z280 MMU has a mode where program and data use separate 64KB areas, but I'm assuming the stack remains in the data area. Blowing away the stack would also be bad. (Though the manual doesn't explicitly say what happens to SP or PC relative addressing when the program and data areas are separated.)
  • It's also possible to separate the User address space from the Supervisor address space, and the latter can still access the former, but only 8-bits at a time and only with (HL) or (Ii+disp) addressing by way of the LDUP/LDUD instructions.
  • If I use smaller banks then I need to shuffle address bits around to obtain the bank number. The Z280 isn't great at shifts. There are no 16-bit shifts, and the 8-bit ones are mostly 2-byte or larger opcodes.
  • The Z280's DMA controller uses linear addresses, so any banking scheme that results in consecutive addresses from the programmer's perspective being jumbled in memory would limit usage of the DMA controller.
  • The case where a segment/offset pair can be calculated by the linker is not too hard and could be handled a variety of ways. Doing it at runtime is different. Starting with a 21-bit address in a few registers (say E and HL), turn that into a segment+offset, write the segment value to some IO port so the CPLD can do something, and then access the data at the proper offset. This needs to happen with as few bytes/cycles as possible. Hmmm... Maybe I'll try to get the keyboard port working first.

    The Z280 is obscure, quirky, and it can go in a socket, thanks to the 68-pin PLCC package. These were the factors that motivated me to make a homebrew system around it. I considered 68K and x86 to be too ordinary. Same goes for 6502 and other 8-bit CPUs. A Z380 would be interesting, but was not an option because of the QFP package. I considered the Z8000, which comes in DIP-48, although it really doesn't seem that quirky. With the big-endian byte order and 16-bit instruction words, it looks kind of like a 68000 with lower clock speeds. Still, not a bad choice.

    Now I've become aware that NEC V60 processors were available in a 68-pin PGA package. Before now I had only known about the QFP ones. There is even documentation that describes the pin assignments and bus timing http://mess.redump.net/datasheets/nec. These chips are hard to find, but until then I am imagining what a homebrew V60 board would look like. Certainly, it wouldn't do to settle for 8-bit expansion slots when you have a powerful 32-bit CPU. Might as well include a full 16-bit ISA slot and run the extra address lines for 8MB of RAM. I wouldn't need to worry about bank-switching schemes in the CPLD, but it would need to handle DRAM refresh by itself.

    There is also word of a 132-pin PGA version of the V70, but I haven't seen these chips either, nor the pinouts.

    2024 May 25 disassemble and reassemble part 2

    While fooling with the disassembler I discovered that SETcc instructions were not being handled correctly. This is because the opcodes were incorrect in the data table, and that's because they are incorrect in the documentation. They are also incorrect in NOWUT, which I hadn't noticed since I never use these instructions, and because some disassemblers will permit a wrong mod/r/m byte in this case. Bits 3-5 are supposed to be 0, but other bit patterns don't seem to be defined for use as anything different.

    The documentation in question is this old NASMDOC.TXT which has a very nice instruction set reference towards the bottom. There are several places where it lists incorrect values for the spare register field though:

  • CMP
  • FDIVR, FIDIVR
  • FISTP
  • SAR
  • the aforementioned SETcc
  • The document does not cover any SSE instructions, and I have not seen anything that does in a similar fashion. I need a list of mnemonics, corresponding opcode values, valid operand types, separated according to which generation of SSE they fit into, and preferably a description of what the instruction actually does. I'll probably have to write my own documentation for this, which is not unusual. It's generally a time-saver in the long run to pull the relevant information out of some 300-page PDF and condense it to a few pages rather than trying to work with such unwieldy things.

    2024 May 20 NOWUT version 0.33 release

    New features:

  • CPUZ280 module which supports assembly only, for Zilog Z80 and Z280 CPUs
  • SIB addressing and MMX instructions for x86
  • little-endian MIPS support and additional MIPS instructions
  • new command line options for LINKBIN
  • slightly more efficient generated code for all platforms
  • Check the documentation. Download the complete archive.

    And be sure to obtain Go Link from Go Tools website.

    2024 May 12 disassemble and reassemble

    I've been working on an x86 disassembler, based on the familiar formula used to make the DeHunk 68k disassembler (and the Unsuper SH2 disassembler, RevMIPS MIPS disassembler, DisARM ARM disassembler). I can successfully disassemble a Win32 program compiled by NOWUT, feed it back into NOWUT, and get the original executable again. This is a nice trick, but not very useful. I'd much rather be able to do this with programs compiled with different tool chains.

    I added SIB addressing modes to NOWUT so I could try to assemble code not originating from NOWUT. How about a FreeBASIC 0.23 program? I tried one, and found that it contained MMX instructions. Weird, because FB 0.23 programs run on pre-MMX CPUs, unlike later versions of FB which I don't use for that reason. I guess the MMX codepath, though present, doesn't execute on pre-MMX CPUs.

    So I added MMX instructions to NOWUT, and proceeded to the next problem: imports. When a NOWUT program is linked with GoLink, GoLink creates a jump table in the .idata section. The jump table has an indirect JMP pointing to an address in an import address table (IAT). So calling a DLL routine ends up touching four memory locations: the CALL, the JMP, the import address, and finally the DLL code. To make a source listing that can be reassembled, my disassembler works backward from the IAT to find the CALL instructions and point them to the associated import name, rather than the jump table. This strategy fails with executables from FB and elsewhere. They do things like putting the jump table or the IAT in some other section instead of in an .idata section. Or an indirect CALL to the IAT instead of a normal CALL. If I disassemble the IAT as data, eg.:

            dd GetLastError
            dd ExitProcess
    then when I try to rebuild the program, GoLink doesn't put the address of the imported symbol there, but instead the address of an entry in another (redundant) IAT.

    I'm sure it's not worth trying to solve every one of these issues, but having a nice looking disassembly with labels and import/export names will at least make it easier to reverse engineer stuff (and fix it so it works in Windows 2000!)

    2024 Apr 8 PIC32MX1xx

    Microchip makes a DIP28 microcontroller which uses the MIPS instruction set. That means it can run NOWUT code, so it could be useful for something. Though now that I have one and I've tried communicating with it using the PICkit3 dongle, I've discovered that it's little-endian MIPS. So now I have to add support for little-endian MIPS :(

    2024 Apr 3 updated text/hex editor: RaeN 0.91

    changelog:

  • less time wastage on unnecessary screen redraw
  • line number display updates when using PGUP/PGDN or mousewheel
  • case-insensitive search is possible (prefix the string with an ESC character)
  • pressing numlock doesn't create mojibake
  • pasting from Windows clipboard is possible (use CTRL+V or menu item)
  • pasting a column (to the right of existing text) is possible
  • CTRL+F3 or CTRL+F4 opens adjacent file in the current directory
  • pressing function key while a menu is in focus is ignored instead of doing weird stuff
  • download RaeN 0.91 here (win32 only)

    (see the Oct 2022 posting for the Shift-JIS font and additional info)
    2024 Mar 22 Z280 VGA demo revisited

    After a day to lament my dismal framerate, I remembered that the Z280 has a set of shadow registers which can be quickly swapped in and out with the EXX instruction. By using these I can optimize my inner loop and increase my framerate to 3fps :)

    There are occasional wrong pixels that appear on the screen, until they are overwritten. In this 8-bit slot, the Trident card drives IORDY low everytime /MEMW goes low, usually only for a couple cycles, not long enough to actually generate a waitstate on the Z280. (It doesn't drive the /0WS line, even though the manual for the card says it supports "zero-wait" operation. I have not tested whether it drives /0WS in a 16-bit slot.) The CPLD uses IORDY to drive the Z280's /WAIT signal. I wonder if /IOW and the data bus aren't staying valid for long enough after IORDY goes high.

    2024 Mar 21 Z280 VGA demo

    I was having difficulties getting the VEGA VGA card to display something, so I switched video cards a few times until I finally succeeded with a Trident 8900D SVGA card. (I will likely revisit the VEGA VGA card later. I think I now know why I had failed getting it to work before.) Perhaps a 1MB SVGA card is slightly overkill for the Z280, but at least it is a small board.

    My immediate plan was to port my small color-warping graphics demo that I had previously run on my FPGA-based "toadroar" RISC CPU. In contrast with that platform, the Z280 has a lower clock speed, fewer registers, and only a 16-bit wide ALU. I simplified the demo's algorithm a bit, and still struggle to reach 2fps on a 320x200 screen. Oh well. Probably better than an 8086.

    30sec video recording from CRT

    some messy Z280 assembly code (which I build with the not-yet-released Z280 NOWUT module)

    2024 Mar 15 MIDI to VGM translator, beta 3

    Update to this program:

  • added workaround for MIDI files that have very short duration note-on times for percussion
  • fixed bugs related to sticking notes and to SSG sound
  • includes a new version of the XM-to-MIDI conversion tool
  • MID2VGM-beta3 download

    2024 Mar 13 Z280 benchmark

    I used an on-chip timer to measure the execution time for small unrolled loops containing an instruction repeated 20 times. I started the timer after the first loop iteration, so that the code would have all entered the cache, then did another 10 loops. So I have a count of timer ticks (which should be one quarter the CPU frequency) for 200 iterations of an instruction, and some loop overhead to subtract out. Assuming I've done this correctly...

  • LD L,B (one byte) ~2.5 cycles per instruction
  • LD L,$00 (two bytes) ~3.0 cycles per instruction
  • LD IXL,$00 (three bytes) ~3.5 cycles per instruction
  • LD IX,$E000 (four bytes) ~4.5 cycles per instruction
  • LDW IY,[IX+$1000] (five bytes, and a memory read) ~17.5 cycles per instruction
  • LDW [IX+$1000],IY (five bytes, and a memory write) ~10.5 cycles per instruction
  • Since my bus speed is currently half the CPU speed, this seems to confirm the manual's specification of 11 cycles for a memory read. Running the bus at full speed would likely save 3 cycles on the LDW IY,[IX+$1000].

    Overall, code execution from the cache is relatively fast, but memory access is quite slow. I'm guessing this is because of extra cycles needed for MMU address translation and cache hit detection.

    2024 Mar 4 dumping ancient CDP1833 mask ROMs

    Over on my BMWs page I have some info about the Bosch Motronic fuel injection and ignition controller. My '82 528e and '83 633CSi are among the first cars to have used this system, back when it was based on the RCA 1802 "COSMAC" processor. These are the '007 and '008 Motronic ECUs, which were reportedly preceded (in BMWs) only by an earlier '002 ECU contained in the 733i. While the unit from the 528e contains an ordinary 2732 EPROM, the units for the M30B32 engine (featured in the 533i, 633CSi, and 733i) instead use four 1KB ROMs.

    Before now I have never seen a dump of these 1KB ROMs. There was a rumor that a different version of ECU existed for the M30B32 which used a normal EPROM. I never saw this ECU or a dump from it either. There was also a rumor that the ROM contents were identical to the 528e. But now I know this is not true.

    The issue with these 1KB ROMs is that they use a multiplexed address bus and aren't supported by most programmers (not by my old Needhams, or my new T56). A long time ago, before I had seen the CDP1833 datasheet, I desoldered one of the chips and tried to dump it but only got 256 bytes of actual data. Reading it properly requires putting A8~A15 on the bus and pulsing the TPA pin high to latch the upper half of the address, then proceeding with A0~A7. You might wonder why a 1KB chip has 16 address bits. The address decoding is essentially built into the ROM, to minimize external glue logic. The first chip responds at $000 to $3FF, the second chip responds at $400 to $7FF, etc.

    Some months ago I posted about my FRASH board which I made to dump and program Genesis cartridges. Prior to that I had also made a few prototype Genesis cartridges with socketed EPROMs. Well, one day I suddenly came up with a plan to put a CDP1833 into the socket of a prototype cart, put that into the FRASH board, and modify the FRASH software to read the data from it. I desoldered the remaining chips from a junkyard Motronic box, and have now dumped them all.

    The first 1KB does appear to be identical to the 528e code. This is a lucky thing, since my chip had bit 1 stuck high. So for the first 1KB I have actually substituted the 528e code. The second 1KB is only different at the very end. The third and fourth chips are significantly different. In fact, the fourth chip has some quite interesting junk data containing these strings:

    ERROR
    COSMAC ASM4 VERSION 00
    TYPE SOURCE FILENAME>
    WRITE?

    Here is the complete dump

    2024 Feb 29 Video 7 VEGA VGA

    This card has a Cirrus Logic GD-510A / GD-520A chipset and plugs into an 8-bit slot. I want to initialize it from the Z280, which means I need to configure all its registers myself without being able to rely on the BIOS. I've booted my 286 with the card installed and dumped the register values from it, and also picked through the BIOS code to see what it does at power-on, so in theory I should have the necessary info. Bitsavers has a scanned manual for the GD-610/620 which appears to be substantially similar.

    The card isn't hugely different from standard VGA. The main difference is that it can emulate CGA or Hercules and drive older 9-pin TTL monitors. It has a 32.514MHz crystal which it can divide in half to get the EGA/mono pixel clock, but it's also available for doing super-VGA modes like 800x600 (with 16 colors only, as there is only 256KB of memory). The 610/620 manual also explains that the memory access pattern is able to reserve more bandwidth for the CPU than the original IBM card. It also appears to support a mouse cursor sprite.

    So far I have managed to get 31KHz H-sync and 70Hz V-sync signals coming out of it, but not an actual image. I verified that the Z280 can set VGA registers and read back the expected values, including the extended registers $80-$AF behind the $3C4/$3C5 port. (The 610/620 manual states that extensions are enabled via bit 0 in register $06, but the VEGA VGA BIOS writes $EC here.) I have not yet verified RAMDAC or video memory access. Since the Z280 is currently running at 12MHz with the default 1/2 bus speed (6MHz) and no waitstates, the timings are somewhat faster than what is normally expected for 8-bit cards in a PC/XT or AT. At the least I probably need to implement the IORDY signal.

    2024 Feb 11 more Z280 debugging

    Once I had the Z280 board sending data over serial, I added a short XMODEM receive routine to the EPROM so I could send files to it. Initially this didn't work. I found out that the Z280 could transmit but not receive, ultimately because the wrong pin had been connected on the DE-9 port. I had to solder on a jumper wire to correct this. I am now up to four jumper wires. The EPROM's /OE had to be connected to the CPU's /IE instead of its /OE, and the PS/2 keyboard clock and data lines needed to be pulled up.

    Serial worked in both directions after that, but XMODEM still seemed to fail. The DRAM controller also needed debugging, so I made myself a little menu of routines to aid in that process. Until that point, I had at most two routines in the EPROM, one which would run at reset and another I could trigger with an NMI.

    Now I finally got the DRAM working and can send a program right from the terminal. The board has two 1MB SIMMs inserted, using TMS44400 DRAM chips. I noticed that when the system is powered down these quickly revert to containing mostly $FF (all bits high), in contrast to the SDRAM on my FPGA development board which was slow about losing data and ultimately end up with various patterns of garbage data in different regions of the chip.

    2024 Feb 4 Z280 board testing

    After assembling this board, which was fairly simple with 100% through-hole parts, all of the CPLD programming still needed to be done before it could function. It has now reached the point where the CPU can execute code from EPROM and do I/O through the ISA slot. So I popped in my sound card and tried to play some music through an old PC speaker. Success! I also managed to send a few bytes over RS232, using the Z280's built-in UART, at 14400bps. Next up, I'll try to get DRAM working and receive programs via serial. Then I can stop having to program EPROMs for a while. After that, maybe I can figure out how to initialize an 8-bit VGA card. (I already dumped the VGA BIOS from it to disassemble...)

    2023 Dec 16 another one of my PCBs, and "new" but useless CPLDs

    This is my "DIAGS" 8-bit ISA card, which at the very least is intended to light some LEDs with data written to port $80, like a typical POST code reader. I figured it would be useful in getting my Z280 board going.

    Unlike the FRASH board which uses an out-of-production Altera CPLD scavenged from eBay, I planned to use a 44-pin Atmel CPLD which I had purchased NEW. I thought for sure that after building the ByteBlaster parallel port interface I would finally be able to program the Atmel chips...

    The quest to put these chips to some use began in 2020 when I was first learning about CPLDs and wanted to buy something to experiment with. Initially I bought a 1500A, which I quickly found out was impossible to program. Next I got the 1504AS. In theory this was supported by my old (Needhams EMP-11) programmer but it turned out to require an unobtainium adapter cable. Later on I made a dongle to connect a USB-Blaster cable, but that didn't help because Quartus doesn't support Atmel chips and ATMISP doesn't support this type of cable.

    But now I have a ByteBlaster interface, right? This is supposed to work with ATMISP, but it does not. ATMISP just fails with a generic error "Hardware and software settings mismatch." I found the article at www.hackup.net/2020/01/erasing-and-programming-the-atf1504-cpld/ which describes using ATMISP to convert a .JED file to an .SVF file. So I wrote some WinCUPL code, compiled it, and produced an .SVF file.

    I tried to use UrJTAG to send the data, since it is supposed to support both USB-Blaster and ByteBlaster interfaces. The UrJTAG documentation doesn't really make it clear what driver is needed for using a USB-Blaster, since it keeps mentioning FTDI stuff but also mentions the Altera driver (which in my case is usbblstr.sys). But then UrJTAG doesn't even run out of the box because libusb0.dll is missing. That had to be obtained separately. It came with install-filter-win.exe, install-filter.exe, testlibusb-win.exe, and testlibusb.exe. I don't know why there are two of each. I ran the install-filter-win. It added registry entires but I had to copy the driver file manually and rename libusb0_x86.dll to libusb0.dll. Then after plugging in the USB-Blaster the test program appeared to succeed. So I ran UrJTAG and got this result:

    jtag: cable usbblaster
    Connected to libftd2xx driver.
    jtag: detect
    usbconn_ftd2xx_flush(): Error from FT_Read(): 4
    usbconn_ftd2xx_flush(): Error from FT_Read(): 4
    usbconn_ftd2xx_flush(): Error from FT_Read(): 4
    usbconn_ftd2xx_flush(): Error from FT_Read(): 4
    usbconn_ftd2xx_flush(): Error from FT_Read(): 4
    IR length: 10
    usbconn_ftd2xx_flush(): Error from FT_Read(): 4
    usbconn_ftd2xx_flush(): Error from FT_Read(): 4
    usbconn_ftd2xx_flush(): Error from FT_Read(): 4
    usbconn_ftd2xx_flush(): Error from FT_Read(): 4
    usbconn_ftd2xx_flush(): Error from FT_Read(): 4
    usbconn_ftd2xx_flush(): Error from FT_Read(): 4
    Chain length: 1
    Device Id: 00000001010100000100000000111111 (0x000000000150403F)
      Manufacturer: Atmel
      Unknown part!
    usbconn_ftd2xx_flush(): Error from FT_Read(): 4
    chain.c(149) Part 0 without active instruction
    chain.c(200) Part 0 without active instruction
    chain.c(149) Part 0 without active instruction

    Not exactly confidence inspiring. I mean, it does say Atmel and I see the digits '1504' but is it supposed to have this many errors? Better try the ByteBlaster instead. I swapped the cables around and tried that.

    jtag: detect
    IR length: 10
    Chain length: 1
    Device Id: 00000001010100000100000000111111 (0x000000000150403F)
      Manufacturer: Atmel
      Unknown part!
    usbconn_ftd2xx_flush(): Error from FT_Read(): 4
    chain.c(149) Part 0 without active instruction
    chain.c(200) Part 0 without active instruction
    chain.c(149) Part 0 without active instruction

    It looks a little better, and didn't take as long to run. So I tried to send the .SVF file.

    jtag: svf d:\tmp\diagled.svf progress
    Error svf: could not establish SIR instruction

    I have no idea what this error means.

    As per the article above, OpenOCD is another program which should be able to send an .SVF file over a JTAG interface. I don't see any indication that it supports ByteBlaster but it's supposed to work with USB-Blaster. (Search for altera-usb-blaster.cfg) Swap cables again. Run OpenOCD with the magic command line.

    Info : usb blaster interface using libftdi
    Error: unable to open ftdi device: libusb_get_device_list() failed

    OpenOCD came with its own libusb0.dll which is different than the other one I had. I tried swapping them but it made no difference.

    I think these chips are going on the scrap pile until and unless ATF150x support gets added to the Xgecu T56 :p

    2023 Nov 21 MD/Genesis code

    I've found that the Genesis example NOWUT program (GENJPG), which works when run from an Everdrive, does not work when running from an EPROM or flash cartridge. The code is clearly executing but the display is not as expected. I can only assume that some initialization which is done by the Everdrive menu/loader is being neglected by GENJPG itself. I've yet to figure out the exact issue, but one thing I've noticed in looking at other code is what appears to be an undocumented 68000 opcode: $4E66. Search engines are showing 0 results for this...

    2023 Nov 8 XM-to-MIDI converter

    OpenMPT can import MIDI files but it isn't so hot at exporting them. I made my own tool for this purpose. Features:

  • Save and reload conversion settings so you don't need to enter them again everytime.
  • Assign a GM instrument for each XM instrument.
  • Adjust the octave for each instrument.
  • Adjust volume for each instrument without disrupting volume effects in the song.
  • Melody sounds use the panning value from the sample or an 8xx effect.
  • Melody sounds use the volume from conversion settings as VELOCITY, and use the volume from sample default, Axx/Cxx effects, or volume column as CHANNEL VOLUME.
  • Percussion sounds use a VELOCITY calculated from both the conversion settings and from the sample default, Axx/Cxx effects, or volume column.
  • Limitations:

  • Up to 16 XM channels and 16 instruments
  • Patterns must have 64 rows (though they can still end early with a Dxx effect)
  • Percussion and melody sounds may not be both present in the same XM channel.
  • Each instrument should have exactly one sample.
  • Envelopes and fadeout are ignored.
  • Only effects 8xx, Axx, Cxx, Dxx, and Fxx are supported.
  • download XM2MID (win32)

    2023 Oct 30 another one of my PCBs

    This board has a CPLD, JTAG header, parallel port header, and a socket for a 64-pin Genesis/Megadrive cart. A slotted 62-pin ISA socket is close enough, since the pins at either end of the cartridge are just redundant grounds or something else unimportant. The immediate goal was to read and write cartridges, including SRAM and flash ROM. (Commercial products exist for this but they cost too much and their software is not guaranteed to work in Windows 2000.)

    I tried to design the board so that it could use either an EPP parallel port or just a bi-directional port. But when the board was here and assembled I wasn't sure which system to connect it to. The parallel port on my 286 is certainly not EPP, and may not even be bidirectional. The ribbon cable going from the header to the DB25 is not super long. I tried to communicate with it under DOS from a socket AM2 system, with little success, and during the debugging process I began to suspect that the port was fried. I was using a wallwart for the 5V supply and found it was actually putting out 5.7 volts. Oops. I made a small Windows driver to access the parallel port and switched to another AM2 system. EPP was still not working so I went with manual handshaking and eventually managed to read data from a normal game but not from the flash-based cartridge PCB.

    I thought I had the CPLD connected to all the pins that mattered on the cart socket, but it turns out that these carts also need the reset line (pin 27 on the front) driven high. After running a jumper wire to control that, I can now (slowly) transfer the data. There is plenty of room to improve the speed by changing the PC-to-CPLD protocol to minimize parellel port traffic, getting EPP to work somehow, and maybe by adding an independent oscillator to the board so the CPLD doesn't have to be clocked from the PC.

    (The unpopulated IC and resistors are intended for building a parallel port Byte Blaster interface to program Atmel CPLDs. I can program the Altera CPLD with the ubiquitous USB JTAG dongle.)
    2023 Oct 3 Z280-based homebrew motherboard

    For previous board layouts in KiCad I used the default settings: two layers, 0.25mm traces, 0.2mm minimum clearance, and a 0.635mm grid. This works well with through-hole parts that have a 2.54mm pitch. For this project I wanted to try going a bit more advanced.

    Increasing to four layers was fairly straightforward. The inside ones each got a large zone connected to either VCC or GND, with special care being taken to ensure that zones underneath the PLCC socket footprints didn't become unconnected islands.

    I had limited success with packing traces more closely together. The next smaller grid size after 0.635mm is 0.508mm, but this one is not good because it leaves no usable gridline running between pads. Being able to run a trace between pads is crucial, and if you look at old ISA cards or the like you'll see that they sometimes run TWO traces between pads of a DIP chip. I don't really see how to do that with any grid size here. But I went down again to 0.254mm which let me use 0.508mm spacing for parallel traces and still have a gridline running between pads. The downside is that the window has to be zoomed in that much further for the grid to be visible. In the end, I'd say the board came out reasonably compact.

    I won't have time to assemble a board right away but the theoretical specs are:

  • Z280 CPU with 16-bit data bus (rated for 12.5MHz)
  • 2x 8-bit ISA
  • 1x serial
  • 1x PS/2
  • 2MB DRAM
  • 128KB EPROM
  • 2023 Sep 24 FreeCAD

    I installed FreeCAD version 0.16 on Windows 2000 and successfully used it to convert a 190KB .STL file into a 9.5MB .STP file (another one of these amazing text-based file formats) for submission to JLCPCB's new CNC machining service. Their 3D printing service accepts .STL files, but the CNC service only accepts .STP files. I have my own 3D printer but there is a part I've been wanting for my car which needs to be able to handle engine heat without melting. Supposedly they can 3D print the part using stainless steel, but I'm curious how the price will compare for a machined aluminum version.

    I found the instructions for doing the conversion here: https://www.makeuseof.com/converting-3d-printable-stl-files-into-step-for-cad/

    2023 Sep 18 Another day in the war between user and tech

    CDJapan sends me promotional emails, which I mostly ignore because they are full of empty rectangles where images hosted on a remote server were meant to appear. Of course, I don't load images that are linked inside an email (and never will!) because HTML email is a dumb thing which should not exist. Spammers are already hitting me with emails that only have one sentence of actual information wrapped in 100KB+ of HTML/CSS/whatever gibberish. It would take even longer to wade through this junk if I loaded all of their web crap too, not to mention falling victim to all of the tracking traps.

    Last time I went to browse CDJapan's site I saw some interesting stuff. Unfortunately, shipping costs went insane and have yet to recover. It's 4000 yen to ship a CD and 5000 to ship a book. Sad. I start thinking, "is it possible to buy a Japanese e-book?" I've never tried before. Yahoo Japan has an e-book store but it requires a Japanese phone number to create an account. I don't like websites that require a phone number to begin with and generally refuse to use them. (If it requires a phone and not a computer then it shouldn't even be called a website.) I certainly won't be supplying a Japanese phone number since I don't have one.

    Shueisha.co.jp has an e-book store and they don't require a phone number. Their FAQ explains that books can be viewed in a browser or with their (bloated, 138MB) app. It doesn't say that books can be downloaded. Also, the app "requires" Windows 8.1, which I don't have. This sounds kind of bad, but I think I might be able to work around this kind of obstruction. I'm willing to risk a few $$ to try it.

    So I bring something up in the online viewer. It's extremely slow here. Table of contents never fully loads. My first question: can I just make the browser window nice and big, and then screenshot the page? No! It's disappointing that my browser would stab me in the back like that. (I am running Serpent, BTW.) I know that pressing PrintScrn results in a window message, so maybe the browser is intercepting that. But before I whip out a disassembler and try to hack the browser, let's try something else.

    Sometimes it's possible to capture files that the browser has silently downloaded by running Nirsoft OpenedFilesView and copying one of the locked temporary files. But I didn't see anything usable in this case.

    So what about their reader app? I downloaded it and extracted it with innosetupextractor. Ran bookend.exe. It appears like it is going to work. In Windows 2000.

    I can login, and it shows an item in my library. But for some reason it thinks the file is only 701 bytes and if you try to download or open it an error message comes up. In fact, the same thing happens under Windows 7 after installing the app the normal way. Who knows.

    My next plan succeeded. I installed CutePDF Writer and added a 'print' button to Serpent's address bar. While using the in-browser viewer, I can print each page to a file. This is better than a screenshot since it is higher resolution than the monitor. Though there is still some small text atop the image which says "This contents is temporary disabled" just to remind me how hard developers are working to screw over users.

    2023 Sep 16 MIDI to VGM translator, beta 2

    Update to this program:

  • fixed register initialization problem on real OPL3
  • added ISC2DMP to convert ISC instruments to DMP for Deflemask/Furnace
  • OPNZ now disables interrupts while accessing timer to prevent glitches
  • made it a ZIP archive which can be extracted on 16-bit DOS while also preserving file case for Linux
  • included a file (opmfreq2.bin) needed to build OPNZ source
  • MID2VGM-beta2 download

    2023 Aug 30 MIDI to VGM translator, beta release

    Update to this program:

  • fixed an ESFM and dual-OPL3 problem
  • ISCM sounds closer to Yamaha chips now, making it easier to create instrument patches
  • adjusted several instruments
  • fixed problem with OPNZ playing OPL2 files on OPL3
  • MID2VGM-beta download includes Win32, Linux, and 16-bit DOS builds

    Includes OPNZ (DOS .VGM/.VGZ player) and ISCM (Win32 instrument editor)

    2023 Aug 26 NOWUT version 0.32 release

    Minor update with a few bug fixes, tweaks, and a new cross-platform FILEGETSIZE routine.

    Put in the missing XOR EDX,EDX that was causing pioquerytimer in PIOLNX to randomly fail.

    Check the documentation. Download the complete archive.

    2023 Aug 16 MIDI to VGM translator, release alpha6

    Update to this program:

  • YM2612 with -s1 switch can play melody on SN76489 if the notes don't go below A-3
  • OPN2/A and OPM hardware LFOs can be used for instrument vibrato
  • better handling of MIDI files that key-on the same note multiple times
  • OPL3 can use all 9 (virtual) channels, dual OPL3 can use 18
  • added OPL2 and dual OPL2
  • added experimental ESFM
  • MID2VGM-alpha6 download includes Win32, Linux, and 16-bit DOS builds

    OPNZ is also included, a small DOS program which can play the OPLx/ESFM VGMs. (It can play other types of VGM on the DUALOPN card only.) First upload had a big problem. Now fixed.

    2023 Aug 13 Shift-JIS in NOWUT

    I'm not sure if anyone would want to do this, but in case anyone did, it appears to work...

    2023 Aug 1 MIDI to VGM translator, OPL, and FM generally

    The nuts and bolts of audio synthesis are a fascinating topic, but producing music from bits also depends on the creative aspect of having first composed a song to play. Hence this project, which not only let's me put a music score together with instrument patches and hear the result, but facilitates building up a library of patches that I might use for future compositions.

    One thing I've noticed during my time of experimenting with FM thus far, with the aid of a spectrum analyzer, concerns the role of high-frequency content. Knowing that a square wave has harmonics at odd integer multiples of the fundamental frequency, my initial plan for producing a square wave using a 4-operator FM synth was to add the four sine waves together, with multipliers set to 1,3,5, and 7. With the amplitudes set appropriately, this looks vaguely like a square wave on a scope, and it even sounds like one as long as it is only used to play high notes. When playing lower notes, the lack of harmonics beyond 7x becomes more evident. A bass line ends up sounding much different, in a bad way, compared to a real square wave. These screen captures from Visual Analyser 2011 XE Beta depict the bad, fake square wave vs. a real square wave:

    So the lack of high-frequency content can make a sound boring. But I found that too much of it can also be bad, resulting in an unpleasant cacophony particularly when playing chords in higher octaves.

    I made a better sounding imitation square wave by using algorithm 4.

    Speaking of algorithms, I like the 4-op synthesizers. I have a DX-100. I have a Genesis. I made my sound card out of YM2203s. But I've also been thinking about targeting other hardware with the MIDI to VGM translator since it would share most of the code. OPL3 has some hardware support for 4-op synthesis, so I started looking into it more. Only 6 channels can use the 4-op mode, while this leaves 6 other channels that can still be used in 2-op mode. I wasn't keen on building a separate set of 2-op instruments, but leaving those channels unused would also be unsatisfying. (Three of the 2-op channels can be sacrificed to enable the OPL2 percussion sounds, though these aren't great as far as FM percussion sounds go.)

    OPL3 in 4-op mode officially supports only four algorithms. These correspond to numbers 0,4,6 in the diagram, and one other which is not in the normal lineup. But wait a second, algo #7 can be reproduced just by pairing two 2-op channels together. Come to think of it, numbers 4,6, and 7 can all be reproduced by pairing two channels together, even on an OPL2. The only thing gained from OPL3's special mode is algo #0. Is it worth having only 6 channels just to get #0? I'd rather limit myself to numbers 4,6, and 7, and then I can pretend that OPL3 has 9 channels with 4 operators each. (The ninth channel would straddle two register banks, badly clashing with my existing code, so I'm ignoring it for now.)

    There are other differences between OPL and OPM/OPN. For instance, OPL doesn't have separate envelope phases for decay2 and release. The solution to this is to write the decay2 value when beginning a note, and then replace it with the release value when ending the note. (Sustain bit in register $2x/$3x also needs to be toggled in the process.)

    I've now made an OPL instrument set by copying the OPM/OPN ones that use 4/6/7, and replacing the remaining ones. The result is already not bad, probably better than playing MIDI files in Windows 3.1 on your SB-compatible...

    MID2VGM-alpha5 download includes Win32, Linux, and 16-bit DOS builds. (BTW, pitch bends are now supported.)

    2023 Jul 14 MIDI to VGM translator, release alpha4

    Update to this program:

  • Changed the default pitch mapping to one that should be more technically correct.
  • fixed a bug in SSG channel allocation
  • faster conversion (runs in reasonable time on 286)
  • added SSG attenuation switch
  • added checks for memory buffer (datablob and reproc) overflow
  • changed a few instruments
  • MID2VGM-alpha4 download includes Win32, Linux, and 16-bit DOS builds

    2023 Jul 4 MIDI to VGM translator, release alpha3

    This is a fun program for rendering MIDI music using a Yamaha OPN/OPM instrument set, in case one has some purpose for turning MIDI into chiptunes or just wondered what it might sound like. The output can target any one of these chips (or two instances of the same chip):

  • YM2151
  • YM2612
  • YM2608
  • YM2203
  • The first challenge in creating this is building up a library of FM instrument patches. As of right now I have 56 unique patches to cover all of the General MIDI melodic and percussion instruments. The assignments are in the NOWUT source code (included) and the patch data is in ISC files which can be modified with the included version of ISCM (Win32 only), though unfortunately the sounds are not emulated very accurately in ISCM. Currently there is no control for key scaling, and LFO settings are ignored. Since YM2203 has no hardware LFO and the others only have a single, shared LFO, I figure that separate LFO settings for each channel could be 'emulated in software' by having MID2VGM insert all the pitch changes into the VGM data, but this hasn't been implemented.

    The next challenge is dealing with songs that are allowed to use 24-part polyphony when the hardware doesn't provide that many channels. My strategy was to play each new note on the Least-Recently-Keyed-Off FM channel. That works OK when the channels are all functionally equivalent. But for my DUALOPN card I have one chip wired to each speaker, so the 2x YM2203 mode has additional logic to assign MIDI channels to one chip or the other and not have instruments randomly bouncing between left and right. There is also an option to use SSG channels in addition to FM channels. This causes some MIDI channels to automatically get a fixed assignment to SSG (no LRKO). Lastly, there are options to use an SN76 noise channel or YM2608 rhythm sounds for percussion instead of FM.

    When two YM2203 or YM2608 chips are specified, using all 6 square waves is possible. YM2608 rhythm is only ever used on the first chip. Since SN76 counts as a separate chip in the VGM file, it can be paired with anything (not just YM2612) but only one is ever used.

    MID2VGM-alpha3 download includes Win32, Linux, and 16-bit DOS builds

    example recordings:

    Megaman 3 Title arrangement, on DUALOPN

    YM2608 - DOOM stage 1

    2x YM2151 - Rhythm Emotion (Two-Mix)

    2023 May 11 NOWUT version 0.31 release

    Preliminary ARM (32-bit) support is here. It's good enough to make a working JPEG decoder demo for the Gameboy Advance. I always figure the JPEG decoder must be a decent proof of things working correctly, based on how much of a pain it is to debug when it fails.

    I've only included instructions for ARMv4 (minus coprocessor instructions), and there is some mutilation that has occurred as with the other instruction sets that have been incorporated into NOWUT. But there is no divide instruction, so division requires a subroutine (like on SH2). I came up with this routine to divide a 32-bit value by a 16-bit value and produce an (unsigned) 16-bit result. I am fairly satisfied with it but maybe you can do better:

    armdivideu:                ; divide r8 by r9 (32/16 unsigned), result in r8
            asm
            mov r10,0
            mov r7,15
    armdiv10:
            cmp r8,r9 shl r7
            cs sub r8,r8,r9 shl r7      ; if r9 was less than r8 then r8 becomes r8-r9
            adc r10,r10,r10             ; result bit is shifted left into r10
            subs r7,r7,1
            bpl armdiv10
            mov r8,r10
            mov r15,r14        ; return
            endasm

    (Note that 'cs' is a condition code prefix.) For signed division, I added some additional code and ended up with this. It can produce a result between -65535 and +65535, which I am sure is not what happens on 8086 or 68000. Overall, this routine is less satisfying, but operands that produce a correct result from 32/16 division should also work here?

    armdivides:                ; divide r8 by r9 (32/16 signed), result in r8
            asm
            eor r10,r8,r9
            mov r10,r10 sar 31          ; r10 becomes 0 (if signs the same) or -1 (if signs different)
    
            rsbs r7,r8,0
            pl mov r8,r7                ; make r8 positive
            rsbs r7,r9,0
            pl mov r9,r7                ; make r9 positive
    
            mov r7,15
    armdiv11:
            cmp r8,r9 shl r7
            cs sub r8,r8,r9 shl r7      ; if r9 was less than r8 then r8 becomes r8-r9
            adc r10,r10,r10             ; result bit is shifted left into r10
            subs r7,r7,1
            bpl armdiv11
            movs r8,r10
            mi rsb r8,r8,r7 shl 16      ; r7 was -1, so we subtract r8 from -65536
    
            mov r15,r14        ; return
            endasm

    A RISCOS option has also been added to LINKBIN, though it hasn't been tested on anything more than a 'hello world' program running in RPCEmu, since I still need to find some digestible documentation on how to do things in RISC OS.

    When testing the Amiga build I noticed that filecreate has apparently been broken all this time. That is now corrected. The previously noted problem with LINKBIN running on 8086 has been dealt with. Lastly, the 8086JPG demo now detects whether it is running on an NEC PC-98.

    Check the documentation. Download the complete archive.

    2023 May 9 So I wanted to measure the elapsed time...

    What if the NOWUT compiler could check a timer before and after compilation and then display the elapsed time? Later, if I decide to change some code for the sake of compilation speed, I'll have a built-in way of benchmarking it. The only problem is that I need to make this code for each of the platforms NOWUT can run on. How hard could it be?

    Windows has a routine called GetTickCount. It returns a count in milliseconds, although the accuracy seems to be no finer than 10-15ms. But something like this will do nicely.

    On Linux I can do this:

    mov eax,78
    lea ebx,[qwordmem]
    xor ecx,ecx
    int $80
    and get back two 32-bit words containing a count of seconds, and a count of microseconds. (The microseconds count from 0 to 999,999.) No problems here.

    Amiga can return the same information as Linux, except because it's Amiga the process looks like this:

  • call exec/createmsgport
  • call exec/createiorequest
  • open timer.device
  • call timer.device/GetSysTime
  • undo all of that before exiting to avoid memory leaks
  • On Atari ST I can use DOS function $2C which returns a count of hours, minutes, and seconds packed into 16 bits. Specifically, there are 5 bits for hours, 6 bits for minutes, and 5 bits for seconds. Of course, 5 bits isn't enough to count 60 seconds, so two-second units are returned instead. This level of precision isn't the greatest, but I don't know how to get anything better without hooking a timer interrupt.

    It seems like an odd choice, considering that the 68000 has 32-bit registers. I don't know the origin of this idea that the time needs to be packed into 16 bits, but I'll note that the FAT filesystem shares the same quirk. Hence the phenomenon of files stored on FAT(32) always having a time stamp with an even number of seconds.

    X68000 also has a DOS function $2C which returns a time packed into 16 bits. But then it has a separate function $27 which returns the time in a longer format with the full 6 bits for seconds. So that's a little better.

    Lastly, MS-DOS has an int $21 function $2C (there's that number again) which returns hours, minutes, seconds, and hundredths, across four 8-bit registers. The time updates when the PIT counter overflows, just like the BIOS timer. So despite the hundredths field, resolution is around 1/18 seconds.

    2023 Apr 28 running Japanese programs on non-Japanese Windows

    IIRC, a default Windows 2000 install won't display CJK characters because the fonts aren't even there. Step 1 in getting things to work is to go to regional options control panel and enable Japanese.

    Setting it as the system default codepage is a separate step, and it has some side effects, but is necessary for Shift-JIS strings and filenames to display.

    (Touhou Labyrinth 2 mojibake window title, with default English codepage)

    Sadly, even that still does not fix everything. Japanese Windows has some additional fonts, different fonts, and fonts which can be referred to via different aliases. Solving this requires copying some files and a lot of fiddling with the registry. I've found it very difficult to know exactly why a program fails and exactly which registry setting might fix it. It seems that Fonts, FontMapper, FontSubstitutes, and SystemLink all figure into this.

    (FontSubstitutes registry branch showing alternate Japanese font names)

    After setting the system default codepage and copious addition of fonts and font registry settings, Touhou Labyrinth 2 looks like this:

    But other programs still have problems. For instance, this RPG Maker game, ToK:

    Why, oh why, is it still broken? When the game calls GetLocaleInfo, the response still indicates a non-Japanese environment. Even though the system codepage is set to 932 (Japanese), there is also a user codepage. And it's another separate setting.

    Why does the game call GetLocaleInfo in the first place if it only works in one locale anyway? I doubt the game developers even wrote this code. It's likely part of some boilerplate library/runtime/framework code which would have made sense in the context of true multi-lingual software but in this case only results in incorrect handling of Shift-JIS text.

    Setting this to Japanese has more side effects. Date and currency formats will change. Hovering over the taskbar clock will no longer tell you the day of the week. Maybe other weird things like this. But it fixes ToK, Touhou Kokishin, and probably other software too.

    Aside from changing the user locale, there might be some other workarounds. Hacking .EXEs would be one. Perhaps creating a separate user account with Japanese locale, and then starting the game for that user with RUNAS? Or maybe fiddling with something in the registry under HKEY_CURRENT_USER\Control Panel\International

    The story doesn't end here, as there is still one game with invisible text...

    2023 Apr 22 tracking on Windows 2000

    List of some known-working Tracker / DAW programs:

    OpenMPT 1.30.09.01 RETRO
    Furnace 0.6pre4-hotfix Win32 build
    Renoise 32-bit v3.1.0 (use innosetupextractor to bypass the installer)
    BambooTracker v0.4.0
    DefleMask v0.12.1
    0CC-LL Tracker 1.0.0.0 (FamiTracker OPLL fork)
    VGM Music Maker v1.1
    TFM Music Maker v1.52
    
    2023 Apr 22 NOWUT linker (LINKBIN) bug

    The COFF spec is not very mindful of word alignment. For instance, a relocation is 10 bytes long and symbol data is 18 bytes long, even though both contain dword fields. This is not a problem for 386 or 68020 CPUs since they allow misaligned memory access. MIPS and SuperH don't allow this, but there is currently no LINKBIN build that runs on these architectures. The 68000/68010 only allows misaligned 32-bit words on 16-bit boundaries. (Not knowing off the top of my head whether Hatari or WinX68 emulate bus errors, I have to wonder if a bug related to alignment on 68000 could still be lurking.)

    However, misaligned dwords are potentially a problem for 8086. If the low word happens to fall on offset $FFFE from a particular segment, then the high word can't be loaded simply by adding 2 (as it would wrap around to $0000). This will need to be fixed in the next release!

    2023 Apr 13 no updates for a while...

    Plenty of projects, but no significant milestones lately. ARM support for NOWUT is in progress. RaeN is almost due for an update. Disassemblers are almost due. Z280 homebrew is still under consideration.

    In the meantime, IMGTOOL 0.96 was released in 2020 and has accumulated a few changes and fixes since then. So how about an IMGTOOL 0.97 release? Just ignore the 'save as Tex' option which is for an unfinished custom image format.

    Download the new archive or check the documentation.

    2023 Jan 24 GoLink compatibility

    I noticed yesterday that GoLink 1.0.4.1 fails to run on Win95 or NT 4. The OS and Subsystem versions in the PE header changed to 5,1 at some point. There doesn't seem to be any real reason for it though, as the imports list is exactly the same as it was on older versions which did run. Changing the values back to 4,0 should solve the issue.

    2023 Jan 5 Disassembling stuff for the win

    Documentation for MmMapIoSpace says that it takes three parameters. But I tried to use this routine and it kept returning 0. Hmmm. So I looked for a driver that called MmMapIoSpace. ATAPI.SYS was one. I disassembled it and noticed that it had four PUSHes before the CALL. Hmmm again. Then I disassembled NTOSKRNL.EXE and found the entry point for the routine itself. It ends with ...drum roll... a RET $10.

    Probably 99% of all parameters for Win32 functions are a 32-bit word. This is one of the 1% cases where they decided to use two 32-bit words (for the physical address) and count it as one parameter. Surprise!

    2023 Jan 1 NOWUT programs to convert between .S98 and .VGM

    These chiptune file formats essentially store the same data in a slightly different way, but software does not always support them both. For the simple case of OPN/OPN2/OPNA chips at least, one file type can easily be converted to the other.

    VGMtoS98 source and S98toVGM source

    Old updates

    entries from 2021-2022

    entries from 2020 and prior