Design a site like this with WordPress.com
Get started

clownmdemu – frontend v0.4 & libretro v0.2

Custom Dear ImGui Style

Dear ImGui’s default style looked a bit tacky and ‘programmer art’-like to me, so I’ve made my own:

I’ve tried to create a typical ‘dark mode’ theme while still keeping a high degree of contrast in order to maintain good legibility. Another focus of the style is technical and artistic minimalism, hence the removal of borders, tab/scrollbar rounding, and colour (colour introduces visual noise, and rounding greatly increases the number of rendered polygons).

Optimised VRAM Viewer

The VRAM viewer wasn’t very efficient: it would try to render every single tile even when only a fraction of them are actually visible to the user. This is something that can be seen with Dear ImGui’s built-in debugger:

The visualiser shows a huge number of tiles drawn above and the below the window.

As the debugger shows, hidden tiles are rendered above and below the section of the viewable region. This causes 4096 polygons to be drawn in total, when far fewer are actually needed.

I discovered that Dear ImGui has a feature specifically intended to address this kind of problem: the List Clipper! The List Clipper automates the process of selecting only the elements in a list that are visible to be rendered. Applying this to the VRAM viewer only required the smallest bit of refactoring (changing a single for-loop that iterated over each tile into two for-loops that iterate over each row and then each tile in each row) as well as the removal of a small hack, and the problem was solved!

The visualiser now only shows the tiles that are visible.

In this example, the number of polygons has been reduced to just over 500!

Support for the YM2612’s Timers

Huzzah – an improvement to the actual Mega Drive emulation!

Background

The Mega Drive’s primary sound chip – the YM2612 – has a nifty little feature that went tragically underused by Sega: a pair of timers that are capable of raising CPU interrupts. This is notable because the only other interrupts of its kind are the VDP’s V-blank and H-blank interrupts, but those interrupts’ timings vary based on whether the console is a PAL or NTSC model. Additionally, these timers are fully configurable, allowing for the possibility of arbitrarily-timed interrupts!

…And Sega didn’t think to connect these timers to either of the Mega Drive’s CPUs. What a waste.

These timers are still usable, but they must be manually polled by the CPU to check if they’ve expired yet. This wastes precious CPU time, and squanders the timers’ potential for certain uses.

Games typically use these timers for controlling the timing of their sound engines. There are two alternative ways to achieve this, but they both have their own downsides:

One way to control the speed of music and sound effects is to use the V-blank interrupt, however, this interrupt occurs less often on PAL consoles, in turn causing the sound engine to update less often. This causes music and sound effect to play slower on PAL consoles, which is perhaps most well known in Sonic the Hedgehog (1991). The game’s sequels avoid this by detecting PAL consoles and forcing every fifth V-blank interrupt to update the music twice, resulting in it having roughly the correct speed, albeit with some minor distortion. Using the YM2612’s timers instead would have avoided this issue entirely as they are almost exactly the same on PAL consoles as they are on NTSC consoles.

Another method of controlling the speed of audio is to manually time code execution by writing the code so that it uses a certain number of CPU cycles. This approach is quite extreme, but it does see heavy use in Z80 code that is responsible for feeding PCM samples to the YM2612’s DAC channel (such code is called a ‘DAC driver’). Most DAC drivers use idle loops to waste CPU cycles until it is time to send the next sample. Were the YM2612’s timers capable of raising CPU interrupts, then this technique would be largely unnecessary except for in the most advanced DAC drivers.

Implementation

The YM2612 timers are quite rudimentary: every time the YM2612 outputs a full frame of audio, Timer A is decremented, and every 16 times a full frame is output, Timer B is decremented.

Unfortunately, I ended up not noticing this, and redundantly refactoring my YM2612 emulator to operate in cycles instead of audio frames (there are 144 cycles to 1 audio frame). The reason that I made this mistake was that the (unofficially translated) official documentation for the YM2608 (the chip that the YM2612 is derived from) states that Timer A decrements every 72 cycles, not 144. And yet, my own testing and the documentation found here both suggest that the timers are twice as slow as the YM2608 manual claims that they are. This may have something to do with how the YM2612 differs from the YM2608 in how it “mixes” all six channels togethers (it doesn’t: it just cycles between outputting each one hundreds of thousands of times per second).

Results

Since these timers are typically used to time sound engines, the lack of emulating this feature causes many games to output no audio. Such games include Vectorman and Castlevania: Bloodlines. With support for the timers added, these games now produce audio.

Another game affected by this is an old ROM-hack of mine – ‘Sonic 2 except the music goes as fast as you do’:

Unlike the vanilla Sonic the Hedgehog 2, this ROM-hack uses Timer A to control the speed of the sound engine. By adjusting the timer, the speed of the music and sound effects is changed!

About Menu

clownmdemu’s frontend is a rather complex bit of software, so, to do the things that it does, it leverages a number of open-source libraries (and fonts). These libraries are made available under certain conditions: for example, a library may require that its authors are credited in the documentation of any software that uses it, while other libraries go a step further and require that an entire copy of the library’s licence is provided in said documentation. The libraries used by my frontend tend to require the latter.

Being a minimalist, I don’t like the idea of every release of my frontend bundling the executable with a dozen text files, so instead I had the idea of embedding the licences into the frontend itself. To this end, there is now an ‘About’ menu which gives a brief overview of what the program is and provides a list of open-source licences.

This makes it a lot easier for myself and anyone else who uses my emulator to abide by the various licences, as there no longer has to be any worry about forgetting to reproduce licences with every binary distribution.

Personally, I think the requirement to reproduce a big blob of legalese with every binary distribution of non-copyleft software is stupid, which is why my libraries are usually zlib- or 0BSD-licensed instead, as they don’t have that requirement.

Support for the Window Plane

Another big feature for Mega Drive emulation!

The Window Plane is an oddity to me: the first two Sonic games (whose codebases I am very familiar with due to spending over a decade reverse-engineering them) never use it, so it’s completely alien to me. That’s why it took me so long to add support for the Window Plane: I never had the need nor knowledge to.

What the Window Plane is is a bizarre override for Plane A. Unlike Plane A, the Window Plane cannot be scrolled, but it is otherwise capable of everything that Plane A is. The Window Plane is not rendered on top of Plane A, like Plane A is to Plane B, but rather the Window Plane renders instead of Plane A. The VDP specifies two boundaries – one vertical and one horizontal – that determine where Plane A stops being drawn and the Window Plane starts being drawn instead (or vice versa).

This feature tends to be used for drawing a HUD that does not scroll with the game’s foreground or background. An example of a game that does this is Castlevania: Bloodlines:

This feature has a glaring bug on real Mega Drives: if the Window Plane is drawn to the left of Plane A, and Plane A is scrolled horizontally by a number of pixels that is not a multiple of 16, then the two columns of Plane A tiles that are next to the Window Plane will be “disfigured”. This bug is noted in Sega’s official Mega Drive developer documentation (the “Genesis Software Manual”, page 50). My emulator does not yet reproduce this bug, but I plan to add it in the future.

One other bit of software that I have on-hand to test the Window Plane is a little bit of homebrew that I found here, which was made by someone called ‘Fonzie’. It’s useful for illustrating the aforementioned bug on real Mega Drives, but it’s also handy for testing that emulators support the Window Plane. Here’s a screenshot of it running in my emulator:

Like the Sprite Plane, Plane A, and Plane B, the Window Plane can be disabled in my frontend’s debugging toggles menu. As a novelty, disabling the Window Plane will cause Plane A to be drawn in its place. This allows the user to see the part of Plane A that is “hidden” by the Window Plane.

Support for the 68000’s BCD Instructions

The 68000 has three instructions for performing binary-coded-decimal arithmetic. The long and short of it is that performing a BCD addition between 0x1 and 0x9 results in 0x10 instead 0f 0xA. This is useful in situations where you need to extract individual decimal digits from a number but don’t want to resort to repeatedly dividing by 10 to do so, as, if the number is in BCD format, you can instead just use bit-shifts and bit-masks, which are way faster than divisions. The two games that I know of which use BCD instructions use them for HUD elements, which makes sense since each digit needs to be extracted so that it can be used to determine which number graphic to display on the HUD.

The reason for these instructions taking so long to be implemented, besides them being quite niche, is that they have an absurd number of undocumented behaviours and edge-cases. For instance, the BCD instructions are some of the only instructions to have their overflow condition code behaviour officially specified by Motorola themselves as ‘undefined’, meaning that the official documentation is of no help in understanding how it works. The documentation also fails to explain what is supposed to happen when a BCD operation is performed on non-BCD numbers. For instance, what’s supposed to happen when 2 is added to 0xC? Is the result 0xE? 0x12? 0x10?

Luckily, other emulator developers encountered this problem too, and exhaustively documented how a real 68000 performs its BCD operations, figuring out every quirk and feature. This information has been collected in this SpritesMind thread. Most notably, Flamewing provided some homebrew that you can run on an emulator or a real Mega Drive to verify that its BCD instructions do as they should. Safe to say, my emulator now passes all of this homebrew’s tests!

With this, three of the 68000’s quirkiest instructions are now fully emulated!

The lack of BCD instruction emulation broke Castlevania: Bloodlines in a couple of humorous ways: it was impossible to use items because the ‘gem’ counter would never go above 0, and it was also not possible to get a Game Over because the lives counter would never go below 0, which effectively gave the player infinite lives.

Vastly-Improved Z80 Emulation Accuracy

Some people have told me that certain Sonic ROM-hacks were missing their DAC audio output when ran in my emulator. One such hack that I was able to verify this with was Sonic 1 Megahack: Ultra Edition, which lacked the music and sound effects during the “Sonic have pased” popup after completing a level.

Unlike my 68000 emulator, my Z80 emulator had not been verified against any test suites, so it was likely that my Z80 emulator had a number of bugs and inaccuracies that these ROM-hacks were invoking, causing them to misbehave and not output audio through the DAC channel.

Back when I initially developed my Z80 emulator, I had read a blog post that detailed something called ‘ZEXDOC’ which is a program that is written in Z80 assembly and can be ran on both real and emulated Z80 CPUs to verify that each instruction performs properly, which it does by analysing RAM, registers, and flags before and after each instruction. The catch was that this program is intended to be ran on CP/M, which is an old operating system for the Z80. With that said, the dependency on CP/M was quite minimal: ZEXDOC only used two of CP/M’s console-printing system calls, and expected itself to be placed at address 0x100.

I was able to quickly rig-up a small program that implemented just enough of CP/M to get ZEXDOC running on my Z80 emulator, and it signalled that many instructions were not working as intend:

Around 50% of the tests failed.

Unfortunately, ZEXDOC gave little feedback as to exactly which instructions went wrong and how they went wrong. Not to mention, if my emulator is performing Z80 instructions incorrectly, and ZEXDOC uses Z80 instructions to perform the calculations that determine whether other Z80 instructions work correctly, then how can I be sure that its calculations are being performed correctly? What if some of these failed tests were false-positives?

So I couldn’t trust ZEXDOC’s output, nor make attempts to narrow-down exactly what inaccuracies my Z80 emulator had.

This problem didn’t exist with my 68000 emulator’s test suite, as the actual tests are not performed by the emulated 68000 itself but rather the host computer’s CPU. If I could find a test suite like that for the Z80, then that would be a great help. Not to mention that a test suite like the one used for my 68000 emulator would do a much better job of helping me verify exactly which instructions are incorrect and why, since it lists the exact contents of each register and memory address before and after the tested instruction’s executions.

I was in luck, because I stumbled across a Z80 test suite that was almost identical to the test suite that I used for my 68000 validator!

Due to its similarly to the 68000 test suite, I was able to make a modified version of my 68000 validator read the Z80 test suite’s data, and begin performing said tests on my Z80 emulator. It immediately uncovered numerous bugs, of which some were minor (such as undocumented flag behaviours) and others were severe (such as entire swaths of instructions using the wrong operands).

Soon, I had my Z80 emulator passing every test, with the exception of instructions which I had yet to implement in the first place such as IN, OUT, CPI, CPIR, and DAA. Now confident that my emulator could run ZEXDOC at least somewhat properly, I tried it again and, this time, most of the tests passed:

Out of the 50 or so tests, only 3 failed.

The only failed tests were for instructions which I had not yet implemented in my emulator. I have never seen these instructions used in any Z80 code for the Mega Drive, be it code from official games or homebrew. Because of this, implementing these instructions was never a priority for me, and I had no test-cases for them either. However, because both the Z80 test suite and ZEXDOC provide exhaustive tests for these instructions, I decided that I would finally add them to my emulator. Eventually, these new instructions passed both validators, meaning that every ZEXDOC test passed:

0 failed tests!

There are still three instructions that are not yet implemented: HALT, IN, and OUT. The reason that the latter two are not implemented is because they are special instructions that write to “IO ports”, which do not exist on the Mega Drive.

The instructions that I did implement are CPI, CPD, CPIR, CPDR, and DAA. The first four all are variants of each other, and are used for searching for a particular byte in a block of memory (much like C’s memchr function). The DAA instruction is for performing “BCD correction” to the accumulator register. It essentially performs the same task as the second half of the 68000’s ABCD, NBCD, and SBCD instructions: it computes a ‘correction factor’ based on the output of the previous arithmetic instruction, and adds it to the output to make it into a valid BCD number.

Now that my Z80 emulator is much closer to behaving like a real Z80, those Sonic ROM-hacks should all have working DAC audio output, right?

Well, the joke’s on me, because it turns out that Sonic 1 Megahack: Ultra Edition still has missing DAC audio. But why? Why does the audio still not work despite the Z80 emulation being so much more accurate? Well… it’s because the audio doesn’t work on a real Mega Drive!

Yep, you read that right: that ROM-hack only works properly on inaccurate emulators, because that’s all that the developer used to test the game during its development. This is actually a fairly common occurrence when it comes to ROM-hacks, since running a ROM-hack on a real Mega Drive requires an expensive flash-cartridge, so most developers just stick to testing exclusively with emulators.

So, in the end, the audio in Sonic 1 Megahack: Ultra Edition was bugged, not because my emulator was inaccurate, but because it was too accurate. I suppose I should be proud of that.

Improved FM Debug Menu

While debugging Sonic 1 Megahack: Ultra Edition‘s missing audio, I realised that the FM debugger could be improved, so I’ve moved the per-FM-channel data out of the tabbed section and into a shared table, and exposed the timers, latched address and port, and channel panning.

Fix 1-Cell Horizontal Scrolling Mode

This is another bug that I didn’t notice because no Sonic game or ROM-hack that I know of uses it. The Mega Drive’s VDP has three ways of scrolling the screen horizontally:

  • Scrolling the entire screen.
  • Scrolling each row of pixels individually.
  • Scrolling each row of pixels in groups of 8 (the size of a tile, or, as Sega’s official documentation calls it, a “cell”).

This bug involves that last one: it just didn’t work at all. This can be seen in Earthworm Jim‘s “What the heck?” level:

So what’s going on? When I first wrote my VDP emulator, I assumed that 1-cell mode kept all of its scroll values next to each other in memory, just like in 1-line mode. However, that is not the case: strangely, each cell’s scroll value is actually spaced 8 values apart. Correcting this behaviour fixes Earthworm Jim, and presumably everything else that uses 1-cell mode.

Groovy.

Add General VDP Debugger

To diagnose the above bug, I needed information on what the VDP was doing, so I added a menu to show the VDP’s various settings. It’s not all that pretty, but it gets the job done for now:

It was with this that I figured out that Earthworm Jim‘s scrolling was only broken when the VDP was in 1-cell horizontal scrolling mode, so this menu should prove useful in the future too.

Abandon ‘tiny file dialogs’ Library

Dear ImGui and SDL2 are great and all, but they don’t provide a cross-platform way to browse files. This is a problem since it’s important for the user to be able to select a ROM image or a save-state file to load or save. To this end, the frontend made use of the ‘tiny file dialogs’ library, which enables the use of the operating system’s standard file dialogs.

Unfortunately, the library is as much a help as it is a burden: its code is of questionable quality, producing various compiler warnings for such novice mistakes as returning pointers to local arrays. In addition, despite emphasising portability and supporting POSIX, the library is incompatible with the BSDs. Finally, its support for symbolic links is allegedly completely broken.

With the library being riddled with code hygiene issues and actively limiting the portability of my frontend, I’ve decided to ditch it. In its place, I’ve added a barebones file input prompt that leverages Dear ImGui.

Neither C++11 nor SDL2 provide a way of querying directories for their contents, so this is the only universally-compatible solution. I could have used C++17’s file-system API, but I worry that such a new API is not very ubiquitous yet.

Of course, using such a limited, clunky way of opening files would harm the frontend’s usability, so I intend to add platform-specific logic to use the native file dialogs whenever possible. Right now, this has been done for Windows, and I intend to do the same for Linux (GTK and Qt) soon. Users of other operating systems will have to get comfy with the barebones dialog, but it’s still an improvement over the frontend not compiling at all.

The barebones file dialog supports drag-and-drop: just drag the desired file onto the window and its path will automatically be entered into the text box. The file dialog doesn’t even have to be open: ROMs and save states can be dragged onto the window at any time, and the frontend will apply them appropriately. Intuitiveness is great.

Fix VRAM Fill

The Mega Drive’s Video Display Processor has three Direct Memory Access modes: 68000 to VDP, VRAM Fill, and VRAM to VRAM. The first is used quite often and implemented in my emulator, the second is much less common and supported by my emulator but not well tested, and the third is rare and not yet in my emulator at all.

Across a large number of games (including Sonic, Castlevania, Vectorman, and Earthworm Jim), VRAM Fill is seemingly only used for setting large swarths of VRAM to 0. This basic usage is easy to emulate, but does not serve well to ensure that said emulation is entirely accurate to the behaviour of a real Mega Drive.

Cue Mega Man: The Wily Wars, which makes some interesting use of VRAM FIll: at the start of VRAM, it stores a handful of blank tiles, each of a different colour – these tiles are all generated with VRAM Fill.

This behaviour was enough to expose issues with my emulator’s implementation of VRAM Fill:

There are strange lines running down everything.
Oh dear.

Looking at the VRAM debugger shows exactly what’s going wrong:

Every other two pixels is not being filled with the correct colour. I immediately had my suspicions about the cause of this: back when I was first writing the VDP emulator in September 2021, I learnt that VRAM is apparently 8-bit, while CRAM and VSRAM are 16-bit. This caught me by surprise, as I’d always assumed that all three of them are 16-bit, so I was not sure what the proper way to implement this quirk was. Instead, I implemented VRAM as 16-bit, as I was originally going to. I figured that this might cause certain edge-cases – such as uploading data to an odd VRAM address – to behave incorrectly, but otherwise it would not be an issue. However, seeing this bug in Wily Wars convinced me that this workaround had to go.

It turns out that many aspects of my emulator’s implementation of VRAM Fill were incorrect: for instance, it mistakenly assumed that the length was measured in 16-bit words, when in reality it was measured in 8-bit bytes minus one. Additionally, the 16-bit word that specifies the value to fill the VRAM with is not entirely used: only the upper 8 bits of it are actually written to VRAM.

Addressing these issues fixed the bug, and Wily Wars now looks much better:

As intended.

Wily Wars still has one small bug: those four coloured dots near the start of VRAM. The game appears to accidentally write a VDP command ($8F02) to the VDP’s data port instead of its command port, causing it to be uploaded to VRAM as graphics. While the game certainly makes the same mistake on a real Mega Drive, it doesn’t seem to result in those pixels being written to that particular place in VRAM, as it causes artifacts to be visible in the level which aren’t present on a real Mega Drive:

See the pixels in that pit? Those aren’t there on a real Mega Drive. Just another inaccuracy to fix in the future, I suppose.

Implement YM2612’s ‘BUSY’ Flag

This is a feature that wasn’t too important to emulate, but it’s so simple that I figured that I’d might as well.

The YM2612 takes time to process the data that it is given. To signal to the CPU that it can’t accept more data yet, the YM2612 sets the high bit of its status byte. The CPU obtains this byte by reading from one of the YM2612’s address ports.

On the YM2612, the busy flag is extremely basic: it is set whenever either data port is written to, and always lasts for 32 YM2612 cycles (192 68000 cycles), regardless of how long the submitted data actually takes to process. I’ve heard that the longest YM2612 operation is only 24 cycles, and that the YM3438 actually does set the busy flag for lengths that match each operation’s duration.

With this feature implemented, any software that explicitly relies on the busy flag for timing should work correctly now. An Earthworm Jim game and a game called Hellfire apparently rely on this.

Add ‘Other’ Debugging Menu

To expose yet more of the Mega Drive’s internal state to the user, a menu has been added to show information about the general console, rather than specific components. Right now, this menu revolves around the bus arbiter, but it will be expanded with other information in the future, as the need arises.

Closing

This has been a pretty massive update (and a massive blog post), so I’m feeling a bit burnt-out on working on this emulator. Progress may be slow for a while after this. I’m pretty happy to see how far this emulator has come, though! It wasn’t long ago that this emulator couldn’t even boot anything, and now look at what it can do: play the classic Sonic trilogy, run Linux, and even run ROM-hacks and homebrew!

Standalone frontend: https://github.com/Clownacy/clownmdemu-frontend/releases/tag/v0.4

libretro core: https://github.com/Clownacy/clownmdemu-libretro/releases/tag/v0.2

Advertisement

Leave a Reply

Fill in your details below or click an icon to log in:

WordPress.com Logo

You are commenting using your WordPress.com account. Log Out /  Change )

Twitter picture

You are commenting using your Twitter account. Log Out /  Change )

Facebook photo

You are commenting using your Facebook account. Log Out /  Change )

Connecting to %s

%d bloggers like this: