With my honours project complete, I decided to put my newfound free time into a project that I’ve been meaning to get around to for almost five years: disassembling Knuckles in Sonic 2.
In case you don’t know, Knuckles in Sonic 2 (which I’m just going to call ‘KiS2’ from now on) is a version of Sonic 2 that lets you play as Knuckles instead of Sonic and Tails. Sonic hackers like to port Knuckles from this version back into regular Sonic 2, but, in the process, they effectively undo the huge number of changes that KiS2 made to Sonic 2’s codebase. This ranges from simple alterations for Knuckles, to bugfixes that have gone undiscovered to this very day.
You might be asking yourself why I want to disassemble this game, since a disassembly for it already exists. Well, the reason is that the existing disassembly is completely separate from the Sonic 2 disassembly that also already exists. Not only does this mean that it is horrifically outdated in comparison to the Sonic 2 disassembly, but this also makes it extremely difficult to compare the two games and find differences between them.
Rather than disassembling the game from scratch like the maker(s) of the other disassembly did, my approach is to take the Sonic 2 disassembly, and edit it to match KiS2. This is exactly what I did to create the disassemblies of Sonic 2’s revisions (REV00 and REV02), the game’s Mega Play arcade version, and the version of Sonic 2 found in Sonic Classics/Sonic Compilation.
As of writing, this task is finally done, and I have a modified Sonic 2 disassembly that produces a perfect copy of KiS2. With this disassembly more or less complete, I figured I should explain everything I’ve learnt about KiS2 here:
Changes
Knuckles
Obviously, Knuckles has replaced Sonic. This is actually surprisingly tacked-on: Knuckles is just a lightly-modified Sonic with all of the gliding and wall-climbing behaviour wrapped in a single function call. I suppose this isn’t surprising, but I was under the impression that the whole Knuckles object was copied from an in-development version of Sonic & Knuckles. I think I got that idea from the Sonic 3 Unlocked blog, but I could just be misremembering.
Notably, Knuckles’ graphics are loaded from the Sonic & Knuckles cartridge: the tiles are recoloured at runtime to suit Sonic 2’s palette. The sprite mappings and dynamic tile loading data are also loaded from the Sonic & Knuckles cartridge. Sonic hackers may find this surprising, since Sonic & Knuckles uses a different sprite mapping format to Sonic 2. This leads me into my next point…
Mappings
All of the game’s mappings were converted to Sonic & Knuckles’ format. This strikes me as very odd, as this means that the mappings now have to be included in the KiS2 ROM, instead of being loaded from the Sonic 2 cartridge, wasting space. Maybe it was considered too much effort to go through the whole game and split the mappings? This conversion was universal: even unused mappings were converted. Heck, even unreferenced parts of mappings were converted. This suggests that the mappings were created using assembly macros, and the macro itself was modified to convert the mappings to Sonic & Knuckles’ format.
The difference between Sonic 2’s and Sonic & Knuckles’ sprite mapping format is that Sonic 2’s has extra data for the game’s two player mode, which uses a fancy rendering mode of the Mega Drive’s VDP. This leads me onto yet another point…
Two Player Mode
Two player mode was removed, but not entirely. It appears that the developer(s?) were struggling to fit the game to the size they wanted, so they began removing code related to two player mode, and once they reached their desired size, they stopped. In the end, they scraped by with only 680 bytes to spare.
There are plenty of leftovers from two player mode in the game: the variable used to detect two player mode (dubbed ‘Two_player_mode’ in the disassembly) still exists, and is referenced frequently in the game’s code. For example, the level title card object still makes heavy use of the flag.
Being a Sonic hacker, I’ve removed two player mode from Sonic 2 before, and I’ve done it much more thoroughly than in KiS2. With that in mind, I know how complex removing two player mode is, so it doesn’t surprise me that the developers didn’t go all the way with it.
Lock-On Technology
This won’t be a surprise to most people reading this, but KiS2, despite being a version of Sonic 2, doesn’t have many of Sonic 2’s assets in it. Instead, it copies them from the attached Sonic 2 cartridge. You see, KiS2 isn’t a standalone game: it’s actually a bonus mode in Sonic & Knuckles. Sonic & Knuckles’ cartridge has a cartridge slot on top of it, allowing you to plug other cartridges into it, with KiS2 being the result of plugging in Sonic 2’s cartridge.
The way Sonic 2’s assets were removed from KiS2 is pretty basic: at the end of Sonic 2 is a massive block of assets (including the game’s music, sounds, drum samples, enemy graphics, player graphics, player sprite mappings, level graphics, level layout, level object placements, and more), and it is simply removed in KiS2. Notably, assets that aren’t part of this giant block were not removed, such as the title screen’s ‘1 PLAYER’ and ‘2 PLAYER VS’ text.
As mentioned earlier, some assets are loaded from the Sonic & Knuckles cartridge, such as Knuckles’ assets. However, those aren’t the only things that are loaded from that cartridge: KiS2 features modified level object placements, which reward the player for exploring with Knuckles’ wall-climbing. Strangely, the data for this is in the Sonic & Knuckles portion of the cartridge instead of KiS2. It’s possible that this was done to free-up space in KiS2, with Sonic & Knuckles having room to spare.
Bugfixes
KiS2 contains a surprising number of bugfixes:
Perhaps most notably, KiS2 removed the air speed cap, which appears to be a leftover from Sonic 1. This is significant because it has always been unclear whether the air speed cap was deliberately retained in Sonic 2 as a feature, or leftover as a bug. The air speed cap is responsible for at least two areas in Sonic 2 not working as intended: the red spring that leads to the ‘monkey island’ in Emerald Hill Zone Act 2, and the launcher that flings you over a large gap in the floor in Wing Fortress Zone. In both cases, the speed cap causes the player to undershoot their target if they press left or right on the D-pad while moving through the air. The removal of this speed cap in KiS2 suggests that it was indeed an unintentional leftover all along.
One of the most well-known bugfixes in KiS2 is the correction of a bug that causes the bottom two lines of the screen to appear incorrectly in Emerald Hill Zone. I wonder how this bug was discovered, since televisions were especially prone to overscan hiding the edges of the screen back then.
One type of bugfix that KiS2 contains is taking the player’s character out of their ‘roll-jumping’ state, where their controls are basically locked. Being left in this state at a bad time can result in the game soft-locking, as the player is unable to move their character. Times when KiS2 makes the character exit their roll-jumping state is when they enter a wind-tunnel and when hovering over a propeller in Wing Fortress Zone.
Sonic 2 suffers from a particularly glaring bug, where entering the cheat to gain 15 Continues causes the game to play Oil Ocean Zone’s music forever. The cause is a nonsensical sound ID being submitted to the sound driver. This is corrected in KiS2. This bug was also fixed in the version of Sonic 2 included in Sonic Mega Collection.
The title card appears to have had a bugfix applied to it which prevents odd behaviour if the graphic of the name of the zone goes too far to the left of the screen, causing its X coordinate to drop below 0. This bugfix works by replacing some unsigned conditional branches with signed conditional branches, and only drawing the sprite if it is within 48 pixels of the screen’s left side.
The bumpers in Casino Night Zone have their own layout data. This data needs to be terminated with special byte patterns that prevent the bumper manager from reading beyond them and parsing surrounding code as data. One of these termination patterns is missing from the very start of Act 1’s layout data. In a stroke of good luck, the code before the data happens to resemble the terminating byte pattern, preventing the bumper manager from processing invalid data. In KiS2, however, this is no longer the case. A proper data terminator was added at the start of the data, fixing this problem. Fun fact: this bug appears to have not been fixed in the earliest prototype of KiS2, causing the game to crash if you go to the top left corner of the level.
There are also some modifications to the game’s collision code, which may be an attempt to fix bugs in it. Unfortunately, I haven’t figured out the point of these modifications yet, so I can’t say for sure what bugs, if any, they’re trying to fix. One bug that it appears to be trying to fix is the bug in Sonic 2 where collision with an object from below doesn’t properly push the player out, sometimes resulting in them phasing straight through the object. This fix does not work correctly, however, and cancels-out the player’s inertia when it shouldn’t. You can read more about it here.
One rather funny bug is that if you’re moving at a high speed towards a wall, and then start moving in the other direction at last second, Sonic will impact the wall and then start moving away from it while playing his pushing animation. KiS2 appears to fix this bug as well, preventing Knuckles from entering his pushing animation if he is not facing towards the object that he pushed against.
In Sonic’s movement code, a register that holds his speed is unintentionally partially overwritten before being used later on to decide whether Sonic is moving fast enough to skid or not. This creates an asymmetry in what speed Sonic needs to be in order to skid when attempting to move in the opposite direction. This too is fixed in KiS2. You can read more about this bug here.
Another bug fixed by KiS2 is that, when the player turns Super, a ring is instantly drained. This is due to a counter never being initialised. Now, the game waits a second before draining the first ring, which is consistent with how it drains every ring afterwards.
In Mystic Cave Zone, it’s possible for the player to become ‘detached’ from a hanging vine switch, appearing suspended in the air away from the vine itself. KiS2 addresses this by forcefully updating the player’s coordinates to match the vine every frame.
Speaking of Mystic Cave Zone, the boss of that zone has a nasty bug where, apparently due to a copy-paste error, the wrong address register is used at one point, causing a random byte of memory to be overwritten. Somehow, KiS2’s developers noticed this and fixed it.
And… that’s it. That should be the last of the bugfixes that I’ve found in KiS2. So, what other changes were made in KiS2?
JmpTos
Yep, JmpTos again. They always find an excuse to crop up when I do this kind of thing. For those not in the loop, ‘JmpTo’ is the nickname given to branch extensions that are present through Sonic 2’s codebase. If a branch is too short to reach its destination, it instead branches to a long-range jump instruction in order to reach it. In the first two revisions of Sonic 2 (REV00 and REV01), they appear to have been generated by the assembler. In the third revision – REV02 – they changed significantly, presumably because the developers switched to using a different assembler. They’ve once again changed quite a bit in KiS2.
What’s interesting about the JmpTos in KiS2 is that they appear to be hand-made, as opposed to the obviously-automated JmpTos in Sonic 2 REV00 and REV01. You see, it appears that the developers went through much of the game’s code, ‘tidying’ the JmpTos: rather than being messily mixed into code, as they were in REV02, they were grouped and moved to the end of their respective blocks of code. Additionally, redundant branches to JmpTos were eliminated: in Sonic 2 REV02, it wasn’t uncommon to see unconditional branches that branched to JmpTos, when they could have just been jump instructions that jumped straight to the intended destination – KiS2 removed many, if not all, of these.
Further adding to the idea that REV02 and KiS2’s JmpTos were hand-made is the fact that one of the JmpTos in REV02 (‘JmpTo13_MarkObjGone’) is completely unused. It was removed in KiS2.
Restored Debug Features
Invisible objects, such as plane-switchers and invisible walls, become visible in Debug Mode in KiS2. One object in particular is made visible with code that was previously only in REV00. This suggests that the code may have existed in REV01’s and REV02’s source code in a dummied-out form that was simply un-dummied-out in KiS2. Perhaps these debug features were hidden behind a build-time flag?
Removed Development Code
In Sonic 2, after the ‘loadLevelLayout’ function is some leftover code. The first chunk of code is the level layout loading function from Sonic 1, modified to repeat the background layout. This was used in some of Sonic 2’s prototypes.
After that is a function that converts a level’s chunks from Sonic 1’s 256×256 format to Sonic 2’s 128×128 format, and after that is a function for eliminating duplicate 128×128 chunks. These were likely used to convert Green Hill Zone’s chunks to 128×128 for Sonic 2’s “Nick Arcade” prototype.
After surviving through numerous prototypes, all three revisions of the final Sonic 2, Sonic Classics, and the Mega Play arcade version, this code was finally removed in KiS2. RIP.
Demos
Also known as ‘attract mode’, the game will play some demos if you leave it on the title screen. The developers of KiS2 attempted to preserve compatibility with Sonic 2’s demos, reenabling things like the air speed cap and giving Knuckles Sonic’s jump height when a demo is playing. Unfortunately, the result is not perfect, and the demos still manage to desynchronise at points. The developers went so far in their attempts to keep the demos working that they manually edited the inputs for the Emerald Hill Zone demo.
Other
I could talk about the modified title screen, Wing Fortress Zone cutscene, ending, and logo after the credits, but honestly I can’t think of anything noteworthy about them. Maybe I’ll go over them in a follow-up post, if I can think of anything interesting to say.
Standalone
As an experiment in what is possible with this disassembly, I’ve added an option to build a ‘standalone’ version of KiS2 that doesn’t rely on Sonic 2 or Sonic & Knuckles in order to run. This is similar to the ‘Sonic 3 Complete’ mode of the Sonic & Knuckles disassembly, which produces a version of Sonic 3 & Knuckles that doesn’t rely on Sonic 3. You can find a built ROM of this standalone KiS2 here. The intention of this, in addition to just being a tech demo, is also to make it feasible to produce ROM hacks of KiS2, which is practically impossible whilst it is dependant on two other ROMs.
Conclusion
Personally, I’ve learnt a lot about KiS2 from this disassembly, and I hope others will learn a lot from it too. KiS2 has always been a mysterious black box to me: its many changes and fixes always being out of reach and beyond our understanding, with no easy way to find the new in a sea of old. Every change and every fix was a needle in a haystack… but not anymore. Maybe now we can see a *complete* port of Knuckles to Sonic 2, title screen, ending, compatibility adjustments, and all!
Fun fact: I started this disassembly on the 28th of April, and it was completed on the 5th of May. It took me almost five years to get around to doing something that only took a week. Geez.
This will never not be weird to me: for some reason, there’s a Sonic monitor lookalike in Microsoft Office. To see it, go to the ‘View’ tab and select the ‘Zoom’ option:
I don’t get it. What came first: Sonic’s monitor or Microsoft Office’s monitor? Who was copying who? Were they even copying each other to begin with?
There seems to be multiple variations of this little sprite: I have an old screenshot from 2015 that shows a version which is much closer to Sonic’s monitor sprite:
Here’s a comparison of the three:
Somebody please tell me that I’m not alone in thinking that these look uncannily similar: they’re both 30×30, they use similar greys, and even the image in the Sonic monitor perfectly matches the size and position of the image in the Office monitor. They’re so similar that I can literally put it into a ROM-hack of Sonic 1 and it works perfectly:
Somebody please help: this has been driving me nuts for the last 7 years.
To my understanding, not much is commonly known about the source code of the original Sonic the Hedgehog games. As someone who’s been obsessed with the programming of these games for almost ten years, I believe that I know a lot more than most people do. Unfortunately, my memory is awful, and I’m not going to be a Sonic hacker forever, so I want to preserve this information however I can before it’s lost again.
As of writing, the source code for the ‘classic’ Mega Drive Sonic the Hedgehog games has never been found. We do have an exhaustive list of disassemblies, however those do not capture all information that proper source code would give us. For example, while the logic of code is recovered by these disassemblies, the meaning of it might not be. Sometimes, the logic isn’t all you need to understand code: it may do something that seems pointless or nonsensical, leaving nearby labels and comments to explain it fully. However, a disassembly cannot reproduce the original labels and comments: those are lost. Likewise, a disassembly cannot reproduce code that was never in the ROM to begin with, such as disabled (or “commented-out”) prototype code. Because of this, a disassembly cannot truly replace source code.
However, small snippets of source code have surfaced over the years. The most obvious example I can think of is a fully-intact copy of a single source file (likely “EDIT.ASM”) in Sonic 2’s “Nick Arcade” prototype.
Sonic 2 “Nick Arcade” prototypes’s source file
Inside one of Sonic 2’s prototypes is a fully intact copy of the source file responsible for the game’s “edit mode” (commonly known as “debug mode”):
This source code contains everything: code, comments, labels, and even disabled prototype code that we never knew even existed before. Notably, these labels give the internal names of enemies and objects used throughout the game, while the comments elaborate on the intended level order. The source code snippet even gives some insight into the exact assembler used to build the source code into a ROM.
That’s not the only useful data in this prototype however…
Sonic 2 “Nick Arcade” prototype’s symbol list
Starting at ROM address 0x418A8 is what appears to be a partial copy of the assembler’s symbol table. Oddly, it doesn’t appear to match the ROM, as labels that appear in the above source code are recorded in this symbol table at different addresses to where they are in the actual ROM. It can be assumed, based on this, that this symbol table is leftover from a previous execution of the assembler, rather than the one that produced the ROM that we have.
From roughly 0x418A8 to 0x47B14, the symbol table follows this format:
The first four bytes denote the length of the symbol identifier divided by four, and rounded up. That is to say, it denotes the number of longwords needed to hold the identifier. Notably, this length is stored in big-endian format.
The following longwords contain the label. Unused bytes are set to 00.
The following four bytes contain the value that is assigned to that symbol. This value is also in big-endian.
Here’s an example:
00 00 00 02 65 64 69 74 69 6E 69 74 00 01 AB 12
The first four bytes are 00000002, indicating an identifier that is 2 longwords (or 8 bytes) long.
The next eight bytes are 65646974696E6974. When interpreted as ASCII, it is “editinit”, which is one of the labels seen in the above source code.
The following four bytes are 0001AB12, which is the ROM address that was assigned to this label when the assembler that produced this symbol table was ran. In the ROM we have, however, “editinit” is located at 0001BABE.
Past 0x47B14, the format of the symbol table changes. Unfortunately, the entries don’t seem to specify what each symbol’s value is, instead containing what appears to be three pointers. This format continues to 0x50000, when actual game data resumes.
The symbol table begins again at 0x50A9C, but in another format that seems to just contain identifiers and pointers.
[EDIT: I’m a dummy: apparently you *can* extract meaningful data from the later parts of the symbol table, as someone else was able to and produce this massive list.]
While the later symbols don’t contain any useful information, they are still useful for determining the original names of various bits of code. For example, the label ‘random’ is certainly what we’ve been calling ‘RandomNumber’ in the Sonic 2 disassembly. Additionally, ‘bgmset’ and ‘soundset’ are more than likely ‘PlayMusic’ and ‘PlaySound’. What’s notable about that last one is that it answers the question of whether ‘PlayMusic’ really was intended to play music, since sometimes it’s used to play sounds instead, which caused people to question whether it actually was a dedicated music-playing function, or simply a generic ‘play something‘ function.
One might be wondering why there’s an intact source file and a partial symbol table in the middle of a ROM. Well, the space that this data occupies is what would normally be padding: towards the end of the ROM, the game’s data is spaced-out for some reason, leaving big gaps. My theory is that whatever tool the devs were using to produce the final ROM simply malloc‘d a huge buffer, and pasted data where it was needed, never initialising the unused space. This resulted in the unused space containing garbage data, leftover in memory from other programs.
This isn’t the only time that symbol data appears in a Sonic game:
Sonic & Knuckles Collection’s symbol list
Unfortunately, I can’t comment too much on this due to not having a copy of Sonic & Knuckles Collection on hand. However, this game contains another instance of symbol data, inside its main EXE.
For the most complete set of symbols, however, one need not look further than…
Sonic CD’s unstripped ELF files
Sonic Gems is a compilation of various emulated classic Sonic titles. Its version of Sonic CD, however, is not emulated: rather, it is a port of Sonic CD’s PC port.
One might be wondering how a game written in Motorola 68000 assembly could have ever been ported to anything with a different CPU architecture. The answer to that is that its code was machine-converted from assembly to C. Notably, this process preserved the original labels, and possibly even the original comments.
Within the files of Sonic Gems, you can find a series of ELF files, which parallel the DLL files of Sonic CD’s original PC port. What’s special about these ELF files is that they are unstripped, containing huge amounts of symbol data.
When loaded into something like IDA or Ghidra, this symbol data is taken advantage of while disassembling or decompiling the code. This reverse-engineered code will have its original labels intact, giving a great look into the original source code. Unfortunately, the code you’ll be looking at is the disassembled/decompiled output of PowerPC/MIPS assembly which was generated from C which was converted from 68k assembly. That is to say, it’s hideous and practically unreadable. Still, if you’re able to draw parallels between this code and the original 68k assembly, you effectively have a pre-labelled disassembly.
These ELF files also contain some debug data, such as the paths of various source files. I was able to extract a bunch of them back in 2018:
This file list gives you a look at the layout of Sonic CD’s source code. Notably, you can see ‘EDIT.C’, which is very likely the file that originally contained the code seen in Sonic 2’s “Nick Arcade” prototype.
Sonic Gems isn’t the only game to contain unstripped executables like this. For instance, The Legend of Zelda Collector’s Edition contains an unstripped executable for its Nintendo 64 emulator (or “simulator”, as it calls itself. Yeah, sure, Nintendo).
Yuji Naka’s video
A few years back, Yuji Naka found some old footage of him working on Sonic 1 back in February 1990. In this video, he scrolls through a portion of the game’s source code, giving us yet another sighting of authentic source code:
Some of these filenames may seem familiar, as they survived into Sonic CD. Note that the items ending with the Yen symbol are actually directories, and not files.
This footage is also noteworthy for showing that Sonic 1 was developed on DOS. This is something that I’ll come back to later.
Patent US5411272A
A tiny snippet of source code can be found in a patent that Sega filed around the time of Sonic 2’s release. It contains two tables that are responsible for Emerald Hill Zone’s spiral loops. While it’s not all that insightful, it does give you a look at the original formatting of these tables, and how the data was grouped.
J2ME
The developers of the J2ME version of Sonic 1 appear to have had access to Sonic 1’s source code, or at least its assets: this is evident through how its filenames reflect labels found in the original source code. For instance, ‘scdtblwk.scd’, matches the label ‘scdtblwk’ that is found in Sonic 2’s “Nick Arcade” prototype.
Most notably, however, some of the files in this version appear to be ‘raw’, unprocessed versions of the data found in the ROM of the Mega Drive version. For example, there’s an unused function in Sonic 2 that converts the collision data from a previously-unknown format to the format that is seen in the ROM itself, and it was discovered that the collision files in the J2ME version are in this mysterious format.
There are other file format oddities as well, such as block priority being stored in its own file, instead of being embedded in the chunk data. It’s possible that this too is how the data was originally formatted, before being converted into its final form for inclusion in the ROM.
Assembler
So we have all this information about the source code itself, and the game’s assets, but what about the development environment? Well, for starters, from the footage we saw earlier, we can tell that Sonic 1 was developed on DOS computers.
According to LazloPsylus, Sonic 1 was very evidently assembled with the 2500AD assembler, X68k, based partly on the fact that the syntax of the code snippet seen in Yuji Naka’s footage is unusual, and unlikely to have built with any other assembler.
However, it does not appear that Sonic 2 was assembled with X68k, as the “Nick Arcade” source code snippet uses a different syntax. It’s plausible that Sonic 2 was assembled with SN 68k, also known as asm68k, however the presence of an ‘addsym’ directive, which is not supported by asm68k, calls this into question.
Further complicating matters is something that I discovered while writing this very blog post: the “Nick Arcade” symbol table uses big-endian integers. DOS PCs are x86, and store their integers in little-endian. This suggests that Sonic 2 wasn’t assembled on a DOS PC at all, unlike Sonic 1, and rather that it was assembled on a big-endian platform like a 68k-powered Macintosh. Another indication of this is that the “Nick Arcade” source snippet uses Unix-style line endings (0x0A), instead of DOS-style line endings (0x0D 0x0A).
There’s some evidence to suggest that Sonic 2 used a different assembler between REV01 and REV02: cross-object-file function calls behave differently, ‘dc.b’ directives are automatically padded, some ‘addi’ and ‘subi’ instructions were optimised to ‘addq’ and ‘subq’ instructions, and some ‘lea’ instructions were unoptimised from PC-relative addressing to absolute long addressing. The cross-object-file behaviour of REV00 and REV01 matches that of the “Nick Arcade” prototype, so it can be assumed that Sonic 2 used the same assembler throughout development up until REV02.
Something I just discovered while writing this blog post is that Sonic 1’s machine code is similar to Sonic 2 REV02: in code that is shared between Sonic 1 and Sonic 2, the same ‘addi’ and ‘subi’ instructions are ‘addq’ and ‘subq’ in both Sonic 1 and Sonic 2 REV02. Likewise, the same ‘lea’ instructions are unoptimised. With this in mind, it would appear that Sonic 2 was migrated back to X68k during the development of REV02, and it would continue to use this assembler as it was used to develop Sonic 3, Knuckles in Sonic 2, and the Mega Play arcade version of Sonic 2.
Sonic 2’s source file boundaries
While we don’t have an exact list of Sonic 2’s source files, we do have a way of determining where its original source files began and ended: the assembler used by Sonic 2 before REV02 would resolve cross-object-file function calls in a clunky way, basically proxying them to little ‘JmpTo’ functions. To put it literally, a ‘bsr’ instruction that referenced a label which was outside of the current object file would instead branch to a single-instruction function that was appended to the end of the object file, which would itself jump to the ‘bsr’ instruction’s original destination. Because these ‘JmpTo’ functions are inserted at the end of the object file, they can be used to tell where an object file (and thus a source file) ended.
From this, we can determine things like that the ring object (object 0x25), the scattered ring object (object 0x37), the big Special Stage ring object leftover from Sonic 1, and the Casino Night Zone ring prize object (object 0xDC) were all stored in the same source file.
Sonic 2 “Simon Wai” prototype’s unassembled Kosinski file
Like the “Nick Arcade” prototype before it, Sonic 2’s “Simon Wai” prototype includes another source code file. This one, however, is not entirely complete, but it contains more than enough useful information:
‘Kosinski’ is the nickname of a compression format used by various Mega Drive games, and this source file contains data compressed in this format. The header, when translated, says…
; Before compression $8000 After compression $2c00 Compression ratio 34.4% Number of cells 1024
‘Cells’ refers to Mega Drive tile graphics, which are 32 bytes each. It doesn’t make such sense in this context, since it’s not tiles that are being compressed, but rather Aquatic Ruin Zone’s level chunk data.
Anyway, what we can learn from this source file is that Kosinski-compressed data wasn’t included into the ROM as binary data, but rather assembly data for the assembler to process. This is unusual, as every assembler that I’ve worked with supports including binary data directly, without requiring this odd shim.
Regardless, this assembly file also possibly indicates why Kosinski files are always padded to 0x10 bytes: each ‘dc.b’ directive in the source file is 0x10 values long, so perhaps the tool that produced this assembly file was unable to output a ‘dc.b’ that was less than 0x10 values long, and so it would instead output dummy 0 values until the ‘dc.b’ was ‘full’.
Closing
Aaaaaaand… that’s it. I think that’s all I know about Sonic the Hedgehog’s source code. Hopefully this information is of use to someone.
It’s been a week since the last update, so what’s new? Well, the biggest improvement is that the assembler can now assemble SuperEgg’s Sonic 2 Nick Arcade disassembly without any modification.
In addition, the assembler is now case-insensitive. Symbol case-insensitivity is user-configurable, just like in asm68k.
Also like asm68k, the assembler can now output a “symbol file”. This is useful because symbol files can be used by Vladikcomper’s Advanced Error Handler. However, the error handler itself cannot be assembled with my assembler, because of its use of advanced asm68k features that I have yet to implement.
One particular difference in how I’m developing this assembler now compared to how I was doing it before is that I’m now consulting asm68k’s manual (SATMAN.pdf) for details on unimplemented features, whereas previously I was avoiding official documentation, and just relying on my own knowledge and guesswork. The reason for this didn’t have anything to do with legal paranoia, just that I didn’t want to make things too easy for myself early on. Now that I’m implementing features that I didn’t even know existed, I feel that using the manual is justified.
One area where using the manual has helped is operator precedence: previously, it matched that of C, but operator precedence is actually different in asm68k, so my assembler was recreating it incorrectly.
Speaking of the operators, I’ve been able to implement my own assembler extension: asm68k doesn’t provide any logical operators, only bitwise ones, so I’ve taken the opportunity to add them to my assembler. Now, in addition to ‘&’ and ‘|’, there’s ‘&&’ and ‘||’. There’s also ‘!’ to complement the bitwise unary ‘~’. I’ve also implemented C-style equality and inequality operators: ‘==’ and ‘!=’, to complement the usual ‘=’ and ‘<>’. This makes it possible to use C-style expressions in the assembler, which is a great fit for me. It’s nice that there’s at least one area where my assembler is already superior to asm68k.
I’m currently working towards getting three asm68k-based projects to assemble with my assembler: Aurora Field’s Z80 assembly macros, Vladikcomper’s Advanced Error Handler, and radioshadow’s Dr. Robotnik’s Mean Bean Machine disassembly. Unfortunately, all of them make use of numerous elaborate asm68k features (the first two especially), so despite spending days implementing missing features, I’m still far from getting any of them to assemble.
I’m planning to focus on the Mean Bean Machine disassembly, since that’s the one that’s closest to assembling. Hopefully my assembler will be compatible with it by the time of the next blog post.
It happened sooner than I was expecting, but here it is: my assembler can now assemble the Sonic 1 disassembly.
As I guessed in the last part, there really weren’t many things left to implement before this was possible: it turns out that the Sonic 1 disassembly doesn’t use variables at all, nor does it use ‘org’ or ‘fatal’ directives. It does use string literals in ‘dc’ directives though, as well as ‘even’ and ‘end’. It also uses if/else/endc blocks, which I somehow completely forgot to mention in the previous post.
Once those were implemented, it was just a matter of ironing out a bunch of bugs before my assembler was eventually outputting a binary file that exactly matched the one produced by asm68k. Not only that, but my assembler detected various mistakes and ambiguities in the disassembly’s code, enabling me to improve the disassembly’s readability. You can see these changes here, here, and here.
I’ve got to say, it’s so cool to assemble a ROM of Sonic 1 with my own assembler, and then play it in my own emulator. It’s like I’m creating an entire software ecosystem or something. Soon enough, it won’t just be Sonic 1: I’ll be assembling my own Mega Drive homebrew too! But, before I can do that, I’ll need to make my assembler at least partially-compatible with AS.
Right, AS. The Macroassembler AS. If I want to assemble Sonic 2, Sonic & Knuckles, or my homebrew, I’m going to have to add support for AS’s features. That’s going to be tricky, because AS is a multi-assembler: it can assemble multiple different assembly languages at once. The Sonic 2 and Sonic & Knuckles disassemblies use this for assembling those games’ soundengines, which are written in Zilog Z80 assembly. On top of that, AS has extensive macro facilities (that ‘Macroassembler’ title is well-earned) which I also need to replicate. Overall, AS support won’t be easy, and it definitely won’t be happening soon.
I’d publicly release this assembler, possibly with a fork of the Sonic 1 disassembly that includes a copy of it, but I’m not sure if that’s the wisest thing to do right now: I think I should keep it private until my honours project is over to avoid any issues regarding “plagiarism” (code contributions) and “human testing” (getting feedback from users). The assembler needs some polish anyway. But don’t worry: once my honours project is over and this becomes a hobby project, it’ll be ‘release early, release often‘ all the way!
I think that I started working on this assembler in early February, and I haven’t really stopped since, so that puts me about a month and a half into working on this thing non-stop. I’ve been so busy with it, in fact, that I haven’t even had time to keep this blog updated on its progress.
But what is there to talk about… it still can’t assemble any of the Sonic disassemblies; it can only do tiny custom files that I write myself to test each feature as it’s implemented. Like this one, for example:
; Amazing test
; A line with just a colonless label
AmazingTest
; A line with just an instruction
move.w d0,d1 ; Absolutely NOT add.w d1,d0
; A line with a colonless label and an instruction
Lbl add.w d1,d0 ; Absolutely NOT move.w d0,d1
; A line with a coloned label and an instruction
FunkyLabel: move.w #$2700,sr ; A literal and the status register
; A line with just a coloned label
Label:
move.w #0,ccr ; A literal and the condition code register
; A lot of blank lines to test that empty statements are properly supported
; A line with an address operand
move.w d0,($00000000).w
;adda.w d0,a1 ; Not implemented yet
; sr instructions
move.w d0,sr
move.w ($FFFF8000).w,sr
; usp instructions
move.l a0,usp
move.l usp,a0
; Testing the various effective address modes
move.l d0,d0
move.l a0,d0
move.l (a0),d0
move.l (a0)+,d0
move.l -(a0),d0
move.l 20(a0),d0
move.l (a0,d0.w),d0
move.l 20(a0,d0.w),d0
move.l (a0,a1.w),d0
move.l 20(a0,a1.w),d0
move.l 20(pc),d0
move.l (pc,d0.w),d0
move.l 20(pc,d0.w),d0
move.l (pc,a0.w),d0
move.l 20(pc,a0.w),d0
; Testing ori
ori.w #$FFFF,sr
ori.b #$FF,ccr
ori.l #$FFFFFFFF,d0
; Testing andi
andi.w #$FFFF,sr
andi.b #$FF,ccr
andi.l #$FFFFFFFF,d0
subi.w #1,d0
addi.w #1,d0
; Testing eori
eori.w #$FFFF,sr
eori.b #$FF,ccr
eori.l #$FFFFFFFF,d0
; Testing cmpi
cmpi.w #1,d0
cmpi.w #1,(a0)
cmpi.w #1,(a0)+
cmpi.w #1,-(a0)
cmpi.w #1,$10(a0)
cmpi.w #1,(a0,d0.w)
cmpi.w #1,DummyDummyDummy(pc)
cmpi.w #1,DummyDummyDummy(pc,d0.w)
DummyDummyDummy:
; Testing both modes of BTST/BCHG/BCLR/BSET
btst.l #0,d0
btst.b #0,(a0)
bchg.l #0,d0
bchg.b #0,(a0)
bclr.l #0,d0
bclr.b #0,(a0)
bset.l #0,d0
bset.b #0,(a0)
btst.l d0,d0
btst.b d0,(a0)
bchg.l d0,d0
bchg.b d0,(a0)
bclr.l d0,d0
bclr.b d0,(a0)
bset.l d0,d0
bset.b d0,(a0)
bset.b d0,#$AA ; What a wacky instruction...
movep.l d0,0(a0)
movep.w 10(a0),d0
movea.l #$FFFF,a0
movea.w d0,a0
negx.w d0
clr.w d0
neg.w d0
not.w d0
tst.w d0
ext.w d0
ext.l d1
nbcd ($FFFF8000).w
pea.l ($FFFFFFFF).w
illegal
tas.b d0
trap #4
link a0,#12
unlk a0
reset
nop
stop #$2700
rte
rts
trapv
rtr
jsr (a0)
jmp (0).w
movem.w a1/d0-d2/d4-a1/d3,-(sp)
movem.w (sp)+,a1/d0-d2/d4-a1/d3
movem.w d0-a7,-(sp)
lea 10(a0),a0
chk.w #12,d0
divu.w #2,d0
divs.w #2,d0
mulu.w #2,d0
muls.w #2,d0
addq.w #1,d0
addq.l #8,d0
subq.w #1,d0
subq.l #8,d0
st.b ($FFFF81D0).w
sf.b ($FFFF81D0).w
shi.b ($FFFF81D0).w
sls.b ($FFFF81D0).w
scc.b ($FFFF81D0).w
scs.b ($FFFF81D0).w
sne.b ($FFFF81D0).w
seq.b ($FFFF81D0).w
svc.b ($FFFF81D0).w
svs.b ($FFFF81D0).w
spl.b ($FFFF81D0).w
smi.b ($FFFF81D0).w
sge.b ($FFFF81D0).w
slt.b ($FFFF81D0).w
sgt.b ($FFFF81D0).w
sle.b ($FFFF81D0).w
dbf d0,0
bra.s $180
bra.w 0
bsr.s $180
bsr.w 0
bhi.s $180
bls.s $180
bcc.s $180
bcs.s $180
bne.s $180
beq.s $180
bvc.s $180
bvs.s $180
bpl.s $180
bmi.s $180
bge.s $180
blt.s $180
bgt.s $180
ble.s $180
bhi.w 0
bls.w 0
bcc.w 0
bcs.w 0
bne.w 0
beq.w 0
bvc.w 0
bvs.w 0
bpl.w 0
bmi.w 0
bge.w 0
blt.w 0
bgt.w 0
ble.w 0
moveq #0,d0
moveq #$FFFFFFFF,d1
moveq #$FFFFFF80,d2
sbcd d0,d1
sbcd -(a0),-(a1)
abcd d0,d1
abcd -(a0),-(a1)
subx.w d0,d1
subx.w -(a0),-(a1)
addx.w d0,d1
addx.w -(a0),-(a1)
or.w d0,d0
sub.w d0,d0
eor.w d0,d0
and.w d0,d0
add.w d0,d0
suba.w d0,a0
cmpa.w d0,a0
adda.w d0,a0
cmpm.w (a0)+,(a1)+
cmp.w (0).w,d0
exg d0,d1
exg a0,a1
exg d0,a0
exg a0,d0
asl.w #1,d0
asl.w d0,d0
asr.w #1,d0
asr.w d0,d0
lsl.w #1,d0
lsl.w d0,d0
lsr.w #1,d0
lsr.w d0,d0
roxl.w #1,d0
roxl.w d0,d0
roxr.w #1,d0
roxr.w d0,d0
rol.w #1,d0
rol.w d0,d0
ror.w #1,d0
ror.w d0,d0
asl.w (a0)
asr.w (a0)
lsl.w (a0)
lsr.w (a0)
roxl.w (a0)
roxr.w (a0)
rol.w (a0)
ror.w (a0)
; Testing the symbol table
TestLabel:
bra.s TestLabel
; Testing fix-ups
bra.w DeclareAfterUseLabel
DeclareAfterUseLabel:
; Some actual regular code
CoolFunction:
moveq #0,d2
tst.w d1
beq.s CoolFunction_Exit
CoolFunction_Loop:
add.w d0,d2
dbf d1,CoolFunction_Loop
CoolFunction_Exit:
rts
move.w #%0101010101010101,d0
; Testing out arithmetic in literals
move.w #1+1,d0 ; 2
move.w #1+(2*3),d0 ; 7
move.w #(2*3)+1,d0 ; 7
move.w #-1,d0 ; $FFFF
move.w #1+2-3*4/5,d0 ; 3
move.w #1--1,d0 ; 2
; The Macro Assembler AS would fail to assemble this line (it would mistake the `(1*2)` for an absolute address)
move.w (1*2)+1(a0,d0.w),d0
dc.b 0,1,2,3,4,5
dc.w 0,1,2,3,4,5
dc.l 0,1,2,3,4,5
Object:
moveq #0,d0
move.b 2(a0),d0
add.w d0,d0
move.w @Offsets(pc,d0.w),d0
jmp @Offsets(pc,d0.w)
@Offsets:
dc.w @Offset1-Object@Offsets
dc.w Object@Offset2-@Offsets
@Offset1:
rts
@Offset2:
rts
bra.s *
bra.s 1*(**1) ; Amazingly, this actually works as intended
dc.l *,*,*
; Test operators
dc.b 1<<1, 4>>1, 3&2, 7^5, 0|2, 6%4, 1==1, 1<>0, 1>0, 0<1, 1&&1, 1||1, 0+1, 2-1 , 1*1 , 1/1
; Testing REPT
Delta: rept 8
dc.b *-Delta
endr
; Testing case-insensitivity
MOVE.B D0,D0
mOvE.b (A0,d0.W),(a1,D1.l)
; More blank lines to test support for trailing blank statements
Well, okay, I *say* that they’re tiny…
As you can see, the assembler supports every type of 68k instruction, and even some high-level constructs like REPT directives. I’m working towards adding support for macros, but those are proving to be a real headache. I’ve implemented labels, a symbol table, and arithmetic expressions, just like I said I would in the last blog post. I’ve also added fancy error-reporting, for effective debugging – here’s an example of one:
Semantic error!
On line 2 of '[SOME REPT]'...
On line 333 of 'misc.asm'...
On line 1 of 'shim.asm'...
'ADDA' instruction cannot be this size - allowed sizes are...
ADDA.W
ADDA.L
adda.b d0,a0
It can definitely be improved, but for a rough prototype, I think it’s really coming together. What I like most is that it tracks the nesting of the erroneous statement, all the way through ‘include’ statements and even REPT blocks or macro declarations.
Until yesterday or so, this assembler would use Flex and Bison to produce a parse tree for the entire source file. A lecturer at my university pointed out that it’s unnecessary to produce a parse tree for anything more than a single statement, so I spent a few hours refactoring the assembler to parse only a single statement at a time. I definitely imagine that this saves quite a bit of RAM, but most noteworthy is that it makes supporting macros possible without an entire separate preprocessing step.
One major refactor that I’m thinking of carrying-out is to replace the fix-up system with a second assembler pass. These are both solutions to the problem of instructions using labels that don’t yet exist at that point in the source file. The idea is that the assembler reaches the end of the source file, and then goes back to assemble the instructions that couldn’t be done previously.
The current system manually takes note of each instruction that couldn’t be assembled due to then-undefined labels, and stores various bits of information about them (more notably, assembler state) in a linked list; when the end of the source file is reached, the assembler then iterates over this linked list, reverting some of the assembler state to how it was at the time the instruction was first processed, and then attempting again to assemble it (or, rather, ‘fix it up’).
This process is quite a pain, because duplicating internal assembler state is a mess of string duplication and backing up *just enough* internal state for the instruction to correctly assemble, even when done out-of-order. As an alternative, I’m thinking of just making it so that, when the assembler reaches the end of the file, it goes back to the start and assembles the whole file all over again. That way, I don’t need to keep track of what the assembler state was at the time an instruction failed, because that state will naturally be recreated by the assembler re-assembling the file in a step-by-step recreation of how it did the first time. This essentially trades code complexity and memory usage for performance, saving the former and costing the latter.
I think that, in the future, when I add support for build-time variables, I will need to ditch fix-ups anyway, since otherwise I’d have to back up portions of the symbol table too, in addition to the other backed-up state, which is a horrifically-complex task that I would much rather avoid.
I think that the remaining features that I need to implement before my assembler can assemble Sonic 1 are as follows…
Macros
Character literals
Constants
Variables
‘equ’ directives
‘org’ directives
‘fatal’ and ‘inform’ directives
‘dcb’ directives
‘incbin’ directives
Having them all written out like this really makes it look like the end is finally in sight: assembling Sonic 1 would be the first major milestone for this project, and it would give this assembler an actual use, instead of it just being a tech-demo.
I have about a month and ten days before my honour’s project is due for submission; I hope that I can complete this and get it polished-up by then. Considering what I’ve gotten done in the last month and a half, I think that I’ll be just fine.
For my university honours project, I chose to create an assembler for the Motorola 68000. It’s not even close to being completed, but I’m bored right now and writing about it seems like a fun idea.
I’ve written a partial 68k assembler before, for my smps2asm2bin project. This mini-assembler was capable of processing dc.b and dc.w directives, as well as some hardcoded macros. It featured a working symbol table as well as managing fix-ups for instructions that used then-undefined symbols. This partial assembler was entirely custom, written in pure C.
This new assembler takes a different approach: while at university, I studied the creation of compilers, and how the process of parsing can be broken down into three steps: lexical, syntactic, and semantic analysis. I wanted my new assembler to follow this structure, so I’ve made use of the Flex and Bison tools in order to achieve this. Unlike more-modern software like ANTLR, Flex and Bison produce standard C89 code, which fits my assembler nicely.
At the moment, the assembler is capable of producing machine code for about 3/4 of the 68k’s instructions. However, it lacks a symbol table and support for arithmetic operations in instruction operands. It also lacks advanced features like macros. Still, I’m happy with the progress that’s been made so far.
Working on the assembler has taught me some strange things about the 68k assembly language: there are things that I never knew, despite speaking this language since late 2012 – almost 10 years ago. For instance, it’s possible for the index register in an indirect address register operand to be another address register (so ‘clr.b (a0,a1.w)’ is a valid instruction). Additionally, a literal operand can serve as the second operand of ‘btst’ and ‘link’ instructions, when it is usually only ever used as the first operand (‘btst d0,#$55’ and ‘link a0,#8’ are examples of this usage).
While my all-C mini-assembler never got to support the entire 68k assembly syntax, I do find Flex and Bison to be very powerful tools that make supporting the intricacies of the language quite easy, rendering many issues that I expected to encounter – were I to expand my mini-assembler – a non-issue in my new assembler. These tools require that the syntax be conceptualised in a very peculiar way, but I found that I clicked with it rather quickly. Essentially, you have to break the language down into ever-shrinking components. For example, a program is a series of statements; a statement can be a macro invocation or an instruction; an instruction is composed of an opcode with an optional size, optionally followed by a number of operands; each operand can be a register, an absolute address, or a literal; a register can be a data register, or an address register. By breaking the syntax down like this, one can use Bison to produce code that can parse the syntax, without needing to write it manually.
One awkward part of writing this assembler is that I will get so far into it, before realising that I need to rework it to support an edge-case that I had failed to consider. One example of this is when I needed to add support for a whole extra type of operand specifically for the MOVEM instruction, which allows you to specify multiple registers in a single operand, whereas normally instructions only let you specify a single one. This kind of problem has recently struck again, as, for the branch instructions, I need to add support for an operand type that expresses an address without a size specifier.
Soon enough, this assembler should be able to assemble all of the 68k’s instructions to their proper machine code, albeit without support for labels, macros, and other high-level features. Once I reach this stage, I should really produce an exhaustive test suite, which ensures that no instruction can be used with operands that it is incompatible with, and that correct machine code is produced. After that, I can focus on adding a symbol table so that the assembler can then support labels. After that, I want to add support for basic arithmetic in literals, so that expressions like ‘move.w #4*(2+2+2),d0’ work.
I suppose that’s enough for today. Hopefully, I’ll have some more stuff to talk about soon.
It looks like I wanted to get one more thing done before the project enters hibernation again.
While not exactly an accuracy-focussed emulator, I do fancy the idea of making clownmdemu into a useful developer tool. After all, I do make hacks and homebrew for the Mega Drive occasionally.
For now, it features a bunch of video-debugging windows which visualise some of the VDP’s internal processes. There’s also a PSG debugger, which shows each channel’s frequency and volume attenuation level. Eventually I’ll add more, for things like the CPUs and the FM chip.
Previously, my debugging emulator of choice was Regen, but that thing’s gotten pretty long-in-the-tooth, having last been updated in 2009. Since then, many things in it have been found to be inaccurate. In addition, it has a particularly glaring bug where any save data that it creates is accidentally byte-swapped, effectively rendering it unusable without manual correction. Once clownmdemu is fleshed-out enough, however, I can totally see it becoming my new main emulator.
I’m finding Dear ImGui to be very straightforward and fun to develop with (albeit with some annoying exceptions here and there, unfortunately). It certainly beats Qt for me: it’s so much simpler and less invasive. Having a huge demo window whose code you can just peak at to see how to do things is very handy too, and is way more preferable to trawling though page after page of incoherent documentation and Stack Overflow posts. Yeah, I know, I’m probably just bad at Qt: there’s probably a reason that projects like Dolphin, Yuzu, and DuckStation use it.
I think I’ve done enough GUI work for now though: the next thing that I work on will probably be FM emulation. Who knows how long that will take to be functional, though.
My emulator has recently seen some work, but development seems to be on the verge of slowing down again, so I think now is a good time to post a write-up here.
So, what’s new in clownmdemu?
Holy moly, an actual GUI
Yes, your eyes aren’t deceiving you: I’ve actually made a GUI. I know – I’m shocked too.
This GUI is implemented using Dear ImGui. I would have used Qt for a more native look-and-feel, but I think it’s a little bloated for my needs, and it’s also a very invasive piece of software, requiring that I effectively completely rewrite my frontend to be based on Qt instead of SDL2 (the two can’t co-exist). Dear ImGui, however, is very light and minimal, with a design philosophy that allows it to be layered atop of whatever windowing and renderer libraries I please. This allows me to keep the existing SDL2 base, and merely expand the frontend with GUI elements.
Dear ImGui isn’t entirely vanilla: it sports some customisations to make it extra pretty. In particular, the font has been changed from Proggy Clean to Karla Regular, and the font rasteriser has been swapped from the default stb_truetype rasteriser to FreeType. Another customisation is that Dear ImGui scales according to the display’s DPI. In fact, the whole frontend does. This is notable because SDL2 apparently doesn’t actually support DPI scaling on Windows, so I have to do it manually.
Hopefully the GUI will be further expanded in the future: I’d like to add a controller rebinding menu at some point. For now though, this is enough to make the emulator usable without the command line or excessive hotkey usage.
PSG audio emulation
I’ve been dreading doing audio emulation since I started this emulator, but PSG honestly isn’t that bad. Granted, that’s mostly because the hardware for it has been thoroughly dissected, and an amazing write-up that explains how it works can be found on SMS Power!. Fun fact: the Mega Drive’s PSG can produce frequencies of up to around 100kHz.
At first, I wrote a PSG emulator that generated audio at the sample rate of the emulator itself (usually 48kHz), but this had nearest-neighbour aliasing issues and it also didn’t perform any kind of low-pass filtering. The result was some slightly unpleasant audio that sounded quite harsh.
In my second attempt, I wrote a PSG emulator that generated audio at the PSG’s native sample rate (roughly 200kHz). This was great and all, but required resampling to the emulator’s native sample rate. At first I was content to leave this task to SDL2, which is capable of accepting audio that isn’t in the format needed by the underlying audio playback library, but I had doubts over the quality of its resampler (it appeared to introduce some distortion in situations where low-pass filtering would be required). Additionally, SDL2 is locked to only one sample rate for the lifetime of its ‘audio device’, making supporting the slightly-different sample rate on PAL systems impossible outside of a compile-time flag or forcefully destroying and reinitialising SDL2’s entire audio system whenever the user switches console region.
To address this, I wrote my own resampler. I’d written linear interpolators before, but this time I wanted something higher quality. For this, I began researching windowed-sinc resampling, particularly the type that uses a Lanczos window. This is something that I’d researched before for CSE2 (maybe I’ll make a blog post on that someday…), but never had a solid grasp on. In fact, the more I studied windowed-sinc resampling, the more I realised that CSE2’s implementation was broken: it completely failed to perform low-pass filtering when downsampling.
So what’s so special about windowed-sinc resampling? Well, while linear interpolation simply plays ‘connect the dots’ with the audio samples, windowed-sinc resampling attempts to recreate the original waveforms from which the samples were produced, and then sample them again at a new sample rate (hence “resampling”). As an added bonus, windowed-sinc resamplers perform low-pass filtering for free (at least when done correctly). The result is a very high-quality resampling, which eliminates aliasing while also producing smooth natural-looking waveforms. The only downside to a windowed-sinc resampler is the complexity of implementing it, which also leads to it being a bit of a performance sinkhole. Still, after some optimisation, it currently appears to be roughly on-par with stb_vorbis (a popular Ogg Vorbis decoder) when it comes to average CPU usage per sample.
Personally, I’m pretty proud of this resampler, so I split it off from the frontend and turned it into a public-domain single-header-file library that’s compatible with C89 and C++98. It can be found here.
Other changes
There were also some miscellaneous changes that don’t need much explanation: PAL and Japanese modes can be toggled in the frontend at runtime (just like using a region switch on a real modded Mega Drive), the framebuffer is now scaled with a degree of smoothing at non-integer multiples, and controllers are now supported.
Conclusion
Aside from the lack of input rebinding and persistent configuration, this emulator is quite usable now. If those two issues were addressed, FM and Z80 emulation were added, and some of the missing 68k instructions were implemented, then this emulator would practically be ready for an initial release. We’ll just have to wait and see where this project goes next.
To tell you the truth, I thought that I started this blog at the end of December, not the start. Oh well.
So… one year on, and I think things have gone pretty well: not counting the post before this one, which was written after the anniversary, I put out 24 articles, which statistically works out to two posts for each month (haha, take that Dolphin and your “monthly” progress reports!).
I haven’t been able to dedicate much time to my hobbies lately, which is why the posts have been getting a little sporadic, but soon I’ll be working on a Motorola 68000 assembler, so that should give me a lot to talk about.
As for other things to come in the future, I may finally wrap-up my series of posts about clownaudio, since I think it’s approaching the point where there’s really not much left to go over. Additionally, I’ll likely get back to working on my Mega Drive emulator at some point. I’m getting a new laptop soon (assuming that the delivery doesn’t get cancelled for whatever reason), so maybe I’ll write some posts about getting it set up. I’m also thinking of switching from Xfce to KDE and Wayland, so that could be the subject of another post.
But yeah, this has been a fun first year. I can only imagine what hijinks I’ll get up to in the next.