Porting a Nintendo 64 Emulator to the Wii U… at 1FPS

Holy moly this has been a grueling project, and I haven’t even gotten to the part that I wanted to do yet!

Let’s start from the beginning: I want to create an implementation of OpenGL ES 2.0 for the Wii U, because the only cross-platform rendering API that it has right now is SDL2, which is extremely basic. Some brilliant programmers have created a runtime GLSL shader compiler library for the Wii U, which just leaves the comparatively simple task of shimming the Wii U’s GX2 API to OpenGL. I know that a port of the ANGLE project is already underway, but I wanted to try my hand at implementing the API manually instead.

Anyway, I figured that porting Mupen64Plus (a Nintendo 64 emulator) would be a great way to test my prospective OpenGL implementation, as one of its best video plugins (GLideN64, which has its own blog) supports OpenGL ES 2.0.

But I didn’t want to spend a bunch of time implementing OpenGL only to find out that the emulator is impossible to port to the Wii U, so I decided that I would port the emulator first. You might think that this presents a ‘chicken and the egg’ problem: how can I port the emulator when there is no OpenGL? The solution is use a software rendererer (Angrylion) instead. Software renderers are great because they are universally portable, however they are often incredibly slow, which is why this is only a solution in the short-term, and why OpenGL is still needed.

Rather than port upstream Mupen64Plus, I opted for porting the libretro fork, as it is considerably easier to port due to being a libretro core. This pairs well with my custom libretro frontend (clownlibretro), which was effortlessly ported to the Wii U thanks to the existing Wii U port of SDL2.

The first thing that I did was produce a stripped-down PC build of the emulator, having only a software renderer and no dynamic recompiler. This is easier to port to the Wii U due to there simply being less code that can go wrong.

Unfortunately, this is where the first problem was encountered: libretro’s Mupen64Plus is always built with the OpenGL renderer, but the renderer will fail to build for the Wii U because OpenGL does not exist yet. Fixing this required making it possible to build the emulator without its OpenGL renderer, which was easy enough because of the codebase’s modular design.

Next was an awkward configuration issue: for some reason, disabling the dynamic recompiler did not disable all of the recompiler-related code, causing linker errors, so I had to pass ‘CPUFLAGS=”-DNO_ASM”‘ to the build script to fix that.

Once that was done, I had a libretro core that theoretically should be able to be compiled for the Wii U without any modifications.

But, of course, that was not the case: in spite of the intended portability of libretro cores, the libretro fork of Mupen64Plus depends on POSIX Threads. There is an option to use SDL2 threads instead, but it suffered from bit-rot and had to be repaired to get working again. Additionally, the libretro fork’s version of the software renderer relies on C++11’s concurrency APIs (std::mutex, std::atomic, etc.), which is not provided by the Wii U’s homebrew toolchain (devkitPro). This code can be dummied-out, however, making the renderer single-threaded. Since the software renderer is only a temporary feature, this is acceptable.

With these issues resolved, the emulator could be compiled for the Wii U, producing a neat little ‘mupen64plus_next_libretro_wiiu.a’ file.

Before I could run this libretro core on the Wii U, I needed to link it with a libretro frontend. I could have used RetroArch, which has a Wii U port, but that port is in the middle of a messy refactor after having been broken for several years, so instead I figured that I would port my own libretro frontend instead. Being based on SDL2, the frontend was almost immediately compatible with the Wii U. However, it did need modifying to support linking with static libretro cores rather than dynamic cores, as dynamic cores aren’t quite supported on the Wii U yet.

Things were looking good: when linked with my Mega Drive emulator (clownmdemu), it booted on the first attempt:

I’m sure that this won’t come up later.

When linked with the N64 emulator, it… did nothing. It was crashing before it even finished booting.

It tried debugging this with decaf-emu and Cemu (two Wii U emulators) but they both crashed along with the emulator. It was at this point that I discovered the amazing Wii U GDB stub, allowing me to debug a real Wii U in real-time. This proved invaluable: no more inserting hundreds of ‘printf’s into the code to figure out where it’s crashing, no more looking at secret crash logs to see the stack frame – I can just boot GDB and start adding breakpoints and probing variables!

With this, I found that the crash was caused by some assembly code that was being used to implement coroutines. Coroutines aren’t a standard feature of older versions of C and C++, but they’re incredibly useful for making libretro cores out of software that makes heavy use of mainloops, so libretro’s developers have provided a library called ‘libco’ that provides them. This requires assembly code. Rather than use inline assembly, libco stores its PowerPC assembly in preassembled form as an integer array. Presumably this is to maximise compatibility with a wide range of compilers (as inline assembly is not standardised), but it appears to be causing issues with either alignment or execute-protection on the Wii U. To work around this, I replaced the pre-assembled arrays with inline assembly, allowing them to work as intended.

However, the emulator was still crashing. After some more debugging, I determined that the crash was due to the emulator assuming the wrong CPU endianness. The Wii U uses a big-endian PowerPC CPU, while PCs use little-endian x86 CPUs. The emulator is unable to detect this at build-time, requiring that I set some flags to notify it. ‘-DM64P_BIG_ENDIAN’ does the trick. Without this, the emulator reads some data backwards, due to numbers being stored in the reverse order.

libretro cores actually have access to a header that allows them to automatically detect endianness at build-time, but for some reason the Mupen64Plus fork doesn’t use them.

With these problems fixed, the emulator was able to boot and display this:

The swapped colour channels and pixel columns are dead-giveaways that there are more endian issues to solve.

The pixel columns being swapped were caused by another missing endian flag: this time it’s ‘-DMSB_FIRST’, which is used by Angrylion. The colour channels being swapped were caused by the libretro interface, which declared a little-endian framebuffer as native-endian. Since the libretro interface only supports native-endian framebuffers, I needed to add some extra code to convert the framebuffer to the expected format.

With those two fixed, the emulator looked much better:

But there’s still a problem: where’s the background? Behind the logo, you can faintly see a couple of mangled triangles, but that’s it. Clearly, something was going horribly wrong.

This proved to be one of the most frustrating debugging grinds that I have ever endured: I have been at it for four days straight, and only today have I found the cause.

But just because I already fixed the bug doesn’t mean that you get off easy! No, you’re going to suffer with me!

I assumed that this was a problem with the software renderer, as the game itself appeared to be running correctly. Using GDB, I prodded some variables related to rasterisation and found that the inputs differed between the PC and Wii U builds. This ruled-out the renderer as the cause of the bug, as the incorrect data had to be coming from somewhere else. Given that the bug seemed to only affect 3D graphics, I figured that perhaps the problem lied with the N64’s vector mathematics co-processor – the RSP. However, hooking each of its functions once again showed that it was receiving bad data rather than creating it. With the GDB stub’s amazing hardware watchpoint, I traced where this bad input data had come from, and unfortunately it led all the way back to the emulated CPU’s registers. There was no way that I could trace it through those. I’d reached a dead-end, I was stuck.

It was at this point that I got desperate: after hours of trying to find a lead, I decided to try compiling the emulator on a PowerPC-powered PC. This served two purposes: to see if the problem was specific to the Wii U, and to see if the problem was actually the software renderer’s fault, which I would do by swapping the software renderer for the hardware renderer (even if it had to be backed by a software-rendered OpenGL implementation). This led me down a rabbit hole of learning how to use QEMU to emulate a 1999 Macintosh running an 8-year-old version of Debian. Being too lazy to figure out how to cross-compile the emulator for it, I had the emulated Mac slowly do the compiling instead, with each rebuild taking an entire half-hour. In the end, after an entire day of setting this all up, it didn’t even boot, producing a mysterious error message about the N64’s CPU encountering an invalid opcode that did not occur in the Wii U port. To add insult to injury, I then remembered that the hardware renderer did not support big-endian. The whole point of using the software renderer was so that I could have a working build of the emulator without putting a bunch of work into getting the hardware renderer working, so I was far from ready to commit to adding big-endian support to the thing.

That brings us to today, when I decided to investigate why the logo rendered fine but the 3D geometry did not. I had assumed that it was because the 3D required vector mathematics while the 2D logo did not, but I did not know that for sure. Upon investigating the Ocarina of Time decompilation, I found that I was correct: the logo is rendered as a ‘texture rectangle’, while the 3D is rendered as a ‘display list’. Notably, the display list has a matrix attached to it, and this matrix is generated by the CPU rather than the RSP. Given that I had previously traced incorrect values to the CPU, this seemed very suspicious. I used the decompilation to produce a version of the game that logged this matrix to a developer console, and, to my amazement, comparing the PC and Wii U versions of the emulator showed that they wildly differed! Finally, I had another lead!

I added more code to track the matrix, but was baffled to see that different calls to ‘printf’ printed different values for the same matrix, despite it not being modified: one call would claim that its ‘xx’ value was set to 0.0, while others would claim that it is 1.0. Even more strangely, passing an extra dummy ‘1.0’ literal to the end of one ‘printf’ would cause the reported matrix to be mostly overwritten with 1.0! It seemed that I had stumbled upon a heisenbug, which, frankly, was utterly infuriating after all the time that I had spent trying to find it in the first place.

But then I noticed something: I tried printing the same floating point value four times, and every other printed value was 0 instead of the proper value. This had me thinking of if the CPU emulator was having trouble pushing floating point values to the stack. I opened the CPU emulator’s C files and searched for references to the ‘float’ and ‘double’ types, and found this:

typedef union {
    int64_t  dword;
    double   float64;
    float    float32[2];
}cp1_reg;

This looked very suspicous: using a union to access individual parts of a variable is a classic way of making code incompatible with big-endian. I modified the one bit of code that used this union to swap the order of its two floats, and…

Good…
Perfect!

At last! At long, long last I’d found that damn bug! And of course it was a big-endian incompatibility! I swear, people just never give a second thought to writing the most unportable nonsense! I suppose it’s worth once again bragging that my Mega Drive emulator (which has prioritised portability from day one) ran on the Wii U on the first attempt without a single bug. Standards – have them!

Ranting aside, it’s great to finally have this working… albeit at the blistering speed of 1FPS. Yeah, even when compiled with the highest optimisation level, the software renderer is unfathomably slow. But that’s okay, I didn’t need it for performance, I needed it for verifying that the emulator works, and it does!

So, with that done, I have a working proof-of-concept: Mupen64Plus runs with minimal modifications, meaning that I can concentrate on putting together a working OpenGL ES 2.0 implementation and making GLideN64 support big-endian! For all the grief that this project has given me so far, I’m happy that I have several bugfixes to contribute upstream. I’m a perfectionist, so I always love seeing software get even the slightest bit closer to perfection.

15 thoughts on “Porting a Nintendo 64 Emulator to the Wii U… at 1FPS

  1. This project stil a live?

    It would be great to have an N64 emulator on the Wii I always wanted that, I researched a lot and didn’t find anything about it until I found this blog, I always thought that the WII U would run the N64, Saturn and Dreamcast well since it already had a beta PSP emulator for it. recent modified with a little improvement

    But how do you know the progress of the project?

    Like

  2. This is great to see, I hope you manage to get the Core to a reasonable speed.
    How are you planning on tackling the lack of PPC Dynarec?
    There is the PPC Dynarec for the Wii64 emulator, but when people tried to port the WiiSX Dynarec to the PcsxRearmed Core for WiiU it didn’t perform well at all but luckily Lightrec was an option for PS1 but I don’t think that will work for N64.

    Will your OpenGL ES 2.0 work be specifically for the Mupen Core or could that be used for other Cores that support it? The PS1 Core has speed issues even tho there is the Lightrec Dynarec on some heavier games, it would be nice if the OpenGL changes could be used for the PS1 Core too, tho now that I think about it I’m not sure if the PS1 Core even supports GL ES 2.0?

    Like

  3. Amazing work friend , Please tell me your still working on this N64 on Wii U would be awesome Resident Evil 2 can be perfectly Emulated by Angrylion due to the Framebuffer effect glitch switching Camera.

    Star Wars Battle For Naboo Rogue Squadron and Indiana Jones run well with GlideN64.

    Think that the most troubling of Games but at any rate just dreaming and thinking ahead and if you have a Patreon Paypal etc you got my support and many others for sure.

    Please keep up the Impossible 😉

    Like

    1. I have been taking a break from the project after being stumped on how to make GLideN64 work on big-endian platforms. My newest plan is to install big-endian Gentoo Linux on my Raspberry Pi 3B+, so that I have a big-endian platform with hardware-accelerated OpenGL support. With that, it should be trivial to get GLideN64 working properly. Once that is done, I can focus on making a proper OpenGLES2 implementation and finally get the emulator running at a decent speed on the Wii U.

      Liked by 2 people

      1. thanks for answering.

        I hope you get the N64, it’s very good, I looked for a long time to see if there was a project for that, I wish you success in your port of the emulator for retroarch, which is even better.

        Are there any plans to port other emulators in the future to the Wiiu such as Sega Saturn and Dreamcast?

        Like

      2. friend, I’ve waited a long time for this moment, I hope that if one day I intend to port these two emulators that I mentioned above, it will be great, also with the Wii, it even runs on PSP, because not the others, right, thank you friend for responding, I’m looking forward to seeing this core working properly On Wiiu, many games can run on N64 on Wiiu in a decent way without suffering with N64vc and inject, just knowing that is a relief, good luck

        Like

  4. I’m glad you are continuing to work on it and hope you can get the OpenGLES2 stuff working on WiiU, there’s still quite a few of us hardcore WiiU Retroarch users still out there looking forward to what you can do.
    N64 on WiiU is like the Holy Grail of RA Cores lol

    Thanks 🙂

    Like

  5. Thanks for efforts this would really make WiiU Complete this and a DS Core but I am grateful for you working on this Big Thank you and can’t wait to see your work take care 😉

    Like

  6. This is awesome and from a dev’s perspective also a very good read.

    Looks like if anyone ever achieves this, it would be you.

    I’m waiting for decent N64 on the Wii U for years and I gladly wait even a little longer, if you can pull it off someday. The Wii U is still my favorite “TV console”.

    Good job so far and thank you for your upstream PRs!

    Like

      1. I don’t know if this can help you but ppsspp was ported to the Wiiu and updated until 2018 so last year someone compiled a new build greatly improving the incompatibilities In short, wouldn’t the source files of this emulator help you understand how well the game’s fps runs and the image so that you could replicate it in mupen64 plus next for the wiiu? I think it has gl, I don’t know how they managed to run the psp on the wiiu, but the N64 doesn’t, maybe it’s a good idea to see how opengl is incorporated into big-endian.

        Like

Leave a comment

Design a site like this with WordPress.com
Get started