On AI coding and homebrew

I have a lot of thoughts on large language models in general, but it has come a little close to home recently with the submission to hacks.guide of a “90% vibe coded” (author’s words, not mine) installer for isfshax, an low-level exploit for Wii U, and the follow-up information that some parts of isfshax itself were made this way.

Let’s get some things out of the way first. This post is my opinion, made up of thoughts that have been stewing for a long time now. I’m not going to try to change anyone’s mind, but given influential scene people are writing policy, I wanted to air some thoughts.

Next, you can do whatever you want within the bounds of your project. If you want to vibe-code a tool, you can, but equally, guide authors and third parties can choose to accept or reject your work for any reason - as that falls within the bounds of their project. These are part of the social consequences of your choices. Nobody is controlling what you do, but they can respond to what they see.

Who are you?

I’m Ash, ashquarky, quarktheawesome, lucaquarky, whatever. My first Wii U homebrew released in 2016 and it was terrible. Still, things got better. I have contributed to:

The Homebrew App Store, arguably the most popular Wii U homebrew program outside of launchers and CFW tools
Pretendo Network, where I am a core team member responsible for the Splatoon, Minecraft LCE, Puyo Puyo game servers, recent improvements to the Juxtaposition Miiverse clone, and wrote the Inkay, Martini and Meowth patchers, used for Pretendo but also as the basis for other patchers such as GiveMiiYouTube or Rose Patcher
linux-wiiu, a port of.. Linux.. to the Wii U
RetroArch’s Wii U port, including a complete rewrite for Aroma (sorry that hasn’t merged yet!)
COSSubstrate, an early plugin system for Wii U which later inspired WUPS’s design, which then (with great effort and investment from Maschell) evolved into Aroma’s plguins
ProgrammingOnTheU, a tutorial on creating simple homebrew applications for Wii U
usata, a hardware mod adapter board that allows you to add SATA storage to a Wii U
Other smaller projects and lots of random documentation work

It’s weird to think I’ve been doing this for 10 years at this point, but that’s what the numbers say. My main motivation for making homebrew is to understand systems more deeply and build my own competence. This silly console has taught me C, C++, Rust, build systems, JavaScript/Node, Go, reverse engineering in IDA and later Ghidra, all skills I’ve been able to transfer to other projects (this was my undergraduate thesis).

I like nothing more than untangling the complex systems this console hides and making them dance, and I love seeing the competencies it’s built in me and feeling like I’m doing good, skilled, useful work. Even though I wish it paid sometimes.

I also work as a coding educator for primary and high schoolers, primarily teaching block, Python and C#. This will come up later!

LLMs and their capabilities - context

It’s pretty well-documented at this point that using AI for creative writing tasks like essays or novels produces fancy-looking text that passes a cursory inspection, or even a thorough inspection by an yet-unskillled writer, but the blending and un-blending of the words ultimately strips a large portion of the meaning, connection, and sometimes factual accuracy of the writing.

This is how I know when a student is using AI. Their sentences are fluid and academic, but they don’t say anything. Like ChatGPT, academic writing uses “formal cautious language to avoid ambiguities and misinterpretations,” but that is a characteristic of the common voice used in academic writing. It is not what academic writing is. Writing is more than language.

josh (with parentheses); You are a better writer than AI. (Yes, you.) - Absolutely recommended viewing if you aren’t sure on this creative writing point

As LLMs have developed, they have started being able to output valid code, not just valid sentences. They stuff the syntax sometimes, but agentic environments allow them to iterate and smooth over these issues before presenting the results to you. These systems have become central to many people’s workflows. One of my teammates at Pretendo uses Copilot to do additional code review. Autocomplete in IDEs has given way from the machine-learning powered systems of past (like ordering of Intellisense options) to full line and then full function code completion, enabled by default. Some people claim that almost all of their code is generated in this way, acting only as a prompter and reviewer of the LLM’s progress.

I have lots of arguments around this, which we will get into, but I hope as a baseline we can agree that vibe-coded outputs have limited capacity for complexity. There is a point, especially when starting from an empty repository, where the AI is unable to keep up with its own work and either forgets its own prompt or stops a task early. Go ask Summer Yue, whose email inbox was eaten by OpenClaw outrunning its own context window, assuming you believe that actually happened. For a more balanced assessment of their capability, though, I definitely recommend Can AI Pass Freshman CS? by a Cornell TA who had the major models complete his CS course along with the students. In short, the models reveal themselves to be highly capable at simple, well-trodden tasks but lose the plot once tasks get longer, more complex, and start pushing the bounds of their context windows. Definitely watch.

Perhaps a new model revision will resolve that, but for now, AI tooling can struggle. From my experience as an educator, I would say it’s equivalent to about 3 years of experience plus a strong willingness to read technical documentation? Working with AI tools does not feel like an equal collaborator - it feels like an intern to me. (And yes, even on basic tasks, it’s extremely obvious when students have copied their homework task into ChatGPT.)

The moral and theft argument

LLMs exist entirely as sums of their training data. It’s just that the data is so vast that we can’t recognise it when it appears. Instead, we can look at gaps in the training data.

It aces all of the hard problems and screws up things no real student got wrong. Like, there are some rather tricky questions about class invariants and preconditions that Gemini blasts straight through. If anything, I actually liked some of its answers better than our own solution key. But then this question asks what happens when the string “hello” is concatenated to itself four times. And somehow Gemini has turned this into “hello hello hello world.” Huh? I guess its training data just has too many “hello world”s in it to the point that it can’t resist sticking a “world” after seeing a “hello.”

mt_xing, Can AI Pass Freshman CS?

These models are trained on datasets such as “all public code on GitHub” and “all websites”. They do not and cannot remember who wrote each of those pieces of data, and thus can and do unknowingly steal human work. Agentic workflows take this a step further by allowing models to do research and find new code in real-time, not just whatever is stored in their weights.

There’s an interesting file in the “90% vibe coded” wafel_installer. I did not do a full audit, this was something like the second or third file I inspected. Here is a quote from it.

uint64_t p2 = 0x00a4F031FB43193b; // need to align patch for old mocha
if (!applyPatch(0x050282AC, &p2, 8, L"Applying patch 2 (launch_os_hook bl)...")) return;

uint32_t p3 = 0xE3A00000;
if (!applyPatch(0x05052C44, &p3, 4, L"Applying patch 3 (mov r0, #0)...")) return;

uint32_t p4 = 0xE12FFF1E;
if (!applyPatch(0x05052C48, &p4, 4, L"Applying patch 4 (bx lr)...")) return;

uint32_t p5 = 0x20002000;
if (!applyPatch(0x0500A818, &p5, 4, L"Applying patch 5 (mov r0, #0; mov r0, #0)...")) return;

if (!applyPatch(0x05059938, os_launch_hook, sizeof(os_launch_hook), L"Applying os_launch_hook...")) return;

uint32_t ancast_hook_start = (0x05059938 + sizeof(os_launch_hook) + 3) & ~3;
if (!applyPatch(ancast_hook_start, ancast_decrypt_hook, sizeof(ancast_decrypt_hook), L"Applying ancast_decrypt_hook...")) return;

uint32_t p8 = generate_bl_t(0x0500A678, ancast_hook_start);
if (!applyPatch(0x0500A678, &p8, 4, L"Applying patch 8 (generate_bl_t)...")) return;

uint32_t p9 = 0xe00fbf00;
if (!applyPatch(0x0500A7C8, &p9, 4, L"Applying patch 9 (Ancast header nop nop)...")) return;

uint32_t p10 = 0x2302e003;
if (!applyPatch(0x0500a7f4, &p10, 4, L"Applying patch 10 (movs r3, #2; b #0x500a800)...")) return;

zer00p/wafel_installer:fw_img_loader.cpp

Now, here is a snippet from Aroma’s fw_img_loader payload. This code dates from early 2020.

*(int*)(0x050282AE - 0x05000000 + 0x081C0000) = 0xF031FB43; // bl launch_os_hook

*(int*)(0x05052C44 - 0x05000000 + 0x081C0000) = 0xE3A00000; // mov r0, #0
*(int*)(0x05052C48 - 0x05000000 + 0x081C0000) = 0xE12FFF1E; // bx lr

*(int*)(0x0500A818 - 0x05000000 + 0x081C0000) = 0x20002000; // mov r0, #0; mov r0, #0

// [ 4 lines removed ]

for (i = 0; i < sizeof(os_launch_hook); i++)
	((char*)(0x05059938 - 0x05000000 + 0x081C0000))[i] = os_launch_hook[i];

u32 ancast_hook_start = (0x05059938 + sizeof(os_launch_hook) + 3) & ~3;
for (i = 0; i < sizeof(ancast_decrypt_hook); i++)
	((char*)(ancast_hook_start - 0x05000000 + 0x081C0000))[i] = ancast_decrypt_hook[i];

*(u32*)(0x0500A678 - 0x05000000 + 0x081C0000) = generate_bl_t(0x0500A678, ancast_hook_start);

// remove various Ancast header size checks (somehow needed for unencrypted fw.img)
*(u32*)(0x0500A7C8 - 0x05000000 + 0x081C0000) = 0xbf00bf00; // nop nop
*(u16*)(0x0500A7C8 - 0x05000000 + 0x081C0000) = 0xe00f; // b #0x500a7ea
*(u32*)(0x0500a7f4 - 0x05000000 + 0x081C0000) = 0x2302e003; // movs r3, #2;  b #0x500a800

wiiu-env/fw_img_payload:ios_kernel/main.c

This set of patches has been floating around different exploit and setup tools for while. The origin is smealum’s iosuhax from 2016. You can trace the lineage through haxchi and dimok’s work and eventually into Aroma. Each time it’s used, this patchset evolves a little - it’s refactored into different code styles, patches are added and re-ordered. So, at first glance, the project in question appears to participating in that tradition.

However, I propose this code was taken specifically from Maschell’s Aroma patchset by an AI tool. This tool doesn’t have access to the binary this code modifies, and doesn’t seem to have researched the earlier history or lineage. I will show it embarked on a refactor even though the tool doesn’t understand what the code is doing.

The same patches are made in the same order, with the exception of the first, which is modified, and 4 Aroma patches which were omitted. I have been informed that the modified patch is original (human) work of the author. None of the other lineage shares this much similarity.
All the ancillary code (variable names, patch functions, etc) has no additional information or context than the original code did. The variable names are generic and don’t e.g. reference the instruction opcodes they contain.
Every patch has a comment, and those are all the same between the two code samples, with the exception of two - “launch_os_hook bl” and “Ancast header nop nop”. Both comments demonstrate the AI has a complete lack of understanding of the code.
- Anyone who has looked at ARM assembly - a required skill to develop these patches - can tell you the correct order is bl launch_os_hook. All ARM/Thumb instructions lead with the opcode, no exceptions; and that’s how the comment is in the source material. If the AI system had access to the binaries, it would have seen the correct syntax, so we can conclude it did not see the disassembly and only relied on Aroma as source material. (The human author may have had access, but even so they didn’t catch this in review, or didn’t care).
- Ancast header nop nop is kinda just meaningless. I have to assume it’s a “summary” of these comments from the Aroma code:
```
// remove various Ancast header size checks (somehow needed for unencrypted fw.img)
*(u32*)(0x0500A7C8 - 0x05000000 + 0x081C0000) = 0xbf00bf00; // nop nop
// [ 2 lines removed ]
```
- If so, it’s missing a lot of context by reducing that to four words. It also missed that the comment gives meaning to three patches, not just one. I feel a skilled human writing this code would have kept the full comment intact, and the AI system shortening it makes the code much worse and harder to understand.
Curiously, the patches which were skipped in the refactor are also the only ones that don’t have comments on the Aroma side. It’s certainly possible an AI tool skipped code it didn’t understand, but I’ll chalk it up to coincidence or deliberate choice by a human.

I know this seems nitpicky, but if you stare at it you’ll note that I have referenced all the changes compared to the Aroma code. Everything the AI system’s refactor brought to the table smells like lack of understanding, at least to me. Compare to previous refactors that adjusted the patchset according to language, needs and context.

How do I know this was taken from the Aroma version of the patchset? Well, here’s Dimok’s version, which Aroma cites in a copyright header as their source:

// patch IOSC_VerifyPubkeySign to always succeed
*(volatile u32*)(0x05052C44 - 0x05000000 + 0x081C0000) = 0xE3A00000; // mov r0, #0
*(volatile u32*)(0x05052C48 - 0x05000000 + 0x081C0000) = 0xE12FFF1E; // bx lr

// patch OS launch sig check
*(volatile u32*)(0x0500A818 - 0x05000000 + 0x081C0000) = 0x20002000; // mov r0, #0; mov r0, #0

Note the extra comments. We know from the “Ancast header nop nop” thing that the AI likes to include information from comments when it can. Therefore it must not have seen this version. I did my best with GitHub search, and the Aroma copy of the code is the only one I could find that remotely matches the patchset used and doesn’t have extra comments.

So, my assertion is that the AI model (Jules here, but all of them do this), did not produce novel work or “make” a new codebase. It fetched and “refactored” code from Aroma, washing it of context in the process. The human reviewer was misled by good-looking output and didn’t inspect it further, missing the nonsense comments and lack of attribution in the process. The code is not novel AI-created work, it is taken from existing sources and massaged to look refactored and new.

There are two major problems with this approach to patch development. First, copyright and attribution. The Aroma fw_img_payload repository is licensed as GPLv2 and the specific patch file (since it came from Dimok) is zlib. Both licenses require, among other things, inclusion of a copy of the copyright notice (Dimok’s in this case). The license notice included is not only the wrong license but does not include Dimok’s line as required.

There is, I think, a more significant problem. I hope I have proven that the system which wrote this code did not actually read the IOSU assembly they are patching and modifying. Consider, then:

The missing patches that are in the Aroma build and not here
The new, bespoke applyPatch system
The large amounts of new/refactored code across the whole system and the large surface for bugs that introduces

Do you feel an AI system, which only references existing code and does no original research, is trustworthy for this?

The responsibility and maintainability argument

In my mind, there are three classes of homebrew projects.

Homebrew applications and tooling (App store, RetroArch, C@VE). If Nintendo opened their SDK up, we could make these same apps with that.
Game/OS mods and hacks (everything on Gamebanana, Aroma plugins, menu theming). Modifying existing applications for various purposes.
Exploits and CFW tooling (Mocha, isfshax, UDPIH, de_Fuse, Aroma)

The last category has a special responsibility, since the functionality (“brick risk”) of user devices is at risk. These consoles have moneteary value - one need only attempt to buy a 3DS to see that - but also emotional value. The childhood Wii. The save files. For a lot of people, a console is the first piece of technology they personally owned rather than being a shared family device.

Isfshax is a very clever exploit that relies on corrupting the filesystem on a Wii U in a specific way. Once the OS boots, it needs to be patched to deal with the corruption without attempting to repair it or crashing. Obviously, surgical filesystem corruption is a.. delicate process that carries enourmous risk to your console. Other exploit chains like contenthax (CBHC) and FailST (Aroma) are recoverable with UDPIH, bluubomb, and other non-invasive methods. To the best of my knowledge, recovering a bad isfshax installation requires an invasive hardware modification (de_Fuse or PTB).

Legendary Wii hacker and former Asahi Linux lead marcan has a post detailing his approach to keeping homebrew tooling safe and reliable. Please go read it - this approach is why BootMii was so solid.

I think anyone working on exploits and CFWs needs to acknowledge that these tools are being put in the hands of uneducated users following user-friendly step by step guides that tell them which buttons to press in which order. You absolutely have a responsibility to keep their device safe and minimise the risks as much as possible. If it’s a beta, if it’s not final and tested and safe, don’t release binaries. Don’t buy a domain. If you ship it, you are responsible.

I took this to heart when working on Martini, which had a slight chance of causing a (UDPIH and CBHC recoverable) home menu brick while modifying an SSL certificate. It hashes your system state and existing backups. The file copy routine has a hardcoded check to refuse overwriting the OS. It backs up every file it touches and hashes the backups. If a hash comes back bad on the sensitive cert, it restores the backup and double-hashes to check the restore operation works, and has handling for the bad cert, the original cert, the good cert or an unrelated cert being restored. It has error screens and unique error codes for all of these outcomes that direct the user to get help.

I believe it is essential to go to this level of care if you are risking a brick. Chips fail from age and neglect, cosmic rays fly, and CPUs overheat.

The vibe-coded project downloads isfshax from GitHub and executes it without checks. What happens if the repository URL changes? What happens if the CA cert is bad? What happens if a new release is made with a different file structure? What happens if the user’s WiFi drops out? SD card is corrupt? Hynix MLC dies? We keep getting new versions of isfshax but isfshax_installer moves to a different repository? What if we get different versions for different OS releases? 5.5.4 vs. 5.5.5? At least they included an SSL root CA instead of disabling verification, I guess.

The standard of care is completely different here, I feel. When vibe-coding, it’s about just sending it and assuming it’ll be fine. If the bot takes code from online, it’ll probably be fine - the bot seems to have no safeguards to ensure attribution or license compliance. The lack of hash checks or pinning is probably fine - no matter that Aroma Updater already sets a standard for hash and version-checking every file. Vibe-coding feels like the opposite of the responsibility and care we saw with BootMii.

It genuinely does not make a big difference to me if you make a homebrew game and vibe-code it with AI. Will it be uncreative and boring? Probably, but that’s taste. There’s no actual danger there. The cost of a bug is PPCHalt. The cost of a bug in a CFW installer is a dead console. I cannot trust AI systems for this when I have demonstrated above that they do not read the code they are working with. They cannot do original research on IOSU or the Wii U’s systems. They can only combine existing code without understanding it.

Epilogue

This post was updated after speaking to the author of wafel_installer. I want to focus specifically on the AI’s contribtions rather than theirs. It’s very likely we’ll see unskilled people submitting code they do not understand and cannot review in the future, and we need to consider what we do with that. The issues I raised are after who I believe to be a skilled person reviewed this code - and they still missed attribution, safety checks, and nonsense code. Entirely vibe-coded content would be worse.

This post originally included arguments on the political and ethical ramifications of AI use, but as these don’t relate to homebrew specifically, I’ve split them into another post. I hope you will read them as I find them more important than the technical arugments - they are systemic issues that cannot be solved with a new prompt, unlike the technical concerns.