Gameboy cartridge with co-processor
Implemented a password system for saving progress. The password is displayed in between levels like in the SNES version.
There is a new password entry screen and the title screen was altered to a simple “new game” / “continue” menu.
Also implemented Pause functionality, and a cheat menu.
Back from holidays!
Added Elevator, Victory and Game Over screens.
There is really not that much left until I consider the project completed:
Pause functionality, password save system, background music and a few known bugs. I might also need to add an option on the title screen for toggling music on/off. And should I implement the password system I’d need a screen for entering passwords and an option for selecting Start or Continue on the title screen.
Some reasoning behind the hardware choices:
I had a few questions on why I chose this particular MCU, and if the cartridge could run more advanced games etc
For me, the goal of this project is to learn and have fun, much more so than making a game.
I wanted to learn how to design PCBs, work with surface mount components, and generally learn more about hardware. I am a software guy and consider myself still very much a novice when it comes to hardware.
It always helps having a specific problem to solve when learning something so I decided to see if I could make something in the spirit of the Super-FX, but for the Gameboy.
The Gameboy was a natural choice, I started my career in games way back in the days working on the Gameboy so I thought it would be a fun trip down memory lane – I was right!
Originally I wanted to use a CPLD instead of salvaging an MBC1 from an existing game, and I was reluctant to using dual-port sram, but taking it one step at a time helps actually getting anything done at all – perhaps for the next revision!
While it would be easy to stick a beefier MCU in there it would take away too much of the point of the project for me. And, for me, a Wolfenstein game felt like a realistic/fun software project to go with the hardware.
I did not want to make a source port and Wolfenstein is a simple enough game to remake from scratch in a reasonable time frame.
Anyways… I had this wish list for the MCU:
– Designed for low power consumption
– It cannot be performance overkill. Somewhere in the ~20-40Mhz region would be nice.
– Minimum 16Kb internal RAM, but a little more would be better.
– Internal ROM of at least 256Kb.
– Reasonably small in size
– Must have enough pins (doh!). A few more than I think I would need just in case.
– I must be able to solder it by hand, preferrably no smaller than 0.8mm pin pitch.
– Operating on 5V would be top choice
– If not operating on 5V, second choice is for 5V tolerant pins.
– Few external components. Internal oscillator would be a big plus.
ARM Cortex-M0/M0+ seemed to fit the bill quite nicely. While it can be configured to run all the way up to 48Mhz it’s no powerhouse compared to other ARM chips as it is specifically designed to be very simple and having the lowest power consumption possible. It does close to 45 MIPS @ 48Mhz.
The better M chips are generally faster, have more advanced instructions, some even floating point, but those really felt like overkill.
The only instruction I wish I had is RBIT which present in M3 and above, it’s really not a big deal but it still bugs me having to spend 1ms doing it in software for the tile conversion.
I don’t really care about DIV, though it is present on M3 in case someone out there are deciding between M0 and M3 and want to have hardware division.
So, narrowing the search for 5V devices resulted in two contenders: KE series from NXP and SAMC series from Atmel.
The SAMC20E from Atmel was my first choice due to fitting my RAM and ROM size wishlist. However, I was concerned about the usable pin count of the 32pin version, and the 48pin version had a pitch that I was worried I wouldn’t be able to solder.
The final nail in the coffin was when I found out about the bit-banding(*) feature in the NXP.
(*) Bit banding:
Essentially this gives you a virtual adress range where you can read or write individual bits in RAM as if they were integers. Ie, no need for read/modify/write operations and masking bits. This is huge!
As an example, my framebuffer is made of two planes, each having 1 bit per pixel (3840 bytes). Very similar to the Gameboy tile format, but obviously not tiled at this point – it does help speeding up the conversion to tiles though.
Without bit banding it would be significantly slower to render to this framebuffer. And with only 16Kb RAM it would not be possible to just make it into even a byte array (160*96 bytes = 15360 bytes = pretty much the entire RAM)
I am glad I picked an MCU with bit banding, I use it for:
– Framebuffer (2 bits per pixel)
– Visibility map (1 bit per tile)
– Collision map (1 bit per tile)
– Texture cache (3 bits per pixel, it has an alpha bit)
The texture cache is used to hold the source texture which will be drawn to the framebuffer. Every time a texture is needed it is copied or decompressed from ROM into the texture cache region of bit-banding RAM.
When it resides in the texture cache, the renderer can quite quickly read from the source texture and write to the framebuffer. Of course I have so little RAM so the texture cache ended up being only big enough to hold a single texture but it works fine and we’re still within performance budgets even though we’re doing a lot of copying.
This ended up being much longer than I expected!
I am planning to do a better documentation sometime after the holidays, and after the Wolfenstein game is completed.
Made a few more cartridges:
Good for practicing SMD soldering, it’s a lot easier now than what it was when I began this project!
One of the carts is going to be a gift and one is for me to permanently have Wolfenstein on.
The first one I made will continue its life as a development cartridge – I’m interested in perhaps making some kind of Mario Kart style game in the future, or at least a proof of concept.
Work in progress game logic and AI:
Implemented most of the game logic, and basic AI which still needs some tweaking.
Also added some “glue” stuff such as title screens, fade in/out and so on (Menu, Elevator, Game Over & Victory screens are not yet finished).
With 3 enemy types, and the 10 levels from Episode 1, I am starting to reach the limit on how much more I can put on the ROM.
Currently sitting at 123Kb out of the 128Kb available so there’s still room for a little more stuff.
Made a box and a sticker for the cartridge:
Added a few features to wolfenstein game.
– Collision against walls and objects
– Some sound effects
Next step is probably going to be interactions with doors & secret walls.
Refactored the map converter to reduce memory overhead.
It looks like we are on track for fitting the first episode (10 levels) on cartridge.
I’m not worried about RAM any more, 3kb remaining is plenty!
ROM usage looks promising for fitting the remaining 9 maps, some more map objects, and the enemy sprites.
Some progress on the raycaster test:
Made some tools for converting the original Wolfenstein3D maps.
Added doors that can open and close.
Memory optimizations! RAM and ROM is extremely limited.
Music and sound playback.
HUD in progress.
Raycaster test with sprites and colors:
Started documenting the technical bits, mostly so I don’t forget.
This is roughly how one gameplay frame could look like on the two CPU’s.
It should be possible to HDMA additional tiles betwen the two VBlanks, so we could in fact fill almost the entire screen with tiles from the KE04.
Experimenting with colors:
The Gameboy has a clunky way of handling colors. Especially if you try to do anything other than rendering tiles.
It has a total of 8 palettes of 4 colors each, and each 8×8 tile can be assigned to one palette.
The KE04 tries to find the best matching palette for each tile. To reduce color errors I decided to make each palette mostly grayscale with a single accent color.
Each texture on the KE04 will use a single palette due to memory and performance reasons.
Simple work-in-progress raycaster test on Rev.C cartridge:
New PCB’s arrived in the mail!
Like last time, it took exactly one week from placing the order. I can’t recommend www.elecrow.com enough.
Here is a fully populated Rev.C board, I am extremely happy everything works as expected and that I didn’t need any bodge wires this time around.
As far as hardware goes, this could easily be called the final revision. I will now focus more on the software side of things.
Board contains: MBC1B, AM29F040B, CY7C144E-55AXC, MKE04Z128VLD4, resistors & capacitors
First test of image generation:
The coprocessor needs about 2ms to convert the framebuffer to tiles and transfer it to sram.
This leaves 31ms for processing/rendering if we run at 30fps, plenty!
The Gameboy can only DMA 128 tiles to VRAM per VBlank, so we need to run at 30 fps to get a maximum of 256 tiles.
256 tiles gives us either 160×96 or 128×128 pixel resolution.
This does not fill the entire Gameboy screen which is 160×144 pixels, but the Gameboy side could easily fill the blank space with border/gameplay UI so it’s not a big deal.
It is possible to cover the entire screen by rendering the image in half height and double it to normal size by manipulating the scroll-y register in the HBlank interrupt. That would result in only 180 tiles for the entire screen at the expense of halving the vertical resolution.
New board ordered
Rev.C designed and PCBs ordered from Elecrow.
This version fixes the flaws present on Rev.B and includes the KE04 processor on the board.
It has pin headers for connecting the SWD debugger plus a couple of extra unused pins.
I will try to use the KE04 from NXP as a co-processor.
It runs on 5V and has enough pins for the dp-sram, plus some extras that I may or may not need.
It sports an ARM Cortex-M0 @ 48Mhz, 128kb flash and 16kb ram. It also has bit manipulation features that could be useful for manipulating gameboy graphics.
I was first looking to use the Atmel SAMC20E18 which is almost identical to the NXP but with double the amount of flash and ram, however it might not have enough pins, and it does not have bit-banding features so I am going with the NXP.
Testing the rev.B cartridge
Wrote a simple program for testing cart sram read/write/dma.
Also verified that I can toggle the EA0 and EA1 pins correctly, I plan on using these as a form of communication/triggers from Gameboy to co-processor.
Second eeprom programmer
I had a C3 flashcart programmer from Jeff Frohwein lying around since some 20-ish years ago.
It connects to the PC Parallell port, which my Mac lacks, so with some modifications I could interface it with my Teensy++ and now program the AM29F040 which is soldered onto the cartridge.
Second prototype cartridge
– MBC1 memory mapper salvaged from a gameboy game
– AM29F040 (eeprom)
– CY7C144E (dual port sram)
– Pin header for right side of CY7C144E and some other useful pins for the co-processor.
Since OSHPark only offer PCBs of 1.6mm thickness I had to look elsewhere for thinner PCBs.
Update: I later found out that OSHPark actually has an option for 0.8mm PCBs. I must have missed that when I was looking for it.
I heard good things about Elecrow and they did not disapoint. Super fast turn around time – The PCBs arrived at my home in Sweden almost exactly 1 week after placing the order (I selected DHL for shipping).
Their prices are very good and you have a good selection of options. I will keep using them for all my future PCB orders.I had missed one trace, and routed another one to the wrong pin of the MBC1 chip but these issues can easily be fixed with wires so they are good for prototyping.
First EEprom programmer
Made an eeprom programmer using a Teensy++.
It can program 512kbit AM29F040 eeproms in either DIP or PLCC package (using an adapter)
A Gameboy IO breakout cartridge. Breaks out the cart connector and MBC chip to pin headers. Designed in Eagle and manufactured by Oshpark.
Oshpark PCB’s are way too thick. The correct thickness seem to be roughly half of that of an Oshpark PCB.
Never the less, it can be inserted in a Gameboy with some convincing (read: brute force!).