One of the most prolific programmers that are getting more and more juice out of the C64 at a dizzying pace and with increasing exquisiteness is our guest Dr MortalWombat, known for his extensive video game catalog and surely more for the C compiler for our system, the Oscar64, which does not stop optimizing, updating and that more and more people use for their own projects. Hablar de DMW es hacerlo de un programador nato, que desde muy joven tuvo claro lo que le gustaba y su historia, sus inicios y como muchas veces nuestra profesión va ligada a nuestros hobbys.
–We leave you with DrMortalWombat himself:
–My first programming memories are writing assembly on punch cards for an IBM /360 system. Nothing complex, I must have been in elementary school at the time, so it was the equivalent of BASIC one liners, but with a steel ribbon printer as output. My dad was the admin of a local computer center, so I had some access and exposure there.
In ’82 I got my own VIC20 and, as expected, was amazed by the graphics and sound, but completely underwhelmed by the slow BASIC. So naturally I wrote my own Assembler in BASIC to tap into the 6502 machine language “powerhouse”. This small CPU left more of an impression on me, than the big mainframe. The amount of compute power in this small chip was and is simply amazing. I sold several small games to computer magazines and used the money to upgrade to a C128 (which I still have and use) in late ’85.
By that time, my brain ran natively on 6502 assembly rather than my natural language, so I was always disturbed, how my friends struggled with assembly, but had no big problems with BASIC. As a result I started writing evermore powerful BASIC compilers. I also did some small jobs in x86 Assembler for my dads shop at the time, and this started a livelong hatred for everything that had an 8080 heritage.
During my army time from ’88 on, I was exposed to Turbo Pascal and immediately fell in love with it. We learned Modula2 in my first university semester and had the MacMETH compiler on a Mac. I had an Amiga at the time, that I programmed in 68K assembly. I loved the Modula2 language and the speed of the compiler, but the compiled code was not good enough for me – so I used this compiler to bootstrap my own Modula2 compiler and integrated IDE – for the Amiga. This made it in an actual commercial product.
I got into contact with GVP (one of the bigger US Amiga peripheral manufacturers) and started working for them during my university years. Many former Commodore engineers worked there, so I had the pleasure to meet and work side by side with some of the guys that developed the hardware that we all love.
In ’95 with the end of the Amiga era, some of my friends and some of the core people of GVP established new companies now doing digital video products for the PC. This is where I had to abandon two of my loved platforms 68k and Modula2 and work on the dark side of x86 and C++ - and this has not changed since.
I got my computer science doctoral degree in 2000 and worked as a freelancer from 2005 on. I am doing mostly performance engineering and DSP code at the moment, but I have built embedded consumer entertainment systems, AI systems, computer game engines, server platforms, compilers, data analysis software and what not in between.
During the pandemic I started cleaning out the basement and found my old C128 again … well and the rest is history.
I built the first two new games (“Plekthora” and “Gates of the Ancient”) in assembler with the C64Studio IDE, but decided to use something more powerful for the next game (“Shallow Domains”).
The existing compilers looked problematic, KickC was not really C and the code generated by CC65 was not up to the performance and size that I would need for my planned game. So I decided to build my own compiler for the 6502. I was swaying between Modula2 and C for a while, but then decided on C because it was the more low level language.
![]() |
Gates of the Ancient |
I rummaged in old projects and found two that could me married to build a foundation for the compiler, a Javascript parser and the x64 backend for a compiler I had worked on some 15 years ago.
At the time I was under the assumption, that the 6502 would not be a good target for any higher level language, due to the lack of a sufficient value stack and the lack of 16 bit instructions. The idea was to have two code generators, one generating a byte code that would be interpreted for the non time critical parts, to save code size and a native generator for the hot zones.
The project was started on August 21st 2021 and the first compiled C64 byte code programs started to run on August 29th. With the byte code compiler and interpreter working, I started with the native code generator on September 7th and reached the same functional level on September 11th. In the beginning, the code size advantage of the byte code was quite significant, but the more time I spent with improving the native code optimizer the closer the two got – and finally the native code generator overtook the byte code one.
So at the moment, the byte code generator is only used for validation purposes and has no real world application any more.
It turns out that the 6502 is a great target for a C compiler, if the optimizer and code generator is designed from the start to end to work to its strength and not its weakness. The three biggest improvements in code quality came from:
- Static call graph analysis is used to eliminate the call stack whenever possible and use zero page or absolute addressing for local variables and parameters. Only function pointers and recursion may end up using the value stack.
- Static integer range analysis allows the simplification of many 16bit operations to 8bit operations. The compiler may simply be able to prove that values do not exceed the smaller range.
- Conversion of indirect addressing to indexed addressing when pointer arithmetic or array indexing fits into a single index register.
![]() |
Portal Buster |
With every new game that I write the compiler keeps improving. I spent about 25% of my development time for each game in getting the compiler to generate faster and or smaller code.
I try to avoid any kind of assembler in my games (although the compiler supports smooth integration of inline assembler and C code). Whenever I feel the need to add an assembler section, I instead improve the compiler to not have to add it – and this policy has worked great so far.
Oscar64 is a combined compiler and linker. It does a whole program compile, optimize, link and outputs an executable file (either a .prg or a cartridge .crt) with a simple invocation, without the need for a additional build steps. So assume you have a C source file “hello.c” calling the compiler with “oscar64 hello.c” will generate a “hello.prg” which can then be started with e.g. Vice. This whole program compile is not only convenient by eliminating the need for a makefile, it also allows the compiler to invoke more complex optimizations. This is one of the advantages of cross compiling on a computer that is orders of magnitude more powerful in speed and memory size than the target system.
Programming games for a micro computer such as the C64 involves frequent direct access to memory or IO registers. In BASIC this would be done with peek() or poke, in assembler with absolute addressing. In C the natural way is to use pointers:
static char * const Screen = (char *)0x400; for(char i=0; i<40; i++) Screen[i] = 81;
This loop will fill the first line of screen memory with circles.
For IO registers it is easier to use the predefined structures in the library with e.g:
#include <c64/vic.h> vic.color_back = 2; vic.color_border = 3;
There are frequently three ways to access hardware components with Oscar64, you can poke directly with a pointer, you can use the registers defined in the header files or use higher level functions defined in those header files, for e.g. sprite manipulation:
void vic_sprxy(byte s, int x, int y);
Similar library methods exist for reading user input from joystick, keyboard, paddles or mouse.
The next important question is, how to get assets, such as fonts, bitmaps or sprites into your program. Oscar64 supports the #embed preprocessor command, that translates binary files into comma separated numbers, that can then easily be used as constant literal initializers:
const char MilitaryFont[] = { #embed "../Resources/militaryfont.bin" };
Oscar64 goes even further by providing compression and parsers for common C64 resource files such as Charpad or Spritepad files.
static const char BeltChars[] = { #embed ctm_chars lzo "belt.ctm" };
Another frequent element in C64 game programming are raster interrupts. In Oscar64 one can declare a function as “interrupt”, which will ensure that it saves all the zero page locations it uses for processing. A higher level concept is implemented in the rasterirq library, which provides a kind of copper. The library does the sorting and interrupt handling and the user of the library has to simply specify the raster lines and IO registers to change. This library is also the base for the sprite multiplexer library.
The final topic I want to touch here is the library of game specific samples/tutorials on github: https://github.com/drmortalwombat/OscarTutorials
Besides the .prg or .crt Oscar64 generates various other files during compilation that help with debugging and profiling. The .map file provides a breakdown or size and placement of all functions and global variables to get a rough overview of memory usage. A more detailed listing, showing the code size of each source line is generated as a .csz file with the -gp option. This file maps every source line of your program to a memory address and a size in bytes.
The other way around, mapping memory addresses to source is provided by the .asm file, which includes line number references if built with the -g or -gp option. Making use of this file requires some 6502 knowledge to understand what and why the compiler generated various code sequences.
When using Vice for emulation and debugging, the .lbl file will come in handy, providing symbols for the program and thus allowing some level of symbolic debugging. It can be loaded into vice with the -moncommands command line argument.
The compiler tries to generate optimal code, but it is limited by the rules of the C language, the 6502 and the source code of your game. If you think the generated code is not up to par, there are various ways to improve this. The first step is to check the optimizer levels. The default level is -O1 which already does a lot local optimizations, but is limited in the amount of automatic inlining it performs. The main optimizer level -O2 inlines most functions that appear to be beneficial in size and speed. It also performs many inter procedural optimizations. The most aggressive mode -O3 does a lot of loop unrolling and also inlines functions, even if code size may increase.
Even more can be achieved by helping the compiler. Picking the right datatype helps a lot – os if a value fits into a byte, use a char instead of an int. The compiler does static value range analysis to predict the potential value range of each integer expression, but it cannot know it when loading a value from memory or getting it as function argument. The __assume statement can help the compiler to know the potential range of a value e.g. for screen coordinates:
void put(char x, char y) { __assume(x < 40 && y < 25); ...
Another potential use of __assume is to mark unreachable code with __assume(false) to prevent code generation for it.
Another programmer controlled optimization hint is loop unrolling. This can be controlled with a #pragma to either fully unroll (aka speedcode), unroll into chunks of n or unroll in a way that the index variable becomes a single byte and thus fits into either x or y register.
The final topic I want to cover here is memory layout. The common layout of an array of structs places one complete struct after another in memory. This has two downsides when accessing individual array elements, first calculating the address requires a multiplication and second accessing then needs indirect addressing and actual pointer arithmetic. A struct of short arrays on the other hand would only need an absolute indexed addressing. Oscar64 supports the easy and transparent translation of an array of structs as a struct of arrays with the __striped storage qualifier.
–Finally, as a farewell, we asked DrMortalwombat to give us reasons for using the Oscar64. His answer was:
–So why use C to write games for the C64?
- If you are new to the C64 but an old hand with C you have a less painful entry
- If you are new to the C64 and new to C this is the ideal entry to one of the main ancestors of most modern languages
- If you are an old hand on the C64 and a great 6502 assembler hacker, you can save a lot of development time by using a higher level language
- If you are a proficient C64 developer but suck at 6502 assembly you can let the compiler do the tricky part
–We can only thank DrMortalwombat for answering our questions, thank you very much!
Links:
- You can go to DMW's Itch.io at THIS link
- You can visit his Ghitub at THIS link
- You can visit our article dedicated to their games at THIS link
- Español: Oscar64 con DrMortalWombat [ES]
0 Comentarios