Aeon Emulator Blog

September 28, 2009

Performance Considerations – Going Unsafe

Filed under: Aeon, Performance — Greg Divis @ 8:44 pm

I like the unsafe keyword in C#. Don’t get me wrong, I don’t like to use it, but I like that it’s called “unsafe.” Not only is it “unsafe” in that it might cause an access violation outright, but using it throws any guarantee of type-safety in the Common Language Runtime right out the window. In an unsafe block of code, there’s nothing stopping you from munging data in your process’s address space just like you could in good old C++. Even worse, using unsafe code means you’re probably messing around with native pointers, the size and behavior of which may vary depending on what platform the program is running on. At least on x86 and x64 platforms the only significant difference in pointers (at least as they relate here) is their size – but on other platforms there can be even bigger issues with endianness and pointer-alignment.

Unsafe code is initially disabled for new C# projects in Visual Studio (with good reason), but there are a couple cases where it’s incredibly useful: interoperability with native code, and spot optimization. Aeon uses unsafe code for both cases, though I tried to avoid using it as much as possible.

Conventional Memory

If you’re unfamiliar with the “real-mode” architecture of an x86 processor, count yourself lucky. I’ll be going into lots more detail about this later on, but for now you just need to know that it has 1MB of addressable RAM. Memory on a PC may be byte-addressable, but that doesn’t necessarily mean you always want to treat it as an array of bytes. Some instructions operate on bytes, others on 16-bit words, and others on 32-bit doublewords. Modeling RAM as a .NET Byte array would be possible, but clumsy and slower than I’d like for anything but byte-value access. In this case, I have a block of memory I just want to set aside to store arbitrary values inside of it, because that’s exactly what it’s modeling.

Fortunately, unsafe allows me to access memory using pointer arithmetic and a specified word size, just like in C++. Consider the following code snippet from Aeon’s PhysicalMemory.GetUInt32 method:

    return *(uint*)(rawView + addr);

In this snippet, rawView is a byte pointer to the allocated block of emulated RAM, and addr is the byte offset of the address to read. Emulated video RAM works in a similar fashion.


So far everything is pretty basic, but there is one other area I used unsafe code aside from emulated memory – the CPU. The x86 processor architecture has a fairly small number of registers, but to maintain backward compatibility with 16 and even 8-bit instructions, some of them are a little strange. Take, for instance, the AL, AH, AX, and EAX registers – each of these registers refers to a different part of the same 32-bit value:


So why does this matter? Since the x86 is a little-endian architecture, any arbitrary 32-bit value in memory may be treated like this collection of registers. In the constructor for Aeon’s Processor class, I just allocate a small block of memory on a native heap, intialize a bunch of pointers to the different registers within that block, and then I can use properties to wrap this:

/// <summary>
/// Gets or sets the value of the EAX register.
/// </summary>
public int EAX
            return *(int*)PAX;
            *(int*)PAX = value;

This has the benefit of automatically keeping the values of AL, AH, AX, and EAX in sync every time one of them changes without the need for any explicit masking and shifting, but it also allows the registers to be accessed efficiently using an index. When decoding instructions, operands which refer to registers use three bits as a register identifier. By also initializing an array of pointers with these identifiers as indices, I can get a pointer to any decoded register very quickly:

return wordRegisterPointers[rmCode];

Creating my own backing store for what are essentially members of a class is not something I make a habit out of, and I am not advocating this as “faster” or “better” than using normal C# fields or auto-properties to store data. In fact, in almost all cases doing what I’ve done here is a terrible abuse of unsafe code in C#. However, given the performance advantages in this case, I believe it’s justified, and it’s really nice to have this kind of power while still having all the advantages of C#’s abstraction.

Target Architecture

The whole reason I’ve been able to get away with most of these unsafe “enhancements” is that I’ve limited the processors I want Aeon to run on to x86 and x64 architectures. What I’m really doing is taking advantage of the fact that the emulated processor and the host processor are both byte-addressable and little-endian. I could certainly work around this and, indeed, all of this used to be implemented as some bitwise logic in the register get/set properties. However, placing that much logic inside properties tended to prevent their inlining by the JIT compiler (I’ll get more into JIT issues next time), so the instruction decoding phase was significantly slower.


Leave a Comment »

No comments yet.

RSS feed for comments on this post. TrackBack URI

Leave a Reply

Fill in your details below or click an icon to log in: Logo

You are commenting using your account. Log Out /  Change )

Google+ photo

You are commenting using your Google+ account. Log Out /  Change )

Twitter picture

You are commenting using your Twitter account. Log Out /  Change )

Facebook photo

You are commenting using your Facebook account. Log Out /  Change )

Connecting to %s

Blog at

%d bloggers like this: