Loading BASIC programs from cassette tape to the Sega Master System

Monday, 16th August 2021

After I posted about the pattern filling modes on Twitter I was alerted to the BBC Micro Bot website which hosts a gallery of programs that produce impressive graphical output from very short BASIC programs (short enough to fit in a Tweet!)

I tried a few of them out but unfortunately ran into problems with a lot of them that use various tricks to reduce the original program text length by embedding non-ASCII characters directly into the body of the program. My usual approach to prepare programs was either to copy and paste them into an emulator (my emulator strips out non-ASCII characters on pasting, as these can't be mapped to keystrokes) or to save them to a file and transfer that serially, and my own editor's tokeniser and BBC BASIC for Windows both had a habit of mangling the non-ASCII characters.

TV showing a picture of a shell generated from a BASIC program

The BBC Micro Bot website does not provide a way to download the BASIC programs directly but does provide a share link that allows you to export a disk image or play back the program from a virtual tape cassette. I don't have a a disk drive attached to my Master System, but the tape option seemed promising...

The basic cassette tape interface

My initial approach to loading programs into BASIC from external storage was to connect my Cambridge Z88 computer running PC Link 2 or EazyLink to the Master System with an RS-232 serial cable. This works well for me, but isn't very practical for others who might not already own a Z88, and I do want this to be an easy project for people to replicate!

Being able to load from tape would significantly lower the barrier to entry, if the cassette tape interface circuit can be constructed easily. You wouldn't even need an actual tape player; the drive belts have gone in mine (I tried to make a test recording on mine and it just unspooled the tape into the bowels of the machine) so for the time being I've been testing loading from PlayUEF, the web-based UEF player that's also found on the BBC Micro Bot website. This can be run from a browser and it works well on my mobile phone and makes loading programs from a web link an absolute doddle. Perfect!

The challenge, then, is how to get that audio data into the Master System in the first place and whether it's got enough grunt to be able to decode it in software. Using the same tape format as the BBC Micro seemed like it would be a good choice so I read this article on the Acorn cassette format on BeebWiki. The highest frequency audio signal is a sine wave at 2400Hz. As we'll need to handle both halves of each cycle we'll need to sample the signal at least 4800 times per second. As the Master System has a CPU clock of around 3.58MHz, that gives us 3.58×106/4800≈756 CPU cycles for each half of the cycle which should be more than sufficient.

Breadboard with the prototype tape interface on it

We still need to get the audio signal into the Master System somehow! I did a bit of digging online and couldn't find too much information about suitable electrical interfaces. My best lead was the circuit diagram for the BBC Micro's tape interface, however that requires several op-amps and I only have a single LM741 in my parts bin, which isn't going to do a very good job on a single 5V supply. In the short term I've used the circuit above which is a slightly modified version of this common circuit used to convert S/PDIF signals to TTL levels using an unbuffered hex inverter as an amplifier. It's not the most stable circuit and has a habit of oscillating when not fed an input signal but the +5V pullup on the output from the console inhibits this (putting a pull-up or pull-down on any of the inputs changes the mark:space ratio of the pulses). That a load on the output affects what's happening on the input is definitely not a good indication of a properly-functioning circuit, but it'll have to do for now.

Logic analyser trace of the audio signal converted to square pulses

I don't know how well it'll respond to a real tape cassette player, but the signal from my PC's sound card or my mobile phone produces very clean square pulses so it should be enough to start experimenting with until I can come up with more stable design.

Decoding the tape signal in software

To decode the signal we'll need to know whether the tape is outputting silence, a 1200Hz tone or a 2400Hz tone. I'm handling this with a simple routine that samples the current level of the input, then loops around waiting for the input level to change state, counting the number of times it takes to go around the loop before the transition. If the counter overflows before the state changes then it's assumed that the tape is outputting silence, but if the state changes before that we can check the value the counter reached to see if it got to a small value (a high frequency signal from the tape) or a large value (a low frequency signal from the tape).

To determine a suitable value for the thresholds I wrote a test program that simply called this routine in a loop 256 times, storing its results in memory, before dumping the results to the screen once all 256 values had been found. By playing a section of the tape in the middle of a data block (so there would be a good mixture of 1200Hz and 2400Hz tones) I got the rather pleasing result that 1200Hz tones appeared to make the counter reach $20 (32) and 1200Hz tones made the counter reach $10 (16), so I put the threshold value between them at 24 as an initial test.

Once we know the frequency of the tape signal at a particular point we can use this to determine the current bit – a "0" bit is a complete 1200Hz cycle (so we'd see two instances of the counter reaching >16) and a "1" bit is two compete 2400Hz cycles (so we'd see four instances of the counter <16). Knowing the bit on its own is not very useful, though, so I added byte decoding which waits for a start bit (0), receives and stores 8 data bits, then checks for a stop bit (1). I could then dump this data to the screen:

A data dump from the tape to the screen

At this point the data isn't completely correctly decoded, but it always produces the same results on every run-through which to my mind was a good sign. The text displayed at the bottom of the screen shows some recognisable fragments of the original BASIC program (as the program is tokenised the keywords are missing). The main problem I was having was down to synchronisation; as the article on BeebWiki states "an odd number of 2400Hz cycles can and does occur" and these occasional cycles were throwing me out of sync. The fix was to add an extra check when handling 2400Hz; even though four 2400Hz half-waves are expected, if a 1200Hz half-wave is detected instead it's assumed that we've gone out of sync and the bit should be decoded as a complete 1200Hz pulse instead. After making this change more recognisable text started coming through, so I was able to move on to decoding the data blocks.

Decoding data blocks on tape

Files on tape are broken up into multiple data blocks, each with a header describing the block (e.g. file name, block number, data length) followed by a variable amount of data (up to 256 bytes) containing the data itself. To load a file from tape each incoming block has to be handled. If the block number is 0 then that's the start of the file, so the filename is checked. If this matches the supplied filename (or the supplied filename was the empty string "") then the loader switches from "Searching" to "Loading" mode and the data we just received is stored in memory. From that point on each incoming block is appended, as long as the block name and number match what we expect (i.e. the same filename as before but the block number should be the previous block number + 1). This continues until the "end of file" block flag is set. A CRC-16 of the received data is also checked after each block to ensure that the data has been received successfully. If there are any errors (the block name/number doesn't match what was expected, or the CRC-16 check fails) then an error message is printed and you can rewind the tape a bit to try receiving the block again.

This doesn't sound too complicated, but I've not had too much luck implementing this successfully. My code is currently a bit of a mess and doesn't properly maintain the "Loading" or "Searching" states which means that if an error occurs at any point then it can get a bit confused trying to resolve it (to the point where it's easier to just abort the load and start again from the beginning). However, the tape block issues are not the only problem...

Conflicting BASIC program formats

The programs I'm trying to load are designed for the BBC Micro, where programs are stored in the Acorn or "Wilson" (after Sophie Wilson) format. However, BBC BASIC (Z80) was written by Richard Russell and therefore uses the "Russell" format. Both have a lot in common, fortunately but the way each line is stored in memory differs (this is covered in more detail in the "Program format" article on BeebWiki). Fortunately the formats are similar enough that it's possible to convert them reasonably trivially in-memory.

My initial plan was to leave the loaded program as it is, and to provide a star command (e.g. *FCONVERT, named after the FCONVERT.BBC program) to convert from the "Wilson" format to the "Russell" format. Unfortunately, after loading the file BBC BASIC (Z80) checks the program and applies its own fixes to it, notably writing the appropriate terminator on the end of the program. As far as I can tell it does this by following the program along, line by line, until it finds what it believes is the end. In the "Russell" format the first byte of the line is its length, and in the "Wilson" format it's a carriage return (value 13). What I think this means is that BBC BASIC (Z80) thinks that the second line of an Acorn program should occur 13 bytes in, and as this isn't likely to be the case it ends up writing its terminator 13 bytes in to end the first line. This prevents me from fixing loaded programs after the fact, as they've already had part of their contents overwritten.

Instead, I've added a check to the program after I've loaded it but before I've returned control to BBC BASIC. If I am able to read the program from start to end in "Wilson" format then I apply the conversion to "Russell" format automatically. In practice this means that both file formats load correctly, but I don't particularly like the way that if you LOAD a "Wilson"-format file then SAVE it back it'll be saved as a "Russell"-format file.

There are still some differences in the formats and the way they're tokenised but usually that can be fixed by loading the problematic lines into the line editor (*EDIT) and pressing Return without making any changes. This forces them to be retokenised and has fixed some of the issues I've run into.

Another issue is that "Wilson" format BASIC files support a line number 0, but "Russell" format ones do not; the file will be loaded and can be run, but if you try to LIST it then it'll stop when it reaches line 0 (which is usually the first one!) so programs appear to be empty. This is not something I think can (or indeed should) be fixed automatically but if necessary you can change the line number for the first line from 0 to 1 with ?(PAGE+1)=1 so there is at least one solution.

The video above shows a couple of programs being loaded from the BBC Micro Bot website onto the Master System. Even with the terrible audio to digital interface it gets the job done! The 7905=7905 messages at the end of each block during loading show the CRC-16 calculated for the received data followed by the expected CRC-16, as they match it indicates the data is getting through successfully.

Emulating the tape interface

As alluded to above my current loading system works well enough when everything is OK but it gets a bit confused when there's a fault. I never really planned out how to handle this and the code's a bit of a spaghetti mess in places so I drew out a flowchart of a more robust loader, which itself is also a bit of a mess but hopefully one that will make loading tapes more user-friendly and robust.

Unfortunately, doing any work on this interface has been very tedious as every time I make a change I need to try it on real hardware – I don't know of any Master System emulators that include tape support! I've been using my own Cogwheel emulator throughout which is itself not a very good emulator (there are no proper debugging tools, for example) but I can at least bolt on weird features I need like PS/2 keyboard emulation, so I thought it would be a good use of my time to add tape interface emulation. I started by writing a UEF (Unified Emulator Format) parser that could turn a UEF image file into a 4800 baud bitstream that could be fed into the Master System's controller port. For some reason my files end up being slightly different lengths to the sound files generated by PlayUEF – nothing major, but a handful of seconds over the length of a roughly 7 minute file. I tried loading the generated bitstream as a 4800Hz sample rate file into Audacity and it didn't quite line up with the audio file generated by PlayUEF either, but after poring over the specifications for the UEF format I was unable to figure out where the discrepancy lay.

Undeterred I pressed ahead and added this simple cassette player to the emulator, which happily seems to work well in spite of the reported timing differences:

Cassette recorder interface in the Cogwheel emulator

One thing that's notable there is that the program CIRCLE2 appears in the catalogue on-screen. When I first added the cassette interface emulation it was missing, but it would not be detected on real hardware either, which made me quite happy! Being able to more rapidly make and test changes to the tape loader code I was able to experiment and found that the tape loader seemed to miss the synchronisation byte ($2A) sent at the start of block headers on the CIRCLE2 program. This seemed to be due to going out of sync with the bit stream in the period between detecting the carrier tone and the first start bit, so I changed the code to demand exactly two 1200Hz half-waves as the start bit rather than using my more lax code that decodes the contents of the main bit stream that allows for extra 2400Hz pulses. This ensures that the start bit is always properly synchronised, and after making this change CIRCLE2 appeared in the catalogue as it should. I then reflashed the ROM chip on the cartridge and tried on real hardware, and that now picks up CIRCLE2. Being able to fix a problem that appears on hardware by replicating and resolving the same issue in software emulation justifies the few hours I spent adding the cassette tape emulation!

Unfortunately, I'm down to under a hundred bytes of free program space on the 32KB ROM I've been developing with, and I still have a lot of features I'd like to add (including the more resilient tape loader). I've been trying to put this moment off as long as possible as I'm going to have to use bank switching to map in additional memory banks, and that means a lot of difficult decisions about what code lives on which ROM pages to minimise the amount of switching that's required. Conventionally slot 2 is used as the slot to map different banks into but at the moment that's where I've mapped in the cartridge RAM to extend BASIC's memory. On the plus side very little of my code needs direct access to BASIC's memory (I mostly use my own data storage or the stack which lives at the top end of memory, far above slot 2) so this would seem like a good fit but I currently allocate storage space in BASIC's free memory for some operations (e.g. as temporary space when copying data around the screen during scrolling operations) so I'll need to change that code to use memory allocated elsewhere. It's not the most glamourous part of the project, but it'll need to be done if I am to add all the features I'd like.

Better drawing, editing, memory and emulation for BBC BASIC on the Sega Master System

Tuesday, 3rd August 2021

I have continued to work on the Sega Master System version of BBC BASIC, and it's feeling much more like a practical version of BASIC than something that was just holding together to run one specific program!

One of the key improvements is to standardise the handling of the different graphics modes. Previously I was using a coordinate system where (0,0) is in the top left corner and all drawing operations were carried out using physical device coordinates (so the screen was treated as being 256 pixels wide and 192 pixels tall). I have now moved the origin (0,0) into the bottom left corner with the Y axis pointing up and scale down all coordinates by 5, effectively making the screen 1280 logical units wide and 960 units tall. This isn't 100% compliant, as the BBC Micro and other versions of BASIC treat the screen as being 1024 units tall, but dividing by 5⅓ is considerably trickier and it would result in the graphics being squished further vertically, so I think using a logical resolution of 1280×960 is an acceptable compromise. I've added some VDU statements to allow you to move the graphics origin as well as define custom graphics and text viewports, so you can control where on the screen graphics and text appear and how they are clipped/scroll.

Screenshot of the Mandelbaum program output
The output of the "Mandelbaum" program following these changes

I have also changed the default palette to more closely match the one used by the BBC Micro and other versions of BBC BASIC. This isn't too difficult when using the Master System's "Mode 4" as that has a user-definable palette, but the legacy TMS9918A modes have a fixed palette so I've tried to match the default palette as sensibly as I can to the TMS9918A palette. It's possible to change the logical to physical palette mappings under Master System "Mode 4" via a VDU command which writes directly to the CRAM (you can either remap one of the 16 logical entries to one of the stock 16 colours, or supply an RGB value directy to select additional colours) which allows for neat tricks like palette cycling, but the TMS9918A modes currently only let you change the current text/drawing colour, not amend the palette as that's fixed in hardware.

I've also added filled/outlined circle and axis-aligned ellipse PLOT commands using some code written by Darren "qarnos" Cubitt which I originally used in the TI-83 Plus version of BBC BASIC. This code is very capable and fully accepted 16-bit coordinates for its inputs, however it was also originally designed to output to a 96×64 pixel screen so the final plotting was done with 8-bit coordinates ranging from -128..127. Fortunately the Master System's screen also fits in 8-bit coordinates at 256 pixels wide but that's not quite enough information as you also need to be able to tell if a particular point is off-screen (less than zero or greater than 255); simply clipping it against those boundaries will result in a vertical smear on the left or right edge of the screen when drawing outlines. Fortunately I was able to figure out how to modify his code to add some extra clipping status flags to ensure that ellipses were clipped and displayed correctly on any part of the screen.

Screenshot showing filled circles     Screenshot showing filled cones
Filled circles and filled/outlined ellipses make these drawings possible

The only graphics operation exposed by the mode-specific drivers before was a simple "set pixel" routine. This is fine for lines but quite slow for filling shapes so graphics mode drivers can now supply a "fill horizontal span" routine for faster shape-filling. If the routine is left unimplemented a stub routine is provided that fills the span using the "set pixel" routine.

I also added a rectangle-filling PLOT command, which is perhaps not the most exciting-sounding graphics operation but it is used to clear the screen so it is at least useful. More interesting is a triangle-filling routine, something I've never enjoyed writing!

Usually I get very bogged down in the idea that the pixels affected when you fill a triangle should exactly fit within the pixels that are outlined by drawing lines between each of the triangle's points, no more and no less. This can be a bit difficult when the triangle is filled by tracing its edges from top to bottom and drawing horizontal spans between them. If the edge is "steep" (its overall height is greater than or equal to its width) then this isn't too bad, as there's only one X coordinate for each Y coordinate where a pixel would have been plotted. However, when the edge is "shallow" (its overall width is greater than its width) there are going to be certain Y coordinates where the line drawn would have had multiple pixels plotted. In that case, where is the boundary of the horizontal span?

The cop-out answer I've used in the past has been to set up three buffers the total height of the screen and to "draw" the three lines first using the same line-drawing algorithm as the line PLOTting command, keeping track of the minimum and maximum X coordinate for each Y coordinate. When it's time to fill the triangle the minimum and maximum X coordinate for each edge can be determined based on the current Y coordinate and a span drawn between them for perfect triangles. On the TI-83 Plus this takes up four bytes per line (minimum and maximum 16-bit values) for a 64 pixel tall screen, with three buffers for the three lines that comes to 4×64×3=768 bytes, pretty bad. On the Sega Master System that would be 4×192×3=2304 bytes, totally unacceptable on a machine with only 8KB total work RAM!

Screenshot of a 3D sphere rendered by BBC BASIC
Each face of this 3D sphere is filled by two triangles.

I've instead simply done my best to interpolate from one X coordinate to the other when working my way down the triangle and filling scanlines, doing a bit of extra pre-incrementing and fudging of initial error values depending on whether it's the top half or bottom half of the triangle. My test was to draw the outline of a triangle in bright white and then to fill a dark blue triangle over the top, if any bright white pixels were visible around the outside this indicated a gap. I mostly got it filled, but I then tried my test program on a BBC Micro emulator and found the BBC Micro exhibited similar gaps so I don't think I'm doing too badly! The above screenshot of the 3D sphere was rendered using this triangle-filling code.

Screenshot showing buggy BASIC code with a PRONT instead of a PRINT statement

I've also been working on improving the line editor. This is called by BASIC when it's asking for a line of text input from you. Previously I'd only implemented adding characters to the end of the line and pressing backspace to remove characters from the end of the line; if you'd typed in a long piece of code and made a mistake at the start you'd need to backspace all the way to the mistake, correct it, then re-type the rest of the line. Now you can use the cursor keys (or home/end) to move around within the line, insert or overwrite new characters (toggled by pressing the insert key on the keyboard) at the cursor position or backspace/delete characters mid-line as required. It sounds like a small thing but it was quite a lot of code to get right and makes a big difference for usability!

Another feature I've added to aid modifying lines of code is the *EDIT command. This takes a line number as an argument and then brings up the program line you've requested in the line editor, ready for editing. The way this works is a bit sneaky, as it's not natively implemented by BBC BASIC! The trick that makes it work is the ability to override two routines, OSLINE (which BASIC calls when it wants to display the line editor) and OSWRCH (which BASIC calls when it wants to output a character).

When a valid *EDIT <line> command is entered, OSLINE is overridden with the first custom routine. This routine doesn't ask for a line of input, but instead copies L.<line> to the input buffer, overrides OSWRCH with a routine that captures data to RAM, overrides OSLINE with a second custom routine, then returns. BASIC therefore thinks you've typed in a LIST statement so it dutifully starts outputting the line via OSWRCH, but this has been overridden to write the characters to RAM rather than the screen. When it's done this BASIC then calls OSLINE again for the next line of input, which brings up the second custom OSLINE handler. This pulls the data from RAM previously captured by the OSWRCH handler, copies it to the OSLINE buffer, and dispays it on-screen as if you'd just typed it in. It then restores the original OSLINE and OSWRCH handlers before jumping back into the stock OSLINE routine, so you can continue editing the line that you'd requested via *EDIT.

A modified Monopoly cartridge, used to run BBC BASIC with an 8KB RAM expansion.

All of this hopefully makes entering programs via the keyboard less cumbersome. Of course, not having to type in programs in full every time you wanted to run them would be even better, and an attempt to give BASIC more memory provides another way to load programs in a somewhat roundabout manner.

The photograph above shows the modified Monopoly cartridge that I'm now using to test BBC BASIC on real hardware instead of the modified After Burner cartridge I was using before. The advantage of Monopoly is that it has an additional 8KB RAM on board, which is used to save games in progress. Sega's cartridge mapper allows for on-cartridge RAM to be mapped into the address range $8000..$BFFF, immediately below the main work RAM which is at $C000..$DFFF. If present, then, BASIC's memory range (which runs from PAGE to HIMEM) can be extended by enabling the save RAM and moving PAGE down.

$8000..$BFFF is a 16KB range, though, and I mentioned that Monopoly has an 8KB RAM. This is true, and what it means is that the 8KB cartridge RAM is accessible from $8000..$9FFF but is then repeated ("mirrored") from $A000..$BFFF. The cartridge RAM detection therefore has to check two things: firstly that there is RAM in the first place (which can be verified by modifying memory at $8000 and seeing if those values stick) and how big it is (which can be checked by writing to $8000 and seeing if that has also modified the data at $8000+<RAM size>). Once presence of any RAM has been determined during startup, RAM mirroring is checked in 1KB, 2KB, 4KB and 8KB offsets. If any RAM mirroring is detected, then it's assumed the RAM is the size that was being checked at the time, however if not it's assumed that the RAM is the full 16KB. At this point, PAGE (which is $C000 in a stock machine with no cartridge RAM) is moved backwards by the size of the detected cartridge RAM, e.g. to $A000 for an 8KB RAM and $8000 for a 16KB RAM. This results in 16KB or 24KB total available to BBC BASIC, a considerable upgrade from the plain 8KB work RAM!

An added bonus of this is that when you type in programs they grow upwards in memory from PAGE. As PAGE now starts within your cartridge memory, and that cartridge memory retains its contents courtesy of a backup battery, it means that if your entered program is smaller than the size of your cartridge memory you can restore it when you switch the console back on by typing OLD.

That's very well for life on real hardware, but I continue to do most of my development testing in an emulator. I'm having to use my own emulator as there aren't any other Master System emulators that also include a PS/2 keyboard emulator, but I did end up running into a very weird bug. Certain trigonometric functions in BBC BASIC were producing very wrong values, resulting in very odd-looking output.

Screenshot showing wonkily-rendered cubeScreenshot showing wonkily-rendered Mandelbaum setScreenshot showing wonkily-plotted sphere
These shapes are on the wonk

Even though the Z80 emulator at the heart of the program passed ZEXDOC (an instruction tester that checks documented functionality) I remembered that it had failed some aspect of ZEXALL which checks the undocumented flags too. I re-ran ZEXALL and found the problem was with my implementation of the bit instruction, so I worked on fixing that including emulation of the Z80's internal temporary memptr/WZ register to ensure that bit n,(hl) set bits 3 and 5 of the flag register appropriately. ZEXALL now passes, but unsurprisingly it didn't fix my problem (as I didn't really think that BBC BASIC would rely on undocumented functionality!)

I ended up isolating certain exact values that when plugged into SIN() or COS() would produce incorrect results. I then dug out my old CP/M emulator and tried plugging those values into the generic CP/M version of BBC BASIC (which has none of my code in it!) and that produced the same, incorrect, results confirming that the issue was definitely in my Z80 emulation and not in some flaw of the BBC BASIC port.

After tracing through the code and dumping out debug values at certain points and seeing where it differed to a known good reference emulator I found that the fault occurred in FMUL, and from there I found that at some point the carry flag (which is either rotated or added to registers with RR or ADC instructions) was being lost. RR looked fine but digging into my implementation for the 16-bit ADC I found the culprit: the code is the same for ADD and ADC, but in ADC if the carry flag is set then the second operand is incremented by one first. This produces the correct numeric answer, but if the second operand was $FFFF on entry to the function then adding a carry of 1 to it would cause it to overflow back to 0. As this happens before any of the rest of the calculations are made it means that the final value of the carry flag was calculated based on op1+0 instead of op1+$10000, hence the loss of the carry flag.

Fortunately, fixing this fixed the wonky output demonstrated above and I feel slightly more confident in my emulation! Now I can continue working on BBC BASIC, I've had an idea for a screen mode that has a reduced number of colours but will hopefully let you draw anywhere on the screen without running out of tiles causing odd corruption (as happens in the usual 16-colour mode 4) and without attribute clash (as happens in the TMS9918A Graphics II mode)...

Cogwheel 1.0.3.0 beta 3

Monday, 24th August 2009

I managed to break save states in the last build of Cogwheel (attempting to load a save state would fail, not being able to set a property). I've marked the offending read-only property with [StateNotSaved] and made the loader slightly more robust in Cogwheel 1.0.3.0 beta 3. It's beta 3, not 2, because I uploaded 2 and then noticed another issue - you couldn't change the controller mappings! This is something that must have been broken for ages, but either nobody noticed or they just didn't care to report it. Oh well, that's been fixed now. For some reason Google don't let you re-upload files, so beta 3 it has to be.

Phantasy Star save screen

Another addition is this build is preliminary support for persistent cartridge RAM. Some games, such as Phantasy Star (pictured above) let you save your progress in the game onto battery-backed RAM built into the cartridge. If you come back to the game later you should now be able to continue your progress without needing to manually save the entire emulator state.

I've had reports of rather bizarre crashes bringing one poor user's machine to its knees. I'm at a loss to establish why; I've tried the emulator on four machines (two Vista, two XP) and although one of the machines displays a white screen instead of the emulator output (no pixel shader 2.0 support on its Radeon 9000) the software trundles along just fine otherwise (I can at least hear the game music!) The one notable difference between my machines and his machine is that he's using a 64-bit version of Windows, and all of the ones I have access to run 32-bit Windows. To see if this is the issue, I've changed the configuration to x86 (I've encountered strange bugs with .NET code using unmanaged 32-bit code on 64-bit Windows) to see if this will remedy issues, but if anyone has any bright ideas I'd be interested to hear them.

Cogwheel 1.0.3.0 beta 1

Thursday, 20th August 2009

I've released a beta version of Cogwheel 1.0.3.0 in the hope of getting some feedback. My main concern is with the new 3D glasses code, so I'd be very grateful if you could install the emulator and run this ROM in it. The ROM simply alternates between showing a red screen for the left eye and a cyan one for the right eye (the emulator defaults to only showing the left eye, so you'll just see red for the moment). If you select a different 3D glasses mode (Options, 3D glasses, Row interleaved) you should end up with something like this:

3D glasses test - row interleaved

If you drag the around the desktop the lines should appear fixed on the spot (as long as you drag it slowly enough to allow it to repaint), and if you resize it the entire form should always be covered in lines one pixel apart. The same should apply to the column interleaved and chequerboard interleaved modes.

I've also added VGM recording and VGM playback. VGM playback is handled by bundling Maxim's SMS VGM player.

VGM player

The console's region (Japanese or Export) and video standard (NTSC or PAL) are now user-configurable via the Emulation menu. The YM2413 (FM sound) emulation has been converted to straight C# (it used to be a P/Invoked native DLL). Drag-and-drop support has been added to aid in loading ROMs, save-states and VGMs.

There have been a number of internal optimisations, fixes and tweaks (such as per-scanline backdrop colours), but nothing too major (compatibility is roughly the same as it was). If you do find any bugs, please report them!

New 3D renderer in Cogwheel

Monday, 10th August 2009

I have written a new 3D-compatible renderer for Cogwheel. It holds two textures, one for each eye, and uses one of a number of different effect file techniques to mix the two views.

Row-interleaved 3D

Based on the interlacing work from the previous entry, the first technique is one that uses interleaved rows. I'm not really sure if there's a good way to convert texture coordinates into device coordinates, so am passing in the viewport height as a parameter and hoping that floating point errors don't trip me up (they haven't, yet).

float4 RowInterleavedPixelShader(VertexPositionTexture input) : COLOR0 {
	float row = input.Texture.y * ViewportHeight * 0.5f;
	if (abs(round(row) - row) < 0.1f) {
		return tex2D(LeftEyeSampler, input.Texture);
	} else {
		return tex2D(RightEyeSampler, input.Texture);
	}
}

Alternate pixel centres may also pose a problem in the future. If anyone had any recommendations, suggestions or warnings on the way I'm detecting the evenness or oddness of a particular "scanline" then I'd appreciate hearing them!

Colunn-interleaved 3D Chequerboard-interleaved 3D

I have also added two other interleaving modes; one in columns and another in a chequerboard pattern. I included these two as I've seen that some 3D LCD panels use a column interleaving pattern (I suppose that with a lenticular lens in front of such a panel you may not even need 3D glasses) and apparently Sharp have displays that use the chequerboard pattern.

I have also taken advantage of pixel shaders to create colour and monochrome anaglyphs (previously calculated in software), though neither look as good as the above full-colour modes for shutter glasses or similar hardware.

There are a few issues I need to sort out first before I can release this; for example, there's no way to set whether the first row/column/pixel is for the left or right eye. More problematic is the removal of support for non power-of-two textures; the Master System's 256×192 display is fine, but the Game Gear's 160×144 display gets rounded up to 192 pixels wide (and yes, I know that's not a power of two) on my video card. I also mean to give Promit's SlimTune profiler a look to see if I can optimise some of the less efficient pieces of my code. The C# version of emu2413 is probably a good candidate, being a "dumb" translation from the original macro-heavy C.

C# emu2413

Thursday, 8th January 2009

This is fairly embarrassing; somebody sent me an email that was flagged as spam which I accidentally deleted. So if you sent me an email and I haven't replied, I'm not deliberately being rude; could you send it again? embarrass.gif



After encountering strange crashes (not .NET exceptions, full out crashes) with emu2413 I decided to port it to straight C# instead from its existing C incarnation (emu2413.h.cs and emu2413.c.cs). Even though the original was macro-heavy it was relatively simple to port, and so there's no dependency on an unmanaged DLL to generate FM sound any more. However, the C# version is significantly slower (Cogwheel now takes about 50% extra CPU time when FM sound is enabled), possibly due to many extraneous method calls that were macros in the original.

However, the emulator still crashes when FM sound is enabled. And I have no idea why, as it only happens in Release mode and outside the IDE. The Debug build works fine inside and outside the IDE, and Release mode works fine when run within the IDE. sad.gif

Controller input updates to Cogwheel

Monday, 5th January 2009

I hope you all had a good Christmas and New Year period!

I received an Xbox 360 controller for Christmas, so have done a bit of work on Cogwheel to add support for it. (You can download a copy of the latest version 1.0.2.0 with SlimDX here).

The first issue to deal with was the D-pad on the Xbox 360 controller. When treated as a conventional joystick or DirectInput device the D-pad state is returned via the point-of-view (POV) hat. The joystick input source class couldn't raise events generated by the POV hat so support for that had to be added. This now allows other controllers that used the POV hat for slightly bizarre reasons (eg the faceplate buttons on the PlayStation controller when using PPJoy) to work too.

The second issue was the slightly odd way that the Xbox 360's DirectInput driver returns the state of the triggers - as a single axis, with one trigger moving the axis in one direction, the other trigger moving it in the other. You cannot differentiate between both triggers being held and both being released, as both states return 0. To get around this, I've added support for XInput devices, where all buttons and triggers operate independently.

The Xbox 360 controller now shows up twice in the UI - once as an XInput device and again as a conventional joystick. Fortunately, you can check if a device is an XInput device by the presence of IG_ in its device ID. Here's some C# code that can be used to check with a joystick is an XInput device or not.

using System.Globalization;
using System.Management;
using System.Text.RegularExpressions;

namespace CogwheelSlimDX.JoystickInput {
    
    /// <summary>
    /// Provides methods for retrieving the state from a joystick.
    /// </summary>
    public class Joystick {

        /* ... */

        /// <summary>
        /// Gets the vendor identifier of the <see cref="Joystick"/>.
        /// </summary>
        public ushort VendorId { get; private set; }

        /// <summary>
        /// Gets the product identifier of the <see cref="Joystick"/>.
        /// </summary>
        public ushort ProductId { get; private set; }

        /* ... */

        /// <summary>
        /// Determines whether the device is an XInput device or not. Returns true if it is, false if it isn't.
        /// </summary>
        public bool IsXInputDevice {
            get {
                var ParseIds = new Regex(@"([VP])ID_([\da-fA-F]{4})"); // Used to grab the VID/PID components from the device ID string.

                // Iterate over all PNP devices.
                using (var QueryPnp = new ManagementObjectSearcher(@"\\.\root\cimv2", string.Format("Select * FROM Win32_PNPEntity"), new EnumerationOptions() { BlockSize = 20 })) {
                    foreach (var PnpDevice in QueryPnp.Get()) {

                        // Check if the DeviceId contains the tell-tale "IG_".
                        var DeviceId = (string)PnpDevice.Properties["DeviceID"].Value;
                        if (DeviceId.Contains("IG_")) {

                            // Check the VID/PID components against the joystick's.
                            var Ids = ParseIds.Matches(DeviceId);
                            if (Ids.Count == 2) {
                                ushort? VId = null, PId = null;
                                foreach (Match M in Ids) {
                                    ushort Value = ushort.Parse(M.Groups[2].Value, NumberStyles.HexNumber);
                                    switch (M.Groups[1].Value) {
                                        case "V": VId = Value; break;
                                        case "P": PId = Value; break;
                                    }
                                }
                                if (VId.HasValue && this.VendorId == VId && PId.HasValue && this.ProductId == PId) return true;
                            }
                        }
                    }
                }
                return false;
            }
        }

        /* ... */
    }
}

When the joysticks are enumerated they are only added to the input manager if they are not XInput devices.



To round up the entry, here's a screenshot of a minesweeper clone I've been working on in BBC BASIC.

2009.01.03.01.gif

You can view/download the code here and it will run in the shareware version of BBC BASIC for Windows. The code has been deliberately uglified (cramming multiple statements onto a single line, few comments, trimmed whitespace) to try and keep it within the shareware version's 8KB limit as this is a good limit to keep in mind for the TI-83+ version too.

Sega Master System emulation in Silverlight

Monday, 15th December 2008

I've had to quickly learn Silverlight for work recently, which has been an interesting experience. I've had to write new code, which is fine but doesn't really excite me as far as Silverlight is concerned - it doesn't really matter which language new code is developed in, as long as it gets the job done.

What does interest me more is that Silverlight is ".NET in your browser", and I'm a big fan of .NET technology with a handful of .NET-based projects under my belt. Silverlight therefore gives me the opportunity to run these projects within the browser, which is a fun idea. smile.gif

To this end, I've turned Cogwheel, a Sega 8-bit system emulator, into a Silverlight application. It took about an hour and a half, which was not as bad as I'd expected! (Skip to the bottom for instructions for the demo).

Raster graphics

Silverlight's raster graphics support is somewhat lacking. You can display raster graphics in Image elements, but - as far as I can see - that's about it. If you wish to generate and display images dynamically via primitive pixel-pushing, you're out of luck as far as Silverlight's class library is concerned.

Thankfully, Ian Griffiths has developed a class named PngGenerator that can speedily encode a PNG from an array of Colors that can then be displayed in an Image. Cogwheel's rasteriser returns pixel data as an array of integers so there's a small amount of overhead to convert these but other than that it's easy to push pixels, albeit in a fairly roundabout manner.

Render loop

The render loop is based around an empty Storyboard that invokes an Action every time it completes then restarts itself.

using System;
using System.Windows;
using System.Windows.Media.Animation;

namespace Cogwheel.Silverlight {

    public static class RenderLoop {

        public static void AttachRenderLoop(this FrameworkElement c, Action update) {
            var Board = new Storyboard();
            c.Resources.Add("RenderLoop", Board);
            Board.Completed += (sender, e) => {
                if (update != null) update();
                Board.Begin();
            };
            Board.Begin();
        }

        public static void DetachRenderLoop(this FrameworkElement c) {
            var Board = (Storyboard)c.Resources["RenderLoop"];
            Board.Stop();
            c.Resources.Remove("RenderLoop");
        }
    
    }
}

I'm not sure if this is the best way to do it, but it works well enough and is easy to use - just grab any FrameworkElement (in my case the Page UserControl) and call AttachRenderLoop:

private void UserControl_Loaded(object sender, RoutedEventArgs e) {
    this.UserControlRoot.AttachRenderLoop(() => {
        /* Update/render loop in here. */
    });
}

Missing .NET framework class library features

This is the big one; Silverlight does not cover the entire .NET framework class library, and so bits of it are missing. Fortunately this can be resolved, the difficulty depending on how you want the functionality of the original app to be affected.

Missing types you're not interested in.

These are the easiest to deal with, and this includes attributes and interfaces that the existing code uses that you're not especially interested in. For example, Cogwheel uses some of .NET's serialisation features for save states - a feature I wasn't intending on implementing in the Silverlight version. The [Serializable] and [NonSerialized] attributes are not available in Silverlight, nor is the IDeserializationCallback interface. To get the project to compile some dummy types were created.

namespace System {

    class SerializableAttribute : Attribute { }
    class NonSerializedAttribute : Attribute { }
    
    interface IDeserializationCallback {
        void OnDeserialization(object sender);
    }

}

Missing types or methods that you don't mind partially losing.

Cogwheel features some zip file handling code that uses System.IO.Compression.DeflateStream, a class not available in Silverlight. Rather than remove the zip classes entirely (which would require modifications to other files that relied on them) it was easier to use conditional compilation to skip over the DeflateStream where required.

switch (this.Method) {
    case CompressionMethod.Store:
        CompressingStream = CompressedStream;
        break;
    
    #if !SILVERLIGHT
    case CompressionMethod.Deflate:
        CompressingStream = new DeflateStream(CompressedStream, CompressionMode.Compress, true);
        break;
    #endif
    
    default:
        throw new NotSupportedException();
}

Missing instance methods.

C# 3.0 adds support for extension methods - user-defined methods that can be used to extend the functionality of existing classes that you cannot modify directly. Silverlight is missing a number of instance methods on certain classes, such as string.ToLowerInvariant();. By using extension methods the missing methods can be restored.

namespace System {

    public static class Extensions {
        public static string ToLowerInvariant(this string s) { return s.ToLower(CultureInfo.InvariantCulture); }
        public static string ToUpperInvariant(this string s) { return s.ToUpper(CultureInfo.InvariantCulture); }
    }

}

Missing static methods.

These are the most work to fix as extension methods only work on instance methods, not static methods. This requires a change at the place the method is called as well as the code for the method itself.

I've got around this by creating new static classes with Ex appended to the name then using using to alias the types. For example, Silverlight lacks the Array.ConvertAll method.

namespace System {

    static class ArrayEx {
        public static TOut[] ConvertAll<TIn, TOut>(TIn[] input, Func<TIn, TOut> fn) {
            TOut[] result = new TOut[input.Length];
            for (int i = 0; i < input.Length; i++) {
                result[i] = fn(input[i]);
            }
            return result;
        }
    }

}

First, a replacement method is written with Ex appended to the class name. Secondly, any file that contains a reference to the method has this added to the top:

#if SILVERLIGHT
using ArrayEx = System.ArrayEx;
#else
using System.IO.Compression;
using ArrayEx = System.Array;
#endif

Finally, anywhere in the code that calls Array.ConvertAll is modified to call ArrayEx.ConvertAll instead. When compiling for Silverlight it calls the new routine, otherwise it calls the regular Array.Convert.

Demo

The links below launch the emulator with the selected ROM image.

To run your own ROM image, click on the folder image in the bottom-right corner of the browser window to bring up a standard open file dialog.

Zip files are not handled correctly, but if you type *.* into the filename box, right-click a zip file, pick Open, then select the ROM from inside that it should work (it does on Vista at any rate).

The cursor keys act as you'd expect; Ctrl or Z is button 1/Start; Alt, Shift or X is 2 (Alt brings up the menu in IE). Space is Pause if you use an SMS ROM and Start if you use a Game Gear ROM. Keys don't work at all in Opera for some reason, but they should work fine in IE 8 and Firefox 3. You may need to click on the application first!

Issues

There are a number of issues I have yet to address. Performance is an obvious one; it's a little choppy even with 100% usage of one the cores on a Core 2 Duo. Sound is missing, and I'm not sure what Opera's doing with keys. Other than that, I thought it was a fun experiment. smile.gif

Once I've tidied it up a bit I'll merge the source code with the existing source repository.

SC-3000 keyboard and a final release

Sunday, 20th April 2008

The latest addition to Cogwheel is SC-3000 keyboard emulation.

SC3000Keyboard.png

The SC-3000 was a home computer with similar hardware to the SG-1000 console, with the main addition of a keyboard. Software cartridges could add, for example, BASIC programming capabilities.

Due to lack of time and motivation, and the fact that the emulator is pretty much as good as I'm going to get it at this moment in time, I've removed the beta label and uploaded the latest version to its website.

3D glasses and CPU cycle counting

Monday, 7th April 2008

I reintroduced joystick support to the emulator front-end over the weekend, using a much more sensible input manager. The control editor form uses the same interface design for keyboard and joysticks - you click the button you wish to edit, it appears pressed, you press the key (or joystick button) you wish to bind to it and it pops back out again. It'll also check to see if you've moved the joystick in a particular direction on any of its reported axes.

Another feature I added was better support for games that used the 3D glasses, using a simple red-cyan anaglyph output blender.

MazeHunter3D.png    PoseidonWars3D.png

BladeEagle3D.png    SpaceHarrier3D.png

If the memory address range that the glasses respond to has been written to within the last three frames, it switches to the blending mode.

Irritatingly, with one glaring bug fix I managed to lower overall compatibility. Some instructions weren't being timed correctly (or at all) meaning that the emulator was executing too many instructions for the number of clock cycles it was asked to run. During the time each video scanline is run, 228 CPU cycles are executed. Upping this to 248 cycles (just for experimentation) fixes all known bugs - including the long-standing flickering pixels at the top of the screen in Cosmic Spacehead and display corruption in GP Rider. (However, digitised speech plays at a noticably higher pitch).

I'm not entirely sure why this is - it could be that some instructions are claiming to take too long, it could be yet another interrupt problem. To call this mildly frustrating is a bit of an understatement!

ColecoVision and TMS9918 Multicolor

Thursday, 3rd April 2008

One notable gap in my TMS9918 (video) emulation was its "Multicolor" mode. This mode broke the screen down into 4×4 pixel squares, resulting in a 64×48 grid. Each cell could be assigned a unique colour, giving you a crude bitmapped video mode.

No Master System or SG-1000 software used this mode to my knowledge, which reduced the likelihood of it being supported at all (if I could't test it, how could I emulate it?) I was tipped off that some ColecoVision software made use of it, so set about emulating the ColecoVision.

ColecoVision.png

The ColecoVision hardware is very similar to the SG-1000 in terms of what is inside the case - a Z80 CPU, TMS9918 video and SN76489 sound. The memory map (as in, which address ranges map to which memory devices) is different, as is the I/O map (as in, the I/O ports that the various hardware components are connected to). To handle this case, the emulator now has a Family field, which can be set to Sega or ColecoVision. This controls which of the different mappings it uses for memory and hardware I/O.

Another difference is the presence of a BIOS ROM. The Sega Master System and Sega Game Gear consoles had the option of BIOS ROMs, but all these did was very basic initialisation and header/checksum checking. The ColecoVision, however, has an 8KB BIOS ROM that offers a lot of functionality to the programmer, so this must be present to run most ColecoVision games.

ColecoVisionControllers.png

The controllers are also quite different. As well as the typical eight-direction joystick and two fire buttons, it added a 12-key keypad (0-9, * and #). This many keys is making my InputManager class look thoroughly idiotic, so that will certainly need a rewrite.

Apart from that, it's pretty simple. RAM is 1KB instead of 8KB; the sound generator uses the standard 15-bit wide shift register (instead of the 16-bit wide one used in the SMS); the video display processor interrupt output is connected to NMI rather than INT.

With those differences applied, it's easy to add the few lines of code to emulate the Multicolor video mode.

SmurfPaintNPlayWorkShop.png
Smurf Paint 'n' Play Workshop

ColecoVision emulation has not been tested at all thoroughly, so chances are it doesn't work very well; Multicolor emulation has been tested with precisely one game!

You can download the latest build of Cogwheel from its website, featuring the new ColecoVision emulation.

SaveStates and Key Names

Monday, 31st March 2008

It's a good feeling when issue 1 is finally marked as Fixed - in this case it was another interrupt-related bug. The IFF1 flag was being used to mask non-maskable interrupts; I don't think it should and it hasn't seem to have broken anything just yet by making non-maskable interrupts truly non-maskable.

I have also changed the savestate format to something that will be a little more backwards compatible rather than a plain BinaryFormatter dump of the entire emulator state. The data is now saved in what is basically an INI file with C#-style attributes and CSS-style url() syntax for binary data.

[Type(BeeDevelopment.Sega8Bit.Hardware.VideoDisplayProcessor)]
VideoRam=Dump(Video\VideoRam.bin)
Address=49165
WaitingForSecond=False
AccessMode=ColourRamWrite
ReadBuffer=32
Registers=Dump(Video\Registers.bin)
System=Ntsc
SupportsMode4=True
; ... snip ...

Other changes include a faked fullscreen mode (ie, a maximised borderless window) and an option to retain the aspect ratio of the game, so games no longer appear stretched on widescreen monitors. The sound emulation is a bit better, but still a little noisy in certain games with spurious beeps or buzzes.

One minor problem was on the key configuration control panel. Converting members of the Keys enumeration into displayable strings can be done via .ToString(), but this results in names that do not match the user's keyboard layout (such as OemQuestion for /, OemTilde for # and even Oemcomma for , - note the lowercase c, all on a UK keyboard).

With a prod in the right direction from jpetrie, here's a snippet that can be used to convert Keys into friendly key name strings:

#region Converting Keys into human-readable strings.

/// <summary>
/// Converts a <see cref="Keys"/> value into a human-readable string describing the key.
/// </summary>
/// <param name="key">The <see cref="Keys"/> to convert.</param>
/// <returns>A human-readable string describing the key.</returns>
public static string GetKeyName(Keys key) {

	// Convert the virtual key code into a scancode (as required by GetKeyNameText).
	int Scancode = MapVirtualKey((int)key, MapVirtualKeyMode.MAPVK_VK_TO_VSC);

	// If that returned 0 (failure) just use the value returned by Keys.ToString().
	if (Scancode == 0) return key.ToString();

	// Certain keys end up being mapped to the number pad by the above function,
	// as their virtual key can be generated by the number pad too.
	// If it's one of the known number-pad duplicates, set the extended bit:
	switch (key) {
		case Keys.Insert:
		case Keys.Delete:
		case Keys.Home:
		case Keys.End:
		case Keys.PageUp:
		case Keys.PageDown:
		case Keys.Left:
		case Keys.Right:
		case Keys.Up:
		case Keys.Down:
		case Keys.NumLock:
			Scancode |= 0x100;
			break;
	}

	// Perform the conversion:
	StringBuilder KeyName = new StringBuilder("".PadRight(32));
	if (GetKeyNameText((Scancode << 16), KeyName, KeyName.Length) != 0) {
		return KeyName.ToString();
	} else {
		return key.ToString();
	}
}

/// <summary>
/// Retrieves a string that represents the name of a key.
/// </summary>
/// <param name="lParam">Specifies the second parameter of the keyboard message (such as <c>WM_KEYDOWN</c>) to be processed.</param>
/// <param name="lpString">Pointer to a buffer that will receive the key name.</param>
/// <param name="size">Specifies the maximum length, in TCHAR, of the key name, including the terminating null character. (This parameter should be equal to the size of the buffer pointed to by the lpString parameter).</param>
/// <returns>The length of the returned string.</returns>
[DllImport("user32.dll")]
static extern int GetKeyNameText(int lParam, StringBuilder lpString, int size);

/// <summary>
/// Translates (maps) a virtual-key code into a scan code or character value, or translates a scan code into a virtual-key code.
/// </summary>
/// <param name="uCode">Specifies the virtual-key code or scan code for a key. How this value is interpreted depends on the value of the <paramref name="uMapType"/> parameter.</param>
/// <param name="uMapType">Specifies the translation to perform. The value of this parameter depends on the value of the <paramref name="uCode"/> parameter.</param>
/// <returns>Either a scan code, a virtual-key code, or a character value, depending on the value of <paramref="uCode"/> and <paramref="uMapType"/>. If there is no translation, the return value is zero.</returns>
[DllImport("user32.dll")]
static extern int MapVirtualKey(int uCode, MapVirtualKeyMode uMapType);

enum MapVirtualKeyMode {
	/// <summary>uCode is a virtual-key code and is translated into a scan code. If it is a virtual-key code that does not distinguish between left- and right-hand keys, the left-hand scan code is returned. If there is no translation, the function returns 0.</summary>
	MAPVK_VK_TO_VSC = 0,
	/// <summary>uCode is a scan code and is translated into a virtual-key code that does not distinguish between left- and right-hand keys. If there is no translation, the function returns 0.</summary>
	MAPVK_VSC_TO_VK = 1,
	/// <summary>uCode is a virtual-key code and is translated into an unshifted character value in the low-order word of the return value. Dead keys (diacritics) are indicated by setting the top bit of the return value. If there is no translation, the function returns 0.</summary>
	MAPVK_VK_TO_CHAR = 2,
	/// <summary>uCode is a scan code and is translated into a virtual-key code that distinguishes between left- and right-hand keys. If there is no translation, the function returns 0.</summary>
	MAPVK_VSC_TO_VK_EX = 3,
	MAPVK_VK_TO_VSC_EX = 4,
}

#endregion

It uses P/Invoke and the Win32 API so is only suitable for use on Windows.

Fun with IThumbnailProvider

Friday, 28th March 2008

Note: I have been informed that the code below no longer works in Windows 7 due to changes in the way IThumbnailProvider operates. It is recommended that you use unmanaged code instead of the managed solution presented below.



I have started releasing Cogwheel binaries on its project page, so if you'd like a look at the project but can't be bothered to check out and build the source yourself you can now give it a whirl.

One of the newer additions is a savestate mechanism; this is a very lazy bit of code on my behalf as all it does currently is serialise the entire emulator to a file using the BinaryFormatter. This resulted in savestates weighing in at about 6MB; by marking certain private fields (such as look-up tables in the Z80 emulator) as [NonSerialized] it was down to 2MB. To squash it down to the current ~250KB size the savestate is compressed using the zip file classes I've written to handle loading ROMs from zips.

Whilst this is going to change soon (I'm currently working this on an simple INI file serialiser, so the savestate files will be compatible with later releases of the software) I decided to experiment with the idea of dumping extra data into the savestate - namely, a screenshot.

SavedStateThumbnails.jpg

The screenshot is simply saved as Screenshot.png in the root of the savestate's zip archive. Creating a thumbnailer is extremely easy under Vista, and as the thumbnailer runs out-of-process you can use .NET code! Here's a quick and dirty run-down of how to make them if you decide to write one yourself.



Setting up the project

Create a new class library project in Visual Studio, then go switch to its project properties editor. On the Application tab, set Target Framework to something sensible (I currently try and keep everything at .NET 2.0 level), then click on the Assembly Information button and tick the Make assembly COM-Visible box.

Finally, move to the Signing tab, and tick the box marked Sign the assembly. From the drop-down box, pick New, which will create a new key file and add it to the project (this is required later for COM registration).

Add the COM interface wrappers

This is a simple copy and paste job! Just bung this in a source file somewhere:

using System;
using System.Runtime.InteropServices;
using System.Runtime.InteropServices.ComTypes;

namespace Thumbnailer {

	/// <summary>
	/// Defines the format of a bitmap returned by an <see cref="IThumbnailProvider"/>.
	/// </summary>
	public enum WTS_ALPHATYPE {
		/// <summary>
		/// The bitmap is an unknown format. The Shell tries nonetheless to detect whether the image has an alpha channel.
		/// </summary>
		WTSAT_UNKNOWN = 0,
		/// <summary>
		/// The bitmap is an RGB image without alpha. The alpha channel is invalid and the Shell ignores it.
		/// </summary>
		WTSAT_RGB = 1,
		/// <summary>
		/// The bitmap is an ARGB image with a valid alpha channel.
		/// </summary>
		WTSAT_ARGB = 2,
	}

	/// <summary>
	/// Exposes a method for getting a thumbnail image.
	/// </summary>
	[ComVisible(true), Guid("e357fccd-a995-4576-b01f-234630154e96"), InterfaceType(ComInterfaceType.InterfaceIsIUnknown)]
	public interface IThumbnailProvider {
		/// <summary>
		/// Retrieves a thumbnail image and alpha type. 
		/// </summary>
		/// <param name="cx">The maximum thumbnail size, in pixels. The Shell draws the returned bitmap at this size or smaller. The returned bitmap should fit into a square of width and height <paramref name="cx"/>, though it does not need to be a square image. The Shell scales the bitmap to render at lower sizes. For example, if the image has a 6:4 aspect ratio, then the returned bitmap should also have a 6:4 aspect ratio.</param>
		/// <param name="hBitmap">When this method returns, contains a pointer to the thumbnail image handle. The image must be a device-independent bitmap (DIB) section and 32 bits per pixel. The Shell scales down the bitmap if its width or height is larger than the size specified by cx. The Shell always respects the aspect ratio and never scales a bitmap larger than its original size.</param>
		/// <param name="bitmapType">Specifies the format of the output bitmap.</param>
		void GetThumbnail(int cx, out IntPtr hBitmap, out WTS_ALPHATYPE bitmapType);
	}

	/// <summary>
	/// Provides a method used to initialize a handler, such as a property handler, thumbnail provider, or preview handler, with a file stream.
	/// </summary>
	[ComVisible(true), Guid("b824b49d-22ac-4161-ac8a-9916e8fa3f7f"), InterfaceType(ComInterfaceType.InterfaceIsIUnknown)]
	public interface IInitializeWithStream {
		/// <summary>
		/// Initializes a handler with a file stream.
		/// </summary>
		/// <param name="stream">Pointer to an <see cref="IStream"/> interface that represents the file stream source.</param>
		/// <param name="grfMode">Indicates the access mode for <paramref name="stream"/>.</param>
		void Initialize(IStream stream, int grfMode);
	}

}

(You may wish to set the namespace to something more appropriate). As you can see, most of that source file is documentation.

Create your thumbnailer class

First thing you'll need to do here is to generate a GUID for your thumbnailer; this is so that when you register your thumbnailer Windows will know which COM object to create an instance of which it can then call to generate a thumbnail (the GUID of your thumbnailer is attached to the extension of the file via standard file associations - more on that later).

Your thumbnailer class should implement two interfaces; IThumbnailProvider (obviously!) and IInitializeWithStream. Here's a skeleton class for the thumbnailer:

using System;
using System.Drawing;
using System.Drawing.Drawing2D;
using System.IO;
using System.Runtime.InteropServices;
using System.Runtime.InteropServices.ComTypes;

namespace Thumbnailer {
	
	[ComVisible(true), ClassInterface(ClassInterfaceType.None)]
	[ProgId("YourApp.ThumbnailProvider"), Guid("YOUR-GUID-IN-HERE")]
	public class ThumbnailProvider : IThumbnailProvider, IInitializeWithStream {

		#region IInitializeWithStream

		private IStream BaseStream { get; set; }

		public void Initialize(IStream stream, int grfMode) {
			this.BaseStream = stream;
		}

		#endregion
		
		#region IThumbnailProvider

		public void GetThumbnail(int cx, out IntPtr hBitmap, out WTS_ALPHATYPE bitmapType) {

			hBitmap = IntPtr.Zero;
			bitmapType = WTS_ALPHATYPE.WTSAT_UNKNOWN;
			
			try {
			
				// Thumbnailer code in here...
			
			} catch { } // A dirty cop-out.

		}
		
		#endregion
	}
}

You will probably want to set the ProgId to something meaningful, and make sure you set the GUID to the one you just generated.

What will happen is that Windows will first initialise your object by calling IInitializeWithStream.Initialize(), passing in an IStream. The above implementation stores the IStream in a member property for future reference.

Windows will then call IThumbnailProvider.GetThumbnail(). cx is the maximum size of the thumbnail (width and height) you should return; Windows will scale your thumbnail down if you return one that is too large. Do not scale your thumbnail up to match this value; it is perfectly valid to return one that is smaller than the requested value. Also; do not scale your thumbnail up to a square: you should return it at the same aspect ratio of the source image.

For the moment, and for the sake of testing, here's a snippet that will create a bright red thumbnail using GDI+:

using (var Thumbnail = new Bitmap(cx, cx)) {
	using (var G = Graphics.FromImage(Thumbnail)) {
		G.Clear(Color.Red);
	}
	hBitmap = Thumbnail.GetHbitmap();
}

Registration

If you compile your class library at this point you should end up with a single DLL. You need to register this DLL using the command-line tool RegAsm.exe that comes with the .NET framework.

Open an elevated command prompt (you need admin rights for this bit) and set the working directory to the output directory of your DLL. Now, invoke the following command:

%windir%\Microsoft.NET\Framework\v2.0.50727\RegAsm /codebase YourThumbnailer.dll
That's half of the battle; the last bit boils down to conventional file associations.

Run the registry editor, and open the HKEY_CLASSES_ROOT key. You will see a list of keys representing file extensions; find one (or create a new one) to match the extension that you wish to attach your thumbnailer to. Under that create a new key named shellex, and under that create another key named {e357fccd-a995-4576-b01f-234630154e96}. Set its (Default) value to {YOUR-GUID-IN-HERE} - yes, the GUID you created earlier. That should look something like this:

  • HKEY_CLASSES_ROOT
    • .yourextension
      • shellex
        • {e357fccd-a995-4576-b01f-234630154e96} = {YOUR-GUID-IN-HERE}

That's it! smile.gif You may need to log out then in again (and/or reboot and/or just kill all Explorer instances and restart them) for Explorer to catch on if nothing seems to be working.

A final note: IStream to Stream

The only final hitch is that IStream is not the same as our beloved .NET Stream. I use the following snippet to dump all of the contents of an IStream into an array of bytes (which can then be converted to a stream using new MemoryStream(byte[]) if need be).

private byte[] GetStreamContents() {

	if (this.BaseStream == null) return null;

	System.Runtime.InteropServices.ComTypes.STATSTG statData;
	this.BaseStream.Stat(out statData, 1);

	byte[] Result = new byte[statData.cbSize];

	IntPtr P = Marshal.AllocCoTaskMem(Marshal.SizeOf(typeof(UInt64)));
	try {
		this.BaseStream.Read(Result, Result.Length, P);
	} finally {
		Marshal.FreeCoTaskMem(P);
	}
	return Result;
}

This, naturally, is not a good idea if you're thumbnailing very large files, as it dumps the entire thing into memory!

For more information, take a look at MSDN: Thumbnail Providers, which includes useful information (including how to change the overlay icon in the bottom-right of your thumbnails or the adornments).

The COM wrappers and GetStreamContents() snippet are based on this uberdemo article.


PhantasyStarTranslation.png

Finally, another screenshot; you can now load IPS patch files directly into Cogwheel using the advanced ROM load dialog - which can be useful for translations.

Sound, at long last.

Friday, 7th March 2008

I have finally got around to adding sound to Cogwheel using Ianier Munoz's waveOut API wrapper.

RistarSoundTest.png

The technique used is fairly simple. I start with a sound buffer that is a multiple a number of video frames in length (1/60th of a second is one frame) - four seems a good number. This buffer needs to be periodically topped up with sound samples (every four frames in the above example).

I run the emulator for one frame, then generate a frame's worth of audio. I add these samples to a queue. The sound callback then periodically dequeues these samples and appends them to its buffer.

// This is called once every video frame.
// 735 samples at 44100Hz = 1/60th second.
// (Multiplied by two for stereo).
this.Emulator.RunFrame();
short[] Buffer = new short[735 * 2];
this.Emulator.Sound.CreateSamples(Buffer);
this.GeneratedSoundSamples.Enqueue(Buffer);

The important thing is that the sound is always generated after the video frame (and thus after any hardware writes). I log writes to the sound hardware over the period of a frame (along with the number of CPU cycles that have elapsed), then space them out when generating the sound samples so that they play in synch. My previous problems were caused by the sound emulation trying to "look ahead" past what had already been generated.

However, there is a potential problem with this - as the video and sound emulation are not locked in synch with eachother, there are two cases that could crop up:

  1. The emulator runs faster than 60Hz, generating too many sound samples.
  2. The emulator runs slower than 60Hz, not generating enough sound samples.

The first is the easiest to deal with. In most instances you'd want a couple of extra frames of sound data left in the queue after topping up the sound buffer, in case in the next period not enough are generated. However, if I notice that the queue is longer than entire sound buffer after topping it up, I clear it completely. This would make the sound a little choppy, but so far this hasn't happened in my tests.

The latter is a little more complex. If I just left it the sound buffer would have gaps in it, causing noticable pops (this I have noticed in some of the more processor-intensive games). To cover up the gaps, I generate enough extra frames of sound data to fill the gap. As no sound hardware writes are made, this has the effect of extending any tones that were currently playing, so the sound will play back slightly out of time. However, slightly out of time by a few 60ths of a second is a better solution than a pop.

// This is called when the sound buffer needs topping up.
// That's about once every four frames.
private void SoundBufferFiller(IntPtr data, int size) {

	// Temporary buffer to store the generated samples.
	short[] Generated = new short[size / 2];

	for (int i = 0; i < Generated.Length; i += 735 * 2) {
		if (this.GeneratedSoundSamples.Count > 0) {
			// We've already queued up some sound samples.
			Array.Copy(this.GeneratedSoundSamples.Dequeue(), 0, Generated, i, 735 * 2);
		} else {
			// Erk, we're out of samples... force generate some more and use those instead.
			// (This avoids popping).
			short[] Temp = new short[735 * 2];
			this.Emulator.Sound.CreateSamples(Temp);
			Array.Copy(Temp, 0, Generated, i, 735 * 2);
		}
	}

	// Copy to the sound buffer.
	Marshal.Copy(Generated, 0, data, size / 2);
	
	// If too many samples are being generated (FPS > 60Hz) then make sure it doesn't go out of control.
	while (this.GeneratedSoundSamples.Count > this.SoundBufferSizeInFrames) this.GeneratedSoundSamples.Dequeue();

}

Joysticks and Game Genie codes

Monday, 3rd March 2008

The biggest update to the emulator relates to input.

ConfigureControls.png

I've added a new control panel that lets you customise key bindings. For the keyboard, you simply click on the button you wish to customise, then press the key on your keyboard you wish to bind. As you can probably tell from the screenshot, I've also added joystick support (via the Multimedia Joystick Functions with a bit of P/Invoke for ease) which means that with a simple adapter and a free driver you can use original SMS pads with the emulator.

I haven't added support for the POV hat, which needs doing as the PPJoy PlayStation controller driver exposes the d-pad as a POV hat.

GameGenieEditor.png

I've also added an interface for adding Game Genie codes. The Game Genie was a cheating device that could be used to patch memory.

I have also started reintroducing (albeit in a slightly buggy fashion) multiple memory devices. Previously I've just been emulating a cartridge ROM and work RAM; the real consoles had multiple memory devices including a card slot, ROM BIOS and expansion slots. By emulating this system one can boot with the original BIOS, and (for example) not insert a cartridge to watch the BIOS animation or play its integrated game.

For ease, there's a quick-load ROM menu option that will load anything into the cartridge slot and disable all others for quick game playing. For those who wish to play with the hardware more, there's an "advanced load" dialog that will let you pick what goes in which slot, and also force various options (such as locale, hardware model, video system and so on) instead of the automatically guessed options. I'd also like it to be able to pick patch files (such as translations) so you don't need to use an external tool. Currently this dialog only supports cartridge ROM and BIOS ROM loading, though.

Cogwheel on Google Code

Sunday, 24th February 2008

On bakery2k1's suggestion, I took a look at my sprite collision flag code. It only checked for collisions between sprites appearing in the foreground - sprites that were hidden behind a background tile weren't checked, which caused the Fantastic Dizzy bug.

FantasticDizzyFixedSpriteCollision.png

I have decided to give Google's free project hosting a whirl. As usual, they can't get it working in Opera flaming.gif (no big surprises there) and it's horrifically slow, but I can't really complain given that it's free, and the issue tracking is pretty handy.

Gauntlet.png

Another game that has been fixed is the conversion of that arcade classic, Gauntlet. This was partially due to buggy controller port emulation, but a long-standing bug has been the lack of the y-scrolling inhibition flag. Games could set a flag on the VDP that would prevent the rightmost 64 pixels from scrolling vertically - this is useful to create a locked status bar on the right for a vertical scrolling game, for example.

SmsVdpTest1.png SmsVdpTest2.png

I'm slowly getting there with FluBBa's VDP tester...

/INTerminable Interrupt Problems

Friday, 22nd February 2008

WCLBGolf1.png WCLBGolf2.png
World Class Leaderboard Golf

Well, I've just about got it the way it was before with the new interrupt code; most games work again, but the few that have always given problems (such as Desert Speedtrap) still don't work properly. sad.gif I think this is the stage where I have to start trawling through disassemblies to try and work out why they're not working.

CosmicSpacehead.png
Cosmic Spacehead - one of the few games to use the 256×224 mode.

One problem I still haven't got to the bottom of is with the Dizzy games. They either reset when starting a new game or lock up. I'm hoping this is a problem with my implementation of the Codemasters mapper. I guess that The Excellent Dizzy Collection has a simple front-end game selection screen that switches to the requisite ROM page for the selected game, then jumps to the start of that page - in my case this jumps back to the initial Codemasters screen.

I'm not sure where Fantastic Dizzy's problems originate. It looks like an interrupt problem (maybe the palette should be switched when the VDP has finished with the status bar at the top?), but could also be related to the other Codemasters problems.

Interestingly, Micro Machines (1 and 2) and Cosmic Spacehead - both using the Codemasters ROM mapper - seem to work fine.

Interrupts: A Fresh Start

Thursday, 21st February 2008

I gave in and rewrote all of the Z80's interrupt emulation from scratch, finding some rather horrible bugs in the existing implementation as I went.

Some of the highlights included non-maskable interrupts ignoring the state of the IFF1 flag (this flag is automatically cleared when an interrupt is serviced, and is used to prevent the interrupt handler from being called again before it has finished) and the RETN instruction not copying the state of the IFF2 flag back to IFF1. When non-maskable interrupts are serviced, the state of IFF1 is copied to IFF2 before it gets cleared, the idea being that if you use RETN interrupts are automatically re-enabled on exit of the NMI ISR. (Contrast this with maskable interrupts, where both flags are cleared, and you need an explicit EI to re-enable them).

The HALT instruction (executes NOPs until an interrupt is requested or the CPU is reset) was also completely incorrectly (and bizarrely) implemented. The rewrite just sets a Halted property, which prevents the CPU from fetching or executing any instructions. The interrupt-triggering code simply resets this property.

This has fixed numerous bugs (I'm not sure when they were introduced, as it was all working a while back). It's gone from "not working at all" to "just about working", but some games or demos that rely on precise interrupt timing don't work properly.

HicolorDemoCorruption.png DesertSpeedtrapMisaligned.png
Game Gear Hicolor Demo and Desert Speedtrap

Both problems in the above screenshots relate to line-based interrupts from the VDP (Video Display Processor). Some other games simply hang at startup. sad.gif

Loading ROMs

Wednesday, 20th February 2008

OutRunEuropa1.png OutRunEuropa3.png
OutRun Europa

I've been adding a series of new features to Cogwheel. I wrote a basic zip file class a while back (only supporting store and deflate compression - the majority of zip files use deflate), and have added a Utility namespace with methods for help with loading ROMs.

For example, there is a method that is passed a filename by reference, and it'll return an array of bytes of the loaded file. If the passed filename was a zip file, by any chance, it'll search for ROMs inside the zip, and modify the filename (so it might end up as C:\Path\SomeFile.zip\Game.sms).

Once this is done, the CRC-32 checksum of the file is calculated then looked up against a database of known ROM dumps. I'm currently using the SMS Checker .romdata files as the database source. These contain information about known bad dumps, and can be used to strip headers and footers, patch bytes and remove redundant overdump data to end up with a valid ROM image.

Finally, the mapper is detected. Currently, I have three mappers - RAM (which is just 64KB of RAM), Standard (the standard Sega Master System ROM mapper) and Codemasters. I use the RAM mapper for all SG-1000 and SC-3000 games (I'm not sure what they're meant to use, here, but some games - such as The Castle - don't work with the standard SMS mapper).

The Codemasters mapper uses a different method to swap pages, which means that the conventional SMS BIOS checksumming code fails to swap in pages correctly and thus doesn't check the entire cartridge. The games therefore have their own checksumming routines and store their own checksum in a Codemasters-specific header; this makes detecting Codemasters games fairly easy, as all you need to do is check for the extra Codemasters checksum.

As the BIOS doesn't check the checksum, the cartridge provides code to do so itself. You can run it by holding 1 and 2 down as the game boots.

ExcellentDizzyCollectionChecksum.png
The Excellent Dizzy Collection
MicroMachinesChecksum.png
Micro Machines

I seem to have broken interrupts somewhere along the line; most noticable are non-maskable interrupts (as generated by the Pause button) which crash the emulator (it appears that once they fire, interrupts are never re-enabled). I'm not sure what's causing this, but hopefully it won't be too unpleasant to fix!

Game Gear LCD Scaling

Thursday, 14th February 2008

The Game Gear's hardware is very similar indeed to the Master System's - so similar that you can play Master System games on a Game Gear via a special adapter. Some Game Gear games were just the Master System ROMs in a Game Gear cartridge. smile.gif

That said, the Game Gear's LCD is only 160×144, and the Master System has a resolution of 256×192 (or 256×224 or 256×240, but those modes were very rarely used). In Game Gear mode, this resolution is simply cropped. In Master System mode something more interesting has to be done to scale this down to fit the LCD.

I won't bore you with the details here, but will refer you to a post here on the subject. Using the research, I added a mode to the VDP emulator that would mimic this scaling.

pop_comparison.jpg

not_only_words_comparison.jpg

This is old news, of course. I have nothing especially new to report, but I have pretty much entirely rewritten the emulator now (apart from the Z80 emulator, which was fairly well designed). The code was absolutely horrible, and so I've redesigned it to be a lot more flexible and intuitive to use as a library. I've rewritten the standard memory mapper, I/O mapping, and VDP (though I did copy the rasterisation and timing stuff from the old VDP code and cleaned it a little); there's a lot more to do (so far I haven't even reintroduced joypad input) but at least it'll be nicer for me to work with. smile.gif

In the meantime, here are some ugly screenshots of what Master System games look like on the Game Gear LCD.

RoadRashMasterGear.png

BlackBelt.png

PrinceOfPersia.png

RoboCop.png

Necromancy

Friday, 8th February 2008

After seeing Scet and Drilian's work on their respective emulator projects I decided I needed to do something with the stagnating Cogwheel source on my hard disk drive.

The only ROM I have tested where I can't find an explanation for a bug is the Game Gear Garfield: Caught in the Act game. Like many games, when left at the title screen it'll run a demo loop of the game in action. At one point Garfield would walk to the left of the screen, jump over a totem pole, shunt it to the right and use it as a way to jump out of a pit. However, in Cogwheel he would not jump far enough to the left, and not clearing the totem pole he'd just walk back to the right and not have anything to jump out of the pit on.

I remembered a post on MaxCoderz discussing a long-standing tradition of thinking that when a JP <condition>, <address> failed the instruction took a single clock cycle. You can see this misreported here, for example. This document, on the other hand, claims it always takes 10 clock cycles - and most importantly of all, the official user manual backs this up.

Garfield.png

So, Garfield can now get out of his pit. The user interface has changed (again) - I'm now using SlimDX to dump pixels to a Panel, which seems to be the least hassle distribution-wise and doesn't throw LoaderLockExceptions.

Parallel-Port SMS Control Pad

Monday, 15th January 2007

I've been wanting to attach an SMS control pad to my PC (and be able to use it to play games with) for a while, so put in an order from those excellent chaps at Rapid for the parts needed.

The joypad (as I've now learned from disassembly) is very primitive - 6 normally-open switches, each connected between a pin on the DE-9 connector and ground. The accepted layout adapter uses the 25-pin parallel port, connecting ground to pin 18, power to pin 1 (not that the control pad uses this pin) and 7 further connections from D0 to D6 for the buttons.

sms_pad.jpg
Master System Control Pad and a poorly-soldered DB-25 to DE-9 adapter.

I had been assured that the data lines on parallel ports (D0..D7) were pulled up, and so the layout seemed easy enough - D0..D6 will return highs normally, and when a button is pressed it is connected to ground.

Unfortunately, for whatever reason the data lines on the parallel port on my PC are not pulled up, at least not in any way that I can find to control. However, if you set the lines to be outputs (using bit 5 of the control register), set them all high, then flip them to inputs, they'll read as highs for a while until they float (slowly) back low again. I've used this to my advantage, and so have this:

/// <summary>Flags corresponding to which buttons are pressed.</summary>
[Flags]
public enum Buttons {
    None = 0x00,
    Up = 0x01,
    Down = 0x02,
    Left = 0x04,
    Right = 0x08,
    Button1 = 0x10,
    Button2 = 0x20,
    All = 0x3F,
}   

// Retrieve the status of the port.
private Buttons GetRawStatus() {
    // Set D0..D7 as outputs.
    Output(this.BaseAddress + 2, 0x00);
    // Set them high:
    Output(this.BaseAddress + 0, 0xFF);
    // Set D0..D7 as inputs.
    Output(this.BaseAddress + 2, 0x20);
    // Retrieve, invert and mask the data lines.
    return (Buttons)(~(Input(this.BaseAddress + 0)) & (int)Buttons.All);
}

This works very well, with one small problem: nothing is debounced, so pressing any button causes 10 or so press/release actions to be detected until the contacts settle. Therefore, the exposed method for retrieving the status is this:

/// <summary>Gets the status of the buttons from the connected SMS joypad.</summary>
/// <returns>The status of the buttons.</returns>
public Buttons GetStatus() {
    if (!this.Debounced) {
        return GetRawStatus();
    } else {
        Buttons Last = GetRawStatus();
        Buttons Current;
        int MaximumIterations = 100;
        while (((Current = GetRawStatus()) != Last) && (MaximumIterations-- > 0)) {
            Last = Current;
            Thread.Sleep(0);
        }
        return Current;
    }
}

For some strange reason, this doesn't quite work; after a while (or rebooting, or reading/writing the EPP registers) the port starts reading nothing but zeroes again. Running another piece of software that uses the parallel port fixes it.

One missing feature of the emulator was support for the SMS pause button. This button is attached to the Z80's non-maskable interrupt line, so pressing it results in the CPU pushing the program counter to the stack then jumping to $0066.

For most games the pause button just pauses the game, but for some others it will display a menu - such as in Psycho Fox, which lets you use the items you have collected to change animal or use a power-up.

psycho_fox_menu.png
Psycho Fox's in-game menu

One major long-standing bug in the emulator has been interrupt handling by the CPU. I think I've (finally!) got it, though it's still not entirely perfect. How I've set it up now is that a flag is set - IntPending or NmiPending, depending on whether the maskable or non-maskable interrupt pin has been modified - when the interrupt is requested, and cleared when it's been handled.

japanese_bios_1.gif
japanese_bios_2.gif
Japanese Master System BIOS

I have updated the memory emulation to better support BIOS ROMs. Initially, the "Majesco" Game Gear BIOS and some of the "Snail Maze" SMS BIOS worked (though the SMS BIOS would display "Software Error" on several games). I've tested a few of them and they seem to work pretty well.

hang_on_safari_hunt.png
Hang On and Safari Hunt

Whilst the Japanese BIOS has (in my opinion) the best final effect, it's the M404 prototype BIOS that has the best effect overall:

prototype_bios.gif

Sega BASIC

Monday, 8th January 2007

mdi.png

I have returned to the MDI view for this project which makes life easier when it comes to putting in debugging tools. For the moment all there is is a palette and tile viewer, but I hope to add some helpful utilities - and work out a way to tie the new Brass and an emulator together for debugging assembly programs.

I've extended the SG-1000 emulation some way towards the SC-3000 (Sega Game 1000 vs. Sega Computer 3000) which has mainly involved adding keyboard emulation. The result is the ability to run Sega BASIC, and so you get the obligatory program that everyone writes sooner or later...

obligatory.png

There is no tape drive emulation, and I'd like to ideally emulate the SF-7000's floppy drive and disk BASIC, but don't quite follow the datasheets I've read thus far (or rather, how everything is tied together).

In the first screenshot you might have noticed that the video output window has "Altered Beast (h) [A]" in the caption. I've added ROM library support, and am currently using the data from the .romdata files arranged by Maxim for his SMS Checker utility. As well as identify a name, there is a more important reason to use the database - it can identify dodgy dumps and be used to correct them (in this case, Altered Beast has an additional header - hence the (h) - that needs to be removed).

SG-1000

Wednesday, 3rd January 2007

I've added some support for the SG-1000, Sega's first? home video game console.

sg_1000_sega.gif

The Master System's VDP is a modified TMS9918, and so most Master System games run in its extended 'Mode 4' setting. That was the only video mode, therefore, that I'd emulated in any form.

For some reason, the older computer games are, the more charm they seem to have to me (maybe because the first games I played would have been on a BBC Micro, which certainly looked a lot more primitive than the Master System games I've been attempting to emulate thus far). I dug out TI's TMS9918 documentation - the differences are quite significant! Tiles are monochrome (though you can pick which foreground and background colour is in use - to some extent), the palette is fixed to 16 pre-defined colours (one of which being 'transparent') and sprite sizes and collisions are handled differently. Various features found in the Master System's VDP (scrolling, line-based interrupts) also appear to be missing from the 'vanilla' TMS9918, but I'm not sure whether or not they make an appearance in the SMS variation of the VDP or not, along with the original TMS9918 limitations.

flipper.png
Flipper

At any rate, the emulator now has a 'SG-1000' mode. The only differences at the moment are that the TMS9918 palette is used and line interrupts are disabled, so you can still (for example) use mode 4 on it.

drol.png choplifter.png hustle_chummy.png championship_loderunner.png space_invaders.png hero.png elevator_action.png zaxxon.png

From first to last: Drol, Choplifter, Hustle Chummy, Championship LodeRunner, Space Invaders, H.E.R.O., Elevator Action, and Zaxxon.

All but one of the SG-1000 games I had ran - and that was The Castle. According to meka.nam, this has an extra 8KB of onboard RAM. Whilst doing some research into the SG-1000 and the TMS9918, I found a forum post by Maxim stating "The Castle runs nicely on my RAM cart :)". Enter the new ROM mapper option to complement Standard, Codemasters and Korean - it's the RAM mapper, which simply represents the entire Z80 address range with a 64KB RAM.

the_castle_1.png the_castle_2.png

That seems to have done the trick!

I mentioned that the palette was different in the SMS VDP and the original TMS9918 - here's an example (SMS on the left, SG-1000 on the right):

music_station_sms.png music_station_sg_1000.png

I'm assuming this is the result of truncating the TMS9918 palette to the SMS 6-bit palette without needing to implement two palette modes on the new VDP. Another comparison is between two versions of Wonder Boy - SMS on the left again, SG-1000 on the right:

wonder_boy_sms.png wonder_boy_sg_1000.png

VDP Interrupts

Wednesday, 20th December 2006

road_rash_2.png

The VDP can generate two different types of CPU interrupt.

The first, and easiest, is the frame interrupt, which is requested when an entire frame has been generated. This is requested, therefore, at a regular 60Hz in NTSC regions and 50Hz in PAL regions - it's a useful timer to synchronise your game to.

The second, and more complex, is the line interrupt. This interrupt is requested when a user-definable number of scanlines have been displayed. An internal counter is decremented each active line (and one more just after), and when it overflows it resets to the value held in a VDP register and requests the interrupt (so 0 would request an interrupt every line, 1 every other line and so on). For every other line outside the active display area, the counter is reset to the contents of the VDP register.

Both interrupt types can be enabled or disabled by defined bits held in the VDP registers.

(The above should be loosely correct, the below is a little more uncertain).

Once an interrupt is requested, a flag for said interrupt is set. The flag is not reset until the VDP control port is read, so you must read the VDP control port if you expect any further interrupts.

To differentiate between line and frame interrupts you can check the value read from the control port. If the most significant bit is set, a frame interrupt (at least) was requested. Reading the vertical counter port (which returns the current scanline's vertical position) will also let you know where you are.

Something is a little wonky with my vertical counting code, as all lines end up being one too large. For the moment I'm subtracting one before returning the value (and waiting one extra scanline before triggering the frame interrupt) which is a horrible solution, but for the moment it has fixed a number of games that weren't working at all before.

earthworm_1.gif earthworm_2.gif

Earthworm Jim, which relies on line interrupts to switch on zoomed sprites to dipslay the status bar at the bottom, now plays. It's missing some graphics on the title screens, though.

For some reason, rebuilding my Z80 emulator (which is in a different project) fixed some other interrupt-related glitches, so I have a sneaking suspicion most of my earlier problems are related to using an out-of-date DLL.

road_rash_1.gif road_rash_3.png road_rash_4.png

Road Rash highlighted another bug. The VDP can draw doubled sprites - that is, when a particular bit is set it will draw sprites as 8x16 pixels, stacking two consecutive sprite tiles on top of each other. Road Rash uses this mode, but also uses odd sprite indices (odd as opposed to even, not strange). The VDP will only take even indices, so a line of code to clear the least significant bit if using doubled sprites fixed that.

Still no sound, though.

Real 3D on classic 2D hardware

Monday, 18th December 2006

My existing implementation of the standard mapper used the values $1FFC..$1FFF in RAM as the paging registers. This is incorrect; from what I can now tell the paging registers are only updated if you write to the $FFFC..$FFFF range, and isn't anything to do with the RAM anyway (the fact that they end up in RAM is a side-effect, not the main way of doing things).

Updating my code to handle this, rather than using the values in RAM, fixed some games; notably Phantasy Star and Space Harrier.

ps_1.gif ps_2.gif ps_3.gif

Thanks to Maxim's post, I fixed the Codemasters mapping for the Excellent Dizzy Collection:

dizzy.gif

A fun bit of hardware is the 3D glasses. These LCD shutter glasses could be used with certain games, and are controlled by writing to the memory range $FFF8..$FFFB. The least-significant bit controls which shutter is open and which is shut, and by alternating frames rapidly in time with the shutters you can display two views; one for the left eye, one for the right.

I've added four different 3D glasses mode. No effect doesn't do anything special, so you just get an unsightly flickery view. Frozen displays frames for one eye only, giving you a 2D view of the 3D game.

To recreate the 3D view on your PC, though, there are two methods. The first method is the standard red-green anaglyph view, for use with those red-green glasses:

anaglyph_1.gif anaglyph_2.gif

Unfortunately, on top of the usual shimmery view you get from anaglyphs, this is made even worse by the fact that I don't remove the existing colour information from the frames. This looks nicer without glasses, but looks really quite horrible when viewed (as intended) through the glasses. For example, the red text is invisible through the green filter, but strongly visible through the red filter, making it shimmer.

anaglyph_grey_1.gif anaglyph_grey_2.gif

By sacrificing the colour and only taking the luminance of the orignal frames, you get an anaglyph that remains stable when viewed.

Fortunately, there is an easy way to retain full colour information in a 3D image, which is to use a stereo pair.

poseiden.gif

The image on the left is to be viewed by your right eye, and the one on the right by your left eye. If you cross your eyes until the two images overlap, and concentrate on the middle one, you should be able to focus on a 3D image.

space_harrier.gif

maze_1.gif

maze_2.gif

I tried adding sound emulation back in, but the timing has confused me once more. If you can help, I've posted the relevant thread here.

Turning Japanese

Friday, 15th December 2006

dizzy.gif

The Z80 CPU uses 16-bit addressing, which limits it to a 64KB address space. As well as fitting the program ROM into this space, we need to fit in the machine's 8KB RAM, limiting us even further.

To get around this limitation, the memory is broken down into a series of windows ("frames"), and you can change what is visible in some of these windows.

The memory range $C000..$DFFF can be used to address the 8KB RAM. This is mirrored from $E000..$FFFF; that is, reads and writes to $E000 work as if you were reading and writing to $C000.

There are three other windows; from $0000..$3FFF, $4000..$7FFF and $8000..$BFFF. By changing the contents of RAM (addresses $FFFC..$FFFF) you can adjust what is accessible from these memory ranges.

The lower 1KB cannot be paged out - this is because when the device boots the contents of RAM (and thus paging registers) are undefined, and you need something fixed in place to set up the machine correctly.

I had already implemented most of the above, with two omissions. The first was that certain cartridges contained their own additional RAM chips (accessible by setting flag bits in paging register $FFFC) that could be used as saved game area. Ys appears to use this area, and so didn't work without it. That was the garbled screenshot I posted earlier:

ys_1.gif ys_2.gif

The Flash now works too.

flash_1.gif flash_2.gif

The second omission was that Codemasters games use a different mapping system to the standard one. Fortunately it is significantly simpler; the 32KB from $0000..$7FFF is locked, and the 16KB from $8000..$BFFF is offset based on a value previously written to address $8000. ($C000..$FFFF is 8KB work RAM as usual).

codemasters.gif

The Excellent Dizzy Collection as seen at the top of this post doesn't entirely work; when you pick a game it resets. Some other emulators do this too. The Game Gear games Cosmic Spacehead and Micro Machines (1 and 2) work well, though.

cosmic.gif mm1.gif mm2.gif

After some fiddling around with the VDP emulation and interrupts, some more games work (but I'm not sure why they now work):

desert_speedtrap.gif asterix.gif indy.gif

Other parts of the emulator have been improved. It can now be set to either domestic (Japanese) or export (everyone else) modes, which affects some games in a variety of ways - from translated text, to removing the Mark III bumper screen, to changing the title of the game.

psychic_jp.gif psychic_en.gif

For Game Gear games, detecting whether it's a Japanese machine or not is simple - test the 6th bit returned when you read port $00. If it's set, you've got an export machine.

However, port $00 is Game Gear-specific - the Master System hardware has a peculiarity that is used to detect region. There are two ports for controllers, and each controller has up, down, left, right, TL, TR and TH lines. You can set the TR and TH lines to be inputs or outputs (and the output level if configured as output) by writing to the I/O control port. If you set them to outputs and try and read from them, on export hardware you get the values you are telling them to output. On Japanese machines, however, they return 0s regardless of the chosen output level.

zillion_jp.gif zillion_en.gif

A more fun addition is Game Genie support. A simple call to Emulator.AddGameGenieCode("3A0-21C-2A2"); and Sonic can stand here all day:

game_genie.gif

Of course, the technical information above is my interpretation of what I've gathered so far from various documents from far brainier chaps than I, and the fact that all I have is a broken emulator indicates that it is probably complete rubbish. smile.gif

Kids, just say No to bilinear filtering

Monday, 11th December 2006

ristar_1.gif

The old layout of the emulator was a bit of a mess.

The Z80 emulator needed to sit inside a class of your design that inherited from IHardwareController (and so exposed methods to handle hardware devices) and another that inherited from IMemoryDevice that exposed methods that could be used to read and write to memory.

As you might imagine, this got a little messy what with the various nested classes needing to pass references to eachother around. The new design is much more straightforwards - you build a class that inherits from Z80A (which is the basic Z80 emulator) and override the four functions for reading from/writing to memory addresses/hardware ports.

ristar_2.gif

Further to that, the Master System emulator itself has been shunted out to its own class library, meaning that the code isn't mixed up with interface code. This way I can easily have two different front-ends - an MDI Windows GUI for example, like the one I've posted screenshots of before, with debugging tools and so on, and an XNA interface that runs full-screen, taking input from a connected joypad.

alex_kidd.gif

For the moment I have the blurry and slow solution which is just to dump the output Bitmap onto a form. You can probably guess as much from the pathetic framerates reported on the above screenshots.

alien_syndrome.gif

Overall compatibility is still increasing slowly. Some programs that used to work now don't, of course. It goes both ways.

The VDP emulation has been significantly rewritten. A lot of the main scanline rasteriser is still the same as it was, though, but slight fixes have been made there (including picking the correct backdrop colour - an x & 0xF + y versus (x & 0xF) + y bug - and sorting out the second column) but I still have a lot of interrupt-related bugs to iron out.

tails_adventure.gif garfield.gif superman.gif greendog.gif dragon_crystal.gif

Just as a test, I cobbled together an 'alternative' interface that uses that console code I posted here a while back.

pinball.gif vfa.gif faceball.gif

It's a sad thing that the above runs at a decent framerate, whereas the graphical versions run at half the rate of they should. At least, that is running in Debug mode within the IDE...

sonic.gif james_pond.gif

Checking the Release build would always be a better plan. That's still lousy performance in my book, though.


Having written the above, I decided to look at the official Game Gear documents once again, and reread the section on ROM banking. I had not realised that for 1MBit cartridges, area 0 and area 1 ($0000..$3FFF and $4000..$7FFF respectively) were fixed, and only area 2 ($8000..$BFFF) could be changed.

Fixing areas 0 and 1 fixed a number of the smaller games:

zillion.gif fantasy_zone_3.gif ghost_house.gif alex_kidd_mir_1.gif alex_kidd_mir_2.gif fantasy_zone_1_1.gif fantasy_zone_1_2.gif wonder_boy.gif
monaco_gp.gif aerial_assault.gif pengo.gif woody_pop.gif

It turns out that if I resize the window so the display is at 1:1 pixel scaling, I get ~73FPS on my machine and - most importantly - only 3% CPU usage. If I make the window even twice as large as it was, it drops to about 30FPS and 100% CPU usage. DirectX beckons...

New Z80 emulator

Wednesday, 6th December 2006

sonic_gg_1.png

One chap I cannot thank enough is CoBB for all his hard work in the Z80 field.

I've rewritten the Z80 emulation from scratch; this time it uses an expanded switch block (the 'manual' way) to decode instructions. Rather than write every combination of instructions out by hand, the code making up the switch blocks up is generated by another program, reading instruction information from a table copied from an Excel spreadsheet.

At the cost of a significantly larger assembly (from 40KB to 140KB) I now get a 100% speed increase (from ~50MHz to 100MHz).

I still can't pass the port of ZEXALL I'm testing with (and the same instructions too - not bad for a 100% rewrite to end up with exactly the same bugs), but after comparing some of my offending tables against CoBB's ones I've isolated some of the hiccoughs. The only instruction group test I fail is, naturally, the one that takes the longest to execute - getting a hardware, or indeed emulator, comparison takes well over an hour.

Anyhow, rewriting the Z80 emulation seems to have been the right thing to do. As you might have guessed from the picture at the top of this entry, Sonic now runs.

sonic_gg_2.png sonic_gg_3.png

My VDP (Video Display Processor) emulation is still rather rough-and-ready (I've really been concentrating on the Z80 bit) and the second column of background tiles is not updated correctly, so I apologise in advance for the distortion! It only appears in SMS mode (the display is cropped in Game Gear mode).

sonic_sms_1.png sonic_sms_2.png

The fill colour (in the left column here) is incorrect in most games too.

sonic_2_1.png sonic_2_2.png sonic_2_3.png sonic_2_4.png sonic_2_5.png

I rather preferred Sonic 2, but maybe that's because you can pick up dropped rings and I'm rather lousy at it otherwise.

wb_1.png wb_2.png wb_3.png

I accused the previous Wonder Boy III shot of not reading the start button. Somehow I also failed to notice the missing sprites (clouds and main part of the castle), which was part of the main problem. It now appears to play well.

vfa_1.png vfa_2.png vfa_3.png vfa_4.png

Support for zoomed sprites seems to be missing in some emulators (at least, the versions of Dega and Pastorama I have to hand), but I use them so implemented them to let my programs work when testing - it's nice to see a commerical game use them too!

gunstar_1.png gunstar_2.png

Gunstar Heroes runs, but flickers and jumps (not the visible sprite 'flicker' in those screenshots) - some bug in my CPU interrupt handling or VDP interrupt generation.

psycho_fox.png the_flash.png

Psycho Fox runs insanely fast - even faster than The Flash as seen to the right - making an already difficult game to control virtually impossible. I'm really not sure what could be causing that, but the enemy sprites sometimes flicker up and back down quickly, so it could be one of the remaining CPU bugs.

fantasy_zone_2.png maze_walker.png

Fantasy Zone demonstrates both the blanking colour bug (that left column should be green) and the distorted second column bug, but plays well. I tried to get a better screenshot of Maze Walker, but not handling 3D glasses makes looking at the screen a rather unpleasant experience (it flickers the left and right eye views in quick succession; the 3D glasses had two LCD shutters that opened and closed in sequence with the images on-screen).

I might add a Dega-esque red/green anaglyph filter, but I find those rather unpleasant to look at so might provide a stereo pair view.

In any case, the most important SMS game now runs -

bios_1.png bios_2.png bios_3.png

I shall refrain from using the Terry Pratchett quote (just this once).

ys_junk.png

Some games look like the above, which makes me happy - it's the ones that do nothing at all that worry me. What with the CPU bugs, dodgy emulation of the mapper, missing (important!) hardware ports and hackish VDP, it's surprising anything runs. It's getting better.

EDIT: How are there so many spelling errors? Fantasy Start as opposed to Fantasy Zone, Video Display Hardware as the expansion of the VDP acronym... I should not write these so late at night.

Compatibility increases further...

Thursday, 23rd November 2006

wonder_boy_1.png

I've added better memory emulation (that is, handling ROM paging, RAM mirroring and enabling a BIOS or not). I wouldn't dare say "more accurate", as that might indicate that something about it is partially accurate. smile.gif

I've isolated one of the biggest problems - and that's programs getting caught in a loop waiting for an interrupt that is never triggered.

The source of these interrupts is the VDP. It can generate two kinds of interrupt - on a line basis (where you can configure it to fire an interrupt every X scanlines) or on a frame basis (where it fires an interrupt at the end of the active display).

Two bits amongst the VDP's own registers control whether the interrupt fires or not. On top of that, there are two internal flags that are set when either of the interrupts fire. They are reset when the VDP's control port is read.

Charles MacDonald's VDP documentation
Bit 5 of register $01 acts like a on/off switch for the VDP's IRQ line. As long as bit 7 of the status flags [this is the frame interrupt pending flag] is set, the VDP will assert the IRQ line if bit 5 of register $01 is set, and it will de-assert the IRQ line if the same bit is cleared.

Bit 4 of register $00 acts like a on/off switch for the VDP's IRQ line. As long as the line interrupt pending flag is set, the VDP will assert the IRQ line if bit 4 of register $00 is set, and it will de-assert the IRQ line if the same bit is cleared.

The way I've chosen to emulate this is as a boolean storing the IRQ status, and to call a function, UpdateIRQ(), every time something that could potentially change the status of it happens.

private bool IRQ = false;
private void UpdateIRQ() {
    bool newIRQ = (LineInterruptEnabled && LineInterruptPending) || (FrameInterruptEnabled && FrameInterruptPending);
    if (newIRQ && !IRQ) this.CPU.Interrupt(true, 0x00);
    IRQ = newIRQ;
}

This detects a rising edge. Falling edge hasn't worked so well. Well, truth be told, neither work very well. So, hitting the I key in the emulator performs a dummy read of the control port which usually 'unblocks' the program.

I'm hoping that my problem is related to this similar one, as the fix seems pretty straightforwards.

The SEGA logo at the top of this entry is not the only evidence of working commerical software...

wonder_boy_2.png wonder_boy_3.png

Not reading the Start button is a common affliction.

faceball_2000.png

Here's the Game Gear DOOM for Scet. smile.gif

marble_madness_1.png marble_madness_2.png marble_madness_3.png marble_madness_4.png marble_madness_5.png

Marble Madness appears to be fully playable, though needs prompting from my I key every so often.

columns_1.png columns_2.png

Columns can't find the start button either, but the demo mode works.

pinball_dreams.png

ys.png desert_strike.png

Getting as far as a title screen is still an achievement in my books.

My luck can't hold out forever, but the homebrew scene is still providing plenty of screenshots...

bock_2002_1.png bock_2002_2.png bock_2002_3.png

Maxim's Bock's Birthday 2002 has been immensely useful for sorting out ROM paging issues.

bock_2003_1.png bock_2003_2.png

Bock's Birthday 2003 demonstrates the interrupt bug, but appears to run healthily otherwise.

hicolor_1.png hicolor_2.png hicolor_3.png

Chris Covell's Hicolor Demo also demonstrates this bug, needing a prod before each new image is displayed.

digger_chan_1.png digger_chan_2.png

Aypok's Digger Chan appears to play fine.

zoop_1.png zoop_2.png nibbles.png

Martin Konrad's Zoop 'em Up and GG Nibbles games run, Zoop 'em Up seems to have collision detection issues (aluop-related bug).

The bug in the Z80 core still eludes me - how aluop with an immediate value passes and aluop with a register fails is rather worrying. At least part of the flags problem has been resolved - the cannon in Bock's KunKun & KokoKun still don't fire, but at the switches now work so you can complete levels.

Screenshots galore

Tuesday, 21st November 2006

I downloaded a collection of homebrew releases from SMS Power! for testing. Here are the obligatory screenshots.

Sega Master System

dc_evolution.png good_advice.png maxim_chip_8_1.png maxim_chip_8_2.png maxim_chip_8_3.png maxim_chip_8_4.png maxim_chip_8_5.png maxim_fantasy_zoop.png sms_power_7th.png tetris.png vik_snd.png violation_1.png violation_2.png violation_3.png violation_4.png

Copyright Violation is displayed incorrectly - the text should be on top of the stars, but that VDP feature is as yet unsupported.

Sega Game Gear

win_gg.png win_gg_paint.png win_gg_text.png win_gg_sea.png win_gg_pic.png sms_power_gg.png

Yes, Windows and DOOM were ported to the Game Gear. That second screenshot is DOOM's automap. wink.gif

There are still lots of ROMs that won't go - that could be due to poor VDP emulation, or it could be due to ROM paging errors (the emulated memory is just straight 64KB RAM), or could be down to the remaining bugs in the Z80 emulation.

Sega: Enter the Pies

Monday, 20th November 2006

The Z80 core is now a bit more accurate - ZEXALL still reports a lot of glitches, and this is even a specially modified version of ZEXALL that masks out the two undocumented flags.

The VDP (Video Display Processor - the graphics hardware) has been given an overhaul and is now slightly more accurate. A lot more software runs now. I have also hacked in my PSG emulator (that's the sound chip) from my VGM player. It's not timed correctly (as nothing is!) but it's good enough for testing.

picohertz_tween.png

picohertz_wormhole.png

Picohertz, the demo I have been working on (on and off) now runs correctly. The hole in the Y in the second screenshot is caused by the 8 sprite per scanline limit. The first screenshot shows off sprite zooming (whereby each sprite is zoomed to 200% the original size). The background plasma is implemented as a palette shifting trick.

fire_track.png

fire_track_paused.png

Fire Track runs and is fully playable. The second shot shows a raster effect (changing the horizontal scroll offset on each scanline.

Seeing as I understand the instructions that my programs use (and the results of them), and have my own understanding of parts of the hardware, it's not really surprising that the programs I've written work perfectly, but ones written by others don't, as they might (and often do) rely on tricks and results that I'm not aware of, or on hardware that I haven't implemented accurately enough. At least I do not need to emulate any sort of OS to run these programs!

SMS Power! has been an amazing resource in terms of hardware documentation and homebrew ROMs. I've been using the entries to the 2006 coding competition to test the emulator.

kunkun_kokokun_title.png kunkun_kokokun_game.png

Bock's game, KunKun & KokoKun, nearly works. The cannon don't fire, which would make the game rather easy if it wasn't for the fact that the switch to open the door doesn't work either. I suspect that a CPU flag isn't being set correctly as the result to an operation somewhere.

pong.png

Haroldoop's PongMaster is especially interesting as it was not written in Z80 assembly, but C. It's also one of the silkiest-smooth pong games I've come across.

an!mal_paws_text.png an!mal_paws.png

An!mal/furrtek's Paws runs, but something means that the effect doesn't work correctly (the 'wavy' bit should only have one full wave in it, not two - it appears my implementation of something doubles the frequency of the wave). The music sounds pretty good, though.

columns.png

Sega's Columns gets the furthest of any official software - the Sega logo fades in then out.

sega_enterpies.png

I do like the idea that Sega is an ENTERPIεS. (From the Game Gear BIOS). (I believe this is a CPU bug).

frogs.png

Charles Doty's Frogs is a bit of a conundrum. The right half of the second frog is missing due to the 8 sprites per scanline limitation of the VDP. However, Meka, Emukon, Dega and now my emulator draw the rightmost frog's tongue (and amount if it showing) differently, as well as whether the frog is sitting or leaping. There's a lot of source for such a static program (it doesn't do anything in any emulator I've tried it on, nor on hardware). Dega is by far the strangest, as the tongue moves in and out rapidly. I'm really not sure what's meant to be happening here.

Here are the results of ZEXALL so far.

Z80 instruction exerciser

ld hl,(nnnn).................OK
ld sp,(nnnn).................OK
ld (nnnn),hl.................OK
ld (nnnn),sp.................OK
ld <bc,de>,(nnnn)............OK
ld <ix,iy>,(nnnn)............OK
ld <ix,iy>,nnnn..............OK
ld (<ix,iy>+1),nn............OK
ld <ixh,ixl,iyh,iyl>,nn......OK
ld a,(nnnn) / ld (nnnn),a....OK
ldd<r> (1)...................OK
ldd<r> (2)...................OK
ldi<r> (1)...................OK
ldi<r> (2)...................OK
ld a,<(bc),(de)>.............OK
ld (nnnn),<ix,iy>............OK
ld <bc,de,hl,sp>,nnnn........OK
ld <b,c,d,e,h,l,(hl),a>,nn...OK
ld (nnnn),<bc,de>............OK
ld (<bc,de>),a...............OK
ld (<ix,iy>+1),a.............OK
ld a,(<ix,iy>+1).............OK
shf/rot (<ix,iy>+1)..........OK
ld <h,l>,(<ix,iy>+1).........OK
ld (<ix,iy>+1),<h,l>.........OK
ld <b,c,d,e>,(<ix,iy>+1).....OK
ld (<ix,iy>+1),<b,c,d,e>.....OK
<inc,dec> c..................OK
<inc,dec> de.................OK
<inc,dec> hl.................OK
<inc,dec> ix.................OK
<inc,dec> iy.................OK
<inc,dec> sp.................OK
<set,res> n,(<ix,iy>+1)......OK
bit n,(<ix,iy>+1)............OK
<inc,dec> a..................OK
<inc,dec> b..................OK
<inc,dec> bc.................OK
<inc,dec> d..................OK
<inc,dec> e..................OK
<inc,dec> h..................OK
<inc,dec> l..................OK
<inc,dec> (hl)...............OK
<inc,dec> ixh................OK
<inc,dec> ixl................OK
<inc,dec> iyh................OK
<inc,dec> iyl................OK
ld <bcdehla>,<bcdehla>.......OK
cpd<r>.......................OK
cpi<r>.......................OK
<inc,dec> (<ix,iy>+1)........OK
<rlca,rrca,rla,rra>..........OK
shf/rot <b,c,d,e,h,l,(hl),a>.OK
ld <bcdexya>,<bcdexya>.......OK
<rrd,rld>....................OK
<set,res> n,<bcdehl(hl)a>....OK
neg..........................OK
add hl,<bc,de,hl,sp>.........OK
add ix,<bc,de,ix,sp>.........OK
add iy,<bc,de,iy,sp>.........OK
aluop a,nn...................   CRC:04d9a31f expected:48799360
<adc,sbc> hl,<bc,de,hl,sp>...   CRC:2eaa987f expected:f39089a0
bit n,<b,c,d,e,h,l,(hl),a>...OK
<daa,cpl,scf,ccf>............   CRC:43c2ed53 expected:9b4ba675
aluop a,(<ix,iy>+1)..........   CRC:a7921163 expected:2bc2d52d
aluop a,<ixh,ixl,iyh,iyl>....   CRC:c803aff7 expected:a4026d5a
aluop a,<b,c,d,e,h,l,(hl),a>.   CRC:60323322 expected:5ddf949b
Tests complete



The aluop (add/adc/sub/sbc/and/xor/or/cp) bug seems to be related to the parity/overflow flag (all other documented flags seem to be generating the correct CRC). daa hasn't even been written yet, so that would be the start of the problems with the daa,cpl,scf,ccf group. adc and sbc bugs are probably related to similar bugs as the aluop instructions.

The biggest risk is that my implementation is so broken it can't detect the CRCs correctly. I'd hope not.

In terms of performance; when running ZEXALL, a flags-happy program, I get about ~60MHz speed in Release mode on a 2.4GHz Pentium 4. When ZEXALL is finished, and it's just looping around on itself, I get ~115MHz.

The emulator has not been programmed in an efficient manner, rather a simple and clear manner. All memory access is done by something that implements the IMemoryDevice controller (with two methods - byte ReadByte(ushort address) and void WriteByte(ushort address, byte data)) and all hardware access is done by something that implements the IHardwareController interface (also exposing two methods - byte ReadDevice(byte port) and void WriteDevice(byte port, byte data)).

Most of the Z80's registers can be accessed via an index which makes up part of an opcode. You'd have thought that the easiest way to represent this would be, of course, an array. However, it's not so simple - one of the registers, index 6, is (HL) - which means "whatever HL is pointing to". I've therefore implemented this with two methods - byte GetRegister(int index) and void SetRegister(int index, byte value).

Life isn't even that simple, though, as by inserting a prefix in front of the opcode you can change the behaviour of the CPU - instead of using HL, it'll use either IX or IY, two other registers. In the case of (HL) it becomes even hairier - it'll not simply substitute in (IX) or (IY), it'll substitute in (IX+d), where d is a signed displacement byte that is inserted after the original opcode.

To sort this out, I have three RegisterCollections - one that controls the "normal" registers (with HL), one for IX and one for IY. After each opcode and prefix is decoded, a variable is set to make sure that the ensuing code to handle each instruction works on the correct RegisterCollection.

The whole emulator is implemented in this simplified and abstracted manner - so I'm not too upset with such lousy performance.

I'm really not sure how to implement timing in the emulator. There's the easy timing, and the not-so-easy timing.

The easy timing relates the VDP speed. On an NTSC machine that generates 262 scanlines (60Hz), on a PAL machine that generates 313 scanlines (50Hz). That's 15720 or 15650 scanlines per second respectively.

According to the official Game Gear manual, the CPU clock runs at 3.579545MHz. I don't know if this differs with the SMS, or whether it's different on NTSC or PAL devices (the Game Gear is fixed to NTSC, as it never needs to output to a TV, having an internal LCD).

I interpret this as meaning that the CPU needs to be run for 227.7 or 228.7 cycles per scanline. That way, my main loop looks a bit like this:

if (Hardware.VDP.VideoStandard == VideoStandardType.NTSC) {
    for (int i = 0; i < 262; ++i) {
        CPU.FetchExecute(228); 
        Hardware.VDP.RasteriseLine();
    }
} else {
    for (int i = 0; i < 313; ++i) {
        CPU.FetchExecute(229);
        Hardware.VDP.RasteriseLine();
    }
}

The VDP raises an event when it enters the vertical blank area, so the interface can capture this and so present an updated frame.

The timing is therefore tied to the refresh rate of the display.

super_gg.png

Here's the fictional Super Game Gear, breezing along at 51MHz. The game runs just as smoothly as it would at 3MHz, though - as the game's timing is tied to waiting for the vertical blank.

Actually, I tell a lie - as Fire Track polls the vertical counter, rather than waiting for an interrupt, it is possible for it to poll this counter so fast (at an increased clock rate) that it hasn't changed between checks. That way "simple" effects run extra fast, but the game (that has a lot of logic code) runs at the same rate.

This works. The problem is caused by sound.

With the video output, I have total control of the rasterisation. However, with sound, I have to contend with the PC's real hardware too! I'm using the most excellent FMOD Ex library, and a simple callback arrangement, whereby when it needs more data to output it requests some in a largish chunk.
If I emulate the sound hardware "normally", that is updating registers when the CPU asks them to be updated, by the time the callback is called they'll have changed a number of times and the granularity of sound updates will be abysmal.

A solution might be to have a render loop like this:

for (int i = 0; i < 313; ++i) {
    CPU.FetchExecute(229);
    Hardware.VDP.RasteriseLine();
    Hardware.PSG.RenderSomeSamples(1000);
}

However, this causes its own problems. I'd have to ensure that I was generating exactly the correct number of samples - if I generated too few I'd end up with crackles and pops in the audio as I ran out of data when the callback requested some, or I'd end up truncating data (which would also crackle) if I generated too much.

My solution thus far has been a half-way-house - I buffer all PSG register updates to a Queue, logging the data written and how many CPU cycles had been executed overall when the write was attempted. This way, when the callback is run, I can run through the queued data, using the delay between writes to ensure I get a clean output.

As before, this has a problem if the timing isn't correct - rather than generate pops or crackles, it means that the music would play at an inconsistent rate.

Of course, the "best" solution would be to use some sort of latency-free audio solution - MIDI, for example, or ASIO. If I timed it, as with everything else, to scanlines I'd end up with a 64µs granularity - which is larger than a conventional 44.1kHz sample (23µs), so PWM sound might not work very well.

chip_8.png

Incidentally, this is not the first emulator I have written - I have written the obligatory Chip-8 emulator, for TI-83 calculator and PC. Being into hardware, but not having the facilities to hand to dabble in hardware as much as I'd like to, an emulator provides a fun middle-ground between hardware and software.

wormhole.gif

And now for something completely different

Friday, 17th November 2006

A long time ago I thought I'd try my hand at this emulation malarkey.

cogwheel.gif

The hardware is the Z80-based Sega Master System and Game Gear.

Due to the design, it was a huge, messy, poorly written series of hacks that could just about produce the above result if you didn't breathe too hard.

After finding this document, I had a bit of a blast at rearranging the Z80 core and timing the video hardware a bit more correctly. Here's that old Fire Track project:

title.png
All this does is run the emulation for 100 scanlines on a Timer's tick, hence the low reported clock speed.

opening_effect.png
The Game Gear's LCD cropped the output screen, so you wouldn't see the junk to the sides of the display here.

level_screen_without_sprite.png
My implementation of the VDP doesn't support sprites yet.

Of course, these are the most exciting screenshots:

cogwheel_sdsc.gif

zexall.png

Two versions of ZEXALL, one that displays the results on the SMS screen, the other in an SDSC debug console.

One day I shall purge all of these CRC errors. One day! But not now, as I don't have the time to put any work on this (I stole a couple hours for the above), and in spare time (hah!) I should really focus on the Latenite software. Which, whilst it doesn't provide such pretty screenshots, is a great deal more useful.

Subscribe to an RSS feed that only contains items with the Cogwheel tag.

FirstLast RSSSearchBrowse by dateIndexTags