¶Reverse engineering Atari 8-bit video
I apologize for the lack of updates recently -- I have a bunch of stuff in the pipeline, but it isn't ready for release yet, and in the last couple of days, I haven't been in much of a mood to work on VirtualDub. When that happens, I usually try to find something else to work on, and this weekend, that ended up being my Atari 8-bit system emulator instead. Specifically, there were some reports of compatibility problems in the video emulation which I wanted to check out.
Well, about ten minutes into that process, I ended up hitting a bad bug in the thunk allocator in VirtualDub's system library, which I had to fix and integrate from Altirra over to the VirtualDub 1.9.x branch. Sigh. So much for getting away completely.
About two hours into the process, I still had sufficiently little information about what was happening in the real Atari hardware that I ended having to reverse engineer it from the specs. Thanks to the folks from the Atari Historical Society (now called the Atari History Museum, I believe), PDF scans of the original GTIA and POKEY specifications are now available, including gate level diagrams. (I haven't been able to find a document for ANTIC. If someone knows of one, let me know.) The problem with these scans is that they're very low quality and hard to read, so it takes a lot of head scratching to figure out what's going on. I'm not a chip designer, either, so basically I have to wing it with what I know about basic digital logic and try to match it up with what I know about the hardware. This can be a bit difficult and frustrating when the logic is encoded in three layers of NOR gates and the scan isn't good enough to indicate whether a signal is active high or active low.
Anyway, I'm definitely a bit late in learning about the details of the Atari 8-bit's GTIA chip, but I learned enough in the process that I couldn't find anywhere else -- or, at least, not in one place -- that I figured I'd just dump it all here in case anyone's interested. There are an amazing number of nuances to trying to emulate a chip exactly, far beyond what you'd get from reading the official documentation.
What the GTIA does
The GTIA chip's job is to convert video information into a video signal. It deals with colors, players and missiles (sprites), and a playfield (main image). There are also a couple of miscellaneous functions crammed into it too, like paddles, that I won't cover here. What it doesn't do, however, is read from memory, which is the job of the other chip it's designed to be paired with, ANTIC. Therefore, the memory stream that GTIA deals with has already been cooked somewhat by ANTIC, simplifying its job somewhat. On the output side, GTIA primarily generates a 160x192 pixel resolution display, with some ability to do 80x192 with more colors or 320x192 with fewer, with a palette chosen from a set of 256 colors (16 hues * 16 shades). Yes, this is old technology. If you were trying to view XXX material on your 8-bit Atari, you were probably using a fair amount of imagination.
There are five main sections I can identify in the GTIA schematic that pertain to the video path:
- ANTIC interface decoding
- Player/missile sequencing
- Priority control
- Collision detection
- Video encoding
I haven't dug much into the P/M sequencing and the collision detection is mostly simple replicated logic, so I'm going to be concentrating on the other three areas.
Data path overview
Starting at the top, the ANTIC interface consists of three wires, AN2-AN0. All playfield video data flows through this interface, whether it be text or graphics, 40-pixel or 320-pixel resolution. The bus transfers three bits of data every color clock, or one pixel at a time at 160 resolution. ANTIC does the work of decoding text modes through the character table and translating colors based on the display list mode, so those largely don't matter as far as GTIA's concerned. All that it receives is either a horizontal/vertical blank signal, 4-bit pixels at 80 resolution, ~2-bit (5 level) pixels at 160 resolution, or 1-bit pixels at 320 resolution. Most video modes on the Atari work in the 160 mode, so GTIA decodes four of the ANx codes to four signals PF0-PF3, one for each playfield. Either one of the bits is set at a time, indicating a particular playfield color is active, or all are off, indicating background. These signals flow up to the priority logic for merging with the sprites.
The priority logic is next, and its job is to determine what is on top from all of the layers, both from the playfield, and also from the sprites. There are four playfields, four players, and four missiles. A lot of the priority is fixed -- the playfields are always sorted the same relative to each other, and the players and missiles are always sorted the same. What you get to choose, based on four bits in the PRIOR register, is how the playfields sort relative to the players/missiles. You can put the players on top of the playfield, below the playfield, split the sprites with the playfield, or split the playfield with the sprites. The priority logic takes all of these, mixes them together, and outputs the one layer that's on top. In a few specific cases, it can output more than one, indicating that they should be mixed.
The output logic is last, and consists of the palette registers and the output circuitry. The output bits from the priority logic flow through the palette registers, which turn signals into colors. This is where BAK turns into the black background, PF2 becomes blue, and the player signal P0 becomes yellow. The palette value consists of seven bits, a four bit hue and a three bit luminance. The hue goes to an oscillator that produces the color signal, and the luminance goes out through a resistor bank to converted to a luminance level. These eventually go out to produce what is now a rather dated but nostalgic computer image on your TV.
That about covers the basic, documented graphics capability... well, almost.
40 column and special GTIA modes
320 pixel mode, or 40-column mode as it is called in the docs, is where the weirdness starts. Most of the logic in the GTIA works in 160 resolution (3.58MHz) and is too slow for 320 resolution, so the 40-column mode actually involves a bit of a bypass. 40 column mode is activated by ANTIC during horizontal blank and during the active region it overrides the playfield outputs to force PF2. The two bits AN1-AN0 that are normally decoded into the playfield are instead serialized into a 40 column bitstream that bypasses most of the priority logic and goes to the encoding section. There, the bitstream selectively replaces the luminance output with the value from PF1, regardless of playfield or player/missile. This bypass is why 40 column mode acts strangely in some ways compared to other modes. I was surprised to find special logic for collision detection, though. I knew from getting Race in Space working that collisions do work in 320 mode and that they return PF2, but I didn't know why. Well, there's logic in GTIA just for 320 mode that ORs together the two bits and outputs a special PF2C line to the collision detector, which is separate from the PF2 line to priority that is forced on. Whaddaya know.
The GTIA also has three special display modes not present in its predecessor, the CTIA. The three modes are 16 direct colors / 1 luminance, 9 indirect colors, and 16 direct luminances / 1 color. Each of the modes has its own quirks, so more on that later. One interesting thing about the GTIA modes, though, is that they piggyback on top of the 320 pixel mode. ANTIC has no idea that GTIA is configured for an 80 pixel, 16 color mode instead of a 320 pixel mode, so it just keeps sending playfield bits on AN1-AN0, which the GTIA reinterprets in paired form as colors. The GTIA, on the other hand, has to reconfigure its logic a bit. One of the side effects of enabling a GTIA mode is that it forces the 40 column mode flip flop off so that the circuitry on the output side doesn't whack the luminance of the output. Well, the side effect of this is that if you turn the GTIA special mode back off in the middle of the scan line, that flip flop doesn't reset, and that leads to the known "pseudo mode E" bug in GTIA. ANTIC mode E is a 160 pixel, 4 color mode that is encoded by ANTIC like this:
- 00 -> 000: background
- 01 -> 100: PF0
- 10 -> 101: PF1
- 11 -> 110: PF2
However, because ANTIC is actually in mode F, it gets encoded this way instead:
- 00 -> 100: PF0
- 01 -> 101: PF1
- 10 -> 110: PF2
- 11 -> 111: PF3
...and you get PF0-PF3 instead of BAK + PF0-PF2. This also explains an old mystery from my childhood, where I once tried creating a lower resolution Graphics 11 display by using Graphics 7 as the base. Graphics 7, or ANTIC mode D, is a 160 resolution, 4 color display with half vertical resolution. If you try doing this, you get messed up colors. Well, more than ten years later, I know the reason: like ANTIC mode E, mode D also outputs BAK + PF0-PF2, and the result is that you can't get an encoding of 11 on AN0-AN1 and only 9 of the 16 colors are available. The only three modes that work are the ones that use the hires encoding, which are ANTIC modes 2, 3, and F. Only took about... 15 years or so to figure that out.
I'm going to divert a bit now and talk about the priority logic, which as I said, determines which playfield or sprite is on top. The original problem I was dealing with was trying to figure out exactly what the bits PRIOR[3:0] do. These four bits control the priority, and according to the official docs, you're supposed to set exactly one of the bits. The four official priority settings are as follows:
...where P0-P3 represent either players 0-3 or missiles 0-3, PF0-3 represent playfields 0-3, and P5 represents the magic fifth player (more on that later). In addition, there's a note in the docs saying that if you set more than one bit, you'll get black where the settings conflict. For the most part, all of this is true, except there's one thing wrong and one thing missing.
First, the error. It pertains to the fifth player, or P5 above. The fifth player is a mode enabled by PRIOR where the four two-bit missiles take on the color of the seldom used playfield 3 instead of using the individual player colors, thus allowing you to move the missiles together to make a fifth 8-bit player. The way this is actually implemented in the priority logic is simply to OR the missiles into PF3 instead of P0-P3, which means that it acts just like PF3. This is the only way that more than one playfield signal can be set, and as it turns out, PF3 wins. The one case that matters is the PRIOR[3:0]=1000 mode, where the players split the playfield. This leads to the weird result that the fifth player is covered by players unless PF0 or PF1 is underneath, in which case P5 shows up on top. Weird.
The missing part is what 0000 does. I had assumed that this would give a totally black display, but when I tried coding it as such, I was shocked to find that the OS itself uses this mode at times, and it most definitely doesn't give black. I worked around this in Altirra by making it equivalent to 0001. That was definitely wrong, though, as I found out soon afterward that the OS actually uses this mode. Well, it turns out that the way the logic is set up, this mode actually causes PF0 and PF1 to blend with P0 and P1, and PF2 and PF3 to blend with P2 and P3, adding 8 or 12 colors depending on whether multicolor player mode is enabled. This happens because if only the official four settings are considered, there are a number of simplifications that can be and were made. The exact logic used for priority is as follows (PRI23 = PRI2 + PRI3, PF23 = PF2 + PF3, P01 = P0 + P1, etc):
SP0 = P0 * /(PF01*PRI23) * /(PRI2*PF23)
SP1 = P1 * /(PF01*PRI23) * /(PRI2*PF23) * (/P0 + MULTI)
SP2 = P2 * /P01 * /(PF23*PRI12) * /(PF01*/PRI0)
SP3 = P3 * /P01 * /(PF23*PRI12) * /(PF01*/PRI0) * (/P2 + MULTI)
SF0 = PF0 * /(P23*PRI0) * /(P01*PRI01) * /SF3
SF1 = PF1 * /(P23*PRI0) * /(P01*PRI01) * /SF3
SF2 = PF2 * /(P23*PRI03) * /(P01*/PRI2) * /SF3
SF3 = PF3 * /(P23*PRI03) * /(P01*/PRI2)
SB = /P01 * /P23 * /PF01 * /PF23
The maximum number of colors that can be produced is 24: the nine palette colors, two from multicolor player mode, 12 from mixing playfields and players, and one from black.
One last thing I should note: none of this affects collision. Collision taps off the player/missile and playfield outputs before any of the priority logic, so fifth player mode never causes playfield 3 hits or disables missile collisions.
Graphics 10 (9 indirect colors)
Alright, now let's proceed to the 9 color mode, which is known to the OS as Graphics 9, and which is enabled by PRIOR[7:6] = 10. This mode requires 4 bits/pixel for 80 pixels wide at normal width, and allows access to all nine palette registers in the GTIA. This mode is implemented by decoding the four bit pixel value and either decoding the result onto the regular PF0-PF3 playfield lines, or blanking those lines and setting the player bits in the priority logic instead. If the playfield colors are used, they will trigger playfield collisions as usual. If the player colors, are used, though, they won't trigger player collisions because the signals go directly to the priority logic via a separate ADP3-ADP0 bus. The reuse of the player signals means that the playfield will sort above players whenever the player corresponding to the color it uses is higher according to the current priority setting.
Graphics 10 also has another known quirk. Because GTIA has to pair signals from ANTIC in order to collect four bits at a time, it can't compute the pixel until one color clock later than usual. I haven't figured out where in the schematic this occurs, but I'm pretty sure this is the reason for Gr.10 being pushed one color clock to the right. As I'll describe later, though, the other GTIA modes bypass some logic and recover one clock of latency, and thus aren't delayed. Many demos exploit the one cycle shift between these modes and flip between them at high speed in order to increase the apparent resolution.
Graphics 9 (1 color, 16 luminances)
Graphics 9 takes the hue from the background color and impresses one of 16 luminances on top of it, based on the incoming four-bit pixel. By "impress," I mean logical OR, so the background luminance should be 0. This mode is the only mode in which all 16 luminances are available, since the palette registers only have three luminance bits, and the only way in which 256 colors can be displayed. Otherwise, the lowest luminance bit is 0. The luminance travels over a side bus B3-B0 that bypasses the priority logic, which presumably saves one clock of latency and is why Gr.9 doesn't have the one color clock shift that Gr.10 does. When Gr.9 mode is active, the NRM signal in the GTIA goes low, and this causes PF0-PF3 and PF2C to blank out. This means that no playfield collisions ever register when Gr.9 is active.
There is one way in which Gr.9 interacts with the priority logic, though. The priority logic ORs P0-P3 together to form an exclusion signal that kills the L signal that signifies the luminance mode. This typically means that players and missiles are not affected by the merge of B3-B0 in the luma output circuitry. The exception, however, is the fifth player. Because the fifth player is merged into PF3 instead of P0-P3, it activates PF3 ahead of the playfield kill logic and also doesn't activate the L kill, meaning that the playfield luminance is ORed onto the fifth player.
Graphics 11 (16 colors, 1 luminance)
Gr.11 takes the luminance from the background color and ORs one of 16 hues on top of it, based on the incoming four-bit pixel. Since the color registers only have seven bits, only eight luma levels are accessible and thus only 128 colors are possible. There is also another difference in that a value of zero also kills the luminance, resulting in black. This allows you to use the luma portion of the background register to set the color brightness without causing the background to be gray or white.
Similarly to Gr.9, the C signal that controls Gr.11 is also killed by P0-P3, it also affects the fifth player, and there is no one-clock delay like there is for Gr.10.
One of the things that I wish I could tell from the schematic is which circuits use which clocks. This confused me at first when I noticed chains of inverters, until I realized that they were synchronously clocked and thus acting as delays. The delays in the various stages would determine the timing of display changes relative to when registers are changed, which would be useful for tuning the emulator. This is particularly important for the collision registers, which are program visible state.
I didn't bother decoding the player/missile logic. The GTIA seems not prone to the horrid sprite hardware abuse that the Atari 2600 TIA was famous for.
I still haven't been able to figure out where the playfield inputs are doubled up in the Gr.10 case. The regular ANx->PFx decoders are used for the playfield colors, which means that something has to either hold the outputs or the inputs for one cycle. Problem is, the only delay registers I can see are on the side path which only outputs to the ADPx and Bx buses.