Tetris in 8-bit Assembly
UP / DOWN / RIGHT / LEFT - Arrow Keys
START - Enter Key
SELECT - Right Shift
A & B - X & Z keys
Sorry!
The Tetris game is only available to play on Desktop.
Come back on a computer!
Old Computers & Assembly
The best gaming console: Game Boy Color
Computers were cooler back when you could actually understand what they did.
The many layers of abstraction in modern computing have done two things:
1) Immensely decreased the total programming time from concept to application (arguably good).
2) Removed the need for technical and programatic ingenuity (somewhat).
A quick peek into the assembly code of Pokemon Red reveals a world of hand-crafted optimizations:
Run-length encoding, sprite graphic compression, jump tables, etc.
Compare the 2MB game size of Pokemon Red vs the (gigantic) 14.4GB of Legend of Zelda: Breath of the Wild.
Is this Zelda game truly 7200 times better???
With the availability of inexpensive storage, general software has bloated to an incredible size.
Programmers have become lazy and now routinely include every conceviable package and library.
A journey into 8-bit assembly revives a certain lost art:
Manual pointer management, on-the-fly lossless decompression, hand-crafted math functions, etc.
Game Boy Assembly

The actual assembly function used in Pokemon Red to generate RNG values.
The actual assembly function used in Pokemon Red to generate RNG values.
As an entry point in the world of assembly, I highly recommend learning the architecture of the original Game Boy CPU (which is very similar to the
Z80).
A few good resources:
Whether or not you'll actually decide to create anything - that's up to you.
However, the knowledge gained from learning CPU architecture in invalvuable for anyone working in computing.
It will help answer questions such as...
-
Why does JavaScript suck at decimals? (This is well discussed elsewhere)
(this has to do with decimal encoding into binary)
-
Why does an array of strings in C require the strings to have a max length?
(this has to do with having fixed-length memory offsets between array elements)
-
Why are bitwise operations faster than modulo?
var & 1 vs var % 2
While not strictly necessary, understanding the above will allow you to create better written, performance-optimized code.
(note - modern compilers will optimize everything for you,
var / 8 will become
var >> 3 etc., but the theory behind it remains useful.)
My Custom CPU: GB++
The GB++ instruction set.
After doing some basic development for the Game Boy Color (Hello World! and the like), I wanted something different.
In programming for the Game Boy Color, you're limited by a number of factors:
- Portability - The only way to run programs is either on an emulator, or using an EverDrive
- Restricted Memory Map - There are thousands of unusable addresses in memory, called "Echo RAM"
- Arbitrary contraints on OAM, the number of sprites, map size, etc.
And thus, as the usual "larger-than-expected-hobby-project", I decided to create a fully-functional CPU emulator based on the Game Boy Color.
I designed the CPU instruction set from scratch, increased the amount of system memory (WRAM, VRAM, and ROM), and implemented various other upgrades.
Best of all, I created the emulator in HTML/CSS/JavaScript, so running ROMs is as easy as opening the browser.
Creating Tetris in Assembly - Setup
I quickly realized that creating a full game in assembly requires much more than a CPU.
You actaully need a whole host of other development tools.
The biggest supporting application I needed to create was an "Assembler", which does a bunch of different useful things:
- Converts human-readable assembly code into actual bytes (LD (HL), A into 0x19)
- Converts global variables into values at compile time (system_address_keypad_input into 0xFF00)
- Calculates the address offsets for JR, JP, and CALL instructions
- Encodes text strings into bytes
- And the list goes on...
This actually took me a couple weeks to make, and ended up being around 1500 lines of code.
My GB++ Assembler - in dark mode!
The next challenge was dealing with graphics.
Game Boy graphics are encoded in a series of 8 pixel by 8 pixel "tiles" (
read about this here).
So, I created a drawing tool that I could "paint" with my cursor and it would output the hexidecimal encoding:
The GB++ Tile Tool (like MS Paint but way worse)
There were actually a few other smaller tools I created as well
(a
Run-Length Encoding tool, a tool for drawing tile maps, etc.),
but these were mostly single-day projets. Finally, with my development suite complete, I could actually being programming.
Creating Tetris in Assembly - Development
My Tetris game Title Screen.
Now the fun could begin.
Armed with my janky home-built development tools, I could dive into coding.
It's funny - you only realize how incredible modern programming langauges and compilers are when you try to do
anything productive.
By default, CPUs (8-bit ones) only know a few basic instructions: Add, Subtract, Conditionals, Bit Operations, etc.
I needed to create my own library of functions:
- Multiplication (used for calculating the Tetris score)
- Modulo (used for generating a random Tetromino)
- Copy strings in memory (think memcpy() in C)
- Access array elements by index
- (many more not listed)
In all, what was supposed to be a 1-week project ended up taking over a month.
One of the largest issues proved to be debugging.
Here's why:
When you're writing a program in C++, and something is isn't working, you can almost certainly assume it's your (the code's) fault.
Why?
You trust that the compiler / linker / assembler are doing their jobs properly.
However, if you're writing programs for your own custom-built CPU with your custom assembler, there are many potential points of failure:
- Option 1: The code has a bug.
- Option 2: The code is fine, but the assembler code has bug.
- Option 3: The assembled code is fine, but the CPU has a bug.
Thankfully, after many cycles of trail-and-error, bugs (everywhere!) were eventually ironed-out.
There were a few other modes I wanted to add to this game (2 player mode, no line clearing, etc.) but spending a full month on Tetris felt like enough.
When I have more time, I'll post the full assembly code with commentary.
Addendum: the Tetris ROM File
Programs created in assembly are incredibly small; this entire Tetris game is only 7,237 bytes.
For reference, the Wikipedia Logo is almost 16,000 bytes.
Here's a breakdown of code vs. graphics vs. data by the number of bytes:
- Code (all executable code): 1905 bytes, 26%
- Graphics (everything displayed on screen): 3958 bytes, 55%
- Data (gameplay lookup tables & data): 1374 bytes, 19%
I don't know what I expected these numbers to be, but it makes sense that "graphics" occupied more bytes than "code" and "data" combined
(there's a title screen, level select screen, full A-Z0-9 alphabet, etc.).