How to create a Game Boy sound in Python?

So how does a Game Boy console produce sounds?

It's difficult to find some easily understandable information on the subject, so I did a little bit of googling. It turns out (according to this link) that a Game Boy originally had

two sound channels connected to the output terminals SO1 and SO2

I'm assuming those two channels are the left and right channels.

Also, a Game Boy could play

Quadrangular wave patterns with sweep and envelope functions, quadrangular wave patterns with envelope functions, voluntary wave pattern, white noise

These four sounds can be controlled independantly and then mixed separately for each of the output terminals.

While searching for "quadrangular wave pattern" term, I found this link. And also this one, which goes into the more detailed aspects of the Game Boy microprocessor.

Long story short, "quadrangular wave patterns" are square waves! So let's get to work and make some sounds with that information.

In this post, we're gonna do the following:

  • first, we investigate the unit sounds the Game Boy is capable of producing
  • we then move on to applying what we have learned to the synthesis of simple melodies, using the well-known "Nokia composer" format as our input

The Game Boy sound

First, let's import the tools we're going to use.

In [1]:
%matplotlib inline
from pylab import *
In [2]:
from IPython.display import Audio # awesome IPython tool to embed audio directly in the notebook
In [3]:
from scipy.signal import square

Basically, a Game Boy sound is a square wave. Let's plot this sort of pattern.

In [4]:
t = arange(0, 0.1, 1/10000.)
In [5]:
plot(t, square(2 * pi * 100 * t))
title("A square wave!")
ylim(-1.1, 1.1)

What does this square wave sound like?

In [6]:
Audio(data=square(2 * pi * 100 * t), rate=5000) 

Sounds very much like a gameboy to me!

Below, we're experimenting with that sound and also with the duty cycle parameter, which is available on the Game Boy according to the documents we found.

In [7]:
from IPython.html.widgets import interact, fixed
from IPython.display import display
In [28]:
def play_square_sound(freq, duration, duty_cycle = 0.5, sample_freq=10.e3, plot_signal=False):
    t = arange(0, duration, 1/sample_freq)
    s = square(2 * pi * freq * t, duty=duty_cycle)
    if plot_signal: 
        plot(t, s)
    display(Audio(s, rate=sample_freq))
In [29]:
         freq=(10, 1000, 50),
         duration=(0.1, 0.5, 0.1),
         duty_cycle=(0.1, 0.9, 0.01),

Well, this sounds like a typical Game Boy sound to me!

We could improve this sound by adding an envelope effect. The type of envelope I'm using below is an exponential fade-in.

In [10]:
def play_square_sound_with_envelope(freq, duration, duty_cycle = 0.5, sample_freq=10.e3, envelope_duration=0.1):
    t = arange(0, duration, 1/sample_freq)
    s = square(2 * pi * freq * t, duty=duty_cycle) * (1 - exp(-t * envelope_duration))
    plot(t, s)
    display(Audio(s, rate=sample_freq))

     freq=(10, 1000, 50),
     duration=(0.1, 0.5, 0.1),
     duty_cycle=(0.1, 0.9, 0.01),
     envelope_duration=(0.1, 100, 0.1))

Now, let's move on to playing some real sounds!

Playing melodies with the Game Boy

Now that we have explored the way the Game Boy generates sounds, we can apply our knowledge to the generation of simple melodies. Here, my test melody will be the classic Tetris melody. A simple way to get this melody in a code-friendly way is to download it from one of the innumerable "Nokia composer sites", whose format gives us the two things we need for synthesis: duration and pitch. A nice specification for the language can be found here.

The melody I'll use below comes from here.

In [11]:
tetris = "e6,8b,8c6,8d6,16e6,16d6,8c6,8b,a,8a,8c6,e6,8d6,8c6,b,8b,8c6,d6,e6,c6,a,2a,8p,d6,8f6,a6,8g6,8f6,e6,8e6,8c6,e6,8d6,8c6,b,8b,8c6,d6,e6,c6,a,a"

Parsing the melody string

To show you how we play this melody, I'll test the "decoding process" (from text to sound) in the lines below.

The first step is to isolate a note from the melody above. We do this by splitting at the commas:

In [12]:
note = tetris.split(",")[2]

We then construct two regular expressions that allow us to extract the duration (expressed in quarter, eight, 16th notes) and the pitch (expressed using standard musical notation like "a", "b", "c", etc.) from the note.

In [13]:
import re
duration = re.compile("^[0-9]+")
pitch = re.compile("[\D]+[\d]*")

We test these regular expressions on our notes:

In [14]:
print duration.findall(note)
print pitch.findall(note)

From this, we can recover duration, frequency and build a sound signal. Here, our duration is '8', which corresponds to an eigth note. And the frequency should correspond to a 'c' in the 6th octave (1046.50 Hz, according to Wikipedia).

In [15]:
t_max = duration.findall(note)
t_max = 1/float(t_max[0])

This duration needs to be multiplied by the duration of one measure, which is arbitrary (it's called the tempo at which you play the piece of music). Finally, we can calculate the frequency of the note we want to play:

In [16]:
In [17]:
    octave = ["1", "2", "3", "4", "5", "6", "7"].index(pitch.findall(note)[0][-1]) + 1 
    height = pitch.findall(note)[0][:-1]
    height = pitch.findall(note)[0]
    octave = 4
print "height= {0}, octave= {1}".format(height, octave)
height= c, octave= 6

Finally, we can calculate the frequency and then generate the sound:

In [18]:
freq = 440 * 2 ** ((["a", "a#", "b", "c", "c#", "d", "d#", "e", "f", "f#", "g", "g#"].index(height) / 12. + octave - 5)) 
In [30]:
wave = play_square_sound(freq, 4 * t_max)

A function that parses the melody and generates a sound

If we apply the knowledge we gained here to a complete melody, it looks like this:

In [20]:
def play_melody(melody, sample_freq=10.e3, bpm=50):
    duration = re.compile("^[0-9]+")
    pitch = re.compile("[\D]+[\d]*") 
    measure_duration = 4 * 60. / bpm #usually it's 4/4 measures
    output = zeros((0,))
    for note in melody.split(','):
        # regexp matching
        duration_match = duration.findall(note)
        pitch_match = pitch.findall(note)
        # duration 
        if len(duration_match) == 0:
            t_max = 1/4.
            t_max = 1/float(duration_match[0])
        if "." in pitch_match[0]:
            t_max *= 1.5
            pitch_match[0] = "".join(pitch_match[0].split("."))
        t_max = t_max * measure_duration
        # pitch
        if pitch_match[0] == 'p':
            freq = 0
            if pitch_match[0][-1] in ["4", "5", "6", "7"]: # octave is known
                octave = ["4", "5", "6", "7"].index(pitch_match[0][-1]) + 4 
                height = pitch_match[0][:-1]
            else: # octave is not known
                octave = 5
                height = pitch_match[0]
            freq = 261.626 * 2 ** ((["c", "c#", "d", "d#", "e", "f", "f#", "g", "g#", "a", "a#", "b"].index(height) / 12. + octave - 4))  
        # generate sound
        t = arange(0, t_max, 1/sample_freq)
        wave = square(2 * pi * freq * t)
        # append to output
        output = hstack((output, wave))
    display(Audio(output, rate=sample_freq)) 

So let's try this on our Tetris melody!

In [21]:
play_melody(tetris, sample_freq=20.e3, bpm=160)

Well, this sounds cool! Below, if pasted a couple of other melodies for you to enjoy:

In [22]:
play_melody("d6,32p,c.6,32p,8a,8c6,8a#,16a,16g,f,c,8a,8c6,8g,8a,f,c,d,8d,8e,8g,8f,8e,8d,c,c,c", bpm=140)
In [23]:
play_melody("d#6,b,c#6,a#,16b,16g#,16a#,16b,16b,16g#,16a#,16b,c#6,g,d#6,16p,16g#,16a#,16b,c#6,16p,16b,16a#,g#,g,g#,16f,16g,16g#,16a#,8d#.6,32d#6,32p,32d#6,32p,32d#6,32p,16d6,16d#6,8f.6,16d6,8a#,8p,8f#6,8d#6,8f#,8g#,a#.,16p,16a#,8d#.6,16f6,16f#6,16f6,16d#6,16a#,8g#.,16b,8d#6,16f6,16d#6,8a#.,16b,16a#,16g#,16f,16f#,d#", bpm=63)