Understanding the Karplus-Strong with Python (Synthetic Guitar Sounds Included)
This post was entirely written using the IPython notebook. Its content is BSD-licensed. You can see a static view or download this notebook with the help of nbviewer at 20170419_KarplusStrongAlgorithm.ipynb.
In this blog post, we will implement the Karplus-Strong algorithm for guitar sound synthesis.
What is the Karplus-Strong algorithm?¶
The Karplus-Strong algorithm, named after its two creators, was originally published in 1983 as the following paper (full paper here):
Karplus, Kevin, and Alex Strong. “Digital Synthesis of Plucked-String and Drum Timbres.” Computer Music Journal 7, no. 2 (1983): 43–55.
As the paper states, it is a simplified digital instrument that allows one to control pitch, amplitude and decay time. What is so interesting about the algorithm is that it yields realistic sounds even though it is very simple.
How does the algorithm work?
To answer that question, we will first briefly go over how wavetable synthesis works.
Wavetable synthesis¶
Wavetable synthesis was invented in the 1970's. It is based on the following idea: suppose you have an array of points describing a wave.
%matplotlib inline
import matplotlib.pyplot as plt
import numpy as np
t = np.linspace(0, 1, num=100)
wavetable = np.sin(np.sin(2 * np.pi * t))
plt.plot(t, wavetable, '-o')
[<matplotlib.lines.Line2D at 0x7ddb908>]
What happens if we imagine a pointer going through our wavetable at different speeds and picking the closest point it finds to generate a new waveform with it?
from moviepy.editor import VideoClip
from moviepy.video.io.bindings import mplfig_to_npimage
duration = 2
pointer1 = 1
pointer2 = 2
points1 = []
points2 = []
fig, axes = plt.subplots(2, 1)
def make_frame(time):
ax = axes[0]
ax.clear()
ax.plot(t, wavetable)
ax.vlines(pointer1 * time / duration, -1, 1, color='red', label='slow pointer')
ax.vlines(pointer2 * time / duration % 1, -1, 1, color='blue', label='fast pointer')
for pointer, color, points in zip([pointer1, pointer2], ['red', 'blue'], [points1, points2]):
arg = np.argmin(np.abs(t - (pointer * time / duration % 1)))
ax.plot(t[arg], wavetable[arg], 'o', color=color)
points.append(wavetable[arg])
ax.set_xlim(0, 1)
ax.legend(loc='lower left')
ax2 = axes[1]
ax2.clear()
ax2.plot(points1, '-or')
ax2.plot(points2, '-ob')
ax2.set_xlim(0, 41)
ax2.set_ylim(-1, 1)
return mplfig_to_npimage(fig)
animation = VideoClip(make_frame, duration=duration)
plt.close(fig)
animation.ipython_display(fps=20, loop=True, autoplay=True)
98%|████████████████████████████████████████▉ | 40/41 [00:05<00:00, 6.93it/s]
As we can see from the above diagram, the fast pointer goes trough the waveform twice as fast as the original pointer. However, as it reaches the end of the wave, it starts over at the beginning. Using this method, we can easily derive several new waveforms from the original wavetable by looping over the wavetable and going back to the beginning when we reach the end of the wavetable. Fun fact, this is how a Gameboy generates sounds.
Let's write a function that implements the wavetable synthesis principle.
def synthesize(sampling_speed, wavetable, n_samples):
"""Synthesizes a new waveform from an existing wavetable."""
samples = []
current_sample = 0
while len(samples) < n_samples:
current_sample += sampling_speed
current_sample = current_sample % wavetable.size
samples.append(wavetable[current_sample])
current_sample += 1
return np.array(samples)
Let's see what the outputs look like.
sample1 = synthesize(1, wavetable, 300)
sample2 = synthesize(2, wavetable, 300)
plt.plot(sample1)
plt.plot(sample2)
plt.xlabel('sample number')
<matplotlib.text.Text at 0xa6b5908>
What we see is that the waveforms look quite similar except for the fact that one has a higher frequency than the other.
Let's generate some sounds at interesting frequencies. We first generate a new wavetable designed for a sampling rate of $f_s$ = 8000 Hz.
fs = 8000
t = np.linspace(0, 1, num=fs)
wavetable = np.sin(np.sin(2 * np.pi * t))
plt.plot(t, wavetable, '-o')
[<matplotlib.lines.Line2D at 0xaa9e278>]
Now, let's generate a sinusoid at 220 Hz and one at 440 Hz.
sample1 = synthesize(220, wavetable, 2 * fs)
sample2 = synthesize(440, wavetable, 2 * fs)
We can listen to it using the IPython rich output:
from IPython.display import Audio
Audio(sample1, rate=fs)
Audio(sample2, rate=fs)
Indeed the sampling scheme works, sample2
has a higher frequency than sample1
.
One of the strenghts of this type of wavetable synthesis is that you can easily move to more complex sounds, with harmonics and still use the same model. For instance, we can use a triangle shape:
wavetable = t * (t < 0.5) + (-(t - 1)) * (t>= 0.5)
plt.plot(t, wavetable, '-o')
[<matplotlib.lines.Line2D at 0xab5b080>]
Again, let's sample two sounds:
sample1 = synthesize(220, wavetable, 2 * fs)
sample2 = synthesize(440, wavetable, 2 * fs)
Audio(sample1, rate=fs)
Audio(sample2, rate=fs)
Or, a more complex sine based wavetable.
def make_sine_wavetable(n_samples, amps, phases, freqs):
"""Makes a wavetable from a sum of sines."""
t = np.linspace(0, 1, num=n_samples)
wavetable = np.zeros_like(t)
for amp, phase, freq in zip(amps,
phases,
freqs):
wavetable += amp * np.sin(np.sin(2 * np.pi * freq * t + phase)) + \
amp / 2 * np.sin(np.sin(2 * np.pi * 2 * freq * t + phase))
return wavetable
wavetable = make_sine_wavetable(t.size, [0.1, 0.5, 0.8, 0.3],
[0, 0.3, 0.4, 0.7],
[1, 2.1, 3, 4.3])
plt.plot(t, wavetable, '-o')
[<matplotlib.lines.Line2D at 0xa96a9e8>]
sample1 = synthesize(220, wavetable, 2 * fs)
sample2 = synthesize(440, wavetable, 2 * fs)
Audio(sample1, rate=fs)
Audio(sample2, rate=fs)