Building a PyQt application to record your microphone and plot its spectrum
I've always been interested by audio processing. At the recent SciPy 2015 conference, two talks involving "live" spectrograms were part of the highlights (the first concerning Bokeh, a great Python graphics backend for your browser and the second by the VisPy team, an impressive OpenGL Python rendering toolkit for science). This spiked my interest and I've decided to devote a little time to write a simple app for live audio processing. I first tried to start with the example from VisPy but as is part of a pull request that's not yet merged into the main branch, I decided to start from scratch using some of the code provided by the VisPy example and combine it with my existing knowledge of matplotlib and PyQt.
The app I want to code should do the following things:
- record a sound from my microphone continuously
- plot the waveform and the spectrum (frequency representation) of the current waveform
The final result can be seen on YouTube:
from IPython.display import YouTubeVideo
YouTubeVideo('1XxR9U_aUog')
This blog post works as follows:
- we'll first look at the Microphone class that records sound and explain its behaviour
- in a second step, we'll design a Qt application that plots the data from the microphone
Continuously recording sound from the microphone using PyAudio¶
It is actually a little bit tricky to record sound from the microphone with Python. In particular, I first tried doing this in pure PyQt, but failed. This seems to be possible, but given I did not manage to find code that does this on the internet, I abandoned this way of doing it.
Inspired by the VisPy talk, I decided to take over their implementation of a continuously recording microphone using PyAudio. The PyAudio bindings are great because they allow you to use most devices on most platforms. On Mac, I haven't been able to install them into Python 3, but I managed to find a conda package compatible with Python 2.7. So the following code needs to run under Python 2.7 on my machine.
# class taken from the SciPy 2015 Vispy talk opening example
# see https://github.com/vispy/vispy/pull/928
import pyaudio
import threading
import atexit
import numpy as np
class MicrophoneRecorder(object):
def __init__(self, rate=4000, chunksize=1024):
self.rate = rate
self.chunksize = chunksize
self.p = pyaudio.PyAudio()
self.stream = self.p.open(format=pyaudio.paInt16,
channels=1,
rate=self.rate,
input=True,
frames_per_buffer=self.chunksize,
stream_callback=self.new_frame)
self.lock = threading.Lock()
self.stop = False
self.frames = []
atexit.register(self.close)
def new_frame(self, data, frame_count, time_info, status):
data = np.fromstring(data, 'int16')
with self.lock:
self.frames.append(data)
if self.stop:
return None, pyaudio.paComplete
return None, pyaudio.paContinue
def get_frames(self):
with self.lock:
frames = self.frames
self.frames = []
return frames
def start(self):
self.stream.start_stream()
def close(self):
with self.lock:
self.stop = True
self.stream.close()
self.p.terminate()
This class works as a microphone recorder by wrapping functions provided by PyAudio (as shown in the PyAudio examples). On instantiation it creates a PyAudio object that uses a callback to feed an internal buffer, self.frames
, with audio chunks that always have the same length (as defined by chunksize
).
The internal buffer can then be accessed (and emptied) using MicrophoneRecorder.get_frames
.
One of the tricky things in this class is that it actually runs in a thread. So to make sure it stops correctly, a call to atexit
is registered at startup. Unfortunately, this also makes the class unrunnable in the IPython notebook (so I won't instantiate it here).
Once we have a mechanism for getting the audio, we can design the graphical part of the app: a GUI written using PyQt.
The audio GUI written using PyQt¶
The audio GUI's function is easy to describe: it should fetch the accumulated output from the microphone recorder at regular intervals and plot the corresponding signal. The plotting should comprise both a time series and a frequency spectrum computed with numpy.fft
.
It turns out that the way I do the plooting was to use matplotlib
. Although in general not suited for real-time graphical interfaces, it works in this case. The reason is that we don't redraw a whole figure at each call to the plotting but only lines that we have already instantiated. So in fact the updating is quick enough for our means.
Interface a matplotlib plot with PyQt is done using a specific class, whose code is below:
from matplotlib.backends.backend_qt4agg import FigureCanvasQTAgg as FigureCanvas
from matplotlib.backends.backend_qt4agg import NavigationToolbar2QT as NavigationToolbar
import matplotlib.pyplot as plt
class MplFigure(object):
def __init__(self, parent):
self.figure = plt.figure(facecolor='white')
self.canvas = FigureCanvas(self.figure)
self.toolbar = NavigationToolbar(self.canvas, parent)
This just creates an object that has a handle to a matplotlib canvas, which will be used to plot our stuff.
The previous object is incorporated into a pretty classic user interface with a checkbox and a slider.
- the checkbox allows for automatic scaling of the spectrum, which is useful if you're recording sounds at different amplitudes and not sure of your output beforehand
- on the other hand, if you uncheck the box, you can control the normalization of the spectrum using a fixed gain provided by the slider
import numpy as np
import matplotlib.pyplot as plt
from PyQt4 import QtGui, QtCore
class LiveFFTWidget(QtGui.QWidget):
def __init__(self):
QtGui.QWidget.__init__(self)
# customize the UI
self.initUI()
# init class data
self.initData()
# connect slots
self.connectSlots()
# init MPL widget
self.initMplWidget()
def initUI(self):
hbox_gain = QtGui.QHBoxLayout()
autoGain = QtGui.QLabel('Auto gain for frequency spectrum')
autoGainCheckBox = QtGui.QCheckBox(checked=True)
hbox_gain.addWidget(autoGain)
hbox_gain.addWidget(autoGainCheckBox)
# reference to checkbox
self.autoGainCheckBox = autoGainCheckBox
hbox_fixedGain = QtGui.QHBoxLayout()
fixedGain = QtGui.QLabel('Manual gain level for frequency spectrum')
fixedGainSlider = QtGui.QSlider(QtCore.Qt.Horizontal)
hbox_fixedGain.addWidget(fixedGain)
hbox_fixedGain.addWidget(fixedGainSlider)
self.fixedGainSlider = fixedGainSlider
vbox = QtGui.QVBoxLayout()
vbox.addLayout(hbox_gain)
vbox.addLayout(hbox_fixedGain)
# mpl figure
self.main_figure = MplFigure(self)
vbox.addWidget(self.main_figure.toolbar)
vbox.addWidget(self.main_figure.canvas)
self.setLayout(vbox)
self.setGeometry(300, 300, 350, 300)
self.setWindowTitle('LiveFFT')
self.show()
# timer for callbacks, taken from:
# http://ralsina.me/weblog/posts/BB974.html
timer = QtCore.QTimer()
timer.timeout.connect(self.handleNewData)
timer.start(100)
# keep reference to timer
self.timer = timer
def initData(self):
mic = MicrophoneRecorder()
mic.start()
# keeps reference to mic
self.mic = mic
# computes the parameters that will be used during plotting
self.freq_vect = np.fft.rfftfreq(mic.chunksize,
1./mic.rate)
self.time_vect = np.arange(mic.chunksize, dtype=np.float32) / mic.rate * 1000
def connectSlots(self):
pass
def initMplWidget(self):
"""creates initial matplotlib plots in the main window and keeps
references for further use"""
# top plot
self.ax_top = self.main_figure.figure.add_subplot(211)
self.ax_top.set_ylim(-32768, 32768)
self.ax_top.set_xlim(0, self.time_vect.max())
self.ax_top.set_xlabel(u'time (ms)', fontsize=6)
# bottom plot
self.ax_bottom = self.main_figure.figure.add_subplot(212)
self.ax_bottom.set_ylim(0, 1)
self.ax_bottom.set_xlim(0, self.freq_vect.max())
self.ax_bottom.set_xlabel(u'frequency (Hz)', fontsize=6)
# line objects
self.line_top, = self.ax_top.plot(self.time_vect,
np.ones_like(self.time_vect))
self.line_bottom, = self.ax_bottom.plot(self.freq_vect,
np.ones_like(self.freq_vect))
def handleNewData(self):
""" handles the asynchroneously collected sound chunks """
# gets the latest frames
frames = self.mic.get_frames()
if len(frames) > 0:
# keeps only the last frame
current_frame = frames[-1]
# plots the time signal
self.line_top.set_data(self.time_vect, current_frame)
# computes and plots the fft signal
fft_frame = np.fft.rfft(current_frame)
if self.autoGainCheckBox.checkState() == QtCore.Qt.Checked:
fft_frame /= np.abs(fft_frame).max()
else:
fft_frame *= (1 + self.fixedGainSlider.value()) / 5000000.
#print(np.abs(fft_frame).max())
self.line_bottom.set_data(self.freq_vect, np.abs(fft_frame))
# refreshes the plots
self.main_figure.canvas.draw()
The refreshing part of the app is handled with a QTimer
that gets called 10 times a second and refreshes the gui at that time by calling the handleNewData
function. That function gets the latest frame from the microphone, plots the time series, computes the Fourier transform and plots its modulus.
The whole app can be run using the following commands:
Demo time¶
I made a little video from the app working live on YouTube:
YouTubeVideo('1XxR9U_aUog')
Conclusions¶
In this post, we learned how to code a little app for live sound recording using PyAudio, a little audio processing using numpy and displaying the data using matplotlib embedded in the PyQt framework.