Building a PyQt application to record your microphone and plot its spectrum

I've always been interested by audio processing. At the recent SciPy 2015 conference, two talks involving "live" spectrograms were part of the highlights (the first concerning Bokeh, a great Python graphics backend for your browser and the second by the VisPy team, an impressive OpenGL Python rendering toolkit for science). This spiked my interest and I've decided to devote a little time to write a simple app for live audio processing. I first tried to start with the example from VisPy but as is part of a pull request that's not yet merged into the main branch, I decided to start from scratch using some of the code provided by the VisPy example and combine it with my existing knowledge of matplotlib and PyQt.

The app I want to code should do the following things:

  • record a sound from my microphone continuously
  • plot the waveform and the spectrum (frequency representation) of the current waveform

The final result can be seen on YouTube:

In [1]:
from IPython.display import YouTubeVideo
YouTubeVideo('1XxR9U_aUog')
Out[1]:

This blog post works as follows:

  • we'll first look at the Microphone class that records sound and explain its behaviour
  • in a second step, we'll design a Qt application that plots the data from the microphone

Continuously recording sound from the microphone using PyAudio

It is actually a little bit tricky to record sound from the microphone with Python. In particular, I first tried doing this in pure PyQt, but failed. This seems to be possible, but given I did not manage to find code that does this on the internet, I abandoned this way of doing it.

Inspired by the VisPy talk, I decided to take over their implementation of a continuously recording microphone using PyAudio. The PyAudio bindings are great because they allow you to use most devices on most platforms. On Mac, I haven't been able to install them into Python 3, but I managed to find a conda package compatible with Python 2.7. So the following code needs to run under Python 2.7 on my machine.

In [2]:
# class taken from the SciPy 2015 Vispy talk opening example 
# see https://github.com/vispy/vispy/pull/928
import pyaudio
import threading
import atexit
import numpy as np

class MicrophoneRecorder(object):
    def __init__(self, rate=4000, chunksize=1024):
        self.rate = rate
        self.chunksize = chunksize
        self.p = pyaudio.PyAudio()
        self.stream = self.p.open(format=pyaudio.paInt16,
                                  channels=1,
                                  rate=self.rate,
                                  input=True,
                                  frames_per_buffer=self.chunksize,
                                  stream_callback=self.new_frame)
        self.lock = threading.Lock()
        self.stop = False
        self.frames = []
        atexit.register(self.close)

    def new_frame(self, data, frame_count, time_info, status):
        data = np.fromstring(data, 'int16')
        with self.lock:
            self.frames.append(data)
            if self.stop:
                return None, pyaudio.paComplete
        return None, pyaudio.paContinue
    
    def get_frames(self):
        with self.lock:
            frames = self.frames
            self.frames = []
            return frames
    
    def start(self):
        self.stream.start_stream()

    def close(self):
        with self.lock:
            self.stop = True
        self.stream.close()
        self.p.terminate()

This class works as a microphone recorder by wrapping functions provided by PyAudio (as shown in the PyAudio examples). On instantiation it creates a PyAudio object that uses a callback to feed an internal buffer, self.frames, with audio chunks that always have the same length (as defined by chunksize).

The internal buffer can then be accessed (and emptied) using MicrophoneRecorder.get_frames.

One of the tricky things in this class is that it actually runs in a thread. So to make sure it stops correctly, a call to atexit is registered at startup. Unfortunately, this also makes the class unrunnable in the IPython notebook (so I won't instantiate it here).

Once we have a mechanism for getting the audio, we can design the graphical part of the app: a GUI written using PyQt.

The audio GUI written using PyQt

The audio GUI's function is easy to describe: it should fetch the accumulated output from the microphone recorder at regular intervals and plot the corresponding signal. The plotting should comprise both a time series and a frequency spectrum computed with numpy.fft.

It turns out that the way I do the plooting was to use matplotlib. Although in general not suited for real-time graphical interfaces, it works in this case. The reason is that we don't redraw a whole figure at each call to the plotting but only lines that we have already instantiated. So in fact the updating is quick enough for our means.

Interface a matplotlib plot with PyQt is done using a specific class, whose code is below:

In [3]:
from matplotlib.backends.backend_qt4agg import FigureCanvasQTAgg as FigureCanvas
from matplotlib.backends.backend_qt4agg import NavigationToolbar2QT as NavigationToolbar
import matplotlib.pyplot as plt

class MplFigure(object):
    def __init__(self, parent):
        self.figure = plt.figure(facecolor='white')
        self.canvas = FigureCanvas(self.figure)
        self.toolbar = NavigationToolbar(self.canvas, parent)

This just creates an object that has a handle to a matplotlib canvas, which will be used to plot our stuff.

The previous object is incorporated into a pretty classic user interface with a checkbox and a slider.

  • the checkbox allows for automatic scaling of the spectrum, which is useful if you're recording sounds at different amplitudes and not sure of your output beforehand
  • on the other hand, if you uncheck the box, you can control the normalization of the spectrum using a fixed gain provided by the slider
In [4]:
import numpy as np
import matplotlib.pyplot as plt
from PyQt4 import QtGui, QtCore

class LiveFFTWidget(QtGui.QWidget):
    def __init__(self):
        QtGui.QWidget.__init__(self)
        
        # customize the UI
        self.initUI()
        
        # init class data
        self.initData()       
        
        # connect slots
        self.connectSlots()
        
        # init MPL widget
        self.initMplWidget()
        
    def initUI(self):

        hbox_gain = QtGui.QHBoxLayout()
        autoGain = QtGui.QLabel('Auto gain for frequency spectrum')
        autoGainCheckBox = QtGui.QCheckBox(checked=True)
        hbox_gain.addWidget(autoGain)
        hbox_gain.addWidget(autoGainCheckBox)
        
        # reference to checkbox
        self.autoGainCheckBox = autoGainCheckBox
        
        hbox_fixedGain = QtGui.QHBoxLayout()
        fixedGain = QtGui.QLabel('Manual gain level for frequency spectrum')
        fixedGainSlider = QtGui.QSlider(QtCore.Qt.Horizontal)
        hbox_fixedGain.addWidget(fixedGain)
        hbox_fixedGain.addWidget(fixedGainSlider)

        self.fixedGainSlider = fixedGainSlider

        vbox = QtGui.QVBoxLayout()

        vbox.addLayout(hbox_gain)
        vbox.addLayout(hbox_fixedGain)

        # mpl figure
        self.main_figure = MplFigure(self)
        vbox.addWidget(self.main_figure.toolbar)
        vbox.addWidget(self.main_figure.canvas)
        
        self.setLayout(vbox) 
        
        self.setGeometry(300, 300, 350, 300)
        self.setWindowTitle('LiveFFT')    
        self.show()
        # timer for callbacks, taken from:
        # http://ralsina.me/weblog/posts/BB974.html
        timer = QtCore.QTimer()
        timer.timeout.connect(self.handleNewData)
        timer.start(100)
        # keep reference to timer        
        self.timer = timer
        
     
    def initData(self):
        mic = MicrophoneRecorder()
        mic.start()  

        # keeps reference to mic        
        self.mic = mic
        
        # computes the parameters that will be used during plotting
        self.freq_vect = np.fft.rfftfreq(mic.chunksize, 
                                         1./mic.rate)
        self.time_vect = np.arange(mic.chunksize, dtype=np.float32) / mic.rate * 1000
                
    def connectSlots(self):
        pass
    
    def initMplWidget(self):
        """creates initial matplotlib plots in the main window and keeps 
        references for further use"""
        # top plot
        self.ax_top = self.main_figure.figure.add_subplot(211)
        self.ax_top.set_ylim(-32768, 32768)
        self.ax_top.set_xlim(0, self.time_vect.max())
        self.ax_top.set_xlabel(u'time (ms)', fontsize=6)

        # bottom plot
        self.ax_bottom = self.main_figure.figure.add_subplot(212)
        self.ax_bottom.set_ylim(0, 1)
        self.ax_bottom.set_xlim(0, self.freq_vect.max())
        self.ax_bottom.set_xlabel(u'frequency (Hz)', fontsize=6)
        # line objects        
        self.line_top, = self.ax_top.plot(self.time_vect, 
                                         np.ones_like(self.time_vect))
        
        self.line_bottom, = self.ax_bottom.plot(self.freq_vect,
                                               np.ones_like(self.freq_vect))
                                               

                                               
    def handleNewData(self):
        """ handles the asynchroneously collected sound chunks """        
        # gets the latest frames        
        frames = self.mic.get_frames()
        
        if len(frames) > 0:
            # keeps only the last frame
            current_frame = frames[-1]
            # plots the time signal
            self.line_top.set_data(self.time_vect, current_frame)
            # computes and plots the fft signal            
            fft_frame = np.fft.rfft(current_frame)
            if self.autoGainCheckBox.checkState() == QtCore.Qt.Checked:
                fft_frame /= np.abs(fft_frame).max()
            else:
                fft_frame *= (1 + self.fixedGainSlider.value()) / 5000000.
                #print(np.abs(fft_frame).max())
            self.line_bottom.set_data(self.freq_vect, np.abs(fft_frame))            
            
            # refreshes the plots
            self.main_figure.canvas.draw()

The refreshing part of the app is handled with a QTimer that gets called 10 times a second and refreshes the gui at that time by calling the handleNewData function. That function gets the latest frame from the microphone, plots the time series, computes the Fourier transform and plots its modulus.

The whole app can be run using the following commands:

import sys if __name__ == "__main__": app = QtGui.QApplication(sys.argv) window = LiveFFTWidget() sys.exit(app.exec_())

Demo time

I made a little video from the app working live on YouTube:

In [2]:
YouTubeVideo('1XxR9U_aUog')
Out[2]:

Conclusions

In this post, we learned how to code a little app for live sound recording using PyAudio, a little audio processing using numpy and displaying the data using matplotlib embedded in the PyQt framework.

Comments