Smile Recognition Using OpenCV and scikit-learn

In this post, we're going to dab a little bit in machine learning and face recognition to predict if an image from a live webcam shows a smiling subject or not. First, we will use an existing dataset, called the "Olivetti faces dataset" and classify the 400 faces seen there in one of two categories: smiling or not smiling. Then, we train a support vector classifier on this dataset to predict if a face depicts a smiling person or not. We do this by using the awesome sklearn machine learning library for Python. Finally, we integrate this classifier into a live loop using OpenCV to capture a frame from our webcam, extract a face and annotate the image with the result of the machine learning prediction.

In [1]:
from IPython.display import YouTubeVideo
YouTubeVideo("mc3XGJaDEMc")
Out[1]:

Please note that I use Python 2.7 in this post due to OpenCV incompatibility issues on my system with Python 3.

Part 1: training a classifier on the existing faces dataset

Loading the faces dataset

We load the dataset using standard sklearn functions below.

In [2]:
%matplotlib inline
from pylab import *
In [3]:
from sklearn import datasets
In [4]:
faces = datasets.fetch_olivetti_faces()

The faces dataset consist of 400 images depicting 40 subjects in a variety of poses: open and closed eyes, smiling or not. The data can be accessed trough the keys below:

In [5]:
faces.keys()
Out[5]:
['images', 'data', 'target', 'DESCR']

One can plot a selection of images from the dataset.

In [6]:
for i in range(10):
    face = faces.images[i]
    subplot(1, 10, i + 1)
    imshow(face.reshape((64, 64)), cmap='gray')
    axis('off')

Producing the smile training data

Now that the dataset is loaded, we will build a cheap user interface to classify the 400 images into two categories:

  • smiling face
  • not smiling face

The class below stores the outcome of the user classification as a dictionary and is used to display the next image upon pressing a button from the GUI.

In [7]:
from IPython.html.widgets import interact, ButtonWidget
from IPython.display import display, clear_output
In [8]:
class Trainer:
    def __init__(self):
        self.results = {}
        self.imgs = faces.images
        self.index = 0
        
    def increment_face(self):
        if self.index + 1 >= len(self.imgs):
            return self.index
        else:
            while str(self.index) in self.results:
                print self.index
                self.index += 1
            return self.index
    
    def record_result(self, smile=True):
        self.results[str(self.index)] = smile

We first instantiate the class and then design a user interface using two buttons from the awesome IPython.html.widgets tools.

In [9]:
trainer = Trainer()
In [10]:
button_smile = ButtonWidget(description='smile')
button_no_smile = ButtonWidget(description='sad face')

def display_face(face):
    clear_output()
    imshow(face, cmap='gray')
    axis('off')

def update_smile(b):
    trainer.record_result(smile=True)
    trainer.increment_face()
    display_face(trainer.imgs[trainer.index])

def update_no_smile(b):
    trainer.record_result(smile=False)
    trainer.increment_face()
    display_face(trainer.imgs[trainer.index])

button_no_smile.on_click(update_no_smile)
button_smile.on_click(update_smile)

display(button_smile)
display(button_no_smile)
display_face(trainer.imgs[trainer.index])

Saving and loading the temporary results from the classification

Due to the fact that I classified the 400 faces over two sessions, I had to save the temporary results and reload them. This (and loading) is done in the lines below.

Loading training dataset

In [11]:
import json
In [12]:
results = json.load(open('results.xml'))
In [13]:
trainer.results = results

Saving training dataset

In [14]:
#with open('results.xml', 'w') as f:
#    json.dump(trainer.results, f)

Visualizing the training set data

Now that our input data is ready, we can plot a little statistic from our dataset: how many people are smiling in the pictures and how many are not?

In [15]:
yes, no = (sum([trainer.results[x] == True for x in trainer.results]), 
            sum([trainer.results[x] == False for x in trainer.results]))
bar([0, 1], [no, yes])
ylim(0, max(yes, no))
xticks([0.4, 1.4], ['no smile', 'smile']);

Below, we're displaying the images of all the smiles and all the "non smiles" that we have classified.

In [16]:
smiling_indices = [int(i) for i in results if results[i] == True]
In [17]:
fig = plt.figure(figsize=(12, 12))
fig.subplots_adjust(left=0, right=1, bottom=0, top=1, hspace=0.05, wspace=0.05)
for i in range(len(smiling_indices)):
    # plot the images in a matrix of 20x20
    p = fig.add_subplot(20, 20, i + 1)
    p.imshow(faces.images[smiling_indices[i]], cmap=plt.cm.bone)
    
    # label the image with the target value
    p.text(0, 14, "smiling")
    p.text(0, 60, str(i))
    p.axis('off')