Applying the Watershed Transform to a Cuneiform Tablet
This post was entirely written using the IPython notebook. Its content is BSD-licensed. You can see a static view or download this notebook with the help of nbviewer at 20160830_SegmentingAnImageWithWatershed.ipynb.
This week, I've been watching the Scipy 2016 tutorial for Scikit-image on YouTube. This tutorial has prompted me to write a little application of the technique of segmentation using the watershed algorithms. This post is essentially based on the material found on Github here and the official documentation for scikit-image.
Goal of this post¶
The goal of this post is to clearly explain the different steps used to provide a useful segmentation of an image using the watershed algorithm.
To this end, we're going to use an image from wikipedia dealing with one of my current interests: cuneiform tablets.
As one sees in the image above, there might be several steps involved before getting a clean image with signs in it:
- isolate the tablet from the background
- pick out individual cuneiform characters from the tablet
- threshold these to produce a final image
Let's get started!
Loading the image¶
To work on the image, we first need to retrieve it as an array of pixels. Luckily for us, scikit-image allows you to download an image if you hand an url to the imread
function:
from skimage import io
img = io.imread("https://upload.wikimedia.org/wikipedia/commons/thumb/f/f2/Tablet_with_Cuneiform_Inscription_LACMA_M.79.106.2_%284_of_4%29.jpg/800px-Tablet_with_Cuneiform_Inscription_LACMA_M.79.106.2_%284_of_4%29.jpg")
Let's check the properties of this image:
img.shape
(531, 800, 3)
It's a 531 by 800 pixels image with a rgb tuple defining the color of each pixel.
img.dtype
dtype('uint8')
The colors are defined by integers (values between 0 and 255), which we will convert to floats while working on them.
from skimage import img_as_float
img = img_as_float(img)
Segmenting the background¶
To cut out the background, we're going to apply the watershed algorithm while initializing the seeds for the algorithm with some foreground and some background pixels.
This works in the following way:
- create the seeds using a contour filter of the tablet image
- perform the segmentation
Creating the seeds¶
We first create a gray image, from which we will compute edges.
from skimage import color
gray = color.rgb2gray(img)
Let's display the gray image:
%matplotlib inline
import matplotlib.pyplot as plt
plt.rcParams['image.cmap'] = 'gray'
plt.rcParams['image.interpolation'] ='nearest'
plt.style.use('seaborn-muted')
plt.imshow(gray)
plt.colorbar()
<matplotlib.colorbar.Colorbar at 0x8667d68>
Now, we filter it using an edge filter.
from skimage import filters
elevation_map = filters.canny(gray)
C:\Anaconda3\lib\site-packages\skimage\filters\__init__.py:28: skimage_deprecation: Call to deprecated function ``canny``. Use ``skimage.feature.canny`` instead. def canny(*args, **kwargs):
Let's display this:
plt.imshow(elevation_map)
plt.colorbar()
<matplotlib.colorbar.Colorbar at 0xbc48828>
The canny filter found some edges inside and outside the tablet. To really make the interior stick out, we average it using a gaussian filter:
filled_elevation_map = filters.gaussian(elevation_map, sigma=5)
plt.imshow(filled_elevation_map)
plt.colorbar()
<matplotlib.colorbar.Colorbar at 0xc051a90>
We now initialize the seeds by thresholding the previous image:
threshold = filters.threshold_otsu(filled_elevation_map)
threshold
0.13830405859398809
This allows us to make seeds for the background (1) and the tablet (2):
import numpy as np
seeds = np.zeros_like(gray)
seeds[filled_elevation_map < threshold / 5] = 1
seeds[filled_elevation_map > threshold] = 2
This yields three types of pixels:
- 0: to be assigned by the watershed algorihtm
- 1: background
- 2: foreground
plt.imshow(seeds)
plt.colorbar()
<matplotlib.colorbar.Colorbar at 0xc2bcb38>
Performing the watershed¶
Let's now flood the image:
from skimage.morphology import watershed
segmentation = watershed(filled_elevation_map, seeds)
What does the result look like?
plt.imshow(segmentation)
plt.colorbar()
<matplotlib.colorbar.Colorbar at 0xcefd128>
Great! We can recognize our tablet!
Finally, we can extract the tablet itself from the contour:
mask = (segmentation == 2).reshape(tuple([*segmentation.shape, 1]))
tablet = mask * img
plt.imshow(tablet)
<matplotlib.image.AxesImage at 0xdaceba8>
Cleaning the characters¶
Now that we have our tablet, we can start working on the characters itself. What assyriologist often do to work with tablets is to first produce a transcription such as this one:
The transcription can be helpful for reading a tablet when it is not available anymore or quite simply to make a clearer picture of the signs, as tablets are difficult to read.
To obtain a sort of automated transcription, let's apply a local histogram equalization to the tablet, to remove the artifacts from the lighting.
from skimage.morphology import disk
equalized = color.rgb2gray(tablet)
selem = disk(30)
equalized = filters.rank.equalize(equalized, selem=selem)
plt.imshow(equalized)
plt.colorbar()
C:\Anaconda3\lib\site-packages\skimage\util\dtype.py:110: UserWarning: Possible precision loss when converting from float64 to uint8 "%s to %s" % (dtypeobj_in, dtypeobj))
<matplotlib.colorbar.Colorbar at 0xdb86a90>
Let's have a look at the resulting histogram:
plt.hist(equalized.ravel(), bins=256);
plt.ylim(0, 2000);
Ideally, we would like to have some points at the bottom of the tablet and some at the surface and do the watershed on them.
seeds = np.zeros_like(equalized)
seeds[equalized < 10] = 1
seeds[(equalized > 140) & (equalized < 200)] = 2
seeds[(~mask[:, :, 0]).nonzero()] = 3
plt.imshow(seeds)
plt.colorbar()
<matplotlib.colorbar.Colorbar at 0xeba2278>
edges = filters.sobel(equalized)
ws = watershed(edges, seeds)
plt.imshow(ws)
<matplotlib.image.AxesImage at 0xef5a240>
Finally, we can produce our cleaned image and compare it to the original:
import scipy.ndimage as ndi
cleaned = np.ones_like(ws)
# signs
cleaned[(ws == 1)] = 0
cleaned = ndi.binary_opening(cleaned)
# border of tablet
cleaned[filters.canny(mask[:, :, 0]) == 1] = 0
C:\Anaconda3\lib\site-packages\skimage\filters\__init__.py:28: skimage_deprecation: Call to deprecated function ``canny``. Use ``skimage.feature.canny`` instead. def canny(*args, **kwargs):
Let's compare both images in full size:
fig, ax = plt.subplots(1, 2, figsize=(16, 10))
ax[0].imshow(tablet)
ax[1].imshow(cleaned)
plt.tight_layout()
We can also apply some zoom on this:
fig, ax = plt.subplots(1, 2, figsize=(10, 5))
zoom = lambda ax: ax.set_xlim(300, 500) and ax.set_ylim(300, 100)
ax[0].imshow(tablet)
zoom(ax[0])
ax[1].imshow(cleaned)
zoom(ax[1])
plt.tight_layout()
Conclusions¶
In this blogpost, we've manipulated a couple of standard image segmentation algorithms using scikit-image. We've applied it to a cuneiform tablet inscription available on wikipedia and obtained a simplified representation from it, essentially applying the watershed algorithm to it.
Further work ideas are numerous here: the final image, although meant to simplifiy the analysis of the original tablet which is difficult due to lighting issues, does reach this goal. So the question is: would it be possible to single out each sign and perform a more accurate automatic transcription?