How to get SegNet running in year 2020 (or 2021)

Filip Drapejkowski
5 min readNov 13, 2020

SegNet is a deep neural network that performs semantic segmentation, with the most common use case being automotive segmentation of the visible road and the objects around it.

Semantic segmentation of elements around a road.

The authors provided not only the publication, but also their source code, datasets, trained weights, and a tutorial. However, following these materials won’t result in working code in 2020/2021. Their code uses a modified fork of a caffe deep learning framework. Caffe probably is the best written and clearest deep learning framework that has even been designed, but unfortunately it’s rarely maintained and it takes some effort to make it work.

Here is the list of official resources:

http://mi.eng.cam.ac.uk/projects/segnet/ — their webpage
https://arxiv.org/abs/1511.00561 — the paper
http://mi.eng.cam.ac.uk/projects/segnet/tutorial.html — tutorial
https://github.com/alexgkendall/caffe-segnet — the code https://github.com/alexgkendall/SegNet-Tutorial/weights and network definitions

Below are the steps I made to debug the problems that appeared during the build process (CPU version, Ubuntu 16.04).

Building the custom caffe version

git clone https://github.com/alexgkendall/caffe-segnet
cd caffe-segnet

In general caffe is written in c++ (although it supports python via pycaffe), so we need to build the project. We can use cmake to generate a makefile for us, or manually adjust existing makefile and corresponding makefile.config. Let’s go with the second option.

cp Makefile.config.example Makefile.config
make all -j

Ok, we’re done. Just kidding, I got a lot of errors and probably so did you. Beware that the order of handled problems might be different for you, so scroll this article down if needed. After each step, rebuild to apply changes:

make clean
make all -j
#if your computer hangs, use -j4 or -j2 to limit concurrency

Moving from GPU to CPU.

Ignore this section (obviously) if you are using your shiny gaming PC or a devbox with 8 fancy GPUs.

Probably you got one of the following errors, or something related to cuda or cudnn:

make: /usr/local/cuda/bin/nvcc: Command not found
fatal error: cublas_v2.h: No such file or directory

Uncomment line 8 in Makefile.config

# CPU_ONLY := 1

HDF5 related issues

fatal error: hdf5.h: No such file or directory
compilation terminated

Let’s make sure the dependencies are there:

sudo apt-get install libhdf5-10
sudo apt-get install libhdf5-serial-dev
sudo apt-get install libhdf5-dev
sudo apt-get install libhdf5-cpp-11
# output in my case:
#libhdf5-dev is already the newest version

Still the same error. Let’s locate our file and add it to CPATH environmental variable:

find /usr -iname "*hdf5.h*"
# got:
# /usr/include/hdf5/serial/hdf5.h
# /usr/include/opencv2/flann/hdf5.h
# Let's try the first one.
export CPATH="/usr/include/hdf5/serial/"

A new error, that means we are making progress!

/usr/bin/ld: cannot find -lhdf5_hl
/usr/bin/ld: cannot find -lhdf5
collect2: error: ld returned 1 exit status
Makefile:518: recipe for target '.build_release/lib/libcaffe.so' failed

This article promised a solution, noticing that caffe doesn’t expect ‘serial’ postfix in filenames. The .so compiled files weren’t in the referenced path; when in doubt — search with find !

find /usr -iname "*hdf5.so"
# got: /usr/lib/x86_64-linux-gnu/hdf5/serial

For me it’s full of symlinks, but let’s do more symlinks, this shouldn’t ruin anything for us, cause we are adding files instead of modifying existing ones. And I’ve had to use sudo :-/

ln -s /usr/lib/x86_64-linux-gnu/libhdf5_serial.so /usr/lib/x86_64-linux-gnu/libhdf5.so
ln -s /usr/lib/x86_64-linux-gnu/libhdf5_serial_hl.so /usr/lib/x86_64-linux-gnu/libhdf5_hl.so

Types mismatch error

Caffe uses dtype template so that it can work with different precision floating numbers. Due to inconsistent use of types, we got:

src/caffe/layers/contrastive_loss_layer.cpp:56:30: error: no matching function for call to ‘max(double, float)’

Let’s just fix the code in a text editor by casting float to Dtype (which is double in our case):

nano src/caffe/layers/contrastive_loss_layer.cpp# line 54
loss += std::max(margin - dist_sq_.cpu_data()[i], Dtype(0.0));
# replace with:
loss += std::max(Dtype(margin - dist_sq_.cpu_data()[i]), Dtype(0.0));
# line 56:
Dtype dist = std::max(margin - sqrt(dist_sq_.cpu_data()[i]), Dtype(0.0));
# replace with:
Dtype dist = std::max(Dtype(margin - sqrt(dist_sq_.cpu_data()[i])), Dtype(0.0));

Rebuild and voilà!

Finally got a successful build.

Getting a working example in python (pycaffe)

git clone https://github.com/alexgkendall/SegNet-Tutorial
cd SegNet-Tutorial/Scripts/
python webcam_demo.py

Possible error messages that you probably received (in random order, since not all of them are very descriptive):

ImportError: dynamic module does not define init function (init_caffe)ImportError: cannot import name '_validate_lengths'TabError: inconsistent use of tabs and spaces in indentationMessage type "caffe.LayerParameter" has no field named "bn_param".ImportError: No module named builtins

Few things:

  • Let’s not use the build-in camera, cause a) road is more interesting for segnet b) it’s slow on CPU
  • There are indentation bugs (tabs mixed with spaces)
  • We need to move from GPU to GPU
  • Let’s use a version their demo script modified be me in advance, so it works right away.
  • Let’s get the outdated dependencies that are required first (feel free to use virtual environment to avoid messing up your other projects)
  • I’ve used Python 2.7, because a) the authors used it b) I didn’t want to ruin the up-to-date dependencies that I regularly use and I was too lazy to use virtual env, but I’ve modified the original code so that Python 3 should also run.
# downgrade numpy
pip2 install numpy==1.15
# people suggest scikit-image <0.15, but it didn’t work for me
pip2 install scikit-image<0.12
# make sure pycaffe is visible for our python (adjust the path)
export PYTHONPATH=~/caffe-segnet/python

Here is the modified demo script, save it to single_image.py :

import numpy as np
import matplotlib.pyplot as plt
import os.path, scipy, argparse, math, cv2, sys, time, caffe
parser = argparse.ArgumentParser()
parser.add_argument('--model', type=str, required=True)
parser.add_argument('--weights', type=str, required=True)
parser.add_argument('--colours', type=str, required=True)
parser.add_argument('--img', type=str, required=True)
args = parser.parse_args()
net = caffe.Net(args.model, args.weights, caffe.TEST)
#caffe.set_mode_gpu()
input_shape = net.blobs['data'].data.shape
output_shape = net.blobs['argmax'].data.shape
label_colours = cv2.imread(args.colours).astype(np.uint8)
cv2.namedWindow("Input")
cv2.namedWindow("SegNet")
start = time.time()
frame = cv2.imread(args.img)
end = time.time()
print('%30s' % 'Read image in ', str((end - start)*1000), 'ms')
start = time.time()
frame = cv2.resize(frame, (input_shape[3],input_shape[2]))
input_image = frame.transpose((2,0,1))
input_image = np.asarray([input_image])
end = time.time()
print('%30s' % 'Resized image in ', str((end - start)*1000), 'ms')
start = time.time()
out = net.forward_all(data=input_image)
end = time.time()
print('%30s' % 'Executed SegNet in ', str((end - start)*1000), 'ms')
start = time.time()
segmentation_ind = np.squeeze(net.blobs['argmax'].data)
segmentation_ind_3ch = np.resize(segmentation_ind,(3,input_shape[2],input_shape[3]))
segmentation_ind_3ch = segmentation_ind_3ch.transpose(1,2,0).astype(np.uint8)
segmentation_rgb = np.zeros(segmentation_ind_3ch.shape, dtype=np.uint8)
cv2.LUT(segmentation_ind_3ch,label_colours,segmentation_rgb)
segmentation_rgb = segmentation_rgb.astype(float)/255
end = time.time()
print('%30s' % 'Processed results in ', str((end - start)*1000), 'ms\n')
cv2.imshow("Input", frame)
cv2.imshow("SegNet", segmentation_rgb)
cv2.waitKey(0)
cv2.imwrite(path+"_segmentation.png",segmentation_rgb)
print("Saved image to ", path+"_segmentation.png")
cv2.destroyAllWindows()

We still need the model architecture definition (.prototxt file) and the trained weights (.h5). If you use the wrong pair, you will end up with random noise or “Failed to parse NetParameter” error.

It took some guesswork (these files are stored separately), but luckily I’ve found a working pair:

wget https://raw.githubusercontent.com/alexgkendall/SegNet-Tutorial/master/Example_Models/segnet_model_driving_webdemo.prototxtwget http://mi.eng.cam.ac.uk/~agk34/resources/SegNet/segnet_weights_driving_webdemo.caffemodelwget http://mi.eng.cam.ac.uk/~agk34/demo_segnet/demo_images_wild/img3.jpg

Ok, let’s run it!

~/caffe-segnet/SegNet-Tutorial/Scripts$ python2 single_image.py --model segnet_model_driving_webdemo.prototxt --weights segnet_weights_driving_webdemo.caffemodel --colours camvid12.png --img img3.jpg

It works for me:

That’s enough for today. In the future I might try to figure out how to run it with OpenCV 4 on Ubuntu 20.04 — but I am not there yet.

--

--