Facial Keypoint Detection with Neural Networks

By Ajay Bhargava

Nose Tip Detection

This project utilized the IMM Face Databse, a set of 244 annotated facial images.

The first step was to detect the nose feature using a convolutional neural network. Example images from the dataset and their annotated nose feature are shown below.

face
Example Annotated Image
face
Example Annotated Image

Using the first 192 images as a training set, and the final 48 images as the validation set, I trained a CNN with 3 convolutional layers, each with 20 hidden channels. The training and validation loss across the 20 epochs are shown below.

face
Loss Curve

The model perfomed fairly well. 2 example outputs where the model performed well, as well as 2 chosen outputs where the model performed the worst. The red dots represent where the model predicted the nose, and the blue is the actual annotated point.

face
Predicted Nose
face
Predicted Nose
face
Predicted Nose
face
Predicted Nose

A possible explanation for why this occurred is the facial structure and orientation of the people in the images – as this would affect how well the model is able to predict the nose location.

Full Facial Keypoints Detection

With the nose tip detection working, the model can be expanded to predict all of the facial keypoints for an image. There are 58 keypoints annotated in the dataset. Some examples of the images and their corresponding annotations are shown below.

face
Example Annotated Image
face
Example Annotated Image

The convolutional neural network I trained had 5 convolutional layers, each with between 20-30 channels and a kernel size of 5. This was followed by two fully connected layers which got the output the be of size 58x2.

Loss for the dataset across the epochs are shown below.

face
Loss Curve

Some examples of successful detections, and some unsuccessful detections, are shown below. The unsuccessful ones may be caused by more unique poses, orientations, or facial features..

face
Predicted Keypoints (pretty good)
face
Predicted Keypoints (pretty good)
face
Predicted Keypoints (not as good)
face
Predicted Keypoints (not as good)

The following are some examples of what the filters look like – specifically in the first convolutional layer:

face
Filters

Training on a Larger Set

The above examples are isolated to using just a small set of data for training and testing. Training a larger model on a larger set of data would alolow us to yield better results. The following section used a dataset of annotatied faces from the Intelligent Behavior Understanding Group (iBug) from Imperial College London. It contains over 6000 annotated faces, from all angles and backgrounds. The following are some example images with their annotations from the dataset:

face
Example Annotated Image
face
Example Annotated Image
face
Example Annotated Image

The model I used in this section was a premade architecture in Pytorch – Resnet50. This is a large convolutional net that is 50 layers deep. It was only slightly altered to take in the correct size images and output the correct amount of points for facial annotations.

The loss for training this model is shown below.

face
Loss Graph

Below are some results from the model, both with images from the dataset, as well as with my own images that I gave to the model.

face
Predicted Facial Keypoints
face
Predicted Facial Keypoints
face
Predicted Facial Keypoints

face
Me
face
George Michael Bluth (poor performance)
face
My Sunshine

Pixelwise Classification

Another task that neural networks can accomplish is a pixelwise classification of how likely a certain pixel is to be a keypoint on the face. I again used the Resnet50 model to predict how likely a point is to be a keypoint. This is a convolutional neural network that is 50 layers deep.

A 2D Gaussian around each keypoint was used with a standard deviation of 12. These values were summed to geneate heatmaps. Results are shown below - both on images from the dataset as well as my own.

face
Predicted Heatmap
face
Predicted Heatmap
face
Predicted Heatmap

face
Predicted Heatmap of me
face
Predicted Heatmap of Lucas
face
Predicted Heatmap of Aaron Judge