The Lowdown on Face Recognition Technology

Facial recognition is a direct application of machine learning that is being deployed far and wide to consumers, in the industry and to law enforcement agencies, with potential benefits in our daily lives as well as serious concerns for privacy. Facial recognition models reach above human performances, but real-world implementation remains problematic in some cases.

Taking roots in the 90s with the Eigenfaces approach at MIT, facial recognition's first large scale successful implementation was Facebook's DeepFace program in 2014, which achieved human level accuracy in lab conditions. Since 2014, larger training datasets, GPUs and rapid advances in neural network architectures have further increased facial recognition performance in a richer set of contexts leading to reliable real-world implementation.

Applications of facial recognition are split among authentication and recognition. In both scenarios, a set of known subjects is initially enrolled in the system (the gallery), and during testing, a new subject (the probe) is presented. Face verification computes one-to-one similarity between the gallery and probe to determine whether the two images are of the same subject. It is a biometric authentication solution used, for instance in the face-based login feature on the iPhone X or border controls in airports. HSBC and Ticketmaster are currently considering using face verification in their mobile applications. Face identification on the other hand, computes one-to-many similarity to correctly identify the probe amongst a gallery of pre-identified people. Its main application is matching unlabeled photos with known profiles. It is used, amongst others, by law enforcement agencies to single out persons-of-interests from crowds.

Facial recognition technology can also be used to infer human characteristics and behaviors such as emotions, age or health. In a recent controversial study from Standford University, a person's sexual orientation was predicted with 81% accuracy using facial analytics methods based on a dataset extracted from Tinder.

The global facial recognition market is split between consumer goods, industrial applications and law enforcement and is expected to reach USD 9 Billion by 2022 according to Allied Market Research and Report Buyer. Main actors in the biometric solutions market include Safran (FR), NEC Corporation (JA), Cognitec (DE), Face++ (CH).

But facial recognition is not a biometric identification tool like the others. "You can delete cookies. You can change browsers. And you can leave your smartphone at home, but you can't delete your face, and you can’t leave it at home" says facial recognition expert Alvaro Bedoya, executive director of Georgetown Law's Center on Privacy & Technology in a recent interview for USA Today. Facial recognition is a biometric authentication tool that does not require consent.

A growing number of civil liberties and privacy associations, including the ACLU, Human Rights Watch, the Electronic Frontier Foundation, and Big Brother Watch in the UK, are calling out on the dangers of facial recognition usage violating civil liberties and civil rights. 40 associations have addressed a Coalition Letter to Amazon Regarding the Facial Recognition System, Rekognition demanding that Amazon stop allowing governments to use AWS Rekognition. Amazon introduced Rekognition in 2016 as part of its Amazon Web Services cloud business. Facebook is also facing a class-action lawsuit in California over its use of facial recognition under the Biometric Information Privacy Act, while 6 out of 10 first-page results for the Google search on "Facebook Face Recognition" are about turning off the face recognition feature, indicating a public distrust of the technology.

The technology has been around for many years and scores highly on standardized training sets. However, real-world conditions offer a particular set of challenges. For instance, face variations of a person can be larger than variations between different persons due to poses. Variations in illuminations, expressions, age, and occlusions such as glasses or headwear also hinder identification. Frontal photos of subjects are also not always available, and using photos from other angles adds further alignment steps to the process. An illustration of the difficulty in generalizing lab experiments to live crowds is illustrated by the recent use of facial recognition by the UK Metropolitan Police during festivals, which resulted in over 95% of the matches being false positive.

To be reliable, facial identification requires large training datasets and powerful matching models. Google and Facebook have access to large proprietary datasets constructed with the photos people upload to their platforms. Large open source datasets are also publicly available. The Labeled Faces in the Wild (LFW) dataset, released in 2007, contains 13k frontal images of 6k people. MS-Celeb-1M is currently the largest public facial recognition dataset for celebrities, containing 10M images of the 10K top celebrities, while MegaFace includes 4.7M photos of 670K different individuals in the training set with 1 million distractors.

Overall, facial recognition is a three step process: localization, normalization and recognition. The system starts by localizing and contouring faces in images. Normalization aligns the original photos to bring them closer to a frontal version. The facial recognition module is then applied to these repositioned faces. A variation on the normalization step augments the target space by generating several representations of a frontal photo in order to simulate different poses. An example of an augmentation technique consists in reconstructing 3D models out of a 2D image to generate variations in pose and project back the variations in 2D.

Since the 90s, facial recognition has moved from handcrafted local-feature-based facial recognition to using optimized deep-learning models. The Facebook Deepface model, trained on the LFW dataset, was the first model to reach human performance. Classic convolutional neural networks (CNNs) and architectures such as AlexNet, VGGNet, GoogleNet, and ResNet, are widely used as the baseline model in facial recognition. These models are then adapted to facial recognition with activation functions and loss functions specifically designed to promote discrimination and generalization. Face++, MegaFace, FaceNet are other neural network models are designed specifically for facial recognition.

Current challenges in facial recognition include achieving robustness for cross-pose and cross-age face variations, using photo-sketches instead of real photos, handling low resolutions photos, and being impervious to occlusions, makeup and spoofing techniques.

InfoQ Software Architects' Newsletter

Login with:

Don't have an InfoQ account?

Write for InfoQ

Rate this Article

This content is in the AI, ML & Data Engineering topic

Related Topics:

Related Editorial

Related Sponsored Content

Popular across InfoQ

The InfoQ Newsletter