Homepage | API & CONOPS Document | [email protected]
Last Updated: November 08, 2024
Submssion: incode_000
Submission Date: September 05, 2023
Developer: Incode Technologies Inc (US)

Accuracy by Dataset

Principle evaluation was performed over four datasets briefly described below. Further details can be found in the NIST report.

  • Visa Images were collected at United States consular facilities in Mexico. They have pixel dimensions 252x300 with a mean interocular distance of 69 pixels. Most are live captures although some are photographs of printed photographs. These images were used in NIST’s Performance of Automated Age Estimation Algorithms study.
  • Mugshots were collected post-arrest by law enforcement in the United States. They have reasonable compliance with the ANSI/NIST ITL 1-2011 Type 10 standard’s subject acquisition profile levels 10 - 20 for frontal images. The images are JPEG compressed and vary in pixel dimensions, with 480x640 being the most common.
  • Application Images were collected during interviews at immigration offices in the United States. The images are 300x300 pixels and have a uniform (typically white) background. Eyeglasses were removed prior to capture. The images are generally conformant to the ISO/IEC 19794-5 Face Image Data standard.
  • Border Crossing Images were acquired using a webcam operated by an immigration officer. The images are 240x300 with a mean interocular distance of 38 pixels. They frequently suffer from quality-related problems such as off-angle head orientations and poor frontal illumination.

Table Summary

The table below summarizes the performance of the age estimation algorithm. It measures accuracy using the following metrics.
  • Mean Absolute Error (MAE): The average (absolute) difference between the estimated age and the actual age. Lower values are more desirable.
  • Acc(x): The proportion of face images for which the estimated age is within x years of the actual age. Higher values are more desirable.
  • FPR(x, y): The proportion of x year-olds whose ages are estimated to be at least y years old. When x < y this is a false positive rate.
  • FNR(x, y): The proportion of x year-olds who are estimated to be less than y years old. When x > y this is a false negative rate.
  • Failure to Process (FTP): The proportion of images for which an age estimate could not be calculated.

For more detailed definitions of the error metrics see Section 2 of the NIST report.

Figures

The plots below show error rates as a function of threshold (i.e., challenge age). The grey vertical line shows the actual age (16 or 25 years).


Demographics

Accuracy is assessed across different ages, sexes, and races. A person’s broad region of birth is also considered.

Accuracy by Age

The plots below show the Mean Absolute Error (MAE) distribution across age intervals for each dataset.


False positive and false negative rates for persons of different ages when the challenge age is fixed. The results shown are for Visa images.


Sex Effects

False Positives vs. Challenge Age

False positive rates for 16 year olds at different challenge ages and for different datasets.

False Negatives vs. Challenge Age

False negative rates for 25 year olds at different challenge ages and for different datasets.


Region Effects

A person’s geographic region of birth is often a strong indicator of a person’s ethnicity. The plots below show error rates for images of persons from sex broad geographic regions:
  • East Africa: Ethiopia, Kenya, Somalia, Sudan, Tanzania
  • West Africa: Nigeria, Liberia, Sierra Leone, Benin, Ghana, Mali, Senegal, Togo
  • East Europe: Poland, Ukraine, Russia, Hungary, Romania, Czechia
  • East Asia: Korea, China, Japan, Taiwan
  • South East Asia: Cambodia, Indonesia, Malaysia, Thailand, Vietnam
  • South Asia: Afghanistan, India, Myanmar, Nepal, Pakistan, Bangladesh

For further details see Section 5.4 of the NIST report.


The Gini coefficient is commonly used in Economics to quantify earning and wealth inequality in a population. Here, it is to express variability in accuracy across geographic regions. Lower values are more desirable as they correspond to more equitable outcomes.


Full Error Distributions


Histograms of Errors


Cummulative Distribution of Errors


Kalina Everyday

The artist Noah Kalina has collected a portrait of himself nearly every day since 2000. NIST purchased a subset of these photos, which were collected with a fixed two-handed selfie technique that introduces only small variations in head orientation, focus, camera distance, and margin around the head. Additional sources of variation include changes in hairstyle, illumination, background, and illumination.

The dotted line is the best linear fit of the estimates. The black line represents what would be perfect age estimates.


Effect of Image Resolution

Only available for Mugshots

I thought that
  • age estimates for lower resolution images would be biased young. Finer facial details (e.g., wrinkles) wouldn’t be visible.
  • there’d be more variation for lower resoluti9on images

Neither appears to be happening.