Questionable Observer Detection Competition 2011

The intent of the Questionable Observer Detection Competition (QuODC) 2011 is to measure the performance of face clustering algorithms in detecting people that appear frequently in a collection of videos showing crowds.



Given a set of videos capturing crowds watching a series of events, determine which individuals appear in more than v videos.  The individuals that appear in more than v videos are called questionable observers, whereas people that appear in at most v videos are called casual observers.  For this competition, v is set to 1.  In other words, the objective posed here is to distinguish the questionable observers that appear in more than one video from the casual observers that appear in exactly one video without the benefit of an existing database of faces with known labels. 


The lack of labeling information means that this is a face clustering problem as opposed to a traditional face recognition problem.  First, faces must be detected or tracked in the videos, and then the faces must be grouped together to form a clustering.  As depicted in the diagram below, clusters that contain patterns from more than a single video should be reported as questionable observer clusters.







The questionable observer detection problem can arise in the following applications:

Š      Detecting a serial offender, accomplice or potential informant that often returns to related crime scenes that were captured on video;

Š      determining if an unknown person shows up at voting stations multiple times during an election; or

Š      finding popular actors in a set of movies using little manual processing.


Performance Metrics


In the ideal clustering, all of the face patterns that correspond to a particular identity would be assigned to the same cluster and all identities would only have a single corresponding cluster.  The performance metrics for the questionable observer detection problem, as shown in the table below, are meant to evaluate clusterings in terms of how close they come to this ideal and how well they facilitate the detection of questionable observers.






Self-Organization Rate (SOR)

The SOR metric essentially acts as a classification rate that penalizes clusters with marginal majorities.  The SOR varies within [0.0, 1.0], such that the ideal clustering would have an SOR of 1.0 and a poorly organized clustering would have an SOR near 0.0.  When computing this metric, each cluster is treated as though it represents the individual whose face patterns comprise the majority of its constituent patterns.  All of the face patterns that are associated with any other identity besides the one a cluster represents are counted as misclassifications.    If no more than half of the patterns in a cluster are associated with the person a cluster represents, that cluster does not have a clear majority and so its contribution to the SOR is discounted.  Specifically, the SOR is given by:






nab denotes the number of times one identity was mistaken for another in terms of the cluster assignments,

  ne represents the number of patterns that are assigned to a cluster in which no single individual corresponds to more than half of its patterns,

  n represents the total number of patterns.


False Negative Rate (FNR)


False Positive Rate (FPR)


A false negative occurs when none of the clusters that represent someone who is a questionable observer contain face patterns from more than one video, whereas a false positive occurs when any cluster that represents a casual observer has face patterns from more than a single video.  The FPR and FRR indicate the frequency of these two error types, i.e.











Š      FN is the number of false positives,

Š      FP is the number of false negatives,

Š      TP is the number of true positives, and

Š      TN is the number of true negatives.






The dataset for this competition, the ND-QO-Flip Crowd Video Database, is available for download at  This dataset consists of 14 crowd video clips recorded around the University of Notre Dame Campus over a seven-month period with a Flip camcorder.  It contains 90 subjects, five of whom appear in multiple videos and should be detected as questionable observers.  The videos each have a resolution of 640x480 and a frame rate of 30 frames per second; contain 25-40 seconds worth of footage; were compressed using H.264 compression; and capture crowds of four to 12 people.  The dataset includes ground truth information describing the videos in which the subjects appear and their initial positions in the video frames.   These ground truths can be used to measure the performance metrics presented above.




The objective of the QuODC is to establish the performance of your algorithm(s) on the ND-QO-Flip Crowd Video Database in terms of the SOR, FNR and FPR.   You should include three items with your submission:


1.    A text file or Word document called “summary” that includes names of the individuals involved in the development of your algorithm, their institutional affiliation(s), the name of your algorithm, a brief description about how it operates and a summary of the results it obtained with respect to the SOR, FNR and FPR metrics.

2.    A directory named “data”, which contains the detected or tracked face images used during clustering.

3.    A CSV result file titled “clustering” that describes the cluster assignments made by the algorithm, with each line containing the name of a face image, the unique identifier of the cluster to which it was assigned, and the University of Notre Dame subject ID of the person contained in the face image.


These items should be submitted in a zip or tar.gz file.


Performance will be compared against the approaches described in the paper cited below:


Barr, Jeremiah R. and Bowyer, Kevin .W. and Flynn, Patrick J. “Detecting Questionable Observers Using Face Track Clustering.” Proceedings of the 2011 IEEE Workshop on Applications of Computer Vision.


The evaluation results will be discussed in a report delivered at the International Joint Conference on Biometrics (IJCB).  Your results must be submitted on or before September 16th, 2011. Please email Jeremiah Barr at to register, using “QuODC 2011” as the subject line.




Š       September 16th, 2011 - Submission of test results are due.

Š        Oct. 11-13, 2011 - Final report will be delivered at the IJCB.



Š        Kevin Bowyer -

Š        Patrick Flynn -

Š        Jeremiah Barr


Computer Vision and Research Lab

Department of Computer Science & Engineering

University of Notre Dame

384 Fitzpatrick Hall
Notre Dame, IN 46556




The dataset was acquired with the support of the Central Intelligence Agency, the Biometrics Task Force and the Technical Support Working Group through US Army contract W91CRB-08-C-0093.