Automatic face recognition has recently achieved impressive progress surpassing human performance and scaling up to millions of faces. While recent methods mostly process still images, video provides additional information in terms of varying poses, facial expressions or illumination. Face recognition in video, hence, presents opportunities for improvement but also challenges due to e.g., difficult imaging conditions such as motion blur. In this challenge participants will aim to improve the performance of face recognition in video by aggregating single-frame pre-trained CNN face descriptors.
To facilitate future video face recognition research, we release a dataset from YouTube videos with different resolutions, video quality, head rotation, etc. Dataset contains 20 000 tracks 1 500 unique identities with 5 facial landmarks marking.
To take part in competition click on Join in. Please authorise through the GitLab. At the first authorization, you will be prompted to create a team (button "Create team"). When you create a team, a repository will be automatically created and the first commit with baseline will be made.
We assume to be given а set of reference images with faces $I_{ref}$. We also assume the access to face tracks obtained by running face detection and tracking algorithms on video streams. Each track is represented by a sequence of ten face images $I_{track}^i,\, i=1,\ldots,10$ uniformly sampled along the track. The goal is to recognize people in video by comparing sets of images in face tracks $S_{track}=\{I_{track}^i\}_{i=1}^{10}$ to reference images $I_{ref}$.
To compare pairs of face images, we are providing a face CNN $f$ generating $L_2$-normalized descriptors $d=f(I)$ of size 512. The neural network $f$ has a modified ResNet34 architecture and is pre-trained on the MSCeleb1M face dataset such that the distance between resulting descriptors $||d^i - d^j||_2^2$ is minimized for images $i,j$ of the same person and maximized if images $i,j$ belong to different people.
Participants should design a function $g(S_{track})\rightarrow d_{track}$ that generates $L_2$-normalized face track descriptors $d_{track}\in\mathbb{R}^{512}$ aggregating information from multiple face images in the track. Moreover, $d_{track}$ should minimize the distance $$D^{i,j} = ||d_{track}^i - d_{ref}^j||_2^2 = ||g(S_{track}^i) - f(I_{ref}^j)||_2^2$$ if the face track $i$ and the reference face image $j$ belong to the same person, and maximize $D^{i,j}$ otherwise.
The performance will be measured by evaluating distances $D^{i,j}$ for all pairs of tracks $i$ and reference images $j$ in the test set. Given positive and negative pairs $(S_{track},I_{ref})$ for the same and different people respectively, submissions to the VFR challenge will be ranked by maximizing the True Positive Rate at 10e-6 False Positive Rate (TPR@FPR=10e-6) according to ROC. In (unlikely) cases of identical TPR, the methods will be futher ranked by minimizing the average distance $D^{i,j}$ for positive pairs$(S^i_{track},I^j_{ref})$ in the test set.
Results in the VFR public leaderboard will be evaluated on the 50% of the test set. The remaining 50% of the test set will be reserved to determine the final results of the challenge. Participants are requested to submit the evaluation script eval.py, which should accept input files test_df.csv and track_order_df.csv defining the description and the order of tracks respectively. To evaluate submissions, the code of participants will be executed on the VFR challenge server. More details are available in How to submit section.
To develop solutions, participants are allowed to use the provided CNN $f$ as well as any other functions extracting information from input images. To encourage original and resource-friendly solutions as well as to avoid assemblies of many models, we restrict the evaluation time to 15min on servers with the following hardware configuration:
OS: Ubuntu 18.04
CPU: i7-8700 (6 cores)
RAM: 16 Gb
GPU: 2080Ti
Execution time: 15 min
In case of questions regarding the task description, evaluations measures or other competition details, please consult our FAQ and Discussion sections. Also you can ask questions in chat
Train data
File name | Description | Size |
---|---|---|
train_df.csv | The file contains paths to the original and warped face images, the coordinates of the face box, the coordinates of 5 points and other meta information | 155340 x 21 |
train_gt_df.csv | The file contains meta information of GT images to calculate recognition accuracy | 4675 x 18 |
To submit a solution, you will need to add it to a GitLab repository.
If you have not previously worked with Git, we recommend using SmartGit, see instructions here.
SmartGit will create two branches in your repository: master and leaderboard. The leaderboard branch is protected and you can add changes there only by the merge request.
Important note on submits: you can submit only 3 times a day
Your solution should meet the following requirements: the launch script should be named eval.py
, four arguments are required - the paths to the csv files test_df.csv
, track_order_df.csv
, test_descriptors.npy
and the path to the saved descriptors agg_descriptors.npy
.
If you have your own solution (other than eval.py
), you can use the custom run.sh
run script. It should be noted that if you want to use a custom GPU solution, then the id of the video card will be 0.
1. File name
Rename the main script (which is responsible for training and predictions) to eval.py
.
2. Reading the data
Configure eval.py
to read data from CSV files.
When running the script, the command line should take 4 arguments: the paths to test_df.csv
, track_order_df.csv
, test_descriptors.npy
, and the fourth is the path to save results of aggregated descriptors.
Example:
python3 eval.py test_df.csv test_track_order_df.csv test_descriptors.np agg_descriptors.npy
3. Saving results
Prediction results should be saved in a npy-file agg_descriptors.npy
in the directory of eval.py
.
Example:
import numpy as np
np.save(‘agg_descriptors.npy’, results)
In the repository you will find the file .gitlab-ci.yml
. It is already configured to send solutions to the leaderboard.
Important: do not change the contents of the .gitlab-ci.yml
file. If you need to do this, contact support team.
The winners will share a cash pool of 500,000 RUB and will receive desktops with a depth camera and a neural network module VPU from our partner Intel. The prizes will be awarded at the annual conference Machines Can See organized in Moscow, 25 June 2019.
1st place - 250,000 RUB + Intel NUC L10 (CPU: i7) + depth camera Realsense D415 + Neural Compute Stick 2
2nd place - 150,000 RUB + Intel NUC L10 (CPU: i5) + depth camera Realsense D415 + Neural Compute Stick 2
3rd place - 100,000 RUB + Intel NUC L10 (CPU: i5) + depth camera Realsense D415 + Neural Compute Stick 2
Competition winners or their authorized representatives will be required to attend the Machines Can See 2019 conference to receive prizes.
Prize winners will also be requested to:
In case of questions regarding the task description, evaluations measures or other competition details, please consult our FAQ and Discussion sections.
Q: When is the competition being held?
A: The competition is open from April 29 until June 22, 23:59 MSK, 2019. Winners will be announced at the Machines Can See conference.
Q: What are the ranking criteria?
A: The methods will be ranked according to TPR at FPR=1e-6. Details
Q: How to take part in the competition?
A: Choose "Log in" on the competition site. You will be able to create a team once registered at GitLab.
Q: How to register for the Machine Can See conference?
A: Participants of the competition are automatically registered to the Machines Can See conference.
Q: Is it possible to register for the competition after its starting date?
A: Yes, the registration will be open until the last day of the competition.
Q: What should I do to join the team to take part in a contest?
A: Please, contact support team.
Q: I’m not able to participate in the conference. Can I still take part in the competition?
A: At least one member from each top-ranked team should participate and present results of the team at the conference.
Q: How to access the repository?
A: Once a team is created, the repository should appear on the projects tab in GitLab.
Q: How to make a submission?
A: Submission details can be foundhere.
Q: When will my results appear on the leaderboard?
A: The progress status will appear in the CI / CD section of GitLab. The processing time will depend on the time it takes to train and test your solution.
Q: What should I do if the submission status is indicated as "Done" but my results do not appear on the leaderboard?
A: Check the output in the jobs section for errors. If there are no errors and the script has successfully terminated, the problem could be caused by the network or hardware issues. Please restart the submission or send the name of your team and the commit identifier to the support team.
Q: How to install packages for Python (pip packages)?
A: Add the list of packages to the file requirements.txt in the standard way.
Q: What environment will be used?
A: Docker based on nvidia/cuda:9.0-cudnn7-runtime
Q: How to install apt packages?
A: Add apt packages in the file apt-packages.txt
one package per line.
Q: Any advice on getting started?
A: In the repository you will see the eval.py
file, which contains the baseline model. File get_scores.py - you can check your result locally (main metrics of the competition).
Also you can ask questions in chat