Final Report
1. The testing dataset
The testing dataset includes 36 images (28 NIR-RGB image pairs and 8 NIR images). We use the NIR-RGB image pairs for quantitative evaluations. For qualitative evaluations, we have manually selected 16 (out from 36) visual outputs that show diverse variance in scene category and complexities, and invites judges to rank the quality of each image.
The testing dataset can be found via the link: https://drive.google.com/file/d/1VrKvbqAOVwWdkOmaVgENlmxyP_Ah--bh/view?usp=sharing
2. The assessment criteria
The following assessment criteria is adopted for evaluation, which includes two parts: Quantitative measures (50%), and Qualitative measures (50%):
i. Quantitative measures (50%) |
ii. Qualitative measures (50%) |
i.1 Peak Signal-to-Noise Ratio (PSNR, 15%), i.2 Structural Similarity (SSIM, 15%), and i.3 Angular Error (AE, 20%) |
ii.1 Color realism and vividness (25%), and ii.2 Texture and boundary fidelity (25%) |
We have adopted a ranking based calculation system. For each assessment criteria, we rank the teams based on their performance. The best to less-well performing teams will earn 5, 4, 3, 2, 1 points, respectively for the current assessment category. The earned point will be weighted and accumulated over all assessment criteria to be each team’s final score.
For quantitative measures, the output images will be compared against the ground truth images and respective statistics (i.e., PSNR, SSIM, and AE) will be calculated and ranked.
For qualitative measures, we carry out human subjective ranking. For the final short-listed 5 teams, we have invited 16 independent to evaluate the colorization outputs for 16 images from all five teams. The top 3 best quality images are chosen based on criteria specified in ii.1 and ii2. For each image, the top 3 teams will get 3, 2, 1 vote points, respectively, from each judge. The vote points earned from all images and all judges will be accumulated, and the teams winning the most total vote points will get 5 points, and the other teams win 4, 3, 2, and 1 points based on their ranking.
3. Detailed evaluation results
We have received registration from 15 international teams with participants from universities, academic institutes, and AI industry. In the first phase (by 20 September, 2020), we have shortlisted 5 top performing teams to participate in the second round of evaluation (by 20 November, 2020).
We list the detail evaluation results for all shortlisted teams. The output images from all participating teams can be found via the link: https://drive.google.com/file/d/1B6YV_USayVXqB6JE0ckzB_riBebD1hhk/view?usp=sharing
Quantitative measures |
|||||
|
A*StarTrek |
Amazing Grace |
IPL |
Media Lab Warrors |
NPU_CIAIC |
PSNR (dB) |
20.67 |
19.24 |
17.50 |
14.26 |
17.39 |
Earned point 15% |
5 |
4 |
3 |
1 |
2 |
SSIM |
0.68 |
0.59 |
0.60 |
0.57 |
0.61 |
Earned point 15% |
5 |
2 |
3 |
1 |
4 |
AE (degrees) |
3.97 |
4.59 |
5.22 |
5.61 |
4.69 |
Earned point 20% |
5 |
4 |
2 |
1 |
3 |
Human-subjective ranking |
|||||
Color realism & vividness (vote points) |
325 |
354 |
204 |
183 |
469 |
Earned point 25% |
3 |
4 |
2 |
1 |
5 |
Texture & boundary fidelity (vote points) |
249 |
406 |
196 |
253 |
425 |
Earned point 25% |
2 |
4 |
1 |
3 |
5 |
Final scores |
|||||
Total points earned |
3.75 |
3.7 |
2.05 |
1.50 |
4.00 |
4. The final team standing.
Final Ranking |
Team Names |
Affiliation |
Final Score |
Team Members |
1 |
NPU_CIAIC |
Northwestern Polytechnical University, China; Shanghai Shengyao Intelligent Technology Co., Ltd, China; Blueye Intelligence, Zhenjiang, China |
4.00 |
Longbin Yan, Xiuheng Wang, Min Zhao, Shumin Liu, Jie Chen |
2 |
A*StarTrek |
Institute of High Performance Computing, A*STAR, Singapore Institute for Infocomm Research, A*STAR, Signapore |
3.75 |
Zaifeng Yang, Zhenghua Chen |
3 |
Amazing Grace |
Xi’An University of Posts & Telecommunications, Xian, China Xidian University, Xi'an China |
3.70 |
Tian Sun, Cheolkon Jung |
4 |
IPL |
Politecnico di Torino, Italy |
2.05 |
Diego Valsesia, Giulia Fracastoro, Enrico Magli |
5 |
Media Lab Warriors |
Xidian University, Xi’an, China |
1.50 |
Lu Liu, Cheolkon Jung, FengQiao Wang |