TY - GEN
T1 - THEMIS: a fair evaluation platform for computer vision competitions
AU - Cai, Zinuo
AU - Yuan, Jianyong
AU - Hua, Yang
AU - Song, Tao
AU - Wang, Hao
AU - Xue, Zhengui
AU - Hu, Ningxin
AU - Ding, Jonathan
AU - Ma, Ruhui
AU - Haghighat, Mohammad Reza
AU - Guan, Haibing
PY - 2021/8/27
Y1 - 2021/8/27
N2 - It has become increasingly thorny for computer vision competitions to preserve fairness when participants intentionally fine-tune their models against the test datasets to improve their performance. To mitigate such unfairness, competition organizers restrict the training and evaluation process of participants' models. However, such restrictions introduce massive computation overheads for organizers and potential intellectual property leakage for participants. Thus, we propose Themis, a framework that trains a noise generator jointly with organizers and participants to prevent intentional fine-tuning by protecting test datasets from surreptitious manual labeling. Specifically, with the carefully designed noise generator, Themis adds noise to perturb test sets without twisting the performance ranking of participants' models. We evaluate the validity of Themis with a wide spectrum of real-world models and datasets. Our experimental results show that Themis effectively enforces competition fairness by precluding manual labeling of test sets and preserving the performance ranking of participants' models.
AB - It has become increasingly thorny for computer vision competitions to preserve fairness when participants intentionally fine-tune their models against the test datasets to improve their performance. To mitigate such unfairness, competition organizers restrict the training and evaluation process of participants' models. However, such restrictions introduce massive computation overheads for organizers and potential intellectual property leakage for participants. Thus, we propose Themis, a framework that trains a noise generator jointly with organizers and participants to prevent intentional fine-tuning by protecting test datasets from surreptitious manual labeling. Specifically, with the carefully designed noise generator, Themis adds noise to perturb test sets without twisting the performance ranking of participants' models. We evaluate the validity of Themis with a wide spectrum of real-world models and datasets. Our experimental results show that Themis effectively enforces competition fairness by precluding manual labeling of test sets and preserving the performance ranking of participants' models.
U2 - 10.24963/ijcai.2021/83
DO - 10.24963/ijcai.2021/83
M3 - Conference contribution
AN - SCOPUS:85125445835
T3 - IJCAI Proceedings
SP - 599
EP - 605
BT - Proceedings of the 30th International Joint Conference on Artificial Intelligence (IJCAI 2021)
A2 - Zhou, Zhi-Hua
PB - International Joint Conferences on Artificial Intelligence
T2 - 30th International Joint Conference on Artificial Intelligence 2021
Y2 - 19 August 2021 through 27 August 2021
ER -