TY - GEN
T1 - Crowd-in-the-loop
T2 - 2017 Conference on Empirical Methods in Natural Language Processing, EMNLP 2017
AU - Wang, Chenguang
AU - Akbik, Alan
AU - Chiticariu, Laura
AU - Li, Yunyao
AU - Xia, Fei
AU - Xu, Anbang
N1 - Publisher Copyright:
© 2017 Association for Computational Linguistics.
PY - 2017
Y1 - 2017
N2 - Crowdsourcing has proven to be an effective method for generating labeled data for a range of NLP tasks. However, multiple recent attempts of using crowdsourcing to generate gold-labeled training data for semantic role labeling (SRL) reported only modest results, indicating that SRL is perhaps too difficult a task to be effectively crowdsourced. In this paper, we postulate that while producing SRL annotation does require expert involvement in general, a large subset of SRL labeling tasks is in fact appropriate for the crowd. We present a novel workflow in which we employ a classifier to identify difficult annotation tasks and route each task either to experts or crowd workers according to their difficulties. Our experimental evaluation shows that the proposed approach reduces the workload for experts by over two-thirds, and thus significantly reduces the cost of producing SRL annotation at little loss in quality.
AB - Crowdsourcing has proven to be an effective method for generating labeled data for a range of NLP tasks. However, multiple recent attempts of using crowdsourcing to generate gold-labeled training data for semantic role labeling (SRL) reported only modest results, indicating that SRL is perhaps too difficult a task to be effectively crowdsourced. In this paper, we postulate that while producing SRL annotation does require expert involvement in general, a large subset of SRL labeling tasks is in fact appropriate for the crowd. We present a novel workflow in which we employ a classifier to identify difficult annotation tasks and route each task either to experts or crowd workers according to their difficulties. Our experimental evaluation shows that the proposed approach reduces the workload for experts by over two-thirds, and thus significantly reduces the cost of producing SRL annotation at little loss in quality.
UR - https://www.scopus.com/pages/publications/85059902091
U2 - 10.18653/v1/d17-1205
DO - 10.18653/v1/d17-1205
M3 - Conference contribution
AN - SCOPUS:85059902091
T3 - EMNLP 2017 - Conference on Empirical Methods in Natural Language Processing, Proceedings
SP - 1913
EP - 1922
BT - EMNLP 2017 - Conference on Empirical Methods in Natural Language Processing, Proceedings
PB - Association for Computational Linguistics (ACL)
Y2 - 9 September 2017 through 11 September 2017
ER -