Verification of a Dataset for Korean Machine Reading Comprehension with Numerical Discrete Reasoning over Paragraphs
DOI: http://dx.doi.org/10.30630/joiv.6.2-2.1120
Abstract
Keywords
Full Text:
PDFReferences
C. A. Perfetti, N. Landi, & J. Oakhill (2005). The Acquisition of Reading Comprehension Skill. In M. J. Snowling & C. Hulme (Eds.), The science of reading: A handbook (pp. 227–247). Blackwell Publishing.
N. K. Duke, and P. D. Pearson. "Effective practices for developing reading comprehension." Journal of education 189.1-2 (2009): 107-122.
P. Rajpurkar, J. Zhang, K. Lopyrev, and P. Liang. 2016. SQuAD: 100,000+ Questions for Machine Comprehension of Text. In Proceedings of the 2016 Conference on Empirical Methods in Natural Language Processing, pages 2383–2392, Austin, Texas. Association for Computational Linguistics.
S. Lim, M. Kim, & J. Lee. (2018). Korquad: Korean qa dataset for machine comprehension. In Proceeding of the Conference of the Korea Information Science Society (pp. 539-541).
D. Jacob, et al. "Bert: Pre-training of deep bidirectional transformers for language understanding." arXiv preprint arXiv:1810.04805 (2018).
R. Colin, et al. "Exploring the limits of transfer learning with a unified text-to-text transformer." J. Mach. Learn. Res. 21.140 (2020): 1-67.
A. Vaswani, et al. "Attention is all you need." Advances in neural information processing systems 30 (2017).
G. Kim, et al. "AI Student: A Machine Reading Comprehension System for the Korean College Scholastic Ability Test." Mathematics 10.9 (2022): 1486.
G. Kim, et al. "Automatic extraction of named entities of cyber threats using a deep Bi-LSTM-CRF network." International journal of machine learning and cybernetics 11.10 (2020): 2341-2355.
A. Radford, et al. "Language models are unsupervised multitask learners." OpenAI blog 1.8 (2019): 9.
G. Kim, et al. "Enhancing Korean Named Entity Recognition With Linguistic Tokenization Strategies." IEEE Access 9 (2021): 151814-151823.
D. Dua, Y. Wang, P. Dasigi, G. Stanovsky, S. Singh, and M. Gardner. 2019. DROP: A Reading Comprehension Benchmark Requiring Discrete Reasoning Over Paragraphs. In Proceedings of the 2019 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies, Volume 1 (Long and Short Papers), pages 2368–2378, Minneapolis, Minnesota. Association for Computational Linguistics.
Q. Ran, Y. Lin, P. Li, J. Zhou, and Z. Liu. 2019. NumNet: Machine Reading Comprehension with Numerical Reasoning. In Proceedings of the 2019 Conference on Empirical Methods in Natural Language Processing and the 9th International Joint Conference on Natural Language Processing (EMNLP-IJCNLP), pages 2474–2484, Hong Kong, China. Association for Computational Linguistics.
A. W. Yu, et al. "Qanet: Combining local convolution with global self-attention for reading comprehension." arXiv preprint arXiv:1804.09541 (2018).
M. Hu, Y. Peng, Z. Huang, and D. Li. 2019. A Multi-Type Multi-Span Network for Reading Comprehension that Requires Discrete Reasoning. In Proceedings of the 2019 Conference on Empirical Methods in Natural Language Processing and the 9th International Joint Conference on Natural Language Processing (EMNLP-IJCNLP), pages 1596–1606, Hong Kong, China. Association for Computational Linguistics.
K. Chen, et al. 2020. Question Directed Graph Attention Network for Numerical Reasoning over Text. In Proceedings of the 2020 Conference on Empirical Methods in Natural Language Processing (EMNLP), pages 6759–6768, Online. Association for Computational Linguistics.
A. Saha, S. Joty, and S. C. Hoi, (2021). Weakly supervised neuro-symbolic module networks for numerical reasoning. arXiv preprint arXiv:2101.11802.
S. Reddy, D. Chen, and C. D. Manning, (2019). Coqa: A conversational question answering challenge. Transactions of the Association for Computational Linguistics, 7, 249-266.
Z. Yang, P. Qi, S. Zhang, Y. Bengio, W. Cohen, R. Salakhutdinov, and C. D. Manning. 2018. HotpotQA: A Dataset for Diverse, Explainable Multi-hop Question Answering. In Proceedings of the 2018 Conference on Empirical Methods in Natural Language Processing, pages 2369–2380, Brussels, Belgium. Association for Computational Linguistics. Zhang, S., Liu, X., Liu, J., Gao, J., Duh, K., & Van Durme, B. (2018). Record: Bridging the gap between human and machine commonsense reading comprehension. arXiv preprint arXiv:1810.12885.
Z. Sheng, et al. (2018). Record: Bridging the gap between human and machine commonsense reading comprehension. arXiv preprint arXiv:1810.12885.
J. Welbl, P. Stenetorp, and S. Riedel. 2018. Constructing Datasets for Multi-hop Reading Comprehension Across Documents. Transactions of the Association for Computational Linguistics, 6:287–302.
M. Alexandre, V. Carles, and E. Heetderks. "Low-resource languages: A review of past work and future challenges." arXiv preprint arXiv:2006.07264 (2020).
G. Kim, et al. "Reading Comprehension requiring Discrete Reasoning Over Paragraphs for Korean." Annual Conference on Human and Language Technology. Human and Language Technology, 2021.
C. Kevin, et al. "Electra: Pre-training text encoders as discriminators rather than generators." arXiv preprint arXiv:2003.10555 (2020).
J. Park, GitHub, 2020, "KoELECTRA: Pretrained ELECTRA Model for Korean", https://github.com/monologg/KoELECTRA