LREC 2022 Program - Remote Sessions

 

                         Remote Papers
                         Session: R1 - Applications involving LRs and Evaluation (including applications in specific domains)
   Predicting the Proficiency Level of Nonnative Hebrew Authors
[Paper] [Video]
Isabelle Nguyen1 and Shuly Wintner2
1Humboldt-Universität zu Berlin, 2University of Haifa
   Trends, Limitations and Open Challenges in Automatic Readability Assessment Research
[Paper] [Video]
Sowmya Vajjala
National Research Council
   HateCheckHIn: Evaluating Hindi Hate Speech Detection Models
[Paper] [Video]
Mithun Das1, Punyajoy Saha2, Binny Mathew2, Animesh Mukherjee3
1Indian Institute of Technology Kharagpur, India, 2Indian Institute of Technology, Kharagpur, 3IIT Kharagpur
   Surfer100: Generating Surveys From Web Resources, Wikipedia-style
[Paper] [Poster] [Video]
Irene Li1, Alex Fabbri2, Rina Kawamura1, Yixin Liu3, Xiangru Tang1, Jaesung tae1, Chang Shen1, Sally Ma1, Tomoe Mizutani1, Dragomir Radev1
1Yale University, 2Salesforce AI Research, 3Carnegie Mellon University
   MS-LaTTE: A Dataset of Where and When To-do Tasks are Completed
[Paper] [Video]
Sujay Kumar Jauhar1, Nirupama Chandrasekaran1, Michael Gamon1, Ryen White2
1Microsoft Research, 2Microsoft
   KazakhTTS2: Extending the Open-Source Kazakh TTS Corpus With More Data, Speakers, and Topics
[Paper] [Video]
Saida Mussakhojayeva, Yerbolat Khassanov, Huseyin Atakan Varol
Nazarbayev University
   A Graph-Based Method for Unsupervised Knowledge Discovery from Financial Texts
[Paper] [Poster] [Video]
Joel Oksanen1, Abhilash Majumder2, Kumar Saunack2, Francesca Toni1, Arun Dhondiyal2
1Imperial College London, 2MSCI Inc.
   Leveraging Mental Health Forums for User-level Depression Detection on Social Media
[Paper] [Poster] [Video]
Sravani Boinepelli1, Tathagata Raha1, Harika Abburi1, Pulkit Parikh1, Niyati Chhaya2, Vasudeva Varma1
1International Institute of Information Technology, Hyderabad, 2Adobe Research
   Classifying Implant-Bearing Patients via their Medical Histories: a Pre-Study on Swedish EMRs with Semi-Supervised GanBERT
[Paper] [Poster] [Video]
Benjamin Danielsson1, Marina Santini2, Peter Lundberg1, Yosef Al-Abasse1, Arne Jonsson1, Emma Eneling1, Magnus Stridsman1
1Linköping University, 2RISE, Research Institutes of Sweden. Division: Digital Systems
   Standardisation of Dialect Comments in Social Networks in View of Sentiment Analysis : Case of Tunisian Dialect
[Paper] [Poster] [Video]
Saméh Kchaou1, rahma boujelbane2, Emna Fsih3, Lamia Hadrich-Belguith4
1MIRACLE TUNISIA, 2FSEGS, 3Faculty of Economics and Management of Sfax, 4ANLP Research Group, MIRACL Lab, FSEGS, Sfax University
   EnsyNet: A Dataset for Encouragement and Sympathy Detection
[Paper] [Video]
Tiberiu Sosea and Cornelia Caragea
University of Illinois at Chicago
   Preliminary Results on the Evaluation of Computational Tools for the Analysis of Quechua and Aymara
[Paper] [Poster] [Video]
Marcelo Yuji Himoro1 and Antonio Pareja-Lora2
1Universidad Nacional de Educación a Distancia, 2Universidad de Alcalá (UAH) / FITISPos (UAH) / ATLAS (UNED) / DMEG (UdG)
   A Tale of Two Regulatory Regimes: Creation and Analysis of a Bilingual Privacy Policy Corpus
[Paper] [Slides] [Video]
Siddhant Arora1, Henry Hosseini2, Christine Utz3, Vinayshekhar Bannihatti Kumar4, Tristan Dhellemmes5, Abhilasha Ravichander6, Peter Story6, Jasmine Mangat7, Rex Chen6, Martin Degeling3, Thomas Norton8, Thomas Hupperich2, Shomir Wilson9, Norman Sadeh6
1Student at Carnegie Mellon Univeristy, 2University of Münster, 3Ruhr University Bochum, 4AWS AI, 5Institute for Software Research, Carnegie Mellon University, 6Carnegie Mellon University, 7University of Massachusetts Amherst, 8Fordham University School of Law, 9Pennsylvania State University
   MeSHup: Corpus for Full Text Biomedical Document Indexing
[Paper] [Poster] [Video]
Xindi Wang1, Robert E. Mercer2, Frank Rudzicz3
1University of Western Ontario, 2The University of Western Ontario, 3St Michael's Hospital; University of Toronto, Department of Computer Science
                         Session: R2 - Corpora and Annotation
   Hierarchical Annotation for Building A Suite of Clinical Natural Language Processing Tasks: Progress Note Understanding
[Paper] [Video]
Yanjun Gao1, Dmitriy Dligach2, Timothy Miller3, Samuel Tesch1, Ryan Laffin1, Matthew Churpek1, Majid Afshar1
1University of Wisconsin Madison, 2Loyola University, 3Boston Children's Hospital and Harvard Medical School
   KC4MT: A High-Quality Corpus for Multilingual Machine Translation
[Paper] [Poster] [Video]
Vinh Nguyen1, Ha Nguyen2, Huong Le3, Thai Nguyen1, Tan Bui2, Luan Pham2, Anh Phan3, Cong Nguyen4, Viet Tran5, Anh Tran6
1VNU - UET, 2Project KC4.0, 3Hanoi University of Science and Technology, 4congnhm@vnu.edu.vn, 5University of Economic and Technical Industries, 6Thai Binh University
   Developing A Multilabel Corpus for the Quality Assessment of Online Political Talk
[Paper] [Video]
Kokil Jaidka
National University of Singapore
   BILinMID: A Spanish-English Corpus of the US Midwest
[Paper] [Poster] [Video]
Irati Hurtado
University of Illinois at Urbana-Champaign
   One Document, Many Revisions: A Dataset for Classification and Description of Edit Intents
[Paper] [Poster] [Video]
Dheeraj Rajagopal1, Xuchao Zhang2, Michael Gamon3, Sujay Kumar Jauhar3, Diyi Yang4, Eduard Hovy5
1Carnegie Mellon University, 2NEC Labs America, 3Microsoft Research, 4Georgia Institute of Technology, 5CMU
   CTAP for Chinese:A Linguistic Complexity Feature Automatic Calculation Platform
[Paper] [Poster] [Video]
Yue Cui1, Junhui Zhu2, Liner Yang2, Xuezhi Fang2, Xiaobin Chen3, Yujie Wang4, Erhong Yang5
1Beijing Language and Culture University, 2Beijing Language and Culture University, 3Tübingen Universität, 4Beijing Jiaotong University, 5Beijing Language and Cultural University
   A Corpus for Suggestion Mining of German Peer Feedback
[Paper] [Poster] [Video]
Dominik Pfütze, Eva Ritz, Julius Janda, Roman Rietsche
University of St.Gallen
   CLGC: A Corpus for Chinese Literary Grace Evaluation
[Paper] [Poster] [Video]
Yi Li, Dong Yu, pengyuan liu
Beijing Language and Culture University
   Anonymising the SAGT Speech Corpus and Treebank
[Paper] [Video]
Özlem Çetinoğlu1 and Antje Schweitzer2
1IMS, University of Stuttgart, 2Institute for Natural Language Processing, University of Stuttgart
   Construction of a Quality Estimation Dataset for Automatic Evaluation of Japanese Grammatical Error Correction
[Paper] [Poster] [Video]
Daisuke Suzuki1, Yujin Takahashi1, Ikumi Yamashita1, Taichi Aida1, Tosho Hirasawa1, Michitaka Nakatsuji1, Masato Mita2, Mamoru Komachi1
1Tokyo Metropolitan University, 2CyberAgent Inc.
   Enhanced Distant Supervision with State-Change Information for Relation Extraction
[Paper] [Poster] [Video]
Jui Shah1, Dongxu Zhang2, Sam Brody3, Andrew McCallum4
1University of Massachusetts Amherst, 2University of Massachusetts, Amherst, 3Bloomberg, 4UMass Amherst
   The Hebrew Essay Corpus
[Paper] [Poster] [Video]
Chen Gafni, Anat Prior, Shuly Wintner
University of Haifa
   Design and Evaluation of the Corpus of Everyday Japanese Conversation
[Paper] [Poster] [Video]
Hanae Koiso1, Haruka Amatani2, Yasuharu Den3, Yuriko Iseki2, Yuichi Ishimoto4, Wakako Kashino2, Yoshiko Kawabata2, Ken'ya Nishikawa2, Yayoi Tanaka2, Yasuyuki Usuda5, Yuka Watanabe2
1The National Institute for Japanese Language and Linguistics, 2National Institute for Japanese Language and Linguistics, 3Graduate School of Humanities, Chiba University, 4Institute of Technologists, 5NINJAL
   Developing Language Resources and NLP Tools for the North Korean Language
[Paper] [Video]
Arda Akdemir1, Yeojoo Jeon1, Tetsuo Shibuya2
1University of Tokyo, 2The University of Tokyo
   Developing a Dataset of Overridden Information in Wikipedia
[Paper] [Poster] [Video]
Masatoshi Tsuchiya and Yasutaka Yokoi
Toyohashi University of Technology
   BRATECA (Brazilian Tertiary Care Dataset): a Clinical Information Dataset for the Portuguese Language
[Paper] [Poster] [Video]
Bernardo Consoli1, Henrique dos Santos2, Ana Helena Ulbrich2, Renata Vieira3, Rafael Bordini4
1Pontifícia Universidade Católica do Rio Grande do Sul, 2Institute for Artificial Intelligence in Healthcare, 3University of Évora, 4Pontifical Catholic University of Rio Grande do Sul
   Universal Grammatical Dependencies for Portuguese with CINTIL Data, LX Processing and CLARIN support
[Paper] [Poster] [Video]
António Branco, João Ricardo Silva, Luís Gomes, João António Rodrigues
University of Lisbon
   CWID-hi: A Dataset for Complex Word Identification in Hindi Text
[Paper] [Poster] [Video]
Gayatri Venugopal1, Dhanya Pramod2, Ravi Shekhar3
1Symbiosis Institute of Computer Studies and Research, Symbiosis International University, 2Symbiosis Centre for Information Technology, Symbiosis International University, 3Queen Mary University of London
   Automatic Classification of Russian Learner Errors
[Paper] [Poster] [Video]
Alla Rozovskaya
Queens College, City University of New York
   Annotation of metaphorical expressions in the Basic Corpus of Polish Metaphors
[Paper] [Poster] [Video]
Elżbieta Hajnicz
Institute of Computer Science, Polish Academy of Sciences
   ChiMST: A Chinese Medical Corpus for Word Segmentation and Medical Term Recognition
[Paper] [Video]
Yuanhe Tian1, Han Qin2, Fei Xia3, Yan Song4
1Department of Linguistics, University of Washington, 2The Chinese University of Hong Kong (Shenzhen), 3University of Washington, 4CUHK-SZ
   Building a Synthetic Biomedical Research Article Citation Linkage Corpus
[Paper] [Poster] [Video]
Sudipta Singha Roy1 and Robert E. Mercer2
1University of Western Ontario, 2The University of Western Ontario
   Dataset Construction for Scientific-Document Writing Support by Extracting Related Work Section and Citations from PDF Papers
[Paper] [Poster] [Video]
Keita Kobayashi1, Kohei Koyama2, Hiromi Narimatsu3, Yasuhiro Minami1
1The University of Electro-Communications, 2The University of Electro-Communication, 3NTT Communication Science Laboratories
   RuPAWS: A Russian Adversarial Dataset for Paraphrase Identification
[Paper] [Poster] [Video]
Nikita Martynov1, Irina Krotova1, Varvara Logacheva2, Alexander Panchenko3, Olga Kozlova1, Nikita Semenov1
1MTS AI, 2Skolkovo Institute of Science and Technology, 3Skolkovo Institue of Science and Technology
   Atril: an XML Visualization System for Corpus Texts
[Paper] [Video]
Andressa Rodrigues Gomide, Conceição Carapinha, Cornelia Plag
Universidade de Coimbra
   MASALA: Modelling and Analysing the Semantics of Adpositions in Linguistic Annotation of Hindi
[Paper] [Poster] [Video]
Aryaman Arora, Nitin Venkateswaran, Nathan Schneider
Georgetown University
   Universal Dependencies for Punjabi
[Paper]
Aryaman Arora
Georgetown University
   TeSum: Human-Generated Abstractive Summarization Corpus for Telugu
[Paper] [Video]
Ashok Urlana1, Nirmal Surange1, Pavan Baswani1, Priyanka Ravva2, Manish Shrivastava1
1International Institute of Information Technology Hyderabad, 2IIIT Hyderabad
   A Corpus of Simulated Counselling Sessions with Dialog Act Annotation
[Paper] [Video]
John Lee, Haley Fong, Lai Shuen Judy Wong, Chun Chung Mak, Chi Hin Yip, Ching Wah Larry Ng
City University of Hong Kong
                         Session: R3 - Dialogue, Conversational Systems, Chatbots, Human-Robot Interaction
   Interactive Evaluation of Dialog Track at DSTC9
[Paper] [Poster] [Video]
Shikib Mehri1, Yulan Feng1, Carla Gordon2, Seyed Hossein Alavi3, David Traum4, Maxine Eskenazi1
1Carnegie Mellon University, 2USC Institute for Creative Technologies, 3University of British Columbia, 4University of Southern California Institute for Creative Technologies
   HADREB: Human Appraisals and (English) Descriptions of Robot Emotional Behaviors
[Paper] [Video]
Josue Torres-Fonsesca and Casey Kennington
Boise State University
   Dialogue Collection for Recording the Process of Building Common Ground in a Collaborative Task
[Paper] [Video]
Koh Mitsuda1, Ryuichiro Higashinaka2, Yuhei Oga3, Sen Yoshida4
1NTT, 2Nagoya University/NTT, 3University of Tsukuba, 4Nippon Telegraph and Telephone Corp.
   Collection and Analysis of Travel Agency Task Dialogues with Age-Diverse Speakers
[Paper] [Poster] [Video]
Michimasa Inaba1, Yuya Chiba2, Ryuichiro Higashinaka3, Kazunori Komatani4, Yusuke Miyao4, Takayuki Nagai4
1The University of Electro-Communications, 2NTT Corporation, 3Nagoya University, 4Osaka University
   Strategy-level Entrainment of Dialogue System Users in a Creative Visual Reference Resolution Task
[Paper] [Poster] [Video]
Deepthi Karkada1, Ramesh Manuvinakurike2, Maike Paetzel-Prüsmann3, Kallirroi Georgila4
1Intel Corporation, 2Intel labs, 3University of Potsdam, 4University of Southern California Institute for Creative Technologies
   MMChat: Multi-Modal Chat Dataset on Social Media
[Paper] [Poster] [Video]
Yinhe Zheng1, Guanyi Chen2, Xin Liu3, Jian Sun1
1Alibaba Group, 2Utrecht University, 3SRC-B
   E-ConvRec: A Large-Scale Conversational Recommendation Dataset for E-Commerce Customer Service
[Paper] [Poster] [Video]
meihuizi jia1, Ruixue Liu2, Peiying Wang2, Yang Song2, Zexi Xi2, Haobin Li2, Xin Shen2, Meng Chen2, Jinhui Pang1, Xiaodong He3
1School of Computer Science, Beijing Institute of Technology, 2JD AI, 3JD AI Research
   SHONGLAP: A Large Bengali Open-Domain Dialogue Corpus
[Paper] [Poster] [Video]
Syed Monsur1, Sakib Chowdhury1, Md Fatemi1, Shafayat Ahmed2
1Celloscope Ltd., 2Virginia Polytechnic Institute and State University
   A Comparison of Praising Skills in Face-to-Face and Remote Dialogues
[Paper] [Poster] [Video]
Toshiki Onishi1, Asahi Ogushi1, Yohei Tahara1, Ryo Ishii2, Atsushi Fukayama2, Takao Nakamura2, Akihiro Miyata1
1Nihon University, 2NTT Corporation
   Comparing Approaches to Language Understanding for Human-Robot Dialogue: An Error Taxonomy and Analysis
[Paper] [Video]
Ada Tur1 and David Traum2
1Los Altos High School, 2University of Southern California Institute for Creative Technologies
   SPORTSINTERVIEW: A Large-Scale Sports Interview Benchmark for Entity-centric Dialogues
[Paper] [Video]
Hanfei Sun, Ziyuan Cao, Diyi Yang
Georgia Institute of Technology
   EmoInHindi: A Multi-label Emotion and Intensity Annotated Dataset in Hindi for Emotion Recognition in Dialogues
[Paper] [Poster] [Video]
Gopendra Vikram Singh1, Priyanshu Priya2, Mauajama Firdaus3, Asif Ekbal1, Pushpak Bhattacharyya4
1IIT Patna, 2Indian Institute of Technology Patna, 3University of Alberta, 4Indian Institute of Technology Bombay and Patna
                         Session: R4 - Digital Humanities and Cultural Heritage
   The Project Dialogism Novel Corpus: A Dataset for Quotation Attribution in Literary Texts
[Paper] [Video]
Krishnapriya Vishnubhotla, Adam Hammond, Graeme Hirst
University of Toronto
   Who’s in, who’s out? Predicting the Inclusiveness or Exclusiveness of Personal Pronouns in Parliamentary Debates
[Paper] [Video]
Ines Rehbein1 and Josef Ruppenhofer2
1University of Mannheim, 2Institute for German Language
   A Language Modelling Approach to Quality Assessment of OCR'ed Historical Text
[Paper] [Poster] [Video]
Callum Booth1, Robert Shoemaker2, Robert Gaizauskas2
1The University of Sheffield, 2University of Sheffield
   Identifying Copied Fragments in a 18th Century Dutch Chronicle
[Paper] [Poster] [Video]
Roser Morante1, Eleanor Smith2, Lianne Wilhelmus1, Alie Lassche3, Erika Kuijpers1
1VU Amsterdam, 2University of Antwerp, 3University of Leiden
   A Study of Distant Viewing of ukiyo-e prints
[Paper] [Poster] [Video]
Konstantina Liagkou1, John Pavlopoulos2, Ewa Machotka2
1Athens University of Economics and Business, 2Stockholm University
   CCTAA: A Reproducible Corpus for Chinese Authorship Attribution Research
[Paper] [Video]
Haining Wang and Allen Riddell
Indiana University Bloomington
   An automatic model and Gold Standard for translation alignment of Ancient Greek
[Paper] [Poster] [Video]
Tariq Yousef1, Chiara Palladino2, Farnoosh Shamsian1, Anise d’Orange Ferreira3, Michel Ferreira dos Reis3
1University of Leipzig, 2Furman University, 3Universidade Estadual Paulista (UNESP)
                         Session: R5 - Discourse and Pragmatics
   Rhetorical Structure Approach for Online Deception Detection: A Survey
[Paper] [Video]
Francielle Vargas1, Jonas D`Alessandro2, Zohar Rabinovich3, Fabrício Benevenuto4, Thiago Pardo1
1University of São Paulo, 2Federal University of Minas Gerais, 3University of Southern California, 4Federal University of Minas Gerais (UFMG)
   TYPIC: A Corpus of Template-Based Diagnostic Comments on Argumentation
[Paper] [Poster] [Video]
Shoichi Naito1, Shintaro Sawada2, Chihiro Nakagawa2, Naoya Inoue3, Kenshi Yamaguchi1, Iori Shimizu2, Farjana Sultana Mim1, Keshav Singh1, Kentaro Inui4
1Tohoku University, 2Osaka Prefecture University, 3Japan Advanced Institute of Science and Technology, 4Tohoku University / Riken
                         Session: R6 - Evaluation and Validation Methodologies
   Towards Speaker Verification for Crowdsourced Speech Collections
[Paper] [Video]
John Mendonca1, Rui Correia2, Mariana Lourenço2, João Freitas3, Isabel Trancoso4
1INESC-ID/Instituto Superior Técnico, 2, 3Defined.ai, 4INESC-ID / IST Univ. Lisbon
   Align-smatch: A Novel Evaluation Method for Chinese Abstract Meaning Representation Parsing based on Alignment of Concept and Relation
[Paper] [Poster] [Video]
Liming Xiao, Bin Li, Zhixing Xu, Kairui Huo, Minxuan Feng, Junsheng Zhou, Weiguang Qu
Nanjing Normal University
   Dynamic Human Evaluation for Relative Model Comparisons
[Paper] [Poster] [Video]
Thórhildur Thorleiksdóttir1, Cedric Renggli1, Nora Hollenstein2, Ce Zhang1
1ETH Zürich, 2University of Copenhagen
   Please, Don't Forget the Difference and the Confidence Interval when Seeking for the State-of-the-Art Status
[Paper] [Video]
Yves Bestgen
Université catholique de Louvain
   PCR4ALL: A Comprehensive Evaluation Benchmark for Pronoun Coreference Resolution in English
[Paper] [Video]
Xinran Zhao1, Hongming Zhang2, Yangqiu Song2
1Hong Kong University of Science and Technology, 2HKUST
   Estimating Confidence of Predictions of Individual Classifiers and TheirEnsembles for the Genre Classification Task
[Paper] [Poster] [Video]
Mikhail Lepekhin1 and Serge Sharoff2
1MIPT, 2University of Leeds
   What do we really know about State of the Art NER?
[Paper] [Poster] [Video]
Sowmya Vajjala1 and Ramya Balasubramaniam2
1National Research Council, 2Novisto
   ProQE: Proficiency-wise Quality Estimation dataset for Grammatical Error Correction
[Paper] [Poster] [Video]
Yujin Takahashi1, Masahiro Kaneko2, Masato Mita3, Mamoru Komachi1
1Tokyo Metropolitan University, 2Tokyo Institute of Technology, 3CyberAgent Inc.
   Evaluation of Off-the-shelf Speech Recognizers on Different Accents in a Dialogue Domain
[Paper] [Poster] [Video]
Divya Tadimeti1, Kallirroi Georgila2, David Traum2
1USC Institute for Creative Technologies, 2University of Southern California Institute for Creative Technologies
   Sentence Pair Embeddings Based Evaluation Metric for Abstractive and Extractive Summarization
[Paper] [Video]
Ramya Akula and Ivan Garibay
University of Central Florida
   On ``Human Parity'' and ``Super Human Performance'' \\in Machine Translation Evaluation
[Paper] [Video]
Thierry Poibeau
LATTICE (CNRS & ENS/PSL)
   Evaluation Benchmarks for Spanish Sentence Representations
[Paper] [Poster] [Video]
Vladimir Araujo1, Andrés Carvallo1, Souvik Kundu2, José Cañete3, Marcelo Mendoza4, Robert E. Mercer5, Felipe Bravo-Marquez6, Marie-Francine Moens7, Alvaro Soto8
1Pontificia Universidad Católica de Chile, 2University of Western Ontario, 3Universidad de Chile, 4Universidad Técnica Federico Santa María, 5The University of Western Ontario, 6University of Chile, 7KU Leuven, 8PUC
                         Session: R7 - Information Extraction and Information Retrieval (including NER, QA, Text Mining, Document Classification, Text Categorisation)
   UMUTextStats: A linguistic feature extraction tool for Spanish
[Paper] [Video]
José Antonio García-Díaz1, Pedro José Vivancos-Vicente2, Ángela Almela1, Rafael Valencia-García1
1Universidad de Murcia, 2Vócali Sistemas Inteligentes S.L.
   Problem-solving Recognition in Scientific Text
[Paper] [Video]
Kevin Heffernan1 and Simone Teufel2
1University of Cambridge, 2Cambridge University
   HRCA+: Advanced Multiple-choice Machine Reading Comprehension Method
[Paper] [Poster] [Video]
YUXIANG ZHANG and Hayato Yamana
Waseda University
   HyperBox: A Supervised Approach for Hypernym Discovery using Box Embeddings
[Paper] [Video]
Maulik Parmar1 and Apurva Narayan2
1Independent Researcher, 2The University of British Columbia
   Extracting Space Situational Awareness Events from News Text
[Paper] [Poster] [Video]
Zhengnan Xie1, Alice Kwak1, Enfa George1, Laura Dozal1, Hoang Van1, Moriba Jah2, Roberto Furfaro1, Peter Jansen1
1University of Arizona, 2University of Texas at Austin
   PerCQA: Persian Community Question Answering Dataset
[Paper] [Video]
Naghme Jamali1, Yadollah Yaghoobzadeh2, Heshaam Faili2
1School of Computer Science, Institute for Research in Fundamental Sciences, 2School of Electrical and Computer Engineering, College of Engineering, University of Tehran
   GrASP: A Library for Extracting and Exploring Human-Interpretable Textual Patterns
[Paper] [Video]
Piyawat Lertvittayakumjorn1, Leshem Choshen2, Eyal Shnarch3, Francesca Toni1
1Imperial College London, 2IBM, Hebrew University Jerusalem Israel, 3IBM Research
   Recurrent Neural Networks with Mixed Hierarchical Structures and EM Algorithm for Natural Language Processing
[Paper] [Video]
zhaoxin luo and Michael Zhu
Purdue University, Department of Statistics
   Korean-Specific Dataset for Table Question Answering
[Paper] [Poster] [Video]
Changwook Jun, Jooyoung Choi, Myoseop Sim, Hyun Kim, Hansol Jang, Kyungkoo Min
LG AI Research
   GerCCT: An Annotated Corpus for Mining Arguments in German Tweets on Climate Change
[Paper] [Poster] [Video]
Robin Schaefer and Manfred Stede
University of Potsdam
   Budget Argument Mining Dataset Using Japanese Minutes from the National Diet and Local Assemblies
[Paper] [Poster] [Video]
Yasutomo Kimura1, Hokuto Ototake2, Minoru Sasaki3
1Otaru University of Commerce / RIKEN AIP, 2Fukuoka University, 3Ibaraki University
   Context-based Virtual Adversarial Training for Text Classification with Noisy Labels
[Paper] [Poster] [Video]
Do-Myoung Lee1, Yeachan Kim2, Chang gyun Seo3
1Korea University, 2Deargen Inc., 3GC Company
   FinMath: Injecting a Tree-structured Solver for Question Answering over Financial Reports
[Paper] [Poster] [Video]
Chenying Li1, Wenbo Ye2, Yilun Zhao3
1Northeastern University, 2Zhejiang University, 3Yale University
   HeadlineCause: A Dataset of News Headlines for Detecting Causalities
[Paper] [Poster] [Video]
Ilya Gusev1 and Alexey Tikhonov2
1Moscow Institute of Physics and Technology, 2Yandex
   Incorporating Zoning Information into Argument Mining from Biomedical Literature
[Paper] [Poster] [Video]
Boyang Liu1, Viktor Schlegel2, Riza Batista-Navarro3, Sophia Ananiadou2
1the University of Manchester, 2University of Manchester, 3Department of Computer Science, The University of Manchester
   MAKED: Multi-lingual Automatic Keyword Extraction Dataset
[Paper] [Video]
Yash Verma1, Anubhav Jangra2, Sriparna Saha3, Adam Jatowt4, Dwaipayan Roy5
1Indian Institute of Science Education and Research, Kolkata, 2Google Research, 3Indian Institute of Technology Patna, 4University of Innsbruck, 5Indian Institute of Science Education and Research
   From Examples to Rules: Neural Guided Rule Synthesis for Information Extraction
[Paper] [Poster] [Video]
Robert Vacareanu1, Marco A. Valenzuela-Escárcega2, George Caique Gouveia Barbosa2, Rebecca Sharp2, Gustave Hahn-Powell2, Mihai Surdeanu2
1Technical University of Cluj-Napoca, 2University of Arizona
   Enhancing Relation Extraction via Adversarial Multi-task Learning
[Paper] [Video]
Han Qin1, Yuanhe Tian2, Yan Song3
1The Chinese University of Hong Kong (Shenzhen), 2Department of Linguistics, University of Washington, 3CUHK-SZ
   Query Obfuscation by Semantic Decomposition
[Paper] [Poster] [Video]
Danushka Bollegala1, Tomoya Machide2, Ken-ichi Kawarabayashi2
1University of Liverpool/Amazon, 2National Institute of Informatics
   TWEET-FID: An Annotated Dataset for Multiple Foodborne Illness Detection Tasks
[Paper] [Video]
Ruofan Hu1, Dongyu Zhang2, Dandan Tao3, Thomas Hartvigsen3, Hao Feng4, Elke Rundensteiner4
1MS, 2PhD Candidate, 3Doctor, 4Professor
   Named Entity Recognition to Detect Criminal Texts on the Web
[Paper] [Video]
Paweł Skórzewski1, Mikołaj Pieniowski1, Grazyna Demenko2
1Adam Mickiewicz University in Poznań, 2Adam Mickiewicz University
   Task-Driven and Experience-Based Question Answering Corpus for In-Home Robot Application in the House3D Virtual Environment
[Paper] [Video]
zhuoqun Xu1, Liubo Ouyang1, Yang Liu2
1Hunan University, 2Samsung Research China -Beijing
   ELRC Action: Covering Confidentiality, Correctness and Cross-linguality
[Paper] [Poster] [Video]
Tom Vanallemeersch1, Arne Defauw1, Sara Szoc1, Alina Kramchaninova1, Joachim Van den Bogaert2, Andrea Lösch3
1CrossLang, 2CrossLang NV, 3Deutsches Forschungszentrum für Künstliche Intelligenz (DFKI) GmbH
   RadQA: A Question Answering Dataset to Improve Comprehension of Radiology Reports
[Paper] [Poster] [Video]
Sarvesh Soni1, Meghana Gudala1, Atieh Pajouhi1, Kirk Roberts2
1UTHealth, SBMI, 2University of Texas Health Science Center at Houston
   Knowledge Graph - Deep Learning: A Case Study in Question Answering in Aviation Safety Domain
[Paper] [Poster] [Video]
Ankush Agarwal1, Raj Gite1, Shreya Laddha2, Pushpak Bhattacharyya1, Satyanarayan Kar3, Asif Ekbal4, Prabhjit Thind3, Rajesh Zele1, Ravi Shankar3
1IIT Bombay, 2Indian Institute of Technology Bombay, 3Honeywell, 4IIT Patna
   A Bayesian Topic Model for Human-Evaluated Interpretability
[Paper] [Poster] [Video]
Justin Wood, Corey Arnold, Wei Wang
UCLA
                         Session: R8 - Knowledge Discovery/Representation
   A Large Interlinked Knowledge Graph of the Italian Cultural Heritage
[Paper] [Video]
Stefano Faralli1, Andrea Lenzi2, Paola Velardi3
1University of Rome Sapienza, 2Sapienza University of Rome, 3university Sapienza Roma
   Training on Lexical Resources
[Paper] [Poster] [Video]
Kenneth Church1, Xingyu Cai2, Yuchen Bian3
1Baidu, USA, 2Baidu USA LLC, 3Baidu Research USA
   Challenging the Assumption of Structure-based embeddings in Few- and Zero-shot Knowledge Graph Completion
[Paper] [Poster] [Video]
Filip Cornell1, Chenda zhang2, Jussi Karlgren3, Sarunas Girdzijauskas4
1KTH Royal Institute of Technology, 2KTH, 3Spotify, 4KTH - Royal Institute of Technology
   Open Terminology Management and Sharing Toolkit for Federation of Terminology Databases
[Paper] [Video]
Andis Lagzdiņš, Uldis Siliņš, Toms Bergmanis, Mārcis Pinnis, Artūrs Vasiļevskis, Andrejs Vasiļjevs
Tilde
   RELATE: Generating a linguistically inspired Knowledge Graph for fine-grained emotion classification
[Paper] [Poster] [Video]
Annika Marie Schoene1, Nina Dethlefs2, Sophia Ananiadou3
1The University of Manchester, 2University of Hull, 3University of Manchester
                         Session: R9 - Language Resource Infrastructures, Standards for LRs, Metadata, Policy issues, Ethics, Legal Issues
   Language technology practitioners as language managers: arbitrating data bias and predictive bias in ASR
[Paper] [Video]
Nina Markl and Stephen McNulty
University of Edinburgh
   Masader: Metadata Sourcing for Arabic Text and Speech Data Resources
[Paper] [Poster] [Video]
Zaid Alyafeai1, Maraim Masoud2, Mustafa Ghaleb1, Maged Al-shaibani1
1KFUPM, 2Independent Researcher
   Linghub2: Language Resource Discovery Tool for Language Technologies
[Paper] [Poster] [Video]
Cécile Robin1, Gautham Suresh2, Víctor Rodriguez-Doncel3, John P. McCrae4, Paul Buitelaar5
1Insight Centre for Data Analytics, 2Data Science Institute, National University of Ireland, Galway, 3Universidad Politecnica de Madrid, 4Insight Center for Data Analytics, National University of Ireland Galway, 5National University of Ireland Galway
                         Session: R10 - Language Resources and Evaluation for Psycho-linguistics, Cognitive Linguistics and Linguistic Theories
   CxLM: A Construction and Context-aware Language Model
[Paper] [Video]
Yu-Hsiang Tseng, Cing-Fang Shih, Pin-Er Chen, Hsin-Yu Chou, Mao-Chang Ku, Shu-Kai HSIEH
Graduate Institute of Linguistics, National Taiwan University
   The Lexometer: A Shiny Application for Exploratory Analysis and Visualization of Corpus Data
[Paper] [Poster] [Video]
Oufan Hai, Matthew Sundberg, Katherine Trice, Rebecca Friedman, Scott Grimm
University of Rochester
   TallVocabL2Fi: A Tall Dataset of 15 Finnish L2 Learners’ Vocabulary
[Paper] [Poster] [Video]
Frankie Robertson1, Li-Hsin Chang2, Sini Söyrinki1
1University of Jyväskylä, 2University of Turku
   CAMS: An Annotated Corpus for Causal Analysis of Mental Health Issues in Social Media Posts
[Paper] [Poster] [Video]
Muskan Garg1, Chandni Saxena2, Sriparna Saha3, Veena Krishnan4, Ruchi Joshi5, Vijay Mago6
1University of Florida, 2The Chinese University of Hong Kong, 3Indian Institute of Technology Patna, 4University of Petroleum And Energy Studies, 5Amity University Rajasthan, 6Lakehead University
   How Does the Experimental Setting Affect the Conclusions of Neural Encoding Models?
[Paper] [Video]
Xiaohan Zhang1, Shaonan Wang2, Chengqing Zong1
1Institute of Automation, Chinese Academy of Sciences, 2National Laboratory of Pattern Recognition, Institute of Automation, Chinese Academy of Sciences
   SPADE: A Big Five-Mturk Dataset of Argumentative Speech Enriched with Socio-Demographics for Personality Detection
[Paper] [Poster] [Video]
Elma Kerz1, Yu Qiao1, Sourabh Zanwar1, Daniel Wiechmann2
1RWTH Aachen University, 2Institute for Logic Language and Computation
                         Session: R11 - Less-Resourced/Endangered Languages
   Progress in Multilingual Speech Recognition for Low Resource Languages Kurmanji Kurdish, Cree and Inuktut
[Paper] [Video]
vishwa gupta1 and Gilles Boulianne2
1Computer Research Institute of Montreal, 2CRIM - Centre de recherche informatique de Montréal
   Efficient Entity Candidate Generation for Low-Resource Languages
[Paper] [Video]
Alberto Garcia-Duran1, Akhil Arora2, Robert West1
1EPFL, 2DLAB, EPFL
   What a Creole Wants, What a Creole Needs
[Paper] [Video]
Heather Lent1, Kelechi Ogueji2, Miryam de Lhoneux1, Orevaoghene Ahia3, Anders Søgaard1
1University of Copenhagen, 2University of Waterloo, 3Masakhane
   Extensions to Brahmic script processing within the Nisaba library: new scripts, languages and utilities
[Paper] [Poster] [Video]
Alexander Gutkin, Cibu Johny, Raiomond Doctor, Lawrence Wolf-Sonkin, Brian Roark
Google
   Predicting Embedding Reliability in Low-Resource Settings Using Corpus Similarity Measures
[Paper] [Video]
Jonathan Dunn, Haipeng Li, Damian Sastre
University of Canterbury
   Hausa Visual Genome: A Dataset for Multi-Modal English to Hausa Machine Translation
[Paper] [Poster] [Video]
Idris Abdulmumin1, Satya Ranjan Dash2, Musa Dawud3, Shantipriya Parida4, Shamsuddeen Muhammad5, Ibrahim Ahmad6, Subhadarshi Panda7, Ondřej Bojar8, Bashir Galadanci9, Bello Bello10
1Ahmadu Bello University, Zaria, 2KIIT University, 3School of Computer Applications, KIIT University, 4Silo AI, 5Bayero University, Kano, 6Department of Information Technology, Bayero University, Kano, 7Graduate Center CUNY, 8Charles University, MFF UFAL, 9Department of Software Engineering, Bayero University, Kano, 10Department of Computer Science, Bayero University, Kano
   A Survey of Machine Translation Tasks on Nigerian Languages
[Paper] [Poster] [Video]
Ebelechukwu Nwafor1 and Anietie Andy2
1Villanova University, 2University of Pennsylvania
   Automatic Speech Recognition Datasets in Cantonese: A Survey and New Dataset
[Paper] [Poster] [Video]
Tiezheng Yu1, Rita Frieske1, Peng Xu1, Samuel Cahyawijaya2, Cheuk Tung YIU1, Holy Lovenia1, Wenliang Dai3, Elham J. Barezi4, Qifeng Chen2, Xiaojuan Ma3, Bertram Shi5, Pascale Fung3
1The Hong Kong University of Science and Technology, 2HKUST, 3Hong Kong University of Science and Technology, 4Department of Computer Science and Engineering, Hong Kong University of Science and Technology, 5ECE/HKUST
   Survey on Thai NLP Language Resources and Tools
[Paper] [Poster] [Video]
Ratchakrit Arreerard1, Stephen Mander1, Scott Piao2
1School of Computing and Communications, Lancaster University, 2Lancaster University
   LaoPLM: Pre-trained Language Models for Lao
[Paper] [Poster] [Video]
Nankai Lin, Yingwen Fu, Chuwei Chen, Ziyu Yang, Shengyi JIANG
Guangdong University of Foreign Studies
   The Maaloula Aramaic Speech Corpus (MASC): From Printed Material to a Lemmatized and Time-Aligned Corpus
[Paper] [Poster] [Video]
Ghattas Eid1, Esther Seyffarth1, Ingo Plag2
1Heinrich Heine University Düsseldorf, 2Heinrich-Heine-Universität Düsseldorf
   VIMQA: A Vietnamese Dataset for Advanced Reasoning and Explainable Multi-hop Question Answering
[Paper] [Poster] [Video]
Khang Le1, Hien Nguyen1, Tung Le Thanh2, Minh Nguyen3
1Japan Advanced Institute of Science and Technology, 2University of Science, VNU-HCM, 3JAIST
   Language Identification for Austronesian Languages
[Paper] [Video]
Jonathan Dunn and Wikke Nijhof
University of Canterbury
   A Mapudüngun FST Morphological Analyser and its Web Interface
[Paper] [Poster] [Video]
Andrés Chandía
Universitat Pompeu Fabra
   Improving Large-scale Language Models and Resources for Filipino
[Paper] [Poster] [Video]
Jan Christian Blaise Cruz1 and Charibeth Cheng2
1Samsung Research Philippines (SRPH), 2De La Salle University
   Thirumurai: A Large Dataset of Tamil Shaivite Poems and Classification of Tamil Pann
[Paper] [Poster] [Video]
Shankar Mahadevan1, Rahul Ponnusamy2, Prasanna Kumar Kumaresan3, Prabakaran Chandran4, Ruba Priyadharshini5, Sangeetha S6, Bharathi Raja Chakravarthi7
1Thiagarajar College of Engineering, 2Master degree, IIITMK College, 3Student, IIITMK College, 4Mu Sigma Inc., 5ULTRA Arts and Science College,, 6National Institute of Technology, 7Insight SFI Research Centre for Data Analytics, Data Science Institute, National University of Ireland Galway
   Generating Monolingual Dataset for Low Resource Language Bodo from old books using Google Keep
[Paper] [Poster] [Video]
Sanjib Narzary1, Maharaj Brahma1, Mwnthai Narzary1, Gwmsrang Muchahary1, Pranav Singh1, Apurbalal Senapati1, Sukumar Nandi2, Bidisha Som2
1Central Institute of Technology Kokrajhar, 2Indian Institute of Technology Guwahati
   AsNER - Annotated Dataset and Baseline for Assamese Named Entity recognition
[Paper] [Video]
Dhrubajyoti Pathak, Sukumar Nandi, Priyankoo Sarmah
Indian Institute of Technology Guwahati
   GeezSwitch: Language Identification in Typologically Related Low-resourced East African Languages
[Paper] [Video]
Fitsum Gaim, Wonsuk Yang, Jong Park
Korea Advanced Institute of Science and Technology
   Handwritten Paleographic Greek Text Recognition: A Century-Based Approach
[Paper] [Video]
Paraskevi Platanou1, John Pavlopoulos2, Georgios Papaioannou3
1Postgraduate, 2Adjunct Professor, 3Associate Professor
   Quality Control for Crowdsourced Bilingual Dictionary in Low-Resource Languages
[Paper] [Video]
Hiroki Chida1, Yohei Murakami1, Mondheera Pituxcoosuvarn2
1Ritsumeikan University, 2Kyoto University
   An Inflectional Database for Gitksan
[Paper] [Poster] [Video]
Bruce Oliver1, Clarissa Forbes2, Changbing Yang3, Farhan Samir4, Edith Coates1, Garrett Nicolai1, Miikka Silfverberg1
1University of British Columbia, 2Independent, 3University of Colorado Boulder, 4University of Toronto
   PyCantonese: Cantonese Linguistics and NLP in Python
[Paper] [Video]
Jackson Lee1, Litong Chen2, Charles Lam3, Chaak Ming Lau4, Tsz-Him Tsui1
1Independent Researcher, 2Wheaton College, 3Hang Seng University of Hong Kong, 4Education University of Hong Kong
   Afaan Oromo Hate Speech Detection and Classification on Social Media
[Paper] [Video]
Teshome Mulugeta Ababu1 and Michael Melese Woldeyohannis2
1Dire Dawa University Institute of Technology, 2Addis Ababa University, Addis Ababa, Ethiopia
                         Session: R12 - Lexicons (also WordNet, FrameNet, Multimodal and Sign Language lexicons, etc.)
   Cross-lingual Linking of Automatically Constructed Frames and FrameNet
[Paper] [Poster] [Video]
Ryohei Sasano
Nagoya University
   Aligning the Romanian Reference Treebank and the Valence Lexicon of Romanian Verbs
[Paper] [Video]
Ana-Maria Barbu1, Verginica Barbu Mititelu2, Cătălin Mititelu3
1“Iorgu Iordan – Al. Rosetti” Institute of Linguistics, University of Bucharest, 2RACAI, 3“Iorgu Iordan – Al. Rosetti” Institute of Linguistics
   PortiLexicon-UD: a Portuguese Lexical Resource according to Universal Dependencies Model
[Paper] [Video]
Lucelene Lopes1, Magali Duran2, Paulo Fernandes3, Thiago Pardo4
1USP - ICMC, 2Universidade de São Paulo, 3Merrimack College, 4University of São Paulo
                         Session: R13 - Multilinguality and Machine Translation (including Speech-to-Speech translation)
   Extended Parallel Corpus for Amharic-English Machine Translation
[Paper] [Poster] [Video]
Andargachew Mekonnen Gezmu1, Andreas Nürnberger1, Tesfaye Bayu Bati2
1Otto-von-Guericke Universität Magdeburg, 2Hawassa University
   Low-resource Neural Machine Translation: Benchmarking State-of-the-art Transformer for Wolof<->French
[Paper] [Poster] [Video]
Cheikh M. Bamba Dione1, Alla LO2, Elhadji Mamadou Nguer3, sileye ba4
1University of Bergen, 2Université Gaston Berger, 3Virtual University of Senegal, 4loreal research and innovation
   Criteria for Useful Automatic Romanization in South Asian Languages
[Paper] [Poster] [Video]
Isin Demirsahin1, Cibu Johny2, Alexander Gutkin2, Brian Roark3
1Google AI, 2Google, 3Google Research
   BERTology for Machine Translation: What BERT Knows about Linguistic Difficulties for Translation
[Paper] [Poster] [Video]
Yuqian Dai, Marc Kamps, Serge Sharoff
University of Leeds
   CVSS Corpus and Massively Multilingual Speech-to-Speech Translation
[Paper] [Video]
Ye Jia1, Michelle Tadmor Ramanovich1, Quan Wang2, Heiga Zen1
1Google, 2Google Inc.
   JParaCrawl v3.0: A Large-scale English-Japanese Parallel Corpus
[Paper] [Poster] [Video]
Makoto Morishita1, Katsuki Chousa2, Jun Suzuki3, Masaaki Nagata4
1NTT Communication Science Laboratories, 2NTT, 3Tohoku University / RIKEN Center for AIP, 4NTT Corporation
   Learning How to Translate North Korean through South Korean
[Paper] [Poster] [Video]
Hwichan Kim1, Sangwhan Moon2, Naoaki Okazaki2, Mamoru Komachi1
1Tokyo Metropolitan University, 2Tokyo Institute of Technology
   FGraDA: A Dataset and Benchmark for Fine-Grained Domain Adaptation in Machine Translation
[Paper] [Poster] [Video]
Wenhao Zhu1, Shujian Huang1, Tong Pu1, Pingxuan Huang2, xu zhang3, Jian Yu3, Wei Chen3, Yanfeng Wang3, Jiajun CHEN4
1National Key Laboratory for Novel Software Technology, Nanjing University, 2University of Michigan, 3sogou, 4Nanjing University
   SansTib, a Sanskrit - Tibetan Parallel Corpus and Bilingual Sentence Embedding Model
[Paper] [Poster] [Video]
Sebastian Nehrdich
University Hamburg
   VISA: An Ambiguous Subtitles Dataset for Visual Scene-aware Machine Translation
[Paper] [Poster] [Video]
Yihang Li, Shuichiro Shimizu, Weiqi Gu, Chenhui Chu, Sadao Kurohashi
Kyoto University
   A Benchmark Dataset for Multi-Level Complexity-Controllable Machine Translation
[Paper] [Poster] [Video]
Kazuki Tani1, Ryoya Yuasa1, Kazuki Takikawa2, Akihiro Tamura1, Tomoyuki Kajiwara2, Takashi Ninomiya2, Tsuneo Kato1
1Doshisha University, 2Ehime University
   gaHealth: An English–Irish Bilingual Corpus of Health Data
[Paper] [Poster] [Video]
Séamus Lankford1, Haithem Afli2, Órla Ní Loinsigh1, Andy Way1
1Dublin City University, 2Munster Technological University
   Translation Memories as Baselines for Low-Resource Machine Translation
[Paper] [Poster] [Video]
Rebecca Knowles1 and Patrick Littell2
1National Research Council Canada, 2National Research Council of Canada
                         Session: R14 - Multimodality and Cross-modality (including Sign Languages, Vision and other modalities) and Multimedia
   N24News: A New Dataset for Multimodal News Classification
[Paper] [Poster] [Video]
Zhen Wang, Xu Shan, Xiangxie Zhang, Jie Yang
Delft University of Technology
   MultiSubs: A Large-scale Multimodal and Multilingual Dataset
[Paper] [Video]
Josiah Wang1, Josiel Figueiredo2, Lucia Specia1
1Imperial College London, 2Federal University of Mato Grosso
   CI-AVSR: A Cantonese Audio-Visual Speech Datasetfor In-car Command Recognition
[Paper] [Poster] [Video]
Wenliang Dai1, Samuel Cahyawijaya2, Tiezheng Yu3, Elham J. Barezi4, Peng Xu3, Cheuk Tung YIU3, Rita Frieske3, Holy Lovenia3, Genta Winata5, Qifeng Chen2, Xiaojuan Ma1, Bertram Shi6, Pascale Fung1
1Hong Kong University of Science and Technology, 2HKUST, 3The Hong Kong University of Science and Technology, 4Department of Computer Science and Engineering, Hong Kong University of Science and Technology, 5Bloomberg, 6ECE/HKUST
   Multimodal Negotiation Corpus with Various Subjective Assessments for Social-Psychological Outcome Prediction from Non-Verbal Cues
[Paper] [Video]
Nobukatsu Hojo, Satoshi Kobashikawa, Saki Mizuno, Ryo Masumura
NTT
   MMDAG: Multimodal Directed Acyclic Graph Network for Emotion Recognition in Conversation
[Paper] [Video]
Shuo Xu, Yuxiang Jia, Changyong Niu, Hongying Zan
Zhengzhou University
   Automatic Gloss-level Data Augmentation for Sign Language Translation
[Paper] [Video]
Jin Yea Jang1, Han-Mu Park2, Saim Shin2, Suna Shin3, Byungcheon Yoon3, Gahgene Gweon1
1Seoul National University, 2KETI, 3Korea Nazarene University
   Image Description Dataset for Language Learners
[Paper] [Poster] [Video]
Kento Tanaka1, Taichi Nishimura1, Hiroaki Nanjo1, Keisuke Shirai1, Hirotaka Kameko1, Masatake Dantsuji2
1Kyoto University, 2Kyoto Tachibana University
   The Multimodal Annotation Software Tool (MAST)
[Paper] [Video]
Bruno Cardoso and Neil Cohn
Tilburg University
   A Multimodal German Dataset for Automatic Lip Reading Systems and Transfer Learning
[Paper] [Video]
Gerald Schwiebert, Cornelius Weber, Leyuan Qu, Henrique Siqueira, Stefan Wermter
University of Hamburg
   Multimodality for NLP-Centered Applications: Resources, Advances and Frontiers
[Paper] [Poster] [Video]
Muskan Garg1, Seema Wazarkar2, Muskaan Singh3, Ondřej Bojar4
1University of Florida, 2Thapar Institute of Engineering and Technology, 3UFAL,Charles University, 4Charles University, MFF UFAL
   Cross-lingual and Multilingual CLIP
[Paper] [Poster] [Video]
Fredrik Carlsson1, Philipp Eisen2, Faton Rekathati3, Magnus Sahlgren4
1Research Institute of Sweden, 2Depict, 3National Library of Sweden, 4AI Sweden
   BAN-Cap: A Multi-Purpose English-Bangla Image Descriptions Dataset
[Paper] [Poster] [Video]
Mohammad Faiyaz Khan1, S.M. Sadiq-Ur-Rahman Shifath1, Md Saiful Islam2
1Shahjalal University of Science and Technology, 2University of Alberta
   SSR7000: A Synchronized Corpus of Ultrasound Tongue Imaging for End-to-End Silent Speech Recognition
[Paper] [Video]
Naoki Kimura, Zixiong Su, Takaaki Saeki, Jun Rekimoto
The University of Tokyo
                         Session: R15 - Natural Language Generation (including Summarization)
   A Simple Yet Effective Corpus Construction Method for Chinese Sentence Compression
[Paper] [Video]
Yang Zhao1, Hiroshi Kanayama2, Issei Yoshida2, Masayasu Muraoka2, Akiko Aizawa3
1IBM Research - Tokyo, Japan, 2IBM Research - Tokyo, 3National Institute of Informatics
   {JADE}: Corpus for Japanese Definition Modelling
[Paper] [Video]
Han Huang1, Tomoyuki Kajiwara2, Yuki Arase1
1Osaka University, 2Ehime University
   Unraveling the Mystery of Artifacts in Machine Generated Text
[Paper] [Video]
Jiashu Pu, Ziyi Huang, Yadong Xi, Guandan Chen, Weijie Chen, Rongsheng Zhang
NetEase Fuxi Lab
   Logic-Guided Message Generation from Raw Real-Time Sensor Data
[Paper] [Video]
Ernie Chang1, Alisa Kovtunova2, Stefan Borgwardt2, Vera Demberg1, Kathryn Chapman1, Hui-Syuan Yeh3
1Saarland University, 2TU Dresden, 3LISN/CNRS & Université Paris Saclay
   The Bull and the Bear: Summarizing Stock Market Discussions
[Paper] [Poster] [Video]
Ayush Kumar1, Dhyey Jani1, Jay Shah1, Devanshu Thakar1, Varun Jain2, Mayank Singh2
1Indian Institute of Technology, Gandhinagar, 2IIT Gandhinagar
   Combination of Contextualized and Non-Contextualized Layers for Lexical Substitution in French
[Paper] [Poster] [Video]
Kévin Espasa1, Emmanuel Morin2, Olivier Hamon3
1University of Nantes, Syllabs, 2University of Nantes, 3Syllabs
   SuMe: A Dataset Towards Summarizing Biomedical Mechanisms
[Paper] [Poster] [Video]
Mohaddeseh Bastan1, Nishant Shankar1, Mihai Surdeanu2, Niranjan Balasubramanian1
1Stony Brook University, 2University of Arizona
   CATAMARAN: A Cross-lingual Long Text Abstractive Summarization Dataset
[Paper] [Video]
zheng chen and Hongyu Lin
University of Electronic Science and Technology of China
                         Session: R16 - Opinion Mining, Sentiment Analysis, Emotion Recognition/Generation
   Emotion analysis and detection during COVID-19
[Paper] [Video]
Tiberiu Sosea1, Chau Pham2, Alexander Tekle3, Cornelia Caragea1, Junyi Jessy Li3
1University of Illinois at Chicago, 2Colgate University, 3University of Texas at Austin
   Cross-lingual Emotion Detection
[Paper] [Poster] [Video]
Sabit Hassan1, Shaden Shaar2, Kareem Darwish3
1University of Pittsburgh, 2Cornell University, 3aiXplain Inc.
   DirectQuote: A Dataset for Direct Quotation Extraction and Attribution in News Articles
[Paper] [Video]
Yuanchi Zhang and Yang Liu
Tsinghua University
   VaccineLies: A Natural Language Resource for Learning to Recognize Misinformation about the COVID-19 and HPV Vaccines
[Paper] [Poster] [Video]
Maxwell Weinzierl1 and Sanda Harabagiu2
1The University of Texas at Dallas, 2University of Texas at Dallas
   Tackling Irony Detection using Ensemble Classifiers
[Paper] [Video]
Christoph Turban and Udo Kruschwitz
University of Regensburg
   Automatic Construction of an Annotated Corpus with Implicit Aspects
[Paper] [Poster] [Video]
Aye Aye Mar and Kiyoaki Shirai
Japan Advanced Institute of Science and Technology
   A Multimodal Corpus for Emotion Recognition in Sarcasm
[Paper] [Poster] [Video]
Anupama Ray1, Shubham Mishra2, Apoorva Nunna3, Pushpak Bhattacharyya4
1IBM Research, 2Indian Institute of Technology Bombay, 3Department of Computer Science and Engineering, IIT Bombay, 4Indian Institute of Technology Bombay and Patna
   Annotation of Valence Unfolding in Spoken Personal Narratives
[Paper] [Video]
Aniruddha Tammewar1, Franziska Braun2, Gabriel Roccabruna1, Sebastian Bayerl3, Korbinian Riedhammer2, Giuseppe Riccardi1
1University Of Trento, 2Technische Hochschule Nürnberg Georg Simon Ohm, 3TH-Nürnberg
   A Large-Scale Japanese Dataset for Aspect-based Sentiment Analysis
[Paper] [Video]
Yuki Nakayama, Koji Murakami, Gautam Kumar, Sudha Bhingardive, Ikuko Hardaway
Rakuten Institute of Technology
   A Japanese Dataset for Subjective and Objective Sentiment Polarity Classification in Micro Blog Domain
[Paper] [Video]
Haruya Suzuki1, Yuto Miyauchi1, Kazuki Akiyama1, Tomoyuki Kajiwara1, Takashi Ninomiya1, Noriko Takemura2, Yuta Nakashima2, Hajime Nagahara2
1Ehime University, 2Osaka University
   Complementary Learning of Aspect Terms for Aspect-based Sentiment Analysis
[Paper] [Video]
Han Qin1, Yuanhe Tian2, Fei Xia3, Yan Song4
1The Chinese University of Hong Kong (Shenzhen), 2Department of Linguistics, University of Washington, 3University of Washington, 4CUHK-SZ
   Deep One-Class Hate Speech Detection Model
[Paper] [Video]
saugata bose and Dr. Guoxin Su
University of Wollongong, Australia
   Opinions in Interactions : New Annotations of the SEMAINE Database
[Paper] [Poster] [Video]
Valentin Barriere1, Slim Essid2, Chloé Clavel3
1Joint Research Center, 2Télécom ParisTech, 3LTCI, Telecom-Paris, Institut Polytechnique de Paris
   Pars-ABSA: a Manually Annotated Aspect-based Sentiment Analysis Benchmark on Farsi Product Reviews
[Paper]
Taha Shangipour ataei1, Kamyar Darvishi1, Soroush Javdan2, Behrouz Minaei-Bidgoli1, Sauleh Eetemadi1
1Computer Engineering Department, Iran University of Science and Technology, 2School of Computer Science, Carleton University
   HindiMD: A Multi-domain Corpora for Low-resource Sentiment Analysis
[Paper] [Poster] [Video]
Mamta .1, Asif Ekbal2, Pushpak Bhattacharyya3, Tista Saha4, Alka Kumar4, Shikha Srivastava4
1Indian Institute of Technology Patna, 2IIT Patna, 3Indian Institute of Technology Bombay and Patna, 4CDOT
   Sentiment Analysis of Homeric Text: The 1st Book of Iliad
[Paper] [Poster] [Video]
John Pavlopoulos1, Alexandros Xenos2, Davide Picca3
1Stockholm University, 2Athens University of Economics and Business, 3University of Lausanne
                         Session: R17 - Parsing, Tagging, Grammar, Syntax, Morphology
   The Persian Dependency Treebank Made Universal
[Paper] [Poster] [Video]
Pegah Safari1, Mohammad Sadegh Rasooli2, Amirsaeid Moloodi3, Alireza Nourian4
1Shahid Beheshti University, 2University of Pennsylvania, 3Shiraz University at Iran, 4Iran University of Science and Technolgy
   GujMORPH - A Dataset for Creating Gujarati Morphological Analyzer
[Paper] [Video]
Jatayu Baxi and brijesh bhatt
Dharmsinh Desai University
   Informal Persian Universal Dependency Treebank
[Paper] [Poster] [Video]
Roya Kabiri, Simin Karimi, Mihai Surdeanu
University of Arizona
   Automatic Correction of Syntactic Dependency Annotation Differences
[Paper] [Video]
Andrew Zupon, Andrew Carnie, Michael Hammond, Mihai Surdeanu
University of Arizona
   Building Large-Scale Japanese Pronunciation-Annotated Corpora for Reading Heteronymous Logograms
[Paper] [Poster] [Video]
Fumikazu Sato1, Naoki Yoshinaga2, Masaru Kitsuregawa3
1The University of Tokyo / National Diet Library, 2Institute of Industrial Science, the University of Tokyo, 3Univ. of Tokyo
                         Session: R18 - Semantics (including Distributional Semantics, Word Sense Disambiguation, Coreference, etc.)
   StyleKQC: A Style-Variant Paraphrase Corpus for Korean Questions and Commands
[Paper] [Video]
Won Ik Cho1, Sangwhan Moon2, Jongin Kim3, Seokmin Kim3, Nam Soo Kim3
1Department of Electrical and Computer Engineering and INMC, Seoul National University, 2Tokyo Institute of Technology, 3Seoul National University
   Syntax-driven Approach for Semantic Role Labeling
[Paper] [Video]
Yuanhe Tian1, Han Qin2, Fei Xia3, Yan Song4
1Department of Linguistics, University of Washington, 2The Chinese University of Hong Kong (Shenzhen), 3University of Washington, 4CUHK-SZ
   HerBERT Based Language Model Detects Quantifiers and Their Semantic Properties in Polish
[Paper] [Video]
Marcin Woliński1, Bartłomiej Nitoń1, Witold Kieraś1, Jakub Szymanik2
1Institute of Computer Science, Polish Academy of Sciences, 2University of Amsterdam
   Lexical Resource Mapping via Translations
[Paper] [Video]
hongchang Bao, Bradley Hauer, Grzegorz Kondrak
University of Alberta
   Unsupervised Attention-based Sentence-Level Meta-Embeddings from Contextualised Language Models
[Paper] [Video]
Keigo Takahashi1 and Danushka Bollegala2
1Tokyo Metropolitan University, 2University of Liverpool/Amazon
                         Session: R19 - Social Media Processing
   Identification of Fine-Grained Location Mentions in Crisis Tweets
[Paper] [Video]
Sarthak Khanal, Maria Traskowsky, Doina Caragea
Kansas State University
   HateBR: A Large Expert Annotated Corpus of Brazilian Instagram Comments for Offensive Language and Hate Speech Detection
[Paper] [Video]
Francielle Vargas1, Isabelle Carvalho1, Fabiana Rodrigues de Góes1, Thiago Pardo1, Fabrício Benevenuto2
1University of São Paulo, 2Federal University of Minas Gerais (UFMG)
   MentalBERT: Publicly Available Pretrained Language Models for Mental Healthcare
[Paper] [Video]
Shaoxiong Ji1, Tianlin Zhang2, Luna Ansari1, Jie Fu3, Prayag Tiwari4, Erik Cambria5
1Aalto University, 2The University of Manchester, 3Mila, University of Montreal, 4University of Padova, 5Nanyang Technological University
   Leveraging Hashtag Networks for Multimodal Popularity Prediction of Instagram Posts
[Paper] [Poster] [Video]
Yu Yun Liao
National Taiwan University
   Annotating the Tweebank Corpus on Named Entity Recognition and Building NLP Models for Social Media Analysis
[Paper] [Poster] [Video]
Hang Jiang1, Yining Hua2, Doug Beeferman3, Deb Roy1
1MIT, 2Harvard Medical School, 3MIT Media Lab
   Did that happen? Predicting Social Media Posts that are Indicative of what happened in a scene: A case study of a TV show
[Paper] [Video]
Anietie Andy1, Reno Kriz1, Sharath Chandra Guntuku1, Derry Tanti Wijaya2, Chris Callison-Burch1
1University of Pennsylvania, 2Boston University
   HashSet - A Dataset For Hashtag Segmentation
[Paper] [Video]
Prashant Kodali1, Akshala Bhatnagar2, Naman Ahuja1, Manish Shrivastava3, Ponnurangam Kumaraguru3
1IIIT Hyderabad, 2IIIT Delhi, 3International Institute of Information Technology Hyderabad
   Using Convolution Neural Network with BERT for Stance Detection in Vietnamese
[Paper] [Poster] [Video]
Oanh Tran1, Anh Phung2, Bach Ngo2
1International School, Vietnam National University, Hanoi, 2Posts and Telecommunications Institute of Technology, Vietnam
   Annotation-Scheme Reconstruction for "Fake News" and Japanese Fake News Dataset
[Paper] [Poster] [Video]
Taichi Murayama1, Shohei Hisada2, Makoto Uehara2, Shoko Wakamiya3, Eiji ARAMAKI4
1ISIR, Osaka University, 2NARA Institute of Science and Technology, 3NAIST, 4NAIST, Japan
   RoBERTuito: a pre-trained language model for social media text in Spanish
[Paper] [Poster] [Video]
Juan Manuel Pérez1, Damián Ariel Furman2, Laura Alonso Alemany3, Franco M. Luque4
1CONICET, Universidad de Buenos Aires, 2Universidad De Buenos Aires, 3Universidad Nacional de Cordoba, 4Universidad Nacional de Córdoba and CONICET
                         Session: R20 - Speech Resources and Processing (including Phonetic Databases, Phonology, Prosody)
   Construction of Responsive Utterance Corpus for Attentive Listening Response Production
[Paper] [Video]
Koichiro Ito1, Masaki Murata2, Tomohiro Ohno3, Shigeki Matsubara4
1Graduate School of Informatics, Nagoya University, 2Department of Information and Computer Engineering, National Institute of Technology, Toyota College, 3Tokyo Denki University, 4Nagoya University
   Speak: A Toolkit Using Amazon Mechanical Turk to Collect and Validate Speech Audio Recordings
[Paper] [Video]
Christopher Song1, David Harwath2, Tuka Alhanai3, James Glass4
1Johns Hopkins University, 2The University of Texas at Austin, 3NYUAD, 4Massachusetts Institute of Technology
   ASCEND: A Spontaneous Chinese-English Dataset for Code-switching in Multi-turn Conversation
[Paper] [Poster] [Video]
Holy Lovenia1, Samuel Cahyawijaya2, Genta Winata3, Peng Xu1, Yan Xu4, Zihan Liu4, Rita Frieske1, Tiezheng Yu1, Wenliang Dai4, Elham J. Barezi5, Qifeng Chen2, Xiaojuan Ma4, Bertram Shi6, Pascale Fung4
1The Hong Kong University of Science and Technology, 2HKUST, 3Bloomberg, 4Hong Kong University of Science and Technology, 5Department of Computer Science and Engineering, Hong Kong University of Science and Technology, 6ECE/HKUST
   A Romanization System and WebMAUS Aligner for Arabic Varieties
[Paper] [Poster] [Video]
Jalal Al-Tamimi1, Florian Schiel2, Ghada Khattab3, Navdeep Sokhey4, Djegdjiga Amazouz5, Abdulrahman Dallak6, Hajar Moussa7
1Université Paris Cité, CNRS, Laboratoire de linguistique formelle, 2Bavarian Archive for Speech Signals, 3Newcastle University, 4Virginia Polytechnic Institute and State University, 5Laboratoire de Phonétique et Phonologie (LPP)-CNRS, Université Sorbonne Nouvelle, 6Newcastle University, Newcastle upon Tyne, 7King Abdul-Aziz University
   BembaSpeech: A Speech Recognition Corpus for the Bemba Language
[Paper] [Poster] [Video]
Claytone Sikasote1 and Antonios Anastasopoulos2
1University of Zambia, 2George Mason University
   BehanceCC: A ChitChat Detection Dataset For Livestreaming Video Transcripts
[Paper] [Poster] [Video]
Viet Lai1, Amir Pouran Ben Veyseh1, Franck Dernoncourt2, Thien Huu Nguyen1
1University of Oregon, 2Adobe Research
   Adversarial Speech Generation and Natural Speech Recovery for Speech Content Protection
[Paper] [Video]
Sheng Li1, Jiyi Li2, Qianying Liu3, Zhuo Gong4
1National Institute of Information and Communications Technology (NICT), Advanced Speech Technology Laboratory, 2University of Yamanashi, 3Kyoto University, 4The University of Tokyo
   A new European Portuguese corpus for the study of Psychosis through speech analysis
[Paper] [Video]
Maria Forjó1, Daniel Neto2, Alberto Abad1, HSofia Pinto1, Joaquim Gago3
1INESC-ID/Instituto Superior Técnico, University of Lisbon, 2Serviço de Saúde da Região Autónoma da Madeira, 3Nova Medical School/Centro Hospitalar de Lisboa Ocidental
   Investigating Inter- and Intra-speaker Voice Conversion using Audiobooks
[Paper] [Poster] [Video]
Aghilas SINI, Damien Lolive, Nelly Barbot, Pierre Alain
Univ Rennes, CNRS, IRISA
   Multilingual Transfer Learning for Children Automatic Speech Recognition
[Paper] [Poster] [Video]
Thomas Rolland1, Alberto Abad1, Catia Cucchiarini2, Helmer Strik2
1INESC-ID, 2CLST, Radboud University Nijmegen
   BehanceQA: A New Dataset for Identifying Question-Answer Pairs in Video Transcripts
[Paper] [Poster] [Video]
Amir Pouran Ben Veyseh1, Viet Lai1, Franck Dernoncourt2, Thien Huu Nguyen1
1University of Oregon, 2Adobe Research
                         Session: R21 - Statistical Methods and Machine Learning for Language Technologies (including Language Models)
   Bidirectional Skeleton-Based Isolated Sign Recognition using Graph Convolutional Networks
[Paper] [Video]
Konstantinos M. Dafnis1, Evgenia Chroni1, Carol Neidle2, Dimitri Metaxas3
1Rutgers University, 2Boston University, 3Rutgers Univ.
   Deep learning-based end-to-end spoken language identification system for domain-mismatched scenario
[Paper] [Poster] [Video]
Woohyun Kang1, Md Jahangir Alam2, Abderrahim Fathan3
1Computer Research Institute of Montreal, 2Computer Research Institute of Montreal (CRIM), 3Centre de Recherche en Informatique de Montréal (CRIM)
   Handwritten Character Generation using Y-Autoencoder for Character Recognition Model Training
[Paper] [Poster] [Video]
Tomoki Kitagawa, Chee Siang Leow, Hiromitsu Nishizaki
University of Yamanashi
   Attention is All you Need for Robust Temporal Reasoning
[Paper] [Poster] [Video]
Lis Kanashiro Pereira
Ochanomizu University
   PoliBERTweet: A Pre-trained Language Model for Analyzing Political Content on Twitter
[Paper] [Poster] [Video]
Kornraphop Kawintiranon and Lisa Singh
Georgetown University
   Modeling the Impact of Syntactic Distance and Surprisal on Cross-Slavic Text Comprehension
[Paper] [Poster] [Video]
Irina Stenger, Philip Georgis, Tania Avgustinova, Bernd Möbius, Dietrich Klakow
Saarland University
   BERTifying Sinhala - A Comprehensive Analysis of Pre-trained Language Models for Sinhala Text Classification
[Paper] [Poster] [Video]
Vinura Dhananjaya, Piyumal Demotte, Surangika Ranathunga, Sanath Jayasena
University of Moratuwa
   Pre-training and Evaluating Transformer-based Language Models for Icelandic
[Paper] [Poster] [Video]
Jón Guðnason and Hrafn Loftsson
Reykjavik University
                         End of Program