LREC 2022 - Accepted Papers

ID	Title	Authors
5	{JADE}: Corpus for Japanese Definition Modelling	Han Huang, Tomoyuki Kajiwara and Yuki Arase
7	HateBR: Large Expert Annotated Corpus of Brazilian Instagram Comments for Abusive Language Detection	Francielle Alves Vargas, Isabelle Carvalho, Fabiana Rodrigues de Góes, Thiago Alexandre Salgueiro Pardo and Fabrício Benevenuto
8	A Survey on the Rhetorical Approach for Online Deception Detection	Francielle Alves Vargas and Jonas D`Alessandro
9	Potential Idiomatic Expression (PIE)-English: Corpus for Classes of Idioms	Tosin Adewumi, Roshanak Vadoodi, Aparajita Tripathy, Konstantina Nikolaido, Foteini Liwicki and Marcus Liwicki
11	KC4MT: A High-Quality Corpus for Multilingual Machine Translation	Vinh Van Nguyen, Ha Tien Nguyen, Huong Thanh Le, Thai Phuong Nguyen, Tan Van Bui, Luan Nghia Pham, Anh Tuan Phan, Cong Hoang-Minh Nguyen, Viet Hong Tran and Anh Huu Tran
17	Align-smatch: A Novel Evaluation Method for Chinese Abstract Meaning Representation Parsing based on Alignment of Concept and Relation	Liming Xiao, Bin Li, Zhixing Xu, Kairui Huo, Minxuan Feng, Junsheng Zhou and Weiguang Qu
18	KazNERD: Kazakh Named Entity Recognition Dataset	Rustem Yeshpanov, Yerbolat Khassanov and Huseyin Atakan Varol
21	Towards Building a Spoken Dialogue System for Argument Exploration	Annalena Bea Aicher, Nadine Gerstenlauer, Isabel Feustel, Wolfgang Minker and Stefan Ultes
24	Interactive Evaluation of Dialog Track at DSTC9	Shikib Mehri, Yulan Feng, Carla Gordon, Seyed Hossein Alavi, David Traum and Maxine Eskenazi
26	FreeTalky: Don’t Be Afraid! Conversations Made Easier by a Humanoid Robot using Persona-based Dialogue	chanjun park, Yoonna Jang, Seolhwa Lee, Sungjin Park and Heuiseok Lim
29	Connecting a French Dictionary from the Beginning of the 20th Century to Wikidata	Pierre Nugues
32	StyleKQC: A Style-Variant Paraphrase Corpus for Korean Questions and Commands	Won Ik Cho, Sangwhan Moon, Jong In Kim, Seokmin Kim and Nam Soo Kim
33	ViHealthBERT: Pre-trained Language Models for Vietnamese in Health Text Mining	Nguyen Phuc Minh, Vu Hoang Tran, Vu Hoang, Huy Duc Ta, Trung Huu Bui and Steven Quoc Hung Truong
34	Dynamic Human Evaluation for Relative Model Comparisons	Thórhildur Thorleiksdóttir, Cedric Renggli, Nora Hollenstein and Ce Zhang
35	Fine-Grained Error Analysis and Fair Evaluation of Labeled Spans	Katrin Ortmann
36	Towards Evaluation of Cross-document Coreference Resolution Models using Datasets with Diverse Annotation Schemes	Anastasia Zhukova, Felix Hamborg and Bela Gipp
37	N24News: A New Dataset for Multimodal News Classification	Zhen Wang, Xu Shan, Xiangxie Zhang and Jie Yang
38	SHARE: A Lexicon of Harmful Expressions by the Spanish Population	Flor Miriam Plaza-del-Arco, Ana Belén Parras Portillo, Pilar López Úbeda, Beatriz Botella Gil and María-Teresa Martín-Valdivia
40	Evaluating Gender Bias in Speech Translation	Marta R. Costa-jussà, Christine Basta and Gerard I. Gállego
41	x-enVENT: A Corpus of Event Descriptions with Experiencer-specific Emotion and Appraisal Annotations	Enrica Troiano, Laura Ana Maria Oberlaender, Maximilian Wegge and Roman Klinger
42	Extensions to Brahmic script processing within the Nisaba library: new scripts, languages and utilities	Alexander Gutkin, Cibu Johny, Raiomond Doctor, Lawrence Wolf-Sonkin and Brian Roark
43	CLeLfPC: a Large Open Multi-Speaker Corpus of French Cued Speech	Brigitte BIGI, Maryvonne Zimmermann and Carine André
44	Unsupervised Embeddings with Graph Auto-Encoders for Multi-domain and Multilingual Hate Speech Detection	Gretel Liz De la Peña Sarracén and Paolo Rosso
45	Recovering Patient Journeys: A Corpus of Biomedical Entities and Relations on Twitter (BEAR)	Amelie Wührl and Roman Klinger
46	Analysis and Prediction of NLP Models via Task Embeddings	Damien Sileo and Marie-Francine Moens
48	Progress in Multilingual Speech Recognition for Low Resource Languages Kurmanji Kurdish, Cree and Inuktut	vishwa gupta and Gilles Boulianne
49	Self-Contained Utterance Description Corpus for Japanese Dialog	Yuta Hayashibe
50	Multitask Learning for Grapheme-to-Phoneme Conversion of Anglicisms in German Speech Recognition	Julia Maria Pritzen, Michael Gref, Dietlind Zühlke and Christoph Andreas Schmidt
51	Probing Pre-trained Auto-regressive Language Models for Named Entity Typing and Recognition	Elena V. Epure and Romain Hennequin
52	Frustratingly Easy Performance Improvements for Cross-lingual Transfer: A Tale on BERT and Segment Embeddings	Rob van der Goot, Max Müller-Eberstein and Barbara Plank
54	Developing A Multilabel Corpus for the Quality Assessment of Online Political Talk	Kokil Jaidka
55	The Arabic Parallel Gender Corpus 2.0: Extensions and Analyses	Bashar Alhafni, Nizar Habash and Houda Bouamor
56	Predicting the Proficiency Level of Nonnative Hebrew Authors	Isabelle Nguyen and Shuly Wintner
57	The GINCO Training Dataset for Web Genre Identification of Documents Out in the Wild	Taja Kuzman, Peter Rupnik and Nikola Ljubešić
59	SSR7000: A Synchronized Corpus of Ultrasound Tongue Imaging for End-to-End Silent Speech Recognition	Naoki Kimura and Zixiong Su
62	HRCA+: Advanced Multiple-choice Machine Reading Comprehension Method	YUXIANG ZHANG and Hayato Yamana
65	Mitigating Dataset Artifacts in Natural Language Inference Through Automatic Contextual Data Augmentation and Learning Optimization	Michail Mersinias and Panagiotis Valvis
67	Modeling Dutch Medical Texts for Detecting Functional Categories and Levels of COVID-19 Patients	Jenia Kim, Stella Verkijk, Edwin Geleijn, Marieke van der Leeden, Carel Meskers, Caroline Meskers, Sabina van der Veen, Piek T.J.M. Vossen and Guy Widdershoven
68	Extended Parallel Corpus for Amharic-English Machine Translation	Andargachew Mekonnen Gezmu, Andreas Nürnberger and Tesfaye Bayu Bati
69	LIP-RTVE: An Audiovisual Database for Continuous Spanish in the Wild	David Gimeno-Gómez and Carlos-D. Martínez-Hinarejos
70	Surfer100: Generating Surveys From Web Resources, Wikipedia-style	Irene Li, Alex Fabbri, Rina Kawamura, Yixin Liu, Xiangru Tang, Jaesung tae, Chang Shen, Sally Ma, Tomoe Mizutani and Dragomir Radev
72	Unraveling the Mystery of Artifacts in Machine Generated Text	Jiashu Pu, Ziyi Huang, Yadong Xi, Guandan Chen, Weijie Chen and Rongsheng Zhang
73	HyperBox: A Supervised Approach for Hypernym Discovery using Box Embeddings	Parmar Maulik and Apurva Narayan
74	An Annotated Arabic-English Bilingual Writer Corpus: Guidelines, Processes, and Insights	Nizar Habash and David Palfreyman
75	TeDDi Sample: Text Data Diversity Sample for Language Comparison and Multilingual NLP	Steven Moran, Christian Bentz, Olga Sozinova, Ximena Gutierrez-Vasques and Tanja Samardzic
76	Predicting Embedding Reliability in Low-Resource Settings Using Corpus Similarity Measures	Jonathan Dunn, Haipeng Li and Damian Sastre
78	Angry or Sad ? Emotion Annotation for Extremist Content Characterisation	Valentina Dragos, Delphine Battistelli, Aline Etienne and Yolène Constable
79	Identification of Multiword Expressions in Tweets for Hate Speech Detection	Nicolas Zampieri, Carlos Ramisch, Irina Illina and Dominique Fohr
80	Please, Don't Forget the Difference and the Confidence Interval when Seeking for the State-of-the-Art Status	Yves Bestgen
81	Kompetencer: Fine-grained Skill Classification in Danish Job Postings via Distant Supervision and Transfer Learning	Mike Zhang, Kristian Nørgaard Jensen and Barbara Plank
83	PCR4ALL: A Comprehensive Evaluation Benchmark for Pronoun Coreference Resolution in English	Xinran Zhao, Hongming Zhang and Yangqiu Song
85	Aggregating Hierarchical Dialectal Data for Arabic Dialect Classification	Nurpeiis Baimukan, Nizar Habash and Houda Bouamor
86	Investigating User Radicalization: A Novel Dataset for Identifying Fine-Grained Temporal Shifts in Opinion	Flora Sakketou, Allison Claire Lahnala, Liane Vogel and Lucie Flek
90	A Pragmatics-Centered Evaluation Framework for Natural Language Understanding	Damien Sileo, Philippe Muller, Tim Van de Cruys and Camille Pradel
92	Metaphor annotation for German	Markus Egg and Valia Kordoni
93	A Framenet and Frame Annotator for German Social Media	Eckhard Bick
94	Cross-lingual Emotion Detection	Sabit Hassan, Shaden Shaar and Kareem Darwish
96	CoRoSeOf - An Annotated Corpus of Romanian Sexist and Offensive Tweets	Diana Constantina Hoefels, Çağrı Çöltekin and Irina Diana Mădroane
99	Hausa Visual Genome: A Dataset for Multi-Modal English to Hausa Machine Translation	Idris Abdulmumin, Satya Ranjan Dash, Musa Abdullahi Dawud, Shantipriya Parida, Shamsuddeen Hassan Muhammad, Ibrahim Sa’id Ahmad, Subhadarshi Panda, Ondřej Bojar, Bashir Shehu Galadanci and Bello Shehu Bello
102	Are Embedding Spaces Interpretable? Results of an Intrusion Detection Evaluation on a Large French Corpus	Thibault Prouteau, Nicolas J. Dugué, Nathalie Camelin and Sylvain Meignier
104	DirectQuote: A Dataset for Direct Quotation Extraction and Attribution in News Articles	Yuanchi Zhang and Yang Liu
105	BasqueParl: A Bilingual Corpus of Basque Parliamentary Transcriptions	Nayla Escribano, Jon Ander Gonzalez, Julen Orbegozo-Terradillos, Ainara Larrondo-Ureta, Simón Peña-Fernández, Olatz Perez-de-Viñaspre and Rodrigo Agerri
106	A Survey of Machine Translation Tasks on Nigerian Languages	Ebelechukwu C. Nwafor and Anietie Andy
108	Multimodal Pipeline for Collection of Misinformation Data from Telegram	Jose Angel Sosa and Serge Sharoff
110	Representing the Toddler Lexicon: Which Corpus and Semantic Metric?	Jennifer M. Weber and Eliana Colunga
111	Multi-Task Learning for Cross-Lingual Abstractive Summarization	Sho Takase and Naoaki Okazaki
112	Extracting Space Situational Awareness Events from News Text	Zhengnan Xie, Alice Saebom Kwak, Enfa George, Laura W. Dozal, Hoang Van, Moriba Jah, Roberto Furfaro and Peter Jansen
113	PerCQA: Persian Community Question Answering Dataset	Naghme Jamali, Yadollah Yaghoobzadeh and Heshaam Faili
114	The Persian Dependency Treebank Made Universal	Pegah Safari, Mohammad Sadegh Rasooli, Amirsaeid Moloodi and Alireza Nourian
115	Privacy-Preserving Graph Convolutional Networks for Text Classification	Ivan Habernal and Timour Igamberdiev
118	LeSpell - A Multi-Lingual Benchmark Corpus of Spelling Errors to Develop Spellchecking Methods for Learner Language	Marie Bexte, Ronja Laarmann-Quante, Andrea Horbach and Torsten Zesch
122	Turkish Universal Conceptual Cognitive Annotation	Necva Bölücü and Burcu Can
124	Language technology practitioners as language managers: arbitrating data bias and predictive bias in ASR	Nina Markl and Stephen Joseph McNulty
125	Semantic Role Labelling for Dutch Law Texts	Roos M. Bakker, Romy A.N. van Drie, Maaike de Boer, Robert van Doesburg and Tom van Engers
130	MS-LaTTE: A Dataset of Where and When To-do Tasks are Completed	Sujay Kumar Jauhar, Nirupama Chandrasekaran, Michael Gamon and Ryen White
131	The Subject Annotations of the Danish Parliament Corpus (2009-2017) v.2 and their Evaluation through Multilabel Classification	Costanza Navarretta and Dorte Haltrup Hansen
132	Causal Investigation of Public Opinion during the COVID-19 Pandemic via Social Media Text	Michael Jantscher and Roman Kern
133	Subjective Text Complexity Assessment for German	Laura Seiffe, Fares Kallel, Roland Roller, Sebastian Möller and Babak Naderi
134	CRASS: A Novel Data Set and Benchmark to Test Counterfactual Reasoning of Large Language Models	Jörg Frohberg and Frank Binder
137	Querying Interaction Structure: Approaches to Overlap in Spoken Language Corpora	Elena Frick, Thomas Schmidt and Henrike Helmer
138	Automatic Speech Recognition Datasets in Cantonese: A Survey and New Dataset	Tiezheng Yu, Rita Frieske, Peng Xu, Samuel Cahyawijaya, Cheuk Tung YIU, Holy Lovenia, Wenliang Dai, Elham J. Barezi, Qifeng Chen, Xiaojuan Ma, Bertram E. Shi and Pascale Fung
141	HADREB: Human Appraisals and (English) Descriptions of Robot Emotional Behaviors	Josue Torres-Fonsesca and Casey Kennington
143	English Language Spelling Correction as an Information Retrieval Task Using Wikipedia Search Statistics	Kyle Goslin and Markus Hofmann
144	Survey on Thai NLP Language Resources and Tools	Ratchakrit Arreerard, Stephen Mander and Scott Piao
146	MentalBERT: Publicly Available Pretrained Language Models for Mental Healthcare	Shaoxiong Ji, Tianlin Zhang, Luna Ansari, Jie Fu, Prayag Tiwari and Erik Cambria
147	KazakhTTS2: Extending the Open-Source Kazakh TTS Corpus With More Data, Speakers, and Topics	Saida Mussakhojayeva, Yerbolat Khassanov and Huseyin Atakan Varol
148	Neural Machine Translation for Low-resource African Languages: Benchmarking State-of-the-art Transformer for Wolof	Cheikh M. Bamba Dione, Alla LO, Elhadji Mamadou Nguer and sileye ba
151	Criteria for Useful Automatic Romanization in South Asian Languages	Isin Demirsahin, Cibu Johny, Alexander Gutkin and Brian Roark
153	Introducing RezoJDM16k: a French KnowledgeGraph DataSet for Link Prediction	Mehdi Mirzapour, Waleed Ragheb, Mohammad Javad Saeedizade, Kevin Cousot, Helene Jacquenet, Lawrence Carbon and Mathieu Lafourcade
154	Identifying Tension in Holocaust Survivors’ Interview: Code-switching/Code-mixing as Cues	Xinyuan Xia, Lu Xiao, Kun Yang and Yueyue Wang
155	Placing multi-modal, and multi-lingual Data in the Humanities Domain on the Map: the Mythotopia Geo-tagged Corpus	Voula Giouli, Anna Vacalopoulou, Nikolaos Sidiropoulos, Christina Flouda, Athanasios Doupas, Giorgos Giannopoulos, Nikos Bikakis, Vassilis Kaffes and Gregory Stainhaouer
156	Estimating Confidence of Predictions of Individual Classifiers and TheirEnsembles for the Genre Classification Task	Mikhail Lepekhin and Serge Sharoff
157	LaoPLM: Pre-trained Language Models for Lao	Nankai Lin, Yingwen Fu, Chuwei Chen, Ziyu Yang and Shengyi JIANG
158	Leveraging Hashtag Networks for Multimodal Popularity Prediction of Instagram Posts	Yu Yun Liao
159	Generating Questions from Wikidata Triples	Kelvin Han, Thiago Castro Ferreira and Claire Gardent
160	The IARPA BETTER Program Abstract Task Four New Semantically Annotated Corpora from IARPA’s BETTER Program	Timothy Mckinnon and Carl Rubino
162	Investigating Active Learning Sampling Strategies for Extreme Multi Label Text Classification	Lukas Fromme, Katsiaryna Mirylenka, Jonas Kuhn and Jasmina Bogojeska
164	Identification of Fine-Grained Location Mentions in Crisis Tweets	Sarthak Khanal, Maria Traskowsky and Doina Caragea
166	An Architecture of resolving a multiple link path in a standoff-style data format to enhance the mobility of language resources	Kazushi Ohya
168	DiaBiz – an Annotated Corpus of Polish Call Center Dialogs	Piotr Pęzik, Gosia Krawentek, Sylwia Karasińska, Paweł Wilk, Paulina Rybińska, Anna Cichosz, Angelika Peljak-Łapińska, Mikołaj Deckert and Michał Adamczyk
169	ArMATH: a Dataset for Solving Arabic Math Word Problems	Reem Ali Alghamdi, Zhenwen Liang and Xiangliang Zhang
171	BILinMID: A Spanish-English Corpus of the US Midwest	Irati Hurtado
173	Efficiently and Thoroughly Anonymizing a Transformer Language Model for Dutch Electronic Health Records: a Two-Step Method	Stella Verkijk and Piek T.J.M. Vossen
175	Aligning Images and Text with Semantic Role Labels for Fine-Grained Cross-Modal Understanding	Abhidip Bhattacharyya, Cecilia Mauceri, Martha Palmer and Christoffer Heckman
176	VaccineLies: A Natural Language Resource for Learning to Recognize Misinformation about the COVID-19 and HPV Vaccines	Maxwell Weinzierl and Sanda Harabagiu
178	CoFiF Plus: A French Financial Narrative Summarisation Corpus	Nadhem ZMANDAR, Tobias Daudert, Sina Ahmadi, Mahmoud El-Haj and Paul Rayson
179	EENLP: Cross-lingual Eastern European NLP Index	Alexey Tikhonov, Alex Malkhasov, Andrey Manoshin, George-Andrei Dima, Réka Cserháti, Md.Sadek Hossain Asif and Matt Sárdi
180	Cross-lingual Linking of Automatically Constructed Frames and FrameNet	Ryohei Sasano
181	DialCrowd 2.0: A Quality-Focused Dialog System Crowdsourcing Toolkit	Jessica Huynh, Ting-Rui Chiang, Jeffrey Bigham and Maxine Eskenazi
182	BERTology for Machine Translation: What BERT knows about linguistic difficulties for translation	Yuqian Dai, Marc de Kamps and Serge Sharoff
183	Evaluation Benchmarks for Spanish Sentence Representations	Vladimir Araujo, Andrés Carvallo, Souvik Kundu, José Cañete, Marcelo Mendoza, Robert E. Mercer, Felipe Bravo-Marquez, Marie-Francine Moens and Alvaro M. Soto
184	Samrómur Children: An Icelandic Speech Corpus	Carlos Daniel Hernandez Mena, David Erik Mollberg, Michal Borský and Jón Guðnason
185	GrASP: A Library for Extracting and Exploring Human-Interpretable Textual Patterns	Piyawat Lertvittayakumjorn, Leshem Choshen, Eyal Shnarch and Francesca Toni
189	CrudeOilNews: An Annotated Crude Oil News Corpus for Event Extraction	Meisin Lee, Lay-Ki Soon, Eu Gene Siew and Ly Fie Sugianto
190	Wiktextract: Wiktionary as Machine-Readable Structured Data	Tatu Ylonen
191	One Document, Many Revisions: A Dataset for Classification and Description of Edit Intents	Dheeraj Rajagopal, Xuchao Zhang, Michael Gamon, Sujay Kumar Jauhar, Diyi Yang and Eduard Hovy
192	LaVA – Latvian Language Learner corpus	Roberts Darģis, Ilze Auziņa, Inga Kaija, Kristīne Levāne-Petrova and Kristīne Pokratniece
195	CVSS Corpus and Massively Multilingual Speech-to-Speech Translation	Ye Jia, Michelle Tadmor Ramanovich, Quan Wang and Heiga Zen
196	A Graph-Based Method for Unsupervised Knowledge Discovery from Financial Texts	Joel Oksanen, Abhilash Majumder, Kumar Saunack, Francesca Toni and Arun Dhondiyal
197	A Corpus of German Citizen Contributions in Mobility Planning: Supporting Evaluation Through Multidimensional Classification	Julia Romberg, Laura Mark and Tobias Escher
198	Tackling Irony Detection using Ensemble Classifiers	Christoph Peter Turban and Udo Kruschwitz
199	A Large Interlinked Knowledge Graph of the Italian Cultural Heritage	Stefano Faralli, Andrea Lenzi and Paola Velardi
200	Claim Extraction and Law Matching for COVID-19-related Legislation	Niklas Dehio, Malte Ostendorff and Georg Rehm
201	A Brief Survey on Textual Dialogue Corpora	Hugo Gonçalo Oliveira, Patrícia Ferreira, Daniel Martins, Catarina Silva and Ana Alves
203	Identification and Analysis of Personification in Hungarian: The PerSECorp project	Gábor Simon
204	Organizing and Improving a Database of French Word Formation Using Formal Concept Analysis	Nyoman Juniarta, Olivier Bonami, Nabil Hathout, Fiammetta Namer and Yannick Toussaint
206	Fine-tuning vs From Scratch: Do Vision & Language Models Have Similar Capabilities on Out-of-Distribution Visual Question Answering?	Kristian Nørgaard Jensen and Barbara Plank
207	Few-Shot Learning for Argument Aspects of the Nuclear Energy Debate	Lena Jurkschat, Gregor Wiedemann, Maximilian Heinrich, Mattes Ruckdeschel and Sunna Torge
208	A Systematic Study Reveals Unexpected Interactions in Pre-Trained Neural Machine Translation	Ashleigh Isabella Richardson and Janet Wiles
209	Empirical Analysis of Noising Scheme based Synthetic Data Generation for Automatic Post-editing	Hyeonseok Moon, chanjun park, Seolhwa Lee, Jaehyung Seo, Jungseob Lee, Sugyeong Eo and Heuiseok Lim
215	BERTHA: Video Captioning Evaluation Via Transfer-Learned Human Assessment	Luis Lebron, Yvette Graham, Kevin McGuinness, Konstantinos Kouramas and Noel E. O'Connor
218	CTAP for Chinese:A Linguistic Complexity Feature Automatic Calculation Platform	Yue Cui, Junhui Zhu, Liner Yang, Xuezhi Fang, Xiaobin Chen, Yujie Wang and Erhong Yang
221	EPIC UdS - Creation and Applications of a Simultaneous Interpreting Corpus	Heike Przybyl, Ekaterina Lapshinova-Koltunski, Katrin Menzel, Stefan Fischer and Elke Teich
223	The EuroPat Corpus: A Parallel Corpus of European Patent Data	Kenneth Heafield, Elaine Farrow, Jelmer van der Linde, Gema Ramírez-Sánchez and Dion Wiggins
224	KIMERA: Injecting Domain Knowledge into Vacant Transformer Heads	Benjamin Winter, Alexei Figueroa Rosero, Alexander Löser, Felix Alexander Gers and Amy Siu
225	GujMORPH - A Dataset for Creating Gujarati Morphological Analyzer	Jatayu Baxi and brijesh bhatt
227	The Maaloula Aramaic Speech Corpus (MASC): From Printed Material to a Lemmatized and Time-Aligned Corpus	Ghattas Eid, Esther Seyffarth and Ingo Plag
228	Recurrent Neural Networks with Mixed Hierarchical Structures and EM Algorithm for Natural Language Processing	zhaoxin luo and Michael Zhu
229	FQuAD2.0: French Question Answering and Learning When You Don't Know	Quentin Heinrich, Gautier Viaud and Wacim Belblidia
230	German Light Verb Constructions in Business Process Models	Kristin Kutzner and Ralf Laue
231	GerEO: A Large-Scale Resource on the Syntactic Distribution of German Experiencer-Object Verbs	Johanna Poppek, Simon Masloch and Tibor Kiss
232	Multilingual Image Corpus – Towards a Multimodal and Multilingual Dataset	Svetla Koeva, Ivelina Stoyanova and Jordan Kralev
233	Evaluating Transformer Language Models on Arithmetic Operations Using Number Decomposition	Matteo Muffo, Aldo Cocco and Enrico Bertino
235	Leveraging Mental Health Forums for User-level Depression Detection on Social Media	Sravani Boinepelli, Tathagata Raha, Harika Abburi, Pulkit Parikh, Niyati Chhaya and Vasudeva Varma
239	Modality Alignment between Deep Representations for Effective Video-and-Language Learning	Hyeongu Yun, Yongil Miles Kim and Kyomin Jung
240	VIMQA: A Vietnamese Dataset for Advanced Reasoning and Explainable Multi-hop Question Answering	Khang Nguyen Le, Hien Dieu Nguyen, Tung Le Thanh and Minh Nguyen
244	The Lexometer: A Shiny Application for Exploratory Analysis and Visualization of Corpus Data	Oufan Hai, Matthew Sundberg, Katherine Trice, Rebecca Friedman and Scott Grimm
245	Holistic Evaluation of Automatic TimeML Annotators	Mustafa Ocal, Adrian Perez, Antonela Radas and Mark Finlayson
248	Construction of Responsive Utterance Corpus for Attentive Listening Response Production	Koichiro Ito, Masaki Murata, Tomohiro Ohno and Shigeki Matsubara
250	A Corpus for Suggestion Mining of German Peer Feedback	Dominik Pfütze, Eva Ritz, Julius Janda and Roman Rietsche
252	Domain Mismatch Doesn't Always Prevent Cross-lingual Transfer Learning	Daniel Edmiston, Phillip Keung and Noah A. Smith
253	PhysNLU: A Language Resource for Evaluating Natural Language Understanding and Explanation Coherence in Physics	Jordan C. Meadows, Zili Zhou and André Freitas
255	Overlooked Data in Typological Databases: What Grambank Teaches Us About Gaps in Grammars	Jakob Lesage, Hannah J. Haynie, Hedvig Skirgård, Tobias Weber and Alena Witzlack-Makarevich
257	Cross-Lingual Knowledge Transfer for Clinical Phenotyping	Jens-Michalis Papaioannou, Paul Grundmann, Betty van Aken, Athanasios Samaras, Ilias Kyparissidis, George Giannakoulas, Felix Gers and Alexander Loeser
258	NorDiaChange: Diachronic Semantic Change Dataset for Norwegian	Andrey Kutuzov, Samia Touileb, Petter Mæhlum, Tita Ranveig Enstad and Alexandra Wittemann
259	Speak: A Toolkit Using Amazon Mechanical Turk to Collect and Validate Speech Audio Recordings	Christopher Song, David Harwath, Tuka Alhanai and James Glass
261	CLGC: A Corpus for Chinese Literary Grace Evaluation	Yi Li, Dong Yu and pengyuan liu
264	The Multilingual Microblog Translation Corpus: Improving and Evaluating Translation of User-Generated Text	Paul McNamee and Kevin Duh
265	Constructing A Dataset of Support and Attack Relations in Legal Arguments in Court Judgements using Linguistic Rules	Basit Ali, Sachin Pawar, Girish Palshikar and Rituraj Singh
268	Training on Lexical Resources	Kenneth Ward Church, Xingyu Cai and Yuchen Bian
269	Evaluating Pre-training Objectives for Low-Resource Translation into Morphologically Rich Languages	Prajit Dhar, Arianna Bisazza and Gertjan van Noord
270	Priming Ancient Korean Neural Machine Translation	chanjun park, Seolhwa Lee, Jaehyung Seo, Hyeonseok Moon, Sugyeong Eo and Heuiseok Lim
271	Distilling the Knowledge of Romanian BERTs Using Multiple Teachers	Andrei-Marius Avram, Darius Catrina, Dumitru-Clementin Cercel, Mihai Dascalu, Traian Rebedea, Vasile Pais and Dan Ioan Tufis
274	Sign Language Production With Avatar Layering: A Critical Use Case over Rare Words	Jung-Ho Kim, Eui Jun Hwang, Sukmin Cho, Du Hui Lee and Jong Park
275	Aspect-Based Emotion Analysis and Multimodal Coreference: A Case Study of Customer Comments on Adidas Instagram Posts	Luna De Bruyne, Akbar Karimi, Orphee De Clercq, Andrea Prati and Veronique Hoste
277	PoS Tagging, Lemmatization and Dependency Parsing of West Frisian	Wilbert Heeringa, Gosse Bouma, Martha Hofman, Eduard Drenth, Jan Wijffels and Hans Van de Velde
279	Language Identification for Austronesian Languages	Jonathan Dunn and Wikke Nijhof
280	ArMIS - The Arabic Misogyny Corpus with Annotator Subjective Disagreements	Dina Almanea and Massimo Poesio
281	A Generalizing Approach to Protest Event Detection in German Local News	Gregor Wiedemann, Jan Matti Dollbaum, Sebastian Haunss, Priska Daphi and Larissa Daria Meier
282	Annotating the Tweebank Corpus on Named Entity Recognition and Building NLP Models for Social Media Analysis	Hang Jiang, Yining Hua, Doug Beeferman and Deb Roy
283	Hong Kong: Longitudinal and Synchronic Characterisations of Protest News between 1998 and 2020	Arya D. McCarthy and Giovanna Maria Dora Dore
287	A Named Entity Recognition Corpus for Vietnamese Biomedical Texts to Support Tuberculosis Treatment	Uyen T.P. Phan, Phuong N.V Nguyen and Nhung Nguyen
290	HECTOR: A Hybrid TExt SimplifiCation TOol for Raw Texts in French	Amalia Todirascu, Rodrigo Wilkens, Eva Rolin, Thomas François, Delphine Bernhard and Núria Gala
295	The Badalona Corpus - An Audio, Video and Neuro-Physiological Conversational Dataset	Philippe Blache, Salomé Antoine, Dorina De Jong, Lena-Marie Huttner, Emilia Kerr, Thierry Legou, Eliot Maës and Clément François
298	Extracting Age-Related Stereotypes from Social Media Texts	Kathleen C. Fraser, Svetlana Kiritchenko and Isar Nejadgholi
300	Multi-source Multi-domain Sentiment Analysis with BERT-based Models	Gabriel Roccabruna, Steve Azzolin and Giuseppe Riccardi
302	Misspelling Semantics in Thai	Pakawat Nakwijit and Matthew Purver
303	Large-Scale Hate Speech Detection with Cross-Domain Transfer	Cagri Toraman, Furkan Şahinuç and Eyup Halit Yilmaz
305	Did that happen? Predicting Social Media Posts that are Indicative of what happened in a scene: A case study of a TV show	Anietie Andy, Reno Kriz, Sharath Chandra Guntuku, Derry Tanti Wijaya and Chris Callison-Burch
309	ASCEND: A Spontaneous Chinese-English Dataset for Code-switching in Multi-turn Conversation	Holy Lovenia, Samuel Cahyawijaya, Genta Indra Winata, Peng Xu, Yan Xu, Zihan Liu, Rita Frieske, Tiezheng Yu, Wenliang Dai, Elham J. Barezi, Qifeng Chen, Xiaojuan Ma, Bertram E. Shi and Pascale Fung
310	Deep learning-based end-to-end spoken language identification system for domain-mismatched scenario	Woohyun Kang, Md Jahangir Alam and Abderrahim Fathan
311	Informal Persian Universal Dependency Treebank	Roya Kabiri, Simin Karimi and Mihai Surdeanu
313	Towards Speech-only Opinion-level Sentiment Analysis	Annalena Aicher, Alisa Gazizullina, Aleksei Gusev, Yuri Matveev and Wolfgang Minker
314	Annotating Interruption in Dyadic Human Conversation	Liu YANG, Catherine ACHARD and Catherine PELACHAUD
315	Personalized Filled-pause Generation with Group-wise Prediction Models	Yuta Matsunaga, Takaaki Saeki, Shinnosuke Takamichi and Hiroshi Saruwatari
317	Evaluating the Effects of Embedding with Speaker Identity Information in Dialogue Summarization	Yuji Naraki, Tetsuya Sakai and Yoshihiko Hayashi
318	A Mapudüngun FST Morphological Analyser and its Web Interface	Andrés Chandía
319	Annotation-Scheme Reconstruction for "Fake News" and Japanese Fake News Dataset	Taichi Murayama, Shohei Hisada, Makoto Uehara, Shoko Wakamiya and Eiji ARAMAKI
320	Korean-Specific Dataset for Table Question Answering	Changwook Jun, Jooyoung Choi, Myoseop Sim, Hyun Kim, Hansol Jang and Kyungkoo Min
321	The Norwegian Parliamentary Speech Corpus	Per Erik Solberg and Pablo Ortiz
322	Development of a Benchmark Corpus to Support Entity Recognition in Job Descriptions	Thomas Alexander Fleming Green, Diana Maynard and Chenghua Lin
323	Multilingual and Multimodal Learning for Brazilian Portuguese	Júlia Yumi Araújo Sato, Helena Caseli and Lucia Specia
324	NyLLex: A Novel Resource of Swedish Words Annotated with Reading Proficiency Level	Daniel Holmer and Evelina Rennes
327	Perceived Text Quality and Readability in Extractive and Abstractive Summaries	Julius Monsen and Evelina Rennes
328	Making a Semantic Event-type Ontology Multilingual	Zdenka Uresova, Karolina Zaczynska, Peter Bourgonje, Eva Fučíková, Georg Rehm and Jan Hajic
329	Towards Speaker Verification for Crowdsourced Speech Collections	John Mendonca, Rui Correia, Mariana Lourenço, João Freitas and Isabel Trancoso
331	AiRO - an Interactive Learning Tool for Children at Risk of Dyslexia	Peter Juel Henrichsen and Stine Fuglsang Engmose
332	LibriS2S: A German-English Speech-to-Speech Translation Corpus	Pedro Jeuris and Jan Niehues
333	Slovene SuperGLUE Benchmark: Translation and Evaluation	Aleš Žagar and Marko Robnik-Šikonja
334	RaFoLa: A Rationale-Annotated Corpus for Detecting Indicators of Forced Labour	Erick Andres Mendez Guzman, Viktor Schlegel and Riza Batista-Navarro
335	JParaCrawl v3.0: A Large-scale English-Japanese Parallel Corpus	Makoto Morishita, Katsuki Chousa, Jun Suzuki and Masaaki Nagata
336	TallVocabL2Fi: An Extensive Mapping of 15 Finnish L2 Learners’ Vocabulary	Frankie Robertson, Li-Hsin Chang and Sini Söyrinki
338	A Speech Recognizer for Frisian/Dutch Council Meetings	Martijn Bentum, Louis ten Bosch, Henk van den Heuvel, Simone Wills, Domenique van der Niet, Jelske Dijkstra and Hans Van de Velde
339	CLISTER : A Corpus for Semantic Textual Similarity in French Clinical Narratives	Nicolas Hiebel, Olivier Ferret, Karën Fort and Aurélie Névéol
341	Increasing CMDI’s Semantic Interoperability with schema.org	Nino Meisinger, Thorsten Trippel and Claus Zinn
342	DiHuTra: a Parallel Corpus to Analyse Differences between Human Translations	Ekaterina Lapshinova-Koltunski, Maja Popović and Maarit Koponen
344	Measuring Uncertainty in Translation Quality Evaluation (TQE)	Serge Gladkoff, Irina Sorokina, Lifeng Han and Alexandra Alekseeva
345	HOPE: A Task-Oriented and Human-Centric Evaluation Framework Using Professional Post-Editing Towards More Effective MT Evaluation	Serge Gladkoff and Lifeng Han
346	A Distant Supervision Corpus for Extracting Biomedical Relationships Between Chemicals, Diseases and Genes	Dongxu Zhang, Sunil Mohan, Michaela Torkar and Andrew McCallum
347	Collection and Analysis of Travel Agency Task Dialogues with Age-Diverse Speakers	Michimasa Inaba, Yuya Chiba, Ryuichiro Higashinaka, Kazunori Komatani, Yusuke Miyao and Takayuki Nagai
348	Learning How to Translate North Korean through South Korean	Hwichan Kim, Sangwhan Moon, Naoaki Okazaki and Mamoru Komachi
350	Elderly Conversational Speech Corpus with Cognitive Impairment Test and Pilot Dementia Detection Experiment Using Acoustic Characteristics of Speech in Japanese Dialects	Meiko Fukuda, Ryota Nishimura, Maina Umezawa, Kazumasa Yamamoto, Yurie Iribe and Norihide Kitaoka
353	A Spoken Drug Prescription Dataset in French for Spoken Language Understanding	Ali Can Kocabiyikoglu, François Portet, Prudence Gibert, Hervé Blanchon, Jean-Marc Babouchkine and Gaëtan Gavazzi
355	Quality and Efficiency of Manual Annotation: Pre-annotation Bias	Marie Mikulová, Milan Straka, Jan Štěpánek, Barbora Štěpánková and Jan Hajic
356	Creating a Basic Language Resource Kit for Faroese	Annika Simonsen, Sandra Saxov Lamhauge, Iben Nyholm Debess and Peter Juel Henrichsen
357	Anonymising the SAGT Speech Corpus and Treebank	Özlem Çetinoğlu and Antje Schweitzer
358	Speech Resources in the Tamasheq Language	Marcely Zanon Boito, Fethi Bougares, Florentin Barbier, Souhir Gahbiche, Loïc Barrault, Mickael Rouvier and Yannick Estève
360	Language Technologies for the Creation of Multilingual Terminologies. Lessons Learned from the SSHOC Project	Federica Gamba, Francesca Frontini, Daan Broeder and Monica Monachini
362	ProDial -- An Annotated Proactive Dialogue Act Corpus for Conversational Assistants using Crowdsourcing	Matthias Kraus, Nicolas Wagner and Wolfgang Minker
363	Automatic Construction of Annotated Corpus with Implicit Aspect	Aye Aye Mar and Kiyoaki Shirai
365	Automatic Detection of Stigmatizing Uses of Psychiatric Terms on Twitter	Véronique MORICEAU, Farah Benamara and Abdelmoumene Boumadane
366	KIND: an Italian Multi-Domain Dataset for Named Entity Recognition	Alessio Palmero Aprosio and Teresa Paccosi
367	Transformer versus LSTM Language Models trained on Uncertain ASR Hypotheses in Limited Data Scenarios	Imran Sheikh, Emmanuel Vincent and Irina Illina
368	Argument Similarity Assessment in German for Intelligent Tutoring: Crowdsourced Dataset and First Experiments	Xiaoyu Bai and Manfred Stede
369	Semantic Relations between Text Segments for Semantic Storytelling: Annotation Tool - Dataset - Evaluation	Michael Raring, Malte Ostendorff and Georg Rehm
371	GerCCT: An Annotated Corpus for Mining Arguments in German Tweets on Climate Change	Robin Schaefer and Manfred Stede
373	The Causal News Corpus: Annotating Causal Relations in Event Sentences from News	Fiona Anting Tan, Ali Hürriyetoğlu, Tommaso Caselli, Nelleke Oostdijk, Tadashi Nomoto, Hansi Hettiarachchi, Iqra Ameer, Onur Uca, Farhana Ferdousi Liza and Tiancheng Hu
374	FGraDA: A Dataset and Benchmark for Fine-Grained Domain Adaptation in Machine Translation	Wenhao Zhu, Shujian Huang, Tong Pu, Pingxuan Huang, xu zhang, Jian Yu, Wei Chen, Yanfeng Wang and Jiajun CHEN
375	"Beste Grüße, Maria Meyer" — Pseudonymization of Privacy-Sensitive Information in Emails	Elisabeth Eder, Michael Wiegand, Ulrike Krieg-Holz and Udo Hahn
377	Budget Argument Mining Dataset Using Japanese Minutes from the National Diet and Local Assemblies	Yasutomo Kimura, Hokuto Ototake and Minoru Sasaki
381	A Dataset of Offensive German Language Tweets Annotated for Speech Acts	Melina Plakidis and Georg Rehm
382	Samrómur: Crowd-sourcing large amounts of data	Staffan J. S. Hedström, David Erik Mollberg, Ragnheiður Þórhallsdóttir and Jón Guðnason
383	Trends, Limitations and Open Challenges in Automatic Readability Assessment Research	Sowmya Vajjala
385	UMUTextStats: A linguistic feature extraction tool for Spanish	José Antonio García-Díaz, Pedro José Vivancos-Vicente, Ángela Almela and Rafael Valencia-García
387	Criteria for the Annotation of Implicit Stereotypes	Wolfgang Schmeisser-Nieto, Montserrat Nofre and Mariona Taulé
388	What do we really know about State of the Art NER?	Sowmya Vajjala and Ramya Balasubramaniam
389	Common Phone: A Multilingual Dataset for Robust Acoustic Modelling	Philipp Klumpp, Tomas Arias, Paula Andrea Pérez-Toro, Elmar Noeth and Juan Rafael Orozco-Arroyave
390	An Annotated Corpus of Textual Explanations for Clinical Decision Support	Roland Roller, Aljoscha Burchardt, Nils Feldhus, Laura Seiffe, Klemens Budde, Simon Ronicke and Bilgin Osmanodja
391	Towards an Open-Source Dutch Speech Recognition System for the Healthcare Domain	Cristian Tejedor-García, Berrie van der Molen, Henk van den Heuvel, Arjan van Hessen and Toine Pieters
392	Work Hard, Play Hard: Collecting Acceptability Annotations through a 3D Game	Federico Bonetti, Elisa Leonardelli, Daniela Trotta, Raffaele Guarasci and Sara Tonelli
394	A Dataset for Speech Emotion Recognition in Greek Theatrical Plays	Maria Moutti, Sofia Eleftheriou, Panagiotis Koromilas and Theodoros Giannakopoulos
396	Aesop's fable "The North Wind and the Sun" Used as a Rosetta Stone to Extract and Map Spoken Words in Under-resourced Languages	elena knyazeva, Philippe Boula de Mareüil and Frédéric Vernier
397	Logic-Guided Message Generation from Raw Real-Time Sensor Data	Ernie Chang, Alisa Kovtunova, Stefan Borgwardt, Vera Demberg, Kathryn Chapman and Hui-Syuan Yeh
398	Mutual Gaze and Linguistic Repetition in a Multimodal Corpus	Anais Murat, Maria Koutsombogera and Carl Vogel
400	BERTrade: Using Contextual Embeddings to Parse Old French	Loïc Grobol, Mathilde Regnault, Pedro Javier Ortiz Suarez, Benoît Sagot, Laurent Romary and Benoit Crabbé
402	The VoxWorld Platform for Multimodal Embodied Agents	Nikhil Krishnaswamy, William Pickard, Brittany Cates, Nathaniel Blanchard and James Pustejovsky
406	MemoSen: A Multimodal Dataset for Sentiment Analysis of Memes	Eftekhar Hossain, Omar Sharif and Mohammed Moshiul Hoque
407	Multilingual Open Text 1.0: Public Domain News in 44 Languages	Chester Palen-Michel, June Kim and Constantine Lignos
409	Hierarchical Annotation for Building A Suite of Clinical Natural Language Processing Tasks: Progress Note Understanding	Yanjun Gao, Dmitriy Dligach, Timothy Miller, Samuel Tesch, Ryan Laffin, Matthew M. Churpek and Majid Afshar
411	MultiSubs: A Large-scale Multimodal and Multilingual Dataset	Josiah Wang, Josiel Figueiredo and Lucia Specia
412	Downstream Task Performance of BERT Models Pre-Trained Using Automatically De-Identified Clinical Data	Thomas Vakili, Anastasios Lamproudis, Aron Henriksson and Hercules Dalianis
413	Ethical Issues in Language Resources and Language Technology – Tentative Categorisation	Pawel Kamocki and Andreas Witt
415	A Unified Approach to Entity-Centric Context Tracking in Social Conversations	Ulrich Rückert, Srinivas Kumar Sunkara, Abhinav Rastogi, Sushant Prakash and Pranav Khaitan
416	Construction of a Quality Estimation Dataset for Automatic Evaluation of Japanese Grammatical Error Correction	Daisuke Suzuki, Yujin Takahashi, Ikumi Yamashita, Taichi Aida, Tosho Hirasawa, Michitaka Nakatsuji, Masato Mita and Mamoru Komachi
418	TweetTaglish: A Dataset for Investigating Tagalog-English Code-Switching	Megan Herrera, Ankit Aich and Natalie Parde
421	Context-based Virtual Adversarial Training for Text Classification with Noisy Labels	Do-Myoung Lee, Yeachan Kim and Chang gyun Seo
424	CoVERT: A Corpus of Fact-checked Biomedical COVID-19 Tweets	Isabelle Mohr, Amelie Wührl and Roman Klinger
425	SDS-200: A Swiss German Speech to Standard German Text Corpus	Michel Plüss, Manuela Hürlimann, Marc André Cuny, Alla Stöckli, Nikolaos Kapotis, Julia Hartmann, Malgorzata Anna Ulasik, Christian Scheller, Yanick Schraner, Amit Jain, Jan Deriu, Mark Cieliebak and Manfred Vogel
428	LARD: Large-scale Artificial Disfluency Generation	Tatiana Passali, Thanassis Mavropoulos, Grigorios Tsoumakas, Georgios Meditskos and Stefanos Vrochidis
430	Proficiency Matters Quality Estimation in Grammatical Error Correction	Yujin Takahashi, Masahiro Kaneko, Masato Mita and Mamoru Komachi
432	Reading Time and Vocabulary Rating in the Japanese Language: Large-Scale Japanese Reading Time Data Collection Using Crowdsourcing	Masayuki Asahara
434	CoQAR: Question Rewriting on CoQA	Quentin Brabant, Gwénolé Lecorvé and Lina M. Rojas Barahona
435	Developing a Spell and Grammar Checker for Icelandic using an Error Corpus	Hulda Óladóttir, Þórunn Arnardóttir, Anton Ingason and Vilhjálmur Þorsteinsson
436	SansTib, a Sanskrit - Tibetan Parallel Corpus and Bilingual Sentence Embedding Model	Sebastian Nehrdich
438	User Interest Modelling in Argumentative Dialogue Systems	Annalena Bea Aicher, Nadine Gerstenlauer, Wolfgang Minker and Stefan Ultes
440	Audiobook Dialogues as Training Data for Conversational Style Synthetic Voices	Liisi Piits, Hille Pajupuu, Heete Sahkai, Rene Altrov, Liis Ermus, Kairi Tamuri, Indrek Hein, Meelis Mihkla, Indrek Kiissel, Egert Männisalu, Kristjan Suluste and Jaan Pajupuu
441	Every time I fire a conversational designer, the performance of the dialog system goes down	Giancarlo A. Xompero, Michele Mastromattei, Samir Salman, Cristina Giannone, Andrea Favalli, Raniero Romagnoli and Fabio Massimo Zanzotto
442	A Romanization System and WebMAUS Aligner for Arabic Varieties	Jalal Al-Tamimi, Florian Schiel, Ghada Khattab, Navdeep Sokhey, Djegdjiga Amazouz, Abdulrahman Dallak and Hajar Moussa
443	A Tale of Two Regulatory Regimes: Creation and Analysis of a Bilingual Privacy Policy Corpus	Siddhant Arora, Henry Hosseini, Christine Utz, Vinayshekhar Bannihatti Kumar, Tristan O. Dhellemmes, Abhilasha Ravichander, Peter Story, Jasmine Mangat, Rex Chen, Martin Degeling, Thomas Norton, Thomas Hupperich, Shomir Wilson and Norman Sadeh
445	Jojajovai: A Parallel Guarani-Spanish Corpus for MT Benchmarking	Luis Chiruzzo, Santiago Góngora, Aldo Alvarez, Gustavo Giménez-Lugo, Marvin Agüero-Torales and Yliana Rodríguez
447	CAMS: An Annotated Corpus for Causal Analysis of Mental Health Issues in Social Media Posts	Muskan Garg, Chandni Saxena, Sriparna Saha, Veena Krishnan, Ruchi Joshi and Vijay Mago
448	A Comprehensive Evaluation and Correction of the TimeBank Corpus	Mustafa Ocal, Antonela Radas, Jared Hummer and Mark Finlayson
450	Automatic Correction of Syntactic Dependency Annotation Differences	Andrew Zupon, Andrew Carnie, Michael Hammond and Mihai Surdeanu
451	Towards Modelling Self-imposed Filter Bubbles in Argumentative Dialogue Systems	Annalena Bea Aicher, Wolfgang Minker and Stefan Ultes
452	Russian Jeopardy! Data Set for Question-Answering Systems	Elena Mikhalkova and Alexander A. Khlyupin
457	The TalkMoves Dataset: K-12 Mathematics Lesson Transcripts Annotated for Teacher and Student Discursive Moves	Abhijit Suresh, Jennifer Jacobs, Charis Harty, Margaret Perkoff, James H. Martin and Tamara Sumner
458	Evaluating Multilingual Sentence Representation Models in a Real Case Scenario	Rocco Tripodi, Rexhina Blloshmi and Simon Levis Sullam
459	FinMath: Injecting a Tree-structured Solver for Question Answering over Financial Reports	Chenying Li, Wenbo Ye and Yilun Zhao
460	Improving Large-scale Language Models and Resources for Filipino	Jan Christian Blaise Cruz and Charibeth Cheng
461	The CRECIL Corpus: a Novel Data Set for Dialogue Character Relation Extraction on Chinese Multi-party Dialogue	Yuru Jiang, Yang Xu, Yuhang Zhan, Weikai He, Yilin Wang, Zixuan Xi, Meiyun Wang, Xinyu Li, Yu Li and Yanchao Yu
462	SenticNet 7: A Commonsense-based Neurosymbolic AI Framework for Explainable Sentiment Analysis	Erik Cambria, Qian Liu, Sergio Decherchi, Frank Xing and Kenneth Kwok
463	Enhanced Distant Supervision with State-Change Information for Relation Extraction	Jui Shah, Dongxu Zhang, Sam Brody and Andrew McCallum
465	Thematic Fit Bits: Annotation Quality and Quantity Interplay for Event Participant Representation	Yuval Marton and Asad Sayeed
466	CI-AVSR: A Cantonese Audio-Visual Speech Datasetfor In-car Command Recognition	Wenliang Dai, Samuel Cahyawijaya, Tiezheng Yu, Elham J. Barezi, Peng Xu, Cheuk Tung YIU, Rita Frieske, Holy Lovenia, Genta Indra Winata, Qifeng Chen, Xiaojuan Ma, Bertram Shi and Pascale Fung
467	Handwritten Character Generation using Y-Autoencoder for Character Recognition Model Training	Tomoki Kitagawa, Chee Siang Leow and Hiromitsu Nishizaki
468	Learning to Prioritize: Precision-Driven Sentence Filtering for Long Text Summarization	Alex Mei, Anisha Kabir, Rukmini Bapat, John N. Judge, Tony Sun and William Yang Wang
472	Multimodal Negotiation Corpus with Various Subjective Assessments for Social-Psychological Outcome Prediction from Non-Verbal Cues	Nobukatsu Hojo, Satoshi Kobashikawa, Saki Mizuno and Ryo Masumura
474	An Empirical Study on the Overlapping Problem of Open-Domain Dialogue Datasets	Yuqiao Wen, Guoqing Luo and Lili Mou
475	Curras + Baladi: Towards a Levantine Corpus	Karim Al-Haff, Mustafa Jarrar, Tymaa Hammouda and Fadi Zaraket
477	Wojood: Nested Arabic Named Entity Corpus and Recognition using BERT	Mustafa Jarrar, Mohammed Khalilia and Sana Ghanem
478	The Hebrew Essay Corpus	Chen Gafni, Anat Prior and Shuly Wintner
479	Nunc profana tractemus. Detecting Code-Switching in a Large Corpus of 16th Century Letters	Martin Volk, Lukas Fischer, Patricia Scheurer, Bernard Silvan Schroffenegger, Raphael Schwitter, Phillip Ströbel and Benjamin Suter
481	Exploring Transformers for Ranking Portuguese Semantic Relations	Hugo Gonçalo Oliveira
482	Data Expansion Using WordNet-based Semantic Expansion and Word Disambiguation for Cyberbullying Detection	Md Saroar Jahan, Djamila Romaissa Beddiar, Mourad Oussalah and Muhidin Mohamed
484	DiscoGeM: A Crowdsourced Corpus of Genre-Mixed Implicit Discourse Relations	Merel Scholman, Tianai Dong, Frances Yung and Vera Demberg
485	The Bahrain Corpus: A Multi-genre Corpus of Bahraini Arabic	Dana Abdulrahim, Go Inoue, Latifa Shamsan, Salam Khalifa and Nizar Habash
488	Constructing a Lexical Resource of Russian Derivational Morphology	Lukáš Kyjánek, Olga Lyashevskaya, Anna Nedoluzhko, Daniil Vodolazsky and Zdeněk Žabokrtský
489	Thirumurai: A Large Dataset of Tamil Shaivite Poems and Classification of Tamil Pann	Shankar G. Mahadevan, Rahul Ponnusamy, Prasanna Kumar Kumaresan, Ruba Priyadharshini, Sangeetha S, Bharathi Raja Chakravarthi and Prabakaran Chandran
490	A Linguistically Motivated Test Suite to Semi-Automatically Evaluate German--English Machine Translation Output	Vivien Macketanz, Eleftherios Avramidis, Aljoscha Burchardt, He Wang, Renlong Ai, Shushen Manakhimova, Ursula Strohriegel, Sebastian Möller and Hans Uszkoreit
492	Combining ELECTRA and Adaptive Graph Encoding for Frame Identification	Fabio Tamburini
494	How Does the Experimental Setting Affect the Conclusions of Neural Encoding Models?	Xiaohan Zhang, Shaonan Wang and Chengqing Zong
495	Towards a new Ontology for Sign Languages	Thierry Declerck
497	‘Am I the Bad One’? Predicting the Moral Judgement of the Crowd Using Pre–trained Language Models	Areej Alhassan, Jinkai Zhang and Viktor Schlegel
498	HeadlineCause: A Dataset of News Headlines for Detecting Causalities	Ilya Gusev and Alexey Tikhonov
501	Lessons Learned from GPT-SW3: Building the First Large-Scale Generative Language Model for Swedish	Ariel Ekgren, Amaru Cuba Gyllensten, Evangelia Gogoulou, Alice Heiman, Severine Verlinden, Joey Öhman, Fredrik Carlsson and Magnus Sahlgren
502	MMDAG: Multimodal Directed Acyclic Graph Network for Emotion Recognition in Conversation	Shuo Xu, Yuxiang Jia, Changyong Niu and Hongying Zan
503	Know Better – A Clickbait Resolving Challenge	Benjamin Hättasch and Carsten Binnig
504	OpenKorPOS: Democratizing Korean Tokenization with Voting-Based Open Corpus Annotation	Sangwhan Moon, Won Ik Cho, Hye Joo Han, Naoaki Okazaki and Nam Soo Kim
506	ChiSense-12: An English Sense-Annotated Child-Directed Speech Corpus	Francesco Cabiddu, Lewis Bott, Gary Jones and Chiara Gambi
508	A Multimodal Corpus for Emotion Recognition in Sarcasm	Anupama Ray, Apoorva Nunna and Pushpak Bhattacharyya
509	Automatic Gloss-level Data Augmentation for Sign Language Translation	Jin Yea Jang, Han-Mu Park, Saim Shin, Suna Shin, Byungcheon Yoon and Gahgene Gweon
510	Making People Laugh like a Pro: Analysing Humor Through Stand-Up Comedy	Beatrice Turano and Carlo Strapparava
512	Annotation Study of Japanese Judgments on Tort for Legal Judgment Prediction with Rationales	Hiroaki Yamada, Takenobu Tokunaga, Ryutaro Ohara, Keisuke Takeshita and Mihoko Sumida
513	Testing Focus and Non-at-issue Frameworks with a Question-under-Discussion-Annotated Corpus	Christoph Hesse, Maurice Langner, Ralf Klabunde and Anton Benz
517	Domain Adaptation in Neural Machine Translation using a Qualia-Enriched FrameNet	Alexandre Diniz da Costa, Mateus Coutinho Marim, Ely Matos and Tiago Timponi Torrent
518	HateCheckHIn: Evaluating Hindi Hate Speech Detection Models	Mithun Das, Punyajoy Saha, Binny Mathew and Animesh Mukherjee
520	Using a Knowledge Base to Automatically Annotate Speech Corpora and to Identify Sociolinguistic Variation	Yaru WU, Fabian Suchanek, Ioana Vasilescu, Lori Lamel and Martine Adda-Decker
521	Aligning the Romanian Reference Treebank and the Valence Lexicon of Romanian Verbs	Ana-Maria Barbu, Verginica Barbu Mititelu and Cătălin Mititelu
523	Bidirectional Skeleton-Based Isolated Sign Recognition using Graph Convolution Networks and Transfer Learning	Konstantinos M. Dafnis, Evgenia Chroni, Carol Neidle and Dimitri Metaxas
527	Masader: Metadata Sourcing for Arabic Text and Speech Data Resources	Zaid Alyafeai, Maraim Masoud, Mustafa Ghaleb and Maged S. Al-shaibani
529	Automating Idea Unit Segmentation and Alignment for Assessing Reading Comprehension via Summary Protocol Analysis	Marcello Gecchele, Hiroaki Yamada, Takenobu Tokunaga, Yasuyo Sawaki and Mika Ishizuka
531	Incorporating Zoning Information into Argument Mining from Biomedical Literature	Boyang Liu, Viktor Schlegel, Riza Batista-Navarro and Sophia Ananiadou
533	NomVallex: A Valency Lexicon of Czech Nouns and Adjectives	Veronika Kolářová and Anna Vernerová
535	A Methodology for Building a Diachronic Dataset of Semantic Shifts and its Application to QC-FR-Diac-V1.0, a Free Reference for French	David Kletz, Philippe Langlais, François Lareau and Patrick Drouin
537	MAKED: Multi-lingual Automatic Keyword Extraction Dataset	Yash Verma, Anubhav Jangra, Sriparna Saha, Adam Jatowt and Dwaipayan Roy
539	Tracking Textual Similarities in Neo-Latin Drama Networks	Andrea Peverelli, Marieke van Erp and Jan Bloemendal
540	Valet: Rule-Based Information Extraction for Rapid Deployment	Dayne Freitag, John Cadigan, Robert Sasseen and Paul Kalmar
541	From Examples to Rules: Neural Guided Rule Synthesis for Information Extraction	Robert Vacareanu, Marco A. Valenzuela-Escárcega, George Caique Gouveia Barbosa, Rebecca Sharp and Mihai Surdeanu
542	Cross-lingual Transfer of Monolingual Models	Evangelia Gogoulou, Ariel Ekgren and Magnus Sahlgren
544	Cross-lingual Approaches for the Detection of Adverse Drug Reactions in German from a Patient's Perspective	Lisa Raithel, Philippe Thomas, Roland Roller, Oliver Sapina, Sebastian Möller and Pierre Zweigenbaum
546	How's Business Going Worldwide ? A Multilingual Annotated Corpus for Business Relation Extraction	Hadjer Khaldi, Farah Benamara, Camille Pradel, Grégoire Sigel and Nathalie Aussenac-Gilles
547	Strategy-level Entrainment of Dialogue System Users in a Creative Visual Reference Resolution Task	Deepthi Karkada, Ramesh Manuvinakurike, Maike Paetzel-Prüsmann and Kallirroi Georgila
549	ALIGNMEET: A Comprehensive Tool for Meeting Annotation, Alignment, and Evaluation	Peter Polák, Muskaan Singh, Anna Nedoluzhko and Ondřej Bojar
550	Generating Extended and Multilingual Summaries with Pre-trained Transformers	Rémi Calizzano, Malte Ostendorff, Qian Ruan and Georg Rehm
551	BembaSpeech: A Speech Recognition Corpus for the Bemba Language	Claytone Sikasote and Antonios Anastasopoulos
554	Distant Reading in Digital Humanities: Case Study on the Serbian Part of the ELTeC Collection	Ranka Stanković, Cvetana Krstev, Branislava Šandrih Todorović, Dusko Vitas, Mihailo Skoric and Milica Ikonić Nešić
555	Extracting Linguistic Knowledge from Speech: A Study of Stop Realization in 5 Romance Languages	Yaru WU, Mathilde Hutin, Ioana Vasilescu, Lori Lamel and Martine Adda-Decker
556	Problem-solving Recognition in Scientific Text	Kevin Heffernan and Simone Teufel
559	Phone Inventories and Recognition for Every Language	Xinjian Li, Florian Metze, David R. Mortensen, Alan W Black and Shinji Watanabe
561	Introducing Frege to Fillmore: A FrameNet Dataset that Captures both Sense and Reference	Levi Remijnse, Piek T.J.M. Vossen, Antske Fokkens and Sam Titarsolej
562	TZOS: an Online Terminology Database Aimed at Working on Basque Academic Terminology Collaboratively	Izaskun Aldezabal, Jose Mari Arriola and Arantxa Otegi
563	NaijaSenti: A Nigerian Twitter Sentiment Corpus for Multilingual Sentiment Analysis	Shamsuddeen Hassan Muhammad, David Adelani, Anuoluwapo Aremu and Idris Abdulmumin
564	Enriching Grammatical Error Correction Resources for Modern Greek	Katerina Korre and John Pavlopoulos
565	Leveraging a Bilingual Dictionary to Learn Wolastoqey Word Representations	Diego Thorn Bear and Paul Cook
567	GECO-MT: The Ghent Eye-tracking Corpus of Machine Translation	Toon Colman, Margot Fonteyne, Joke Daems, Nicolas Dirix and Lieve Macken
570	A Universal Dependencies Treebank of Ancient Hebrew	Daniel G. Swanson and Francis Tyers
571	Annotation of Valence for Spoken Personal Narratives	Aniruddha Tammewar, Franziska Braun, Gabriel Roccabruna, Sebastian Peter Bayerl, Korbinian Riedhammer and Giuseppe Riccardi
573	Offensive language detection in Hebrew: can other languages help?	Marina Litvak, Natalia Vanetik, Chaya Liebeskind, Omar Hmdia and Rizek Abu Madeghem
574	Conversational Analysis of Daily Dialog Data using Polite Emotional Dialogue Acts	Chandrakant Bothe and Stefan Wermter
575	Evaluation of Transfer Learning for Polish with a Text-to-Text Model	Aleksandra Chrabrowa, Łukasz Dragan, Karol Grzegorczyk, Dariusz Kajtoch, Mikołaj Koszowski, Robert Mroczkowski and Piotr Rybak
577	Design and Evaluation of the Corpus of Everyday Japanese Conversation	Hanae Koiso, Haruka Amatani, Yasuharu Den, Yuriko Iseki, Yuichi Ishimoto, Wakako Kashino, Yoshiko Kawabata, Ken'ya Nishikawa, Yayoi Tanaka, Yasuyuki Usuda and Yuka Watanabe
580	Evaluation of HTR models without Ground Truth Material	Phillip Benjamin Ströbel, Martin Volk, Simon Clematide, Raphael Schwitter, Tobias Hodel and David Schoch
581	Emotion analysis and detection during COVID-19	Tiberiu Sosea, Chau Thi Minh Pham, Alexander Tekle, Cornelia Caragea and Junyi Jessy Li
582	A Unifying View On Task-oriented Dialogue Annotation	Vojtěch Hudeček, leon-paul Schaub, Daniel Stancl, Patrick Paroubek and Ondřej Dušek
583	Dilated Convolutional Neural Networks for Lightweight Diacritics Restoration	Bálint Csanády and András Lukács
584	A Simple Yet Effective Corpus Construction Method for Chinese Sentence Compression	Yang Zhao, Hiroshi Kanayama, Issei Yoshida, Masayasu Muraoka and Akiko Aizawa
585	Development of a Multilingual CCG Treebank via Universal Dependencies Conversion	Tu-Anh Tran and Yusuke Miyao
590	Attention is All you Need for Robust Temporal Reasoning	Lis Kanashiro Pereira
595	VISA: An Ambiguous Subtitles Dataset for Visual Scene-aware Machine Translation	Yihang Li, Shuichiro Shimizu, Weiqi Gu, Chenhui Chu and Sadao Kurohashi
596	JaMIE: A Pipeline Japanese Medical Information Extraction System with Novel Relation Annotation	Fei Cheng, Shuntaro Yada, Ribeka Tanaka, Eiji ARAMAKI and Sadao Kurohashi
597	Syntactic-driven Approach for Semantic Role Labeling	Yuanhe Tian, Han Qin, Fei Xia and Yan Song
598	Image Description Dataset for Language Learners	Kento Tanaka, Taichi Nishimura, Hiroaki Nanjo, Keisuke Shirai, Hirotaka Kameko and Masatake Dantsuji
599	Automating Horizon Scanning in Future Studies	Tatsuya Ishigaki, Suzuko Nishino, Sohei Washino, Hiroki Igarashi, Yukari Nagai, Yuichi Washida and Akihiko Mural
600	A Benchmark Dataset for Multi-Level Complexity-Controllable Machine Translation	Kazuki Tani, Ryoya Yuasa, Kazuki Takikawa, Akihiro Tamura, Tomoyuki Kajiwara, Takashi Ninomiya and Tsuneo Kato
601	HashSet - A Dataset For Hashtag Segmentation	Prashant Kodali, Akshala Bhatnagar, Naman Ahuja, Manish Shrivastava and Ponnurangam Kumaraguru
602	BehanceCC: A ChitChat Detection Dataset For Livestreaming Video Transcripts	Viet Dac Lai, Amir Pouran Ben Veyseh, Franck Dernoncourt and Thien Huu Nguyen
603	Towards the Detection of a Semantic Gap in the Chain of Commonsense Knowledge Triples	Yoshihiko Hayashi
604	IRAC: A Domain-specific Annotated Corpus of Implicit Reasoning in Arguments	Keshav Singh, Naoya Inoue, Farjana Sultana Mim, Shoichi Naito and Kentaro Inui
605	QT30: A Corpus of Argument and Conflict in Broadcast Debate	Annette Hautli-Janisz, Zlata Kikteva, Wassiliki Siskou, Kamila Gorska, Ray Becker and Chris Reed
606	Validity, Agreement, Consensuality and Annotated Data Quality	Anaëlle Baledent, Yann Mathet, Antoine Widlöcher, Christophe Couronne and Jean-Luc Manguin
607	Enhanced Entity Annotations for Multilingual Corpora	Michael Strobl, Amine Trabelsi and Osmar Zaïane
612	A Hmong Corpus with Elaborate Expression Annotations	David R. Mortensen, Xinyu Zhang, Chenxuan Cui and Katherine J. Zhang
617	ELAL: An Emotion Lexicon for the Analysis of Alsatian Theatre Plays	Delphine Bernhard and Pablo Ruiz Fabo
619	Classifying Implant-Bearing Patients via their Medical Histories: a Pre-Study on Swedish EMRs with Semi-Supervised GanBERT	Marina Santini
620	A Multi-source Graph Representation of the Movie Domain for Recommendation Dialogues Analysis	Antonio Origlia, Martina Di Bratto, Maria Di Maro and Sabrina Mennella
621	Improving Event Duration Question Answering by Leveraging Existing Temporal Information Extraction Data	Felix Giovanni Virgo, Fei Cheng and Sadao Kurohashi
622	Enhancing Relation Extraction via Adversarial Multi-task Learning	Han Qin, Yuanhe Tian and Yan Song
623	Universal Dependencies for Western Sierra Puebla Nahuatl	Robert Pugh, Marivel Huerta Mendez, Mitsuya Sasaki and Francis Tyers
625	The Automatic Extraction of Linguistic Biomarkers as a Viable Solution for the Early Diagnosis of Mental Disorders	Gloria Gagliardi and Fabio Tamburini
626	MuLVE, A Multi-Language Vocabulary Evaluation Data Set	Anik Jacobsen, Salar Mohtaj and Sebastian Möller
627	Generating Artificial Texts as Substitution or Complement of Training Data	Vincent Claveau, Antoine Chaffin and Ewa Kijak
630	A Comparative Cross Language View On Acted Databases Portraying Basic Emotions Utilising Machine Learning	Felix Burkhardt, Anabell Hacker, Uwe Reichel, Hagen Wierstorf, Florian Eyben and Björn W. Schuller
631	ALEXSIS: A Dataset for Lexical Simplification in Spanish	Daniel Ferrés and Horacio Saggion
632	Integrating a Phrase Structure Corpus Grammar and a Lexical-Semantic Network: the HOLINET Knowledge Graph	Jean-Philippe Prost
633	The Universal Anaphora Scorer	Juntao Yu, Sopan Khosla, Nafise Sadat Moosavi, Silviu Paun, Sameer Pradhan and Massimo Poesio
634	Cross-lingual and Cross-domain Transfer Learning for Automatic Term Extraction from Low Resource Data	Amir Hazem, Merieme Bouhandi, Florian Boudin and Beatrice Daille
636	Nkululeko: A Tool For Rapid Speaker Characteristics Detection	Felix Burkhardt, Johannes Wagner, Hagen Wierstorf, Florian Eyben and Björn W. Schuller
637	Give me your Intentions, I’ll Predict your Actions: A Two-level Classification of Speech Acts for Crisis Management in Social Media	Enzo laurenti, Nils Bourgon, Farah Benamara, Alda Mari, Véronique MORICEAU and Camille Courgeon
638	Placing M-Phasis on the Plurality of Hate: A Feature-Based Corpus of Hate Online	Dana Ruiter, Liane Reiners, Ashwin Geet D'Sa, Thomas Kleinbauer, Dominique Fohr, Irina Illina, Dietrich Klakow, Christian Schemer and Angeliki Monnier
639	Animacy Denoting German Nouns: Annotation and Classification	Manfred Klenner and Anne Göhring
640	Using Linguistic Typology to Enrich Multilingual Lexicons: the Case of Lexical Gaps in Kinship	Temuulen Khishigsuren, Gábor Bella, Khuyagbaatar Batsuren, Abed Alhakim Freihat, Nandu Chandran Nair, Amarsanaa Ganbold, Hadi Khalilia, Yamini Chandrashekar and fausto giunchiglia
641	An Analysis of Dialogue Act Sequence Similarity Across Multiple Domains	Ayesha Enayet and Gita Sukthankar
643	Scaling up Discourse Quality Annotation for Political Science	Neele Falk and Gabriella Lapesa
644	GGPONC 2.0 - The German Clinical Guideline Corpus for Oncology: Curation Workflow, Annotation Policy, Baseline NER Taggers	Florian Borchert, Christina Lohr, Luise Modersohn, Jonas Witt, Thomas Langer, Markus Follmann, Matthias Gietzelt, Bert Arnrich, Udo Hahn and Matthieu-P. Schapranow
647	Inducing Discourse Marker Inventories from Lexical Knowledge Graphs	Christian Chiarcos
650	Evaluation of Off-the-shelf Speech Recognizers on Different Accents in a Dialogue Domain	Divya Tadimeti, Kallirroi Georgila and David Traum
651	Impact Analysis of the Use of Speech and Language Models Pretrained by Self-Supersivion for Spoken Language Undestanding	salima mdhaffar, Valentin Pelloin, Antoine Caubrière, Gaëlle Laperriere, Sahar Ghannay, Bassam Jabaian, Nathalie Camelin and Yannick Estève
652	RUSAVIC Corpus: Russian Audio-Visual Speech in Cars	Denis Ivanko, Alexandr Axyonov, Dmitry Ryumin, Alexey Kashevnik and Alexey Karpov
654	Named Entity Recognition in Estonian 19th Century Parish Court Records	Siim Orasmaa, Kadri Muischnek, Kristjan Poska and Anna Edela
655	The Spoken Language Understanding MEDIA Benchmark Dataset in the Era of Deep Learning: data updates, training and evaluation tools	Gaëlle Laperrière, Valentin Pelloin, Antoine Caubrière, salima mdhaffar, Nathalie Camelin, Sahar Ghannay, Bassam Jabaian and Yannick Estève
658	Challenging the Assumption of Structure-based embeddings in Few- and Zero-shot Knowledge Graph Completion	Filip Cornell, Chenda zhang, Jussi Karlgren and Sarunas Girdzijauskas
659	Dialogue Collection for Recording the Process of Building Common Ground in a Collaborative Task	Koh Mitsuda, Ryuichiro Higashinaka, Yuhei Oga and Sen Yoshida
660	Polar Quantification of Actor Noun Phrases for German	Anne Göhring and Manfred Klenner
661	Evaluating Pretraining Strategies for Clinical BERT Models	Anastasios Lamproudis, Aron Henriksson and Hercules Dalianis
662	Czech Dataset for Cross-lingual Subjectivity Classification	Pavel Přibáň and Josef Steinberger
664	Adversarial Speech Generation and Natural Speech Recovery for Speech Content Protection	Sheng Li, Jiyi Li, Qianying Liu and Zhuo Gong
665	Story Trees: Representing Documents using Topological Persistence	Pantea Haghighatkhah, Antske Fokkens, Pia Sommerauer, Bettina Speckmann and Kevin Verbeek
667	Towards Latvian WordNet	Peteris Paikens, Mikus Grasmanis, Agute Klints, Ilze Lokmane, Lauma Pretkalniņa, Laura Rituma, Madara Stāde and Laine Strankale
670	COPA-SSE: Semi-structured Explanations for Commonsense Reasoning	Ana Brassard, Benjamin Heinzerling, Pride Kavumba and Kentaro Inui
671	Dataset of Student Solutions to Algorithm and Data Structure Programming Assignments	Fynn Schröder, Marcus Soll, Louis Kobras, Melf Johannsen, Peter Kling and Chris Biemann
672	Corpus for Automatic Structuring of Legal Documents	Prathamesh Kalamkar, Aman Tiwari, Astha Agarwal, Saurabh Karn, Smita Gupta, Vivek Raghavan and Ashutosh Modi
673	Constrained Language Models for Interactive Poem Generation	Andrei Popescu-Belis, Àlex R. Atrio, Valentin Minder, Aris Xanthos, Gabriel Luthier, Simon Mattei and Antonio Rodriguez
674	D3: A Massive Dataset of Scholarly Metadata for Analyzing the State of Computer Science Research	Jan Philip Wahle, Terry Ruas, Saif Mohammad and Bela Gipp
675	Constructing a Culinary Interview Dialogue Corpus with Video Conferencing Tool	Taro Okahisa, Ribeka Tanaka, Takashi Kodama, Yin Jou Huang and Sadao Kurohashi
676	Query Obfuscation by Semantic Decomposition	Danushka Bollegala, Tomoya Machide and Ken-ichi Kawarabayashi
678	A First Corpus of AZee Discourse Expressions	Camille Challant and Michael Filhol
680	A Large-Scale Japanese Dataset for Aspect-based Sentiment Analysis	Yuki Nakayama, Koji Murakami, Gautam Kumar, Sudha Bhingradive and Ikuko Hardaway
683	SciPar: A Collection of Parallel Corpora from Scientific Abstracts	Dimitrios Roussis, Vassilis Papavassiliou, Prokopis Prokopidis, Stelios Piperidis and Vassilis Katsouros
685	Building Sentiment Lexicons for Mainland Scandinavian Languages Using Machine Translation and Sentence Embeddings	Peng Liu, Cristina Marco and Jon Atle Gulla
686	Constructing Parallel Corpora from COVID-19 News using MediSys Metadata	Dimitrios Roussis, Vassilis Papavassiliou, Sokratis Sofianopoulos, Prokopis Prokopidis and Stelios Piperidis
687	Investigating Independence vs. Control: Agenda-Setting in Russian News Coverage on Social Media	Annerose Eichel, Gabriella Lapesa and Sabine Schulte im Walde
688	Developing Language Resources and NLP Tools for the North Korean Language	Arda Akdemir, Yeojoo Jeon and Tetsuo Shibuya
690	The Construction and Evaluation of the LEAFTOP Dataset of Automatically Extracted Nouns in 1480 Languages	Gregory Baker and Diego Molla
692	CxLM: A Construction and Context-aware Language Model	Yu-Hsiang Tseng, Cing-Fang Shih, Pin-Er Chen, Hsin-Yu Chou, Mao-Chang Ku and Shu-Kai HSIEH
694	HeLI-OTS, Off-the-shelf Language Identifier for Text	Tommi Jauhiainen, Heidi Jauhiainen and Krister Lindén
696	Tracing Syntactic Change in the Scientific Genre: Two Universal Dependency-parsed Diachronic Corpora of Scientific English and German	Marie-Pauline Krielke, Luigi Talamo, Mahmoud Fawzi and Jörg Knappen
698	Developing a Dataset of Overridden Information in Wikipedia	Masatoshi Tsuchiya and Yasutaka Yokoi
699	From Pattern to Interpretation. Using Colibri Core to Detect Translation Patterns in the Peshitta.	Mathias Coeckelbergs
700	How Much Context Span is Enough? Examining Context-Related Issues for Document-level MT	Sheila Castilho
701	Study of Overlaps and Gender Annotations in the Context of Broadcast Media	Martin Lebourdais, Marie Tahon, Antoine LAURENT, Sylvain Meignier and Anthony Larcher
702	Clarifying Implicit and Underspecified Phrases in Instructional Text	Talita Anthonio, Anna Sauer and Michael Roth
703	JGLUE: Japanese General Language Understanding Evaluation	Kentaro Kurihara, Daisuke Kawahara and Tomohide Shibata
704	Enriching Epidemiological Thematic Features For Disease Surveillance Corpora Classification	Edmond Menya, Mathieu Roche, Roberto Interdonato and Dickson Owuor
705	Speech Aerodynamics Database: Tools and Visualisation	Shi YU, Clara Ponchard, Roland Trouville, Sergio Hassid and Didier Demolin
706	ParCorFull2.0: a Parallel Corpus Annotated with Full Coreference	Ekaterina Lapshinova-Koltunski, Pedro Augusto Ferreira, Elina Lartaud and Christian Hardmeier
707	Huqariq: A Multilingual Speech Corpus of Native Languages of Peru forSpeech Recognition	Rodolfo Zevallos, Luis Camacho and Nelsi Melgarejo
708	Using the LARA Little Prince to compare human and TTS audio quality	Elham Akhlaghi, Ingibjörg Iða Auðunardóttir, Anna Bączkowska, Branislav Bédi, Hakeem Beedar, Harald Berthelsen, Cathy Chua, Catia Cucchiarin, Hanieh Habibi, Ivana Horváthová, Junta Ikeda, Christèle Maizonniaux, Neasa Ní Chiaráin, Chadi Raheb, Manny Rayner, John Sloan, Nikos Tsourakis and Chunlin Yao
710	Cyberbullying Classifiers are Sensitive to Model-Agnostic Perturbations	Chris Emmery, Ákos Kádár, Grzegorz Chrupała and Walter Daelemans
712	The Chinese Causative-Passive Homonymy Disambiguation: an adversarial Dataset for NLI and a Probing Task	Shanshan Xu and Katja Markert
719	Writing System and Speaker Metadata for 2,800+ Language Varieties	Daan van Esch, Tamar Lucassen, Sebastian Ruder, Isaac Caswell and Clara Rivera
721	Do we Name the Languages we Study? The #BenderRule in LREC and ACL articles	Fanny Ducel, Karën Fort, Gaël Lejeune and Yves Lepage
722	MMChat: Multi-Modal Chat Dataset on Social Media	Yinhe Zheng, Guanyi Chen, Xin Liu and Jian Sun
723	Borrowing or Codeswitching? Annotating for Finer-Grained Distinctions in Language Mixing	Elena Alvarez-Mellado and Constantine Lignos
724	Conversational Speech Recognition Needs Data? Experiments with Austrian German	Julian Linke, Philip N. Garner, Gernot Kubin and Barbara Schuppler
728	GLoHBCD: A Naturalistic German Dataset for Language of Health Behaviour Change on Online Support Forums	Selina Meyer and David Elsweiler
729	Building Static Embeddings from Contextual Ones: Is It Useful for Building Distributional Thesauri?	Olivier Ferret
730	The Multimodal Annotation Software Tool (MAST)	Bruno Cardoso and Neil Cohn
733	Standardisation of Dialect Comments in Social Networks in View of Sentiment Analysis : Case of Tunisian Dialect	Saméh Kchaou, rahma boujelbane, Emna Fsih and Lamia Hadrich-Belguith
734	Towards a Broad Coverage Named Entity Resource: A Data-Efficient Approach for Many Diverse Languages	Silvia Severini, Ayyoob ImaniGooghari, Philipp Dufter and Hinrich Schütze
735	The PALMA Corpora of African Varieties of Portuguese	Tjerk Hagemeijer, Amália Mendes, Rita Gonçalves, Catarina Cornejo, Raquel Madureira and Michel Généreux
738	A Japanese Dataset for Subjective and Objective Sentiment Polarity Classification in Micro Blog Domain	Haruya Suzuki, Yuto Miyauchi, Kazuki Akiyama, Tomoyuki Kajiwara, Takashi Ninomiya, Noriko Takemura, Yuta Nakashima and Hajime Nagahara
740	A Learning-Based Dependency to Constituency Conversion Algorithm for the Turkish Language	Büşra Marşan, Oğuz K. Yıldız, Aslı Kuzgun, Neslihan Cesur, Arife B. Yenice, Ezgi Sanıyar, Oğuzhan Kuyrukçu, Bilge N. Arıcan and Olcay Taner Yıldız
741	Explainable Tsetlin Machine Framework for Fake News Detection with Credibility Score Assessment	Bimal Bhattarai, Ole-Christoffer Granmo and Lei Jiao
742	The Robotic Surgery Procedural Framebank	Marco Bombieri, Marco Rospocher, Simone Paolo Ponzetto and Paolo Fiorini
744	UgChDial: A Uyghur Chat-based Dialogue Corpus for Response Space Classification	Zulipiye Yusupujiang and Jonathan Ginzburg
745	Compiling a Suitable Level of Sense Granularity in a Lexicon for AI Purposes: The Open Source COR Lexicon	Bolette Pedersen, Nathalie Carmen Hau Sørensen, Sanni Nimb, Sussi Olsen, Ida Flørke and Thomas Troelsgård
746	PATATRA and PATAFreq: two French databases for the documentation of within-speaker variability in speech	Cécile Fougeron, Nicolas Audibert, cedric Gendrot, Estelle Chardenon and Louise Wohmann
749	A Speculative and Tentative Common Ground Handling for Efficient Composition of Uncertain Dialogue	Saki Sudo, Kyoshiro Asano, Koh Mitsuda, Ryuichiro Higashinaka and Yugo Takeuchi
750	ELF22: A Context-based Counter-Trolling Dataset to Combat Internet Trolls	Huije Lee, Young Ju NA, Hoyun Song, Jisu Shin and Jong Park
751	A Multi-Party Dialogue Ressource in French	Maria Boritchev and Maxime Amblard
753	GRhOOT: Ontology of Rhetorical Figures in German	Ramona Kühn, Jelena Mitrović and Michael Granitzer
754	Unmasking the Myth of Effortless Big Data - Building an Open Source Multi-lingual Infrastructure from Scratch	Linda Wiechetek, Katri Hiovain-Asikainen, Inga Lill Sigga Mikkelsen, Sjur Moshagen, Flammie A. Pirinen, Trond Trosterud and Børre Gaup
757	Generating Monolingual Dataset for Low Resource Language Bodo from old books using Google Keep	Sanjib Narzary, Maharaj Brahma, Mwnthai Narzary, Gwmsrang Muchahary, Pranav Kumar Singh, Apurbalal Senapati, Sukumar Nandi and Bidisha Som
759	A Thesaurus-based Sentiment Lexicon for Danish: The Danish Sentiment Lexicon	Sanni Nimb, Sussi Olsen, Bolette Pedersen and Thomas Troelsgård
760	PAGnol: An Extra-Large French Generative Model	Julien Launay, E.L. Tommasone, Baptiste Pannier, François Boniface, Amélie Chatelain, Alessandro Cappelli, Iacopo Poli and Djamé Seddah
762	ClinIDMap: Clinical IDs Mapping for Data Interoperability	Elena Zotova, Montse Cuadros and German Rigau
763	Hate Speech Dynamics Against African descent, Roma and LGBTQI Communities in Portugal	Paula Carvalho, Bernardo Cunha, Raquel Santos, Fernando Batista and Ricardo Ribeiro
764	Multi-Aspect Transfer Learning for Detecting Low Resource Mental Disorders on Social Media	Ana Sabina Uban, Berta Chulvi and Paolo Rosso
766	Polysemy in Spoken Conversations and Written Texts	Aina Garí Soler, Matthieu Labeau and Chloé Clavel
767	SLäNDa version 2.0: Improved and Extended Annotation of Narrative and Dialogue in Swedish Literature	Sara Stymne and Carin Östman
769	Bicleaner AI: Bicleaner Goes Neural	Jaume Zaragoza-Bernabeu, Gema Ramírez-Sánchez, Marta Bañón and Sergio Ortiz Rojas
770	A Benchmark Corpus for the Detection of Automatically Generated Text in Academic Publications	Vijini Liyanage, Davide Buscaldi and Adeline Nazarenko
771	The Makerere Radio Speech Corpus: A Luganda Radio Corpus for Automatic Speech Recognition	Jonathan Mukiibi, Ali Hussein, Joshua Meyer, Andrew Katumba and Joyce Nakatumba-Nabende
774	Multidimensional Coding of Multimodal Languaging in Multi-Party Settings	Christophe Parisse, Marion Blondel, Stéphanie Caët, Claire Danet, Coralie Vincent and Aliyah Morgenstern
775	The Copenhagen Corpus of Eye Tracking Recordings from Natural Reading of Danish Texts	Nora Hollenstein, Maria Barrett and Marina Björnsdóttir
778	Far-Field Speaker Recognition Benchmark Derived From The DiPCo Corpus	Mickael Rouvier and Mohammad Mphammadamini
779	Pro-TEXT: an Annotated Corpus of Keystroke Logs	Aleksandra Miletic, Christophe Benzitoun, Georgeta Cislaru and Santiago Herrera-Yanez
780	Querying a Dozen Corpora and a Thousand Years with Fintan	Christian Chiarcos, Christian Fäth and Maxim Ionov
781	Standard German Subtitling of Swiss German TV content: the PASSAGE Project	Jonathan David Mutal, Pierrette Bouillon, Johanna Gerlach and Veronika Haberkorn
783	Sentence Pair Embeddings Based Evaluation Metric for Abstractive and Extractive Summarization	Ramya Akula and Ivan Garibay
784	ArCovidVac: Analyzing Arabic Tweets About COVID-19 Vaccination	Hamdy Mubarak, Sabit Hassan, Shammur Absar Chowdhury and Firoj Alam
785	Generating Textual Explanations for Machine Learning Models Performance: A Table-to-Text Task	Isaac Ampomah, James Burton, Amir Enshaei and Noura Al Moubayed
786	BRATECA (Brazilian Tertiary Care Dataset): a Clinical Information Dataset for the Portuguese Language	Bernardo Consoli, Henrique D. P. dos Santos, Ana Helena D. P. S. Ulbrich, Renata Vieira and Rafael H. Bordini
787	Evolving Large Text Corpora: Four Versions of the Icelandic Gigaword Corpus	Steinþór Steingrímsson, Starkaður Barkarson and Hildur Hafsteinsdóttir
789	TANDO: A Corpus for Document-level Machine Translation	Harritxu Gete, Thierry Etchegoyhen, David Ponce, Gorka Labaka, Nora Aranberri, Ander Corral, Xabier Saralegi, Igor Ellakuria and Maite Martin
792	MOTIF: Contextualized Images for Complex Words to Improve Human Reading	Xintong Wang, Florian Schneider, Özge Alacam, Prateek Chaudhury and Chris Biemann
796	BaSCo: An Annotated Basque-Spanish Code-Switching Corpus for Natural Language Understanding	Maia Aguirre, Laura García-Sardiña, Manex Serras, Ariane Méndez and Jacobo López
797	Open Terminology Management and Sharing Toolkit for Federation of Terminology Databases	Andis Lagzdiņš, Uldis Siliņš, Toms Bergmanis, Mārcis Pinnis, Artūrs Vasiļevskis and Andrejs Vasiļjevs
799	Constructing Distributions of Variation in Referring Expression Type from Corpora for Model Evaluation	T. Mark Ellison and Fahime Same
801	Using Convolution Neural Network with BERT for Stance Detection in Vietnamese	Oanh Thi Tran, Anh Cong Phung and Bach Xuan Ngo
802	MT4All: Machine Translation For All	Ona de Gibert Bonet, Iakes Goenaga, Olatz Perez-de-Viñaspre, Jordi Armengol-Estapé, Carla Parra Escartín, Marina Sanchez, Mārcis Pinnis, Gorka Labaka and Maite Melero
804	Out-of-Domain Evaluation of Finnish Dependency Parsing	Jenna Kanerva and Filip Ginter
805	Efficient Entity Candidate Generation for Low-Resource Languages	Alberto Garcia-Duran, Akhil Arora and Robert West
806	Exploring Text Recombination for Automatic Narrative Level Detection	Nils Reiter, Judith Sieker, Svenja Guhr, Evelyn Gius and Sina Zarrieß
808	The Bull and the Bear: Summarizing Stock Market Discussions	Ayush Kumar, Dhyey Jani, Jay Shah, Devanshu Thakar, Varun Jain and Mayank Singh
811	Analysis of Dialogue in Human-Human Collaboration in Minecraft	Takuma Ichikawa and Ryuichiro Higashinaka
812	Linghub2: Language Resource Discovery Tool for Language Technologies	Cécile Robin, Gautham Vadakkekara Suresh, Víctor Rodriguez-Doncel, John P. McCrae and Paul Buitelaar
813	Data Collection for Empirically Determining the Necessary Information for Smooth Handover in Dialogue	Sanae Yamashita and Ryuichiro Higashinaka
814	The Index Thomisticus Treebank as Linked Data in the LiLa Knowledge BaseThe Index Thomisticus Treebank as Linked Data in the LiLa Knowledge Base	Francesco Mambrini, Marco Passarotti, Giovanni Moretti and Matteo Pellegrini
816	Extracting and Analysing Metaphors in Migration Media Discourse: towards a Metaphor Annotation Scheme	Ana Zwitter Vitez, Mojca Brglez, Marko Robnik Šikonja, Tadej Škvorc, Andreja Vezovnik and Senja Pollak
819	Language Resources to Support Language Diversity – the ELRA Achievements	Valérie Mapelli, Victoria Arranz, Khalid Choukri and Hélène Mazo
820	BasqueGLUE: Natural Language Understanding Benchmark for Basque	Gorka Urbizu, Iñaki San Vicente, Xabier Saralegi, Rodrigo Agerri and Aitor Soroa
823	Afaan Oromo Hate Speech Detection and Classification on Social Media	Teshome Ababu and Michael Melese
824	Challenges with Sign Language Datasets for Sign Language Recognition and Translation	Mirella De Sisto, Vincent Vandeghinste, Santiago Egea Gómez, Mathieu De Coster and Dimitar Shterionov
826	Semi-automatically Annotated Learner Corpus for Russian	Anisia Katinskaia, Maria Lebedeva, Jue Hou and Roman Yangarber
827	Complementary Learning of Aspect Terms for Aspect-based Sentiment Analysis	Han Qin, Yuanhe Tian, Fei Xia and Yan Song
829	A Low-Cost Motion Capture Corpus in French Sign Language for Interpreting Iconicity and Spatial Referencing Mechanisms	Clémence Mertz, Vincent BARREAUD, Thibaut Le Naour, Damien Lolive and Sylvie Gibet
830	COVID-19 Mythbusters in World Languages	Mana Ashida, Jin-Dong Kim and Lee J. Seunghun
833	Singlish Where Got Rules One? Constructing a Computational Grammar for Singlish	Siew Yeng Chow and Francis Bond
835	DDisCo: A Discourse Coherence Dataset for Danish	Linea Flansmose Mikkelsen, Oliver Kinch, Anders Jess Pedersen and Ophélie Lacroix
836	UniMorph 4.0: Universal Morphology	Khuyagbaatar Batsuren, Omer Goldman, Salam Khalifa, Nizar Habash, Witold Kieraś, Gábor Bella, Brian Leonard, Garrett Nicolai, Yustinus Ghanggo Ate, Maria Ryskina, Kyle Gorman, Sabrina J. Mielke, Charbel El-Khaissi, Tiago Pimentel, Michael Gasser, William Abbott Lane, Matt Coler, Jaime Rafael Montoya Samame, Delio Siticonatzi Camaiteri, Esaú Zumaeta Rojas, Didier López Francis, Arturo Oncevay, Juan López Bautista, Gema Celeste Silva Villegas, Lucas Torroba Hennigen, Adam Ek, Jean-Philippe Bernardy, Andrey Scherbakov, Aziyana Bayyr-ool, Antonios Anastasopoulos, Roberto Zariquiey, Karina Sheifer, Sofya Ganieva, Matvey Plugaryov, Elena Klyachko, Ali Salehi, Candy Angulo, Andrew Krizhanovsky, Natalia Krizhanovskaya, Elizabeth Salesky, Clara Vania, Sardana Ivanova, Jennifer White, Rowan Hall Maudslay, Josef Valvoda, Ran Zmigrod, Paula Czarnowska, Irene Nikkarinen, Aelita Salchak, Christopher Straughn, Zoey Liu, Jonathan North Washington, Yuval Pinter, Duygu Ataman, Marcin Wolinski, Totok Suhardijanto, Anna Yablonskaya, Niklas Stoehr, Zahroh Nuriah, Francis M. Tyers, Edoardo M. Ponti, Grant Aiton, Aryaman Arora, Richard J. Hatcher, Ritesh Kumar, Mohit Raj, Daria Rodionova, Anastasia Yemelina, Dorina Lakatos, Hilaria Cruz, Botond Barta, Gábor Szolnok, Judit Ács, Taras Andrushko, Igor Marchenko, Polina Mashkovtseva, Alexandra Serova, Emily Prud'hommeaux, Maria Nepomniashchaya, Elena Budianskaya, Eleanor Chodroff, Mans Hulden, Miikka Silfverberg, fausto giunchiglia, David Yarowsky, Ryan Cotterell, Reut Tsarfaty and Ekaterina Vylomova
838	At the Intersection of NLP and Sustainable Development: Exploring the Impact of Demographic-Aware Text Representations in Modeling Value on a Corpus of Interviews	Goya van Boven, Stephanie Hirmer and Costanza Conforti
840	AGILe: The First Lemmatizer for Ancient Greek Inscriptions	Evelien de Graaf, Silvia Stopponi, Jasper K. Bos, Saskia Peels-Matthey and Malvina Nissim
841	On the Multilingual Capabilities of Very Large-Scale English Language Models	Jordi Armengol-Estapé, Ona de Gibert Bonet and Maite Melero
843	Evaluation of Transfer Learning and Domain Adaptation for Analyzing German-Speaking Job Advertisements	Ann-Sophie Gnehm, Eva Bühlmann and Simon Clematide
844	How to be FAIR when you CARE: The DGS Corpus as a Case Study of Open Science Resources for Minority Languages	Marc Schulder and Thomas Hanke
845	AppReddit: a Corpus of Reddit Posts Annotated for Appraisal	Marco Antonio Stranisci, Simona Frenda, Eleonora Ceccaldi, Valerio Basile, Rossana Damiano and Viviana Patti
846	Evaluating Tokenizers Impact on OOVs Representation with Transformers Models	Alexandra Benamar, Cyril Grouin, Meryl Bothua and Anne Vilnat
847	Textinator: an Internationalized Tool for Annotation and Human Evaluation in Natural Language Processing and Generation	Dmytro Kalpakchi and Johan Boye
848	Creating a Data Set of Abstractive Summaries of Turn-labeled Spoken Human-Computer Conversations	Virginia Meijer and Iris Hendrickx
850	The slurk Interaction Server Framework: Better Data for Better Dialog Models	Jana Götze, Maike Paetzel-Prüsmann, Wencke Liermann, Tim Diekmann and David Schlangen
851	Spanish Datasets for Sensitive Entity Detection in the Legal Domain	Ona de Gibert Bonet, Aitor García Pablos, Montse Cuadros and Maite Melero
853	AsNER - Annotated Dataset and Baselines for Assamese Named Entity recognition	Dhrubajyoti Pathak, Sukumar Nandi and Priyankoo Sarmah
855	Deep One-Class Hate Speech Detection Model	saugata bose and Dr. Guoxin Su
860	Evaluating Subtitle Segmentation for End-to-end Generation Systems	Alina Karakanta, François Buet, Mauro Cettolo and François Yvon
864	E-ConvRec: A Large-Scale Conversational Recommendation Dataset for E-Commerce Customer Service	meihuizi jia, Ruixue Liu, Peiying Wang, Yang Song, Zexi Xi, Haobin Li, Xin Shen, Meng Chen, Jinhui Pang and Xiaodong He
865	Design Choices in Crowdsourcing Discourse Relation Annotations: The Effect of Worker Selection and Training	Merel Scholman, Valentina Pyatkin, Frances Yung, Ido Dagan, Reut Tsarfaty and Vera Demberg
867	Building large multilingual conversational corpora for diversity-aware language science and technology	Andreas Liesenfeld and Mark Dingemanse
869	ConvTextTM: An Explainable Convolutional Tsetlin Machine Framework for Text Classification	Bimal Bhattarai, Ole-Christoffer Granmo and Lei Jiao
870	EnsyNet: A Dataset for Encouragement and Sympathy Detection	Tiberiu Sosea and Cornelia Caragea
872	The Tembusu Treebank: An English Learner Treebank	Luís Morgado da Costa, Francis Bond and Roger V. P. Winder
873	Automatic Normalisation of Early Modern French	Rachel Bawden, Jonathan Poinhos, Eleni Kogkitsidou, Philippe Gambette, Benoît Sagot and Simon Gabay
874	Italian NLP for Everyone: Resources and Models from EVALITA to the European Language Grid	Valerio Basile, Cristina Bosco, Michael Fell, Viviana Patti and Rossella Varvara
875	From FreEM to D’AlemBERT: a Large Corpus and a Language Model for Early Modern French	Simon Gabay, Pedro Javier Ortiz Suarez, Alexandre BARTZ, Alix Chagué, Rachel Bawden, Philippe Gambette and Benoît Sagot
876	Barch: an English Dataset of Bar Chart Summaries	Iza Škrjanec, Muhammad Salman Edhi and Vera Demberg
877	PoliBERTweet: A Pre-trained Language Model for Analyzing Political Content on Twitter	Kornraphop Kawintiranon and Lisa Singh
878	Enhancing Deep Learning with Embedded Features for Arabic Named Entity Recognition	Ali L. Hatab, Caroline Sabty and Slim Abdennadher
880	A Survey of Multilingual Models for Automatic Speech Recognition	Hemant Yadav and Sunayana Sitaram
881	gaHealth: An English–Irish Bilingual Corpus of Health Data	Seamus Lankford, Haithem Afli, Órla Ní Loinsigh and Andy Way
882	Building a Dataset for Automatically Learning to Detect Questions Requiring Clarification	Ivano Lauriola, Kevin Small and Alessandro Moschitti
884	Identifying Draft Bills Impacting Existing Legislation: a Case Study on Romanian	Corina Ceausu and Sergiu Nisioi
885	SHONGLAP: A Large Bengali Open-Domain Dialogue Corpus	Syed Mostofa Monsur, Sakib Chowdhury, Md Shahrar Fatemi and Shafayat Ahmed
886	Evaluating Sampling-based Filler Insertion for Spontaneous TTS	Siyang Wang, joakim gustafson and Éva Székely
887	CyberAgressionAdo-v1: a Dataset of Annotated Online Aggressions in French Collected through a Role-playing Game	Anaïs Ollagnier, Elena Cabrio, Serena Villata and Catherine Blaya
890	HerBERT Based Language Model Detects Quantifiers and Their Semantic Properties in Polish	Marcin Woliński, Bartłomiej Nitoń, Witold Kieraś and Jakub Szymanik
891	Sense and Sentiment	Francis Bond and Merrick Choo
892	What a Creole Wants, What a Creole Needs	Heather Lent, Kelechi Ogueji, Miryam de Lhoneux, Orevaoghene Ahia and Anders Søgaard
893	CATs are Fuzzy PETs: A Corpus and Analysis of Potentially Euphemistic Terms	Martha Gavidia, Patrick Lee, Anna Feldman and JIng Peng
894	KSoF: The Kassel State of Fluency Dataset – A Therapy Centered Dataset of Stuttering	Sebastian Peter Bayerl, Alexander Wolff von Gudenberg, Florian Hönig, Elmar Noeth and Korbinian Riedhammer
896	A Semi-Automatic Approach to Create Large Gender- and Age-Balanced Speaker Corpora: Usefulness of Speaker Diarization & Identification.	Rémi Uro, David Doukhan, Albert Rilliard, Laetitia Larcher, Anissa-Claire Adgharouamane, Marie Tahon and Antoine Laurent
897	The ALPIN Sentiment Dictionary: Austrian Language Polarity in Newspapers	Thomas Kolb, Sekanina Katharina, Bettina Manuela Johanna Kern, Julia Neidhardt, Tanja Wissik and Andreas Baumann
900	A Comparison of Praising Skills in Face-to-Face and Remote Dialogues	Toshiki Onishi, Asahi Ogushi, Yohei Tahara, Ryo Ishii, Atsushi Fukayama, Takao Nakamura and Akihiro Miyata
902	EZCAT: an Easy Conversation Annotation Tool	Gaël Guibon, Luce Lefeuvre, Matthieu Labeau and Chloé Clavel
903	MUSS: Multilingual Unsupervised Sentence Simplification by Mining Paraphrases	Louis Raphaël Théo Martin, Angela Fan, Éric de la Clergerie, Antoine Bordes and Benoît Sagot
905	COSMOS: Experimental and Comparative Studies of Concept Representations in Schoolchildren	Jeanne Villaneau and Farida SAID
906	PortiLexicon-UD: a Portuguese Lexical Resource according to Universal Dependencies Model	Lucelene Lopes, Magali SanxhSanches es Duran, Paulo Fernandes and Thiago Alexandre Salgueiro Pardo
908	A Language Modelling Approach to Quality Assessment of OCR'ed Historical Text	Callum Booth, Robert Shoemaker and Robert Gaizauskas
910	Text Classification and Prediction in the Legal Domain	Minh-Quoc Nghiem, Paul Baylis, André Freitas and Sophia Ananiadou
911	LuxemBERT: Simple and Practical Data Augmentation in Language Model Pre-Training for Luxembourgish	Cedric Lothritz, Bertrand Lebichot, Kevin Allix, Lisa Veiber, TEGAWENDE BISSYANDE, Jacques Klein, Andrey Boytsov, Clément Lefebvre and Anne Goujon
912	Modeling the Impact of Syntactic Distance and Surprisal on Cross-Slavic Text Comprehension	Irina Stenger
913	Universal Grammatical Dependencies for Portuguese with CINTIL Data, LX Processing and CLARIN support	António Branco, João Ricardo Silva, Luís Gomes and João António Rodrigues
915	Towards a Cleaner Document-Oriented Multilingual Crawled Corpus	Julien Abadji, Pedro Javier Ortiz Suarez, Laurent Romary and Benoît Sagot
918	Telling a Lie: Analyzing the Language of Information and Misinformation during Global Health Events	Ankit Aich and Natalie Parde
919	SCAI-QReCC Shared Task on Conversational Question Answering	Svitlana Vakulenko, Johannes Kiesel and Maik Fröbe
920	The CLAMS Platform at Work: Processing Audiovisual Data from the American Archive of Public Broadcasting	Marc Verhagen, Kelley Lynch, Kyeongmin Rim and James Pustejovsky
921	TArC: Tunisian Arabish Corpus, First complete release	elisa gugliotta and Marco Dinarelli
922	Preliminary Results on the Evaluation of Computational Tools for the Analysis of Quechua and Aymara	Marcelo Yuji Himoro and Antonio Pareja-Lora
923	Modeling Noise in Paraphrase Detection	Teemu Eemeli Vahtola, Eetu Sjöblom, Jörg Tiedemann and Mathias Creutz
926	ISO based Discourse Markers Annotated Multilingual Corpus	Purificação Moura Silvano, Mariana K. Damova, Giedrė Valūnaitė Oleškevičienė, Chaya Liebeskind, Christian Chiarcos, Dimitar Trajanov, Ciprian-Octavian Truică, Elena-Simona Apostol and Anna Baczkowska
928	MentSum: A Resource for Exploring Summarization of Mental Health Online Posts	Sajad Sotudeh, Nazli Goharian and Zachary Young
933	Misogyny and Aggressiveness Tend to Come Together and Together We Address Them	Arianna Muti, Alberto Barrón-Cedeño and Francesco Fernicola
934	ACT2: A multi-disciplinary semi-structured dataset for importance and purpose classification of citations	Suchetha Nambanoor Kunnath, Valentin Stauber, Ronin Wu, David Pride, Viktor Botev and Petr Knoth
935	Building Dataset for Grounding of Formulae — Annotating Coreference Relations Among Math Identifiers	Takuto Asakura, Yusuke Miyao and Akiko Aizawa
936	PerPaDa: A Persian Paraphrase Dataset based on Implicit Crowdsourcing Data Collection	Salar Mohtaj, Fatemeh Tavakkoli and Habibollah Asghari
937	BEA-Base: A Benchmark for ASR of Spontaneous Hungarian	Peter Mihajlik, Andras Balog, Tekla Etelka Graczi, Anna Kohari, Balázs Tarján and Katalin Mady
938	Cross-Level Semantic Similarity for Serbian Newswire Texts	Vuk Batanović and Maja Miličević Petrović
941	Opinions in Interactions : New Annotations of the SEMAINE Database	Valentin Barriere, Slim Essid and Chloé Clavel
943	Effectiveness of Data Augmentation and Pretraining for Improving Neural Headline Generation in Low-Resource Settings	Matej Martinc, Syrielle Montariol, Lidia Pivovarova and Elaine Zosa
944	Dataset for Complex Word Identification in Hindi	Gayatri Venugopal, Dhanya Pramod and Ravi Shekhar
945	Corpus Design for Studying Linguistic Nudges in Human-Computer Spoken Interactions	Natalia Kalashnikova, Serge Pajak, Fabrice Le Guel, Ioana Vasilescu, Gemma Serrano and Laurence Devillers
950	SNuC: The Sheffield Numbers Spoken Language Corpus	Emma Barker, Jon Barker, Robert Gaizauskas, Ning Ma and Monica Lestari Paramita
951	The Norwegian Dialect Corpus Treebank	Andre Kåsen, Kristin Hagen, Anders Nøklestad, Joel Priestly, Per Erik Solberg and Dag Trygve Truslew Haug
952	Using Semantic Role Labeling to Improve Neural Machine Translation	Reinhard Rapp
954	TWEET-FID: An Annotated Dataset for Multiple Foodborne Illness Detection Tasks	Ruofan Hu, Dongyu Zhang, Dandan Tao, Thomas Hartvigsen, Hao Feng and Elke Rundensteiner
955	Introducing the Welsh Text Summarisation Dataset and Baseline Systems	Ignatius Ezeani, Mahmoud El-Haj, Jonathan Morris and Dawn Knight
956	A Systematic Approach to Derive a Refined Speech Corpus for Sinhala	Disura Warusawithana, Nilmani Kulaweera, Lakshan Weerasinghe and Buddhika Karunarathne
959	Language Patterns and Behaviour of the Peer Supporters in Multilingual Healthcare Conversational Forums	Ishani Mondal, Kalika Bali, Mohit Jain, Monojit Choudhury, Jacki O'Neill, Millicent Ochieng and Kagnoya Awori
960	Assessing Multilinguality of Publicly Accessible Websites	Rinalds Vīksna, Inguna Skadiņa, Raivis Skadiņš and Andrejs Vasiļjevs
964	Identifying Copied Fragments in a 18th Century Dutch Chronicle	Roser Morante, Eleanor L. T. Smith, Lianne Wilhelmus, Alie Lassche and Erika Kuijpers
965	Frame Shift Prediction	Zheng Xin Yong, Patrick D. Watson, Tiago Timponi Torrent, Oliver Czulo and Collin F. Baker
966	A Semi-Automated Live Interlingual Communication Workflow Featuring Intralingual Respeaking: Evaluation and Benchmarking	Tomasz Korybski, Elena Davitti, Constantin Orasan and Sabine Braun
969	Named Entity Recognition to Detect Criminal Texts on the Web	Paweł Skórzewski, Mikołaj Pieniowski and Grazyna Demenko
971	IndoUKC: A Concept Centered Indian Multilingual LexicalResource	Nandu Chandran Nair, Rajendran S. Velayuthan, Yamini Chandrashekar, Gábor Bella and fausto giunchiglia
972	Rosetta-LSF: an Aligned Corpus of French Sign Language and French for Text-to-Sign Translation	Elise Bertin-Lemée, Annelies Braffort, Camille Challant, Claire Danet, Boris Dauriac, Michael Filhol, Emmanuella Martinod and Jérémie Segouat
973	A Study of Distant Viewing of ukiyo-e prints	John Pavlopoulos and Ewa Machotka
975	Elvis vs. M. Jackson: Who has More Albums? Classification and Identification of Elements in Comparative Questions	Meriem Beloucif, Seid Muhie Yimam, Steffen Stahlhacke and Chris Biemann
977	Decorate the Examples: A Simple Method of Prompt Design for Biomedical Relation Extraction	Hui-Syuan Yeh, Thomas Lavergne and Pierre Zweigenbaum
978	Automatic Classification of Russian Learner Errors	Alla Rozovskaya
979	SPADE: A Big Five-Mturk Dataset of Argumentative Speech Enriched with Socio-Demographics for Personality Detection	Elma Kerz, Yu Qiao, Sourabh Zanwar and Daniel Wiechmann
980	Resources and Experiments on Sentiment Classification for Georgian	Nicolas Stefanovitch, Jakub Piskorski and Sopho Kharazi
982	Applying Automatic Text Summarization for Fake News Detection	Philipp Hartl and Udo Kruschwitz
983	Effectiveness of French Language Models on Abstractive Dialogue Summarization Task	Yongxin Zhou, François Portet and Fabien Ringeval
984	Comparing Annotated Datasets for Named Entity Recognition in English Literature	Rositsa Ivanova, Marieke van Erp and Sabrina Kirrane
985	Towards the Construction of a WordNet for Old English	Fahad Khan, Francisco J. Minaya Gómez, Rafael Cruz González, Harry Diakoff, Javier E. Diaz Vera, John P. McCrae, Ciara O'Loughlin, William Michael Short and Sander Stolk
986	Do Transformer Networks Improve the Discovery of Inference Rules from Text?	Mahdi Rahimi and Mihai Surdeanu
987	On the Robustness of Cognate Generation Models	Winston Wu and David Yarowsky
990	Detecting Multiple Transitions in Literary Texts	Nuette Heyns and Menno van Zaanen
991	NEmo: an Affective Dataset of Gun Violence News	Carley Reardon, Sejin Paik, Ge Gao, Meet Parekh, Yanling Zhao, Lei Guo, Margrit Betke and Derry Tanti Wijaya
995	Task-Driven and Experience-Based Question Answering Corpus for In-Home Robot Application in the House3D Virtual Environment	zhuoqun Xu, Liubo Ouyang and Yang Liu
996	Combination of Contextualized and Non-Contextualized Layers for Lexical Substitution in French	Kévin Espasa, Emmanuel Morin and Olivier Hamon
997	BERTifying Sinhala - A Comprehensive Analysis of Pre-trained Language Models for Sinhala Text Classification	Vinura Dhananjaya, Piyumal Demotte, Surangika Ranathunga and Sanath Jayasena
998	TBD3: A Thresholding-Based Dynamic Depression Detection from Social Media for Low-Resource Users	Hrishikesh Kulkarni, Sean MacAvaney, Nazli Goharian and Ophir Frieder
999	Multi-lingual Transfer Learning for Children Automatic Speech Recognition	Thomas Rolland, Alberto Abad, Catia Cucchiarini and Helmer Strik
1001	Entity Linking over Nested Named Entities for Russian	Natalia Loukachevitch, Pavel Braslavski, Vladimir Ivanov, Tatiana Batura, Suresh Manandhar, Artem Shelmanov and Elena Tutubalina
1004	MuLD: The Multitask Long Document Benchmark	George Hudson and Noura Al Moubayed
1005	RefCo and its Checker: Improving Language Documentation Corpora's Reusability Through a Semi-Automatic Review Process	Herbert Lange and Jocelyn Aznar
1006	The ManDi Corpus: A Spoken Corpus of Mandarin Regional Dialects	Liang Zhao and Eleanor Chodroff
1009	Spoken Language Treebanks in Universal Dependencies: an Overview	Kaja Dobrovoljc
1011	A new European Portuguese corpus for the study of Psychosis through speech analysis	Maria Forjó, Daniel Neto, Alberto Abad, HSofia Pinto and Joaquim Gago
1014	Annotation of metaphorical expressions in the Basic Corpus of Polish Metaphors	Elżbieta Hajnicz
1015	HiNER: A large Hindi Named Entity Recognition Dataset	Rudra Murthy, Pallab Bhattacharjee, Rahul Sharnagat, Jyotsana Khatri, Diptesh Kanojia and Pushpak Bhattacharyya
1016	MLQE-PE: A Multilingual Quality Estimation and Post-Editing Dataset	Marina Fomicheva, Shuo Sun, Erick Fonseca, Chrysoula Zerva, Frédéric Blain, Vishrav Chaudhary, Francisco Guzmán, Nina Lopatina, Lucia Specia and André F. T. Martins
1017	SpecNFS: A Challenge Dataset Towards Extracting Formal Models from Natural Language Specifications	Sayontan Ghosh, Amanpreet Singh, Alex Merenstein, Wei Su, Scott A. Smolka, Erez Zadok and Niranjan Balasubramanian
1018	Investigating Inter- and Intra-speaker Voice Conversion using Audiobooks	Aghilas SINI, Damien Lolive, Nelly Barbot and Pierre Alain
1019	A Multimodal German Dataset for Automatic Lip Reading Systems and Transfer Learning	Gerald Schwiebert, Cornelius Weber, Leyuan Qu, Henrique Siqueira and Stefan Wermter
1020	Sentence Selection Strategies for Distilling Word Embeddings from BERT	Yixiao Wang, Zied Bouraoui, Luis Espinosa Anke and Steven Schockaert
1021	IgboBERT Models: Building and Training Transformer Models for the Igbo Language	Chiamaka Chukwuneke, Ignatius Ezeani, Paul Rayson and Mahmoud El-Haj
1023	RRGparbank: A Parallel Role and Reference Grammar Treebank	Tatiana Bladier, Kilian Evang, Valeria Generalova, Laura Kallmeyer, Robin Möllemann, Natalia Moors, Rainer Osswald and Simon Petitjean
1024	ChiMST: A Chinese Medical Corpus for Word Segmentation and Medical Term Recognition	Yuanhe Tian, Han Qin, Fei Xia and Yan Song
1025	A (Psycho-)Linguistically Motivated Scheme for Annotating and Exploring Emotions in a Genre Diversified Corpus	Aline Etienne, Delphine Battistelli and Gwénolé Lecorvé
1026	Negation Detection in Dutch Spoken Human-Computer Conversations	Tom Sweers, Iris Hendrickx and Helmer Strik
1027	A Study on the Ambiguity in Human Annotation of German Oral History Interviews for Perceived Emotion Recognition and Sentiment Analysis	Michael Gref, Nike Matthiesen, Sreenivasa Hikkal Venugopala, Shalaka Satheesh, Aswinkumar Vijayananth, Duc Bach Ha, Sven Behnke and Joachim Köhler
1028	Pars-ABSA: a Manually Annotated Aspect-based Sentiment Analysis Benchmark on Farsi Product Reviews	Taha Shangipour ataei, Kamyar Darvishi, Soroush Javdan, Behrouz Minaei-Bidgoli and Sauleh Eetemadi
1029	DiaWUG: A Dataset for Diatopic Lexical Semantic Variation in Spanish	Gioia Baldissin, Dominik Schlechtweg and Sabine Schulte im Walde
1030	Quantification Annotation in ISO 24617-12, Second Draft	Harry Bunt, Maxime Amblard, Johan Bos, Karën Fort, Bruno Guillaume, Philippe de Groote, Chuyuan Li, Pierre Ludmann, Michel Musiol, Siyana Pavlova, Guy Perrier and Sylvain Pogodalla
1031	»textklang« – Towards a Multi-Modal Exploration Platform for German Poetry	Nadja Schauffler, Toni Bernhart, Andre Blessing, Gunilla Eschenbach, Markus Gärtner, Kerstin Jung, Anna Kinder, Julia Koch, Sandra Richter, Gabriel Viehhauser, Ngoc Thang Vu, Lorenz Wesemann and Jonas Kuhn
1033	The ComMA Dataset V0.2: Annotating Aggression and Bias in Multilingual Social Media Discourse	Ritesh Kumar, Shyam Ratan, Siddharth Singh, Enakshi Nandi, Laishram Niranjana Devi, Akash Bhagat, Yogesh Dawer, bornini lahiri, Akanksha Bansal and Atul Kr. Ojha
1034	CEPOC: The Cambridge Exams Publishing Open Cloze dataset	Mariano Felice, Shiva Taslimipoor, Øistein E. Andersen and Paula Buttery
1035	My Case, For an Adposition: Lexical Polysemy of Adpositions and Case Markers in Finnish and Latin	Daniel Chen and Mans Hulden
1037	Latvian National Corpora Collection – Korpuss.lv	Baiba Saulite, Roberts Darģis, Normunds Gruzitis, Ilze Auzina, Kristīne Levāne-Petrova, Lauma Pretkalniņa, Laura Rituma, Peteris Paikens, Arturs Znotins, Laine Strankale, Kristīne Pokratniece, Ilmārs Poikāns, Guntis Barzdins, Anda Baklāne and Valdis Saulespurēns
1038	A Warm Start and a Clean Crawled Corpus - A Recipe for Good Language Models	Vésteinn Snæbjarnarson, Haukur Barri Símonarson, Pétur Orri Ragnarsson, Svanhvít Lilja Ingólfsdóttir, Haukur Jónsson, Vilhjalmur Thorsteinsson and Hafsteinn Einarsson
1041	RoomReader: A Multimodal Corpus of Online Multiparty Conversational Interactions	Justine Reverdy, Sam O'Connor Russell, Louise Duquenne, Diego Garaialde, Benjamin R. Cowan and Naomi Harte
1042	Dialogue Corpus Construction Considering Modality and Social Relationships in Building Common Ground	Yuki Furuya, Koki Saito, Kosuke Ogura, Koh Mitsuda, Ryuichiro Higashinaka and Kazunori Takashio
1043	A Deep Transfer Learning Method for Cross-Lingual Natural Language Inference	Dibyanayan Bandyopadhyay, Arkadipta De, Baban Gain, Tanik Saikh and Asif Ekbal
1045	I still have Time(s): Extending HeidelTime for German Texts	Andy Luecking, Manuel Stoeckel, Giuseppe Abrami and Alexander Mehler
1046	The Brooklyn Multi-Interaction Corpus for Analyzing Variation in Entrainment Behavior	Andreas Weise, Matthew McNeill and Rivka Levitan
1047	XLM-T: Multilingual Language Models in Twitter for Sentiment Analysis and Beyond	Francesco Barbieri, Luis Espinosa Anke and Jose Camacho-Collados
1049	Quevedo: Annotation and Processing of Graphical Languages	Antonio F. G. Sevilla, Alberto Díaz Esteban and José María Lahoz-Bengoechea
1050	Morphological Complexity of Children Narratives in Eight Languages	Gordana Hržica, Chaya Liebeskind, Kristina Š. Despot, Olga Dontcheva-Navratilova, Laura Kamandulytė-Merfeldienė, Sara Košutar, Matea Kramarić and Giedrė Valūnaitė Oleškevičienė
1051	OpenEL: An Annotated Corpus for Entity Linking and Discourse in Open Domain Dialogue	Wen Cui, Leanne Rolston, Marilyn Walker and Beth Ann Hockey
1053	EXPRES Corpus for A Field-specific Automated Exploratory Study of L2 English Expert Scientific Writing	Ana-Maria Bucur, Madalina Chitez, Valentina Muresan, Andreea Dinca and Roxana Rogobete
1054	Bootstrapping Text Anonymization Models with Distant Supervision	Anthi Papadopoulou, Pierre Lison, Lilja Øvrelid and Ildikó Pilán
1055	Natural Questions in Icelandic	Vésteinn Snæbjarnarson and Hafsteinn Einarsson
1056	Multimodality for NLP-Centered Applications: Resources, Advances and Frontiers	Muskan Garg, Seema Wazarkar, Muskaan Singh and Ondřej Bojar
1057	RED v2: Enhancing RED Dataset for Multi-Label Emotion Detection	Alexandra Ciobotaru, Mihai Vlad Constantinescu, Liviu P. Dinu and Stefan Dumitrescu
1058	An Evaluation Framework for Legal Document Summarization	Ankan Mullick, Abhilash Nandy, Manav Nitin Kapadnis, Sohan Patnaik, Raghav R and Roshni Kar
1059	PLOD: An Abbreviation Detection Dataset for Scientific Documents	Leonardo Zilio, Hadeel Saadany, Prashant K. Sharma, Diptesh Kanojia and Constantin Orăsan
1061	HindiMD: A Multi-domain Corpora for Low-resource Sentiment Analysis	Mamta ., Asif Ekbal, Pushpak Bhattacharyya, Shikha Srivastava, Alka Kumar and Tista Saha
1063	Korean Language Modeling via Syntactic Guide	Hyeondey Kim, Seonhoon Kim, INHO KANG, Nojun Kwak and Pascale Fung
1064	LeConTra: A Learner Corpus of English-to-Dutch News Translation	Bram Vanroy and Lieve Macken
1065	QA4IE: A Quality Assurance Tool for Information Extraction	Rafael Jimenez Silva, Kaushik Gedela, Alex Marr, Bart Desmet, Carolyn Rose and Chunxiao Zhou
1067	WiC-TSV-de: German Word-in-Context Target-Sense-Verification Dataset and Cross-Lingual Transfer Analysis	Anna Breit, Artem Revenko and Narayani Blaschke
1069	Building a Synthetic Biomedical Research Article Citation Linkage Corpus	Sudipta Singha Roy and Robert E. Mercer
1070	A New Dataset for Topic-Based Paragraph Classification in Genocide-Related Court Transcripts	Miriam Schirmer, Udo Kruschwitz and Gregor Donabauer
1072	Comparing approaches to language understanding for human-robot dialogue: an error taxonomy and analysis	Ada D. Tur and David Traum
1073	The Hindi-Telugu Parallel Corpus	Vandan Vasantlal Mujadia and Dipti Sharma
1075	Building an Endangered Language Resource in the Classroom: Universal Dependencies for Kakataibo	Roberto Zariquiey, Claudia Alvarado, Ximena Echevarría, Luisa Gomez, Rosa Gonzales, Mariana Illescas, Sabina Oporto, Frederic Blum, Arturo Oncevay and Javier Vera
1077	Annotating Attribution in Czech News Server Articles	Barbora Hladka, Jiří Mírovský, Matyáš Kopp and Václav Moravec
1079	Features of Perceived Metaphoricity on the Discourse Level: Abstractness and Emotionality	Prisca Piccirilli and Sabine Schulte im Walde
1085	SPORTSINTERVIEW: A Large-Scale Sports Interview Benchmark for Entity-centric Dialogues	Hanfei Sun, Ziyuan Cao and Diyi Yang
1087	The Search for Agreement on Logical Fallacy Annotation of an Infodemic	Claire Bonial, Austin Blodgett, Taylor Hudson, Stephanie M. Lukin, Jeffrey Micher, Douglas Summers-Stay, Peter Sutor and Clare Voss
1088	German Parliamentary Corpus (GerParCor)	Giuseppe Abrami, Mevlüt Bagci, Leon Hammerla and Alexander Mehler
1089	Universal Proposition Bank 2.0	Ishan Jindal, Alexandre Rademaker, Michał Ulewicz, Huyen Nguyen, Ha Linh, Khoi-Nguyen Tran, Huaiyu Zhu and Yunyao Li
1090	Evaluating Methods for Extraction of Aspect Terms in Opinion Texts in Portuguese - the Challenges of Implicit Aspects	Mateus Machado and Thiago Alexandre Salgueiro Pardo
1091	Building a Multilingual Taxonomy of Olfactory Terms with Timestamps	Stefano Menini, Teresa Paccosi, Serra Sinem Tekiroğlu and Sara Tonelli
1092	Xposition: An Online Multilingual Database of Adpositional Semantics	Luke Gessler, Nathan Schneider, Joseph C. Ledford and Austin Blodgett
1094	TYPIC: A Corpus of Template-Based Diagnostic Comments on Argumentation	Shoichi Naito, Shintaro Sawada, Chihiro Nakagawa, Naoya Inoue, Kenshi Yamaguchi, Iori Shimizu, Farjana Sultana Mim, Keshav Singh and Kentaro Inui
1095	Introducing the CURLICAT Corpora: Seven-language Domain Specific Annotated Corpora from Curated Sources	Tamás Váradi, Bence Nyéki, Svetla Koeva, Marko Tadić, Vanja Štefanec, Maciej Ogrodniczuk, Bartłomiej Nitoń, Piotr Pęzik, Verginica Barbu Mititelu, Elena Irimia, Maria Mitrofan,Vasile Păiș, Dan Tufiș, Radovan Garabík, Simon Krek, Andraž Repar
1097	TUSC: Emotion Word Usage in Tweets from US and Canada	Krishnapriya Vishnubhotla and Saif M. Mohammad
1098	The Norwegian Colossal Corpus: A Text Corpus for Training Large Norwegian Language Models	Per E. Kummervold, Freddy Wetjen and Javier de la Rosa
1100	Dataset Construction for Scientific-Document Writing Support by Extracting Related Work Section and Citations from PDF Papers	Keita Kobayashi, Kohei Koyama, Hiromi Narimatsu and Yasuhiro Minami
1101	Assessing the Quality of an Italian Crowdsourced Idiom Corpus:the Dodiom Experiment	Giuseppina Morza, Raffaele Manna and Johanna Monti
1102	Annotating Arguments in a Corpus of Opinion Articles	Gil Rocha, Luís Trigo, Henrique Lopes Cardoso, Rui Sousa-Silva, Paula Carvalho, Bruno Martins and Miguel Won
1104	GeezSwitch: Language Identification in Typologically Related Low-resourced East African Languages	Fitsum Gaim, Wonsuk Yang and Jong C. Park
1106	Out of Thin Air: Is Zero-Shot Cross-Lingual Keyword Detection Better Than Unsupervised?	Boshko Koloski, Senja Pollak, Blaž Škrlj and Matej Martinc
1107	MHE: Code-Mixed Corpora for Similar Language Identification	Priya Rani, John P. McCrae and Theodorus Fransen
1108	DeepREF: A Framework for Optimized Deep Learning-based Relation Classification	Igor Nascimento, Rinaldo Lima, Adrian-Gabriel CHIFU, Bernard Espinasse and Sébastien Fournier
1109	Cross-lingual and Multilingual CLIP	Fredrik Carlsson, Philipp Eisen, Faton Rekathati and Magnus Sahlgren
1110	Building Large-Scale Japanese Pronunciation-Annotated Corpora for Reading Heteronymous Logograms	Fumikazu Sato, Naoki Yoshinaga and Masaru Kitsuregawa
1111	The Project Dialogism Novel Corpus: A Dataset for Quotation Attribution in Literary Texts	Krishnapriya Vishnubhotla, Adam Hammond and Graeme Hirst
1112	Merkel Podcast Corpus: A Multimodal Dataset Compiled from 16 Years of Angela Merkel’s Weekly Video Podcasts	Debjoy Saha, Shravan Nayak and Timo Baumann
1113	The Speed-Vel Project: a Corpus of Sound and Aerodynamic Data to Measure Droplets Emission During Speech Interaction in a Context of Covid-19 Contamination	Francesca Carbone, Gilles Bouchet, Alain Ghio, Thierry Legou, Carine André, muriel lalain, Sabrina Kadri, Caterina Petrone, Federica Procino and Antoine Giovanni
1115	A Turkish Hate Speech Dataset and Detection System	Fatih Beyhan, Buse Çarık, İnanç Arın, Ayşecan Terzioğlu, Berrin Yanikoglu and Reyyan Yeniterzi
1116	RuPAWS: A Russian Adversarial Dataset for Paraphrase Identification	Nikita Martynov, Irina Krotova, Varvara Logacheva, Alexander Panchenko, Olga Kozlova and Nikita Semenov
1117	Crowdsourcing Kazakh-Russian Sign Language: DailySigners-50	Medet Mukushev, Aidyn Ubingazhibov, Aigerim Kydyrbekova, Alfarabi Imashev, Vadim Kimmelman and Anara Sandygulova
1118	Klexikon: A German Dataset for Joint Summarization and Simplification	Dennis Aumiller and Michael Gertz
1119	CCTAA: A Reproducible Corpus for Chinese Authorship Attribution Research	Haining Wang and Allen Riddell
1120	Medical Crossing: a Cross-lingual Evaluation of Clinical Entity Linking	Anton M. Alekseev, Zulfat Miftahutdinov, Elena Tutubalina, Artem Shelmanov, Vladimir Ivanov, Vladimir Kokh, Alexander Nesterov, Manvel Avetisian, Andrei V. Chertok and Sergey I. Nikolenko
1121	Towards Universal Segmentations: UniSegments 1.0	Zdeněk Žabokrtský, Niyati Bafna, Jan Bodnár, Lukáš Kyjánek, Emil Svoboda, Magda Ševčíková and Jonáš Vidra
1125	A Study in Contradiction: Data and Annotation for AIDA Focusing on Informational Conflict in Russia-Ukraine Relations	Jennifer Tracey, Ann Bies, Jeremy Getman, Kira Griffitt and Stephanie Strassel
1126	MeSHup: Corpus for Full Text Biomedical Document Indexing	Xindi Wang, Robert E. Mercer and Frank Rudzicz
1127	Pre-Training Language Models for Identifying Patronizing and Condescending Language: An Analysis	Carla Perez Almendros, Luis Espinosa Anke and Steven Schockaert
1129	Attention Understands Semantic Relations	Anastasia Chizhikova, Sanzhar Murzakhmetov, Oleg Serikov, Tatiana Shavrina and Mikhail Burtsev
1130	Towards Understanding Gender-Seniority Compound Bias in Natural Language Generation	Samhita Honnavalli, Aesha Parekh, Lily Ou, Sophie Groenwold, Sharon Levy, Vicente Ordonez and William Yang Wang
1134	Bazinga! A Dataset for Multi-Party Dialogues Structuring	Paul Lerner, Juliette Bergoënd, Camille Guinaudeau, Hervé Bredin, Benjamin Maurice, Sharleyne Lefevre, Martin Bouteiller, Aman Berhe, Léo Galmant, Ruiqing Yin and Claude Barras
1135	The Ellogon Web Annotation Tool: Annotating Moral Values and Arguments	Alexandros Fotios Ntogramatzis, Anna Gradou, Georgios Petasis and Marko Kokol
1136	ELITR Minuting Corpus: A Novel Dataset for Automatic Minuting from Multi-Party Meetings in English and Czech	Anna Nedoluzhko, Muskaan Singh, Marie Hledíková, Tirthankar Ghosal and Ondřej Bojar
1138	A New Dataset for Identifying Question-Answer Pairs in Video Transcripts	Viet Dac Lai, Amir Pouran Ben Veyseh, Franck Dernoncourt and Thien Huu Nguyen
1139	Investigating the Relationship Between Romanian Financial News and Closing Prices from the Bucharest Stock Exchange	Ioan-Bogdan Iordache, Ana Sabina Uban, Catalin Stoean and Liviu P. Dinu
1141	BAN-Cap: A Multi-Purpose English-Bangla Image Descriptions Dataset	Mohammad Faiyaz Khan, S.M. Sadiq-Ur-Rahman Shifath and Md Saiful Islam
1142	OCR for Handwritten Paleographic Greek Text: A Century-Based Approach	Paraskevi Platanou, John Pavlopoulos and Georgios Papaioannou
1143	WeCanTalk: A New Multi-language, Multi-modal Resource for Speaker Recognition	Karen Jones, Kevin Walker, Christopher Caruso, Jonathan Wright and Stephanie Strassel
1145	EmoWOZ: A Large-Scale Corpus and Labelling Scheme for Emotion Recognition in Task-Oriented Dialogue Systems	Shutong Feng, Nurul Lubis, Christian Geishauser, Hsien-chin Lin, Michael Heck, Carel van Niekerk and Milica Gasic
1148	Life is not Always Depressing: Exploring the Happy Moments of People Diagnosed with Depression	Ana-Maria Bucur, Adrian Cosma and Liviu P. Dinu
1149	Knowledge Graph Question Answering Leaderboard: A Community Resource to Prevent a Replication Crisis	Aleksandr Perevalov, Xi Yan, Liubov Kovriguina, Longquan Jiang, Andreas Both and Ricardo Usbeck
1150	SuMe: A Dataset Towards Summarizing Biomedical Mechanisms	Mohaddeseh Bastan, Nishant Shankar, Mihai Surdeanu and Niranjan Balasubramanian
1152	Reflections on 30 Years of Language Resource Development and Sharing	Christopher Cieri, Mark Liberman, Sunghye Cho, Stephanie Strassel, James Fiumara and Jonathan Wright
1153	Exploring Data Augmentation Strategies for Hate Speech Detection in Roman Urdu	Ubaid Azam, Hammad Rizwan and Asim Karim
1154	Translation Memories as Baselines for Low-Resource Machine Translation	Rebecca Knowles and Patrick Littell
1156	Incorporating LIWC in Neural Networks to Improve Human Trait and Behavior Analysis in Low Resource Scenarios	Isil Yakut Kilic and Shimei Pan
1157	RoBERTuito: a pre-trained language model for social media text in Spanish	Juan Manuel Pérez, Damián Ariel Furman, Laura Alonso Alemany and Franco M. Luque
1158	Re-train or Train from Scratch? Comparing Pre-training Strategies of BERT in the Medical Domain	Hicham El Boukkouri, Olivier Ferret, Thomas Lavergne and Pierre Zweigenbaum
1159	Atril: an XML Visualization System for Corpus Texts	Andressa Rodrigues Gomide, Conceição Carapinha and Cornelia Plag
1160	Detecting Optimism in Tweets using Knowledge Distillation and Linguistic Analysis of Optimism	Ștefan Cobeli, Ioan-Bogdan Iordache, Shweta Yadav, Cornelia Caragea, Liviu P. Dinu and Dragoș Iliescu
1161	A Free/Open-Source Morphological Analyser and Generator for Sakha	Sardana Ivanova, Jonathan N. Washington and Francis Tyers
1164	Complex Labelling and Similarity Detection in Legal Texts: Automatic Analysis of France’s Court of Cassation Rulings	Thibault Charmet, Benoît Sagot and Rachel Bawden
1166	Cyrillic-MNIST: a Cyrillic Version of the MNIST Dataset	Bolat Tleubayev, Zhanel Zhexenova and Anara Sandygulova
1169	CorefUD 1.0: Coreference Meets Universal Dependencies	Anna Nedoluzhko, Michal Novák, Martin Popel, Zdeněk Žabokrtský, Amir Zeldes and Daniel Zeman
1171	NerKor+Cars-OntoNotes++	Attila Novák and Borbála Novák
1172	Universal Semantic Annotator: the First Unified API for WSD, SRL and Semantic Parsing	Riccardo Orlando, Simone Conia, Stefano Faralli and Roberto Navigli
1173	Finnish Hate-Speech Detection on Social Media Using CNN and FinBERT	Md Saroar Jahan, Mourad Oussalah and Nabil Arhab
1174	CAMIO: A Corpus for OCR in Multiple Languages	Michael Arrigo and Stephanie Strassel
1175	Hollywood Identity Bias Dataset: A Context Oriented Bias Analysis of Movie Dialogues	Sandhya Singh, Prapti Roy, Nihar Ranjan Sahoo, Niteesh Kumar Reddy Mallela, Himanshu Gupta, Pushpak Bhattacharyya, Milind Savagaonkar, Nidhi Sultan, Roshni Ramnani, Anutosh Maitra and Shubhashis Sengupta
1176	Unifying Morphology Resources with OntoLex-Morph. A Case Study in German	Christian Chiarcos, Christian Fäth and Maxim Ionov
1177	Simple TICO-19: A Dataset for Joint Translation and Simplification of COVID-19 Texts	Matthew Shardlow and Fernando Alva-Manchego
1178	Using Sentence-level Classification Helps Entity Extraction from Material Science Literature	Ankan Mullick, Shubhraneel Pal, Tapas Nayak, Seung-Cheol Lee, Satadeep Bhattacharjee and Pawan Goyal
1180	Lexical Resource Mapping via Translations	hongchang Bao, Bradley Hauer and Grzegorz Kondrak
1181	A Cross-document Coreference Dataset for Longitudinal Tracking across Radiology Reports	Surabhi Datta, Hio Cheng Lam, Atieh Pajouhi, Sunitha Mogalla and Kirk Roberts
1183	Unsupervised Attention-based Sentence-Level Meta-Embeddings from Contextualised Language Models	Keigo Takahashi and Danushka Bollegala
1184	A Twitter Corpus for Named Entity Recognition in Turkish	Buse Çarık and Reyyan Yeniterzi
1188	UsingWiktionary to Create Specialized Lexical Resources and Datasets	Lenka Bajčetić and Thierry Declerck
1190	Annotating Verbal Multiword Expressions in Arabic: Assessing the Validity of a Multilingual Annotation Procedure	Najet Hadj Mahamed, Chrifa Ben Khelil, Agata Savary, Iskandar keskes, Jean-Yves Antoine and Lamia Hadrich-Belguith
1192	Adapting Language Models When Training on Privacy-Transformed Data	Tugtekin Turan, Dietrich Klakow, Emmanuel Vincent and Denis Jouvet
1194	A New Dataset for Identifying Misinformation Spreaders and Political Bias	Flora Sakketou, Joan Plepi, Riccardo Cervero, Paolo Rosso and Lucie Flek
1195	Camel Treebank: An Open Multi-genre Arabic Dependency Treebank	Nizar Habash, Muhammed AbuOdeh, Dima Taji, Reem Faraj, Jamila El Gizuli and Omar Kallas
1197	Pre-training and Evaluating Transformer-based Language Models for Icelandic	Jón Daðason and Hrafn Loftsson
1201	MTLens: Machine Translation Output Debugging	Shreyas Sharma, Kareem Darwish, Lucas Aguiar Pavanelli, Thiago Castro Ferreira, Mohamed Al-Badrashiny, Kamer Ali Yuksel and Hassan Sawaf
1202	gaBERT — an Irish Language Model	James Barry, Joachim Wagner, Lauren Cassidy, Alan Cowap, Teresa Lynn, Abigail Walsh, Mícheál J. Ó Meachair and Jennifer Foster
1204	Embeddings models for Buddhist Sanskrit	Ligeia Lugli, Matej Martinc, Andraž Pelicon and Senja Pollak
1206	Cross-Lingual Link Discovery for Under-Resourced Languages	Michael Rosner, Sina Ahmadi, Elena-Simona Apostol, Julia Bosque-Gil, Christian Chiarcos, Milan Dojchinovski, Katerina Gkirtzou, Jorge Gracia, Dagmar Gromann, Chaya Liebeskind, Giedre Valu ̄naite ̇ Olesˇkevicˇiene ̇, Gilles Sérasset and Ciprian-Octavian Truică
1207	An Expanded Finite-State Transducer for Tsuut’ina Verbs	Joshua Holden, Christopher Cox and Antti Arppe
1208	ELRC Action: Covering Confidentiality, Correctness and Cross-linguality	Tom Vanallemeersch, Arne Defauw, Sara Szoc, Alina Kramchaninova, Joachim Van den Bogaert and Andrea Lösch
1209	FABRA: French Aggregator-Based Readability Assessment toolkit	Rodrigo Wilkens, David Alfter, Xiaoo Wang, Alice Pintarde, Anais Tack, Kevin P. Yancey and Thomas François
1211	ALBETO and DistilBETO: Lightweight Spanish Language Models	José Cañete, Sebastian Donoso, Felipe Bravo-Marquez, Andrés Carvallo and Vladimir Araujo
1212	Building Comparable Corpora for Assessing Multi-Word Term Alignment	Omar Adjali, Emmanuel Morin and Pierre Zweigenbaum
1213	LPAttack: A Feasible Annotation Scheme for Capturing Logic Pattern of Attacks in Arguments	Farjana Sultana Mim, Naoya Inoue, Shoichi Naito, Keshav Singh and Kentaro Inui
1214	RadQA: A Question Answering Dataset to Improve Comprehension of Radiology Reports	Sarvesh Soni, Meghana Gudala, Atieh Pajouhi and Kirk Roberts
1216	Quality Control for Crowdsourced Bilingual Dictionary in Low-Resource Languages	Hiroki Chida and Yohei Murakami
1218	Multilingual Pragmaticon: Database of Discourse Formulae	Anton Buzanov, Polina Bychkova, Arina Molchanova, Anna Postnikova and Daria Ryzhova
1219	STAPI: An Automatic Scraper for Extracting Iterative Title-Text Structure from Web Documents	Nan Zhang, Shomir Wilson and Prasenjit Mitra
1220	Knowledge Graph - Deep Learning: A Case Study in Question Answering in Aviation Safety Domain	Ankush Agarwal, Raj Gite, Shreya Laddha, Pushpak Bhattacharyya, Satyanarayan Kar, Asif Ekbal, Prabhjit Thind, Rajesh Zele and Ravi Shankar
1222	A corpus of Hindi adposition and case semantics	Aryaman Arora, Nitin Venkateswaran and Nathan Schneider
1223	An automatic model and Gold Standard for translation alignment of Ancient Greek	Tariq Yousef, Chiara Palladino, Farnoosh Shamsian, Anise d’Orange Ferreira and Michel Ferreira dos Reis
1228	Mean Machine Translations: On Gender Bias in Icelandic Machine Translations	Agnes Sólmundsdóttir, Dagbjört Guðmundsdóttir, Lilja Björk Stefánsdóttir and Anton Ingason
1229	Annotating the Sentiment of Homeric Text	John Pavlopoulos, Alexandros Xenos and Davide Picca
1230	Who’s in, who’s out? Predicting the Inclusiveness or Exclusiveness of Personal Pronouns in Parliamentary Debates	Ines Rehbein and Josef Ruppenhofer
1231	VoxCommunis: A Corpus for Cross-linguistic Phonetic Analysis	Emily Ahn and Eleanor Chodroff
1232	Challenging the Transformer-based models with a Classical Arabic dataset: Quran and Hadith	Shatha Altammami and Eric Atwell
1233	On the Impact of Temporal Representations on Metaphor Detection	Giorgio Ottolina, Matteo Luigi Palmonari, Manuel Vimercati and Mehwish Alam
1234	Abstract Meaning Representation for Gesture	Richard Brutti, Lucia Donatelli, Kenneth Lai and James Pustejovsky
1235	ELTE Poetry Corpus: a Machine Annotated Database of Canonical Hungarian Poetry	Péter Horváth, Péter Kundráth, Balázs Indig, Zsófia Fellegi, Eszter Szlávich, Tímea Borbála Bajzát, Zsófia Sárközi-Lindner, Bence Vida, Aslihan Karabulut, Mária Timári and Gábor Palkó
1236	Annotation of Communicative Functions of Short Feedback Tokens in Switchboard	Carol Figueroa, Adaeze Adigwe, Magalie Ochs and Gabriel Skantze
1238	On ``Human Parity'' and ``Super Human Performance'' \\in Machine Translation Evaluation	Thierry Poibeau
1243	Universal Dependencies for Punjabi	Aryaman Arora
1245	A STEP towards Interpretable Multi-Hop Reasoning:Bridge Phrase Identification and Query Expansion	Fan Luo and Mihai Surdeanu
1246	Collecting Visually-Grounded Dialogue with A Game Of Sorts	Bram Willemsen, Dmytro Kalpakchi and Gabriel Skantze
1249	Question Modifiers in Visual Question Answering	William John Britton, Somdeb Sarkhel and Deepak Venugopal
1254	HAWP: a Dataset for Hindi Arithmetic Word Problem Solving	Harshita Sharma, Pruthwik Mishra and Dipti Sharma
1256	A Dataset of Offensive Language in Kosovo Social Media	Adem Ajvazi and Christian Hardmeier
1262	Exploring Paraphrase Generation and Entity Extraction for Multimodal Dialogue System	Eda Okur, Saurav Sahay and Lama Nachman
1263	BD-SHS: A Benchmark Dataset for Learning to Detect Online Bangla Hate Speech in Different Social Contexts	Nauros Romim, Mosahed Ahmed, Md Saiful Islam, Arnab Sen Sharma, Hriteshwar Talukder and Mohammad Ruhul Amin
1264	A Whole-Person Function Dictionary for the Mobility, Self-Care and Domestic Life Domains: a Seedset Expansion Approach	Ayah Zirikly, Bart Desmet, Julia Porcino, Jonathan Camacho Maldonado, Pei-Shu Ho, Rafael Jimenez Silva and Maryanne Sacco
1267	Leveraging Pre-trained Language Models for Gender Debiasing	Nishtha Jain, Declan Groves, Lucia Specia and Maja Popović
1268	A Bayesian Topic Model for Human-Evaluated Interpretability	Justin Wood, Corey Arnold and Wei Wang
1269	TeSum: Human-Generated Abstractive Summarization Corpus for Telugu	Priyanka Ravva, Ashok Urlana, Nirmal Surange, Pavan Baswani and Manish Shrivastava
1270	Question Generation and Answering for exploring Digital Humanities collections	Frederic Bechet, Elie Antoine, Jérémy Auguste and Géraldine Damnati
1272	IceBATS: An Icelandic Adaptation of the Bigger Analogy Test Set	Steinþór Steingrímsson, Hjalti Daníelsson, Steinunn Rut Friðriksdóttir and Einar Sigurdsson
1274	DrugEHRQA: A Question Answering Dataset on Structured and Unstructured Electronic Health Records For Medicine Related Queries	Jayetri Bardhan, Anthony Colas, Kirk Roberts and Daisy Zhe Wang
1276	The Bulgarian Event Corpus: Overview and Initial NER Experiments	Petya Osenova, Kiril Simov, Iva Marinova and Melania Berbatova
1281	The Open Cantonese Wordnet Corpus: Enriching Linguistic Representation in the Cantonese Wordnet	Ut Seong Sio and Luís Morgado da Costa
1282	Evaluating Retrieval for Multi-domain Scientific Publications	Nancy Ide, Keith Suderman, Jingxuan Tu, Marc Verhagen, Shanan Peters, John Lawson, Andrew Borg and James Pustejovsky
1285	The Engage Corpus: A Social Media Dataset for Text-Based Recommender Systems	Daniel Cheng, Kyle Yan, Phillip Keung and Noah A. Smith
1286	An Inflectional Database for Gitksan	Bruce Oliver, Clarissa Forbes, Changbing Yang, Farhan Samir, Edith Coates, Garrett Nicolai and Miikka Silfverberg
1287	Transfer Learning Methods for Domain Adaptation in Technical Logbook Datasets	Farhad Akhbardeh, Marcos Zampieri, Cecilia Ovesdotter Alm and Travis Desell
1290	PyCantonese: Cantonese Linguistics and NLP in Python	Jackson L. Lee, Litong Chen, Charles Lam, Chaakming Lau and Tsz-Him Tsui
1292	Development of Automatic Speech Recognition for the Documentation of Cook Islands Māori	Rolando Coto-Solano, Sally Akevai Nicholas, Samiha Datta, Victoria Quint, Piripi Wills, Emma Ngakuravaru Powell, Liam Koka'ua, Syed Tanveer and Isaac Feldman
1293	RU-ADEPT: Russian Anonymized Dataset with Eight Personality Traits	C. Anton Rytting, Valerie Novak, James R. Hull, Victor M. Frank, Paul Rodrigues, Jarrett Lee and Laurel Miller-Sims
1294	CATAMARAN: A Cross-lingual Long Text Abstractive Summarization Dataset	zheng chen and Hongyu Lin
1295	A Corpus for Commonsense Inference in Story Cloze Test	Bingsheng Yao, Ethan Joseph, Julian Lioanag and Mei Si
1296	Dataset and Baseline for Automatic Student Feedback Analysis	Missaka H.M Herath, Kushan R.A. Chamindu, Hashan Maggona Kumbakarage Maduwantha and Surangika Ranathunga
1297	EmoInHindi: A Multi-label Emotion and Intensity Annotated Dataset in Hindi for Emotion Recognition in Dialogues	Gopendra Vikram Singh, Priyanshu Priya, Mauajama Firdaus, Asif Ekbal and Pushpak Bhattacharyya
1301	BeSt: The Belief and Sentiment Corpus	Jennifer Tracey, Owen Rambow, Claire Cardie, Adam Dalton, Hoa Trang Dang, Mona Diab, Bonnie Dorr, Louise Guthrie, Magdalena Markowska, Smaranda Muresan, Vinodkumar Prabhakaran, Samira Shaikh and Tomek Strzalkowski
1302	RELATE: Generating a linguistically inspired Knowledge Graph for fine-grained emotion classification	Annika Marie Schoene, Nina Dethlefs and Sophia Ananiadou
1304	A Corpus of Simulated Counselling Sessions with Dialog Act Annotation	John S. Y. Lee, Haley Fong, Lai Shuen Judy Wong, Chun Chung Mak, Chi Hin Yip and Ching Wah Larry Ng

Important dates

Latest Tweets