Industry Track Programme

The Industry Track took place on Wednesday 22 June, in the morning, from 9:30 to 12:30.

All presentations are available in the programme below.

Follow the presentations on Zoom:


start time  end time  Organization Presenter Title Description
09:30 09:40 ELRA Bente Maegaard and Khalid Choukri Introduction to the Industry Track  
09:40 10:00 European Commission Philippe Gelin Language Data Space  
10:00 10:20 Google  Daan van Esch  Language Technology Inclusivity at Google and Beyond Language technology has historically only been available in a small number of the world's languages, and while this remains the case today, exciting research advances are already starting to change the picture significantly: for example, Google Translate recently launched 24 new languages without any parallel data whatsoever, distilled from a 1000-language massively multilingual machine translation model. This talk will touch upon some of the most promising recent trends, and outline what the next couple of years may look like in terms of making language technology more inclusive.
10:20 10:30 CJK Jack Halpern ArabLEX: Arabic Full-Form Lexicon with 530 million entries

The most comprehensive Arabic computational full form lexicon ever created, covering over 530 million  inflected, conjugated, declined,  and cliticized  wordforms. Ideal for NLP applications like MT, NER and morphological analysis and especially for speech technology, such as training ASR  and TTS models.

More on ArabLEX and DiaLEX.

10:30 10:40 Vicomtech Thierry Etchegoyhen  From Under-resourced to Large-scale Industrial Deployment: Machine Translation of Basque The tecnological support for the Basque language can still be described as weak or fragmentary overall. In recent years, significant efforts have been made to provide high-quality machine translation for this language. We describe the main steps that have led to large-scale industrial deployments of machine translation services that are having a significant impact on the digital presence of Basque.
10:40 10:50 ChapsVision Sophie Ulrich    
10:50 11:10 Coffee Break      
11:10 11:30 Amazon Jimmy Kunzman LREC 2022 Marseille Select On-device Spoken Langage Understanding Topics Applied research on compute-constraint, on-device spoken language understanding systems raises lots of interest for enabling an Alexa experience when there is no internet connectivity or when serving the request locally improves the Alexa experience for our customers in their homes or cars. We will briefly touch some topics like dynamic adaptation and personalization, tightly integrated speech understanding and small footprint ASR systems in the context of rapidly evolving neural speech processing technologies.
11:30 11:40 CEA Dalila Guessoum Application of NLP to cosmetic Introduction to natural language processing technologies of CEA list and domain adaptation  for Cosmetic ingredients toxicological information extraction from scientific documentations

11:40 11:50 Emvista Cédric  Lopez Dealing with meaning representations  Emvista is a software editor. Created in Montpellier in 2018 and backed by its R&D team in Natural Language Processing, the company offers innovative products that are based on state-of-the-art technologies. Its flagship product, an email management assistant, allows its users to be more efficient in managing their e-mails by identifying all the relevant information in the mass of information received (for example the requests for information and actions) and many other features such as automatic email forwarding or classification. Emvista proposes many other text analysis services such as opinion and emotions analysis, the anonymization of sensitive texts, and the extraction of entities. All of this is based on a technology that structures text contents using a meaning representation which will be presented during this talk.
11:50 12:00 Vocapia Claude Barras Towards adaptive, multi-domain speech transcription systems Recent advances in DNN models allow for more flexible solutions in automatic speech transcription and we will share some feedbacks from Vocapia Research on the topic. Cross-domain capacities, covering telephone conversations, broadcast speech, podcasts or more noisy situations can now be reached through a global model. Another area of progress is the growing capacity for users with few linguistic expertise to adapt pre-trained models to their specific needs and applications. Finally, the challenge of addressing new languages and dialects, often low-resourced and impacted by code-switching, can be handled thanks to universal multilingual phonetic models and an adaptation more parsimonious into annotated data. Nevertheless, relevant linguistic data remains as before the key to any successful project in the field.
12:00 12:10 Cerence Rainer Gruhn AI for a World in Motion

Cerence provides conversational AI for automotive and mobility industries. After a brief introduction of the company, this presentation will discuss the challenges of creating motorcycle driver databases to enable speech control of car navigation systems by two-wheeler drivers. We will look at Bluetooth headset and helmet-induced distortions.


12:10 12:20 Orange Lina Rojas Research in NLP at Orange  
Important dates
  • 5 November 2021: Submission of proposals for workshops and tutorials
  • 17 January 2022: Submission of proposals for oral and poster papers
  • 5 April 2022: Notification of acceptance for oral and poster/demo papers
  • 6 May 2022: Final Submission of accepted oral and poster/demo papers
  • 21-22-23 June 2022: Main Conference
  • 20-24-25 June 2022: Workshops & Tutorials