Magali Paquot will give an invited online talk on 30 October 2023 (18:00 Madrid, 17:00 London) on "The Core Metadata Schema for L2 data" in the framework of the event "Corpus linguistics and applied linguistics research 2023" organized by Universidad de Murcia. You can register here: https://umurcia.zoom.us/webinar/register/WN_a6Wkw7llSG2HrvJ9yIGKvQ
Here's an abstract:
The Core Metadata Schema for L2 data: Collaborative efforts towards improved data findability, metadata quality and study comparability in L2 research
The Core Metadata Schema for L2 data consists in a comprehensive set of variables that encapsulate crucial information about L2 data. It is organized into several sections that describe specific aspects of a learner corpus. These include administrative details (e.g. authors or license), corpus design, text-related variables, learner-related variables, in-built annotation (e.g. details about manual or automatic annotation), information about annotators or transcribers (e.g. native language or language repertoire) and task-related details (e.g. instructions, time constraints) (Paquot et al., 2023). It is the result of extensive collaboration between learner corpus compilers at the Centre for English Corpus Linguistics (UCLouvain, Belgium) and EURAC Research (Bolzano, Italy), and a research data infrastructure expert and member of CLARIN's metadata taskforce (König et al., 2022; Frey et al. 2023).
In this presentation, I will discuss the underlying rationale for the development of such a resource and present its second version. This will give me the opportunity to clarify in what ways we have tried to embark learner corpus researchers into this initiative and reiterate our hope that the LCR community will collaborate with us to refine the schema and align it with the evolving needs of the field.