2024-02-20

New preprint

A new preprint with Frederic Blum, Nathan Hill, and Cristian Juárez appeared online with Open Research Europe, awaiting open peer review. The study is titled "Grouping sounds into evolving units for the purpose of historical language comparison" (DOI: 10.12688/openreseurope.16839.1.

Computer-assisted approaches to historical language comparison have made great progress during the past two decades. Scholars can now routinely use computational tools to annotate cognate sets, align words, and search for regularly recurring sound correspondences. However, computational approaches still suffer from a very rigid sequence model of the form part of the linguistic sign, in which words and morphemes are segmented into fixed sound units which cannot be modified. In order to bring the representation of sound sequences in computational historical linguistics closer to the research practice of scholars who apply the traditional comparative method, we introduce improved sound sequence representations in which individual sound segments can be grouped into evolving sound units in order to capture language-specific sound laws more efficiently. We illustrate the usefulness of this enhanced representation of sound sequences in concrete examples and complement it by providing a small software library that allows scholars to convert their data from forms segmented into sound units to forms segmented into evolving sound units and vice versa.

In addition, we were informed that two papers were accepted for the COLING-LREC conference in Torino in May this year. One study, with Robert Forkel and Guillaume Ségerer, titled "Linguistic Survey of India and Polyglotta Africana: Two Retrostandardized Digital Editions of Large Historical Collections of Multilingual Wordlists", presenting two CLLD-Datasets, the LSI and the PolyglottaAfricana. Another study, with Michele Pulini, titled "First Steps Towards the Integration of Resources on Historical Glossing Traditions in the History of Chinese: A Collection of Standardized Fǎnqiè Spellings from the Guǎngyùn".