New Paper and Update to PySEM
With the publication of Concepticon 2.6.0 (https://concepticon.clld.org) during the last week (together with Annika Tjuka and Robert Forkel as main collaborators on this project) the PySEM package (https://pypi.org/project/pysem) has now also been updated to version 0.5, which contains the data from the most recent Concepticon version.
In addition, our paper on supervised phonological reconstruction has now appeared. This study, common work with Nathan Hill and Robert Forkel, offers a new straightforward framework for phonological reconstruction and word prediction, which can serve as a fast baseline for future studies devoted to the task. This study can be found here.
Computational approaches in historical linguistics have been increasingly applied during the past decade and many new methods that implement parts of the traditional comparative method have been proposed. Despite these increased efforts, there are not many easy-to-use and fast approaches for the task of phonological reconstruction. Here we present a new framework that combines state-of-the-art techniques for automated sequence comparison with novel techniques for phonetic alignment analysis and sound correspondence pattern detection to allow for the supervised reconstruction of word forms in ancestral languages. We test the method on a new dataset covering six groups from three different language families. The results show that our method yields promising results while at the same time being not only fast but also easy to apply and expand.