Annotation of Sound Change and Phonetic Alignments in EDICTOR
Mandarin (Běijīng)
).beij1234
).BeijingMandarin
).rain
, write the rain
or rain (noun)
).658
, RAIN (PRECIPITATION)
). ni44; ʑe44
in your data, so ni44; ʑe44
is your value).ni44; ʑe44
thus becomes ni44
and ʑe44
).n i ⁴⁴
and ʑ e ⁴⁴
).
Typical linguistic data as illustrated in Wu et al. (2020).
Long-table format, required for the EDICTOR as illustrated in Wu et al. (2020).
The EDICTOR (List 2017) is a web-based tool that allows to edit, analyse, and publish etymological data. It is available in Version 1.0. The tool can be accessed via the website at https://digling.org/edictor/. All that is needed to use the tool is a webbrowser (Firefox, Safari, Chrome). Offline usage is also possible, but requires to change webbrowser settings. The tool is file-based: input is not a database structure, but a plain tab-separated text file (as a single sheet from a spreadsheet editor in long table format). The data-formats are identical with those used by LingPy, thus allowing for a close interaction between automatic analysis and manual refinement.
The EDICTOR structure is modular, consisting of different panels that allow for:
File formats are straightforward:
Cognates can be assigned to words using two rudimentary operations:
The Cognates panel is synchronized with the wordlist panel: when assigning cognates within meaning slots, the wordlist panel is automatically filtered to show only the words under consideration.
The Cognates panel also allows to directly align the words which were assigned to the same cognate set.
Partial cognates can be assigned to words provided that the data is segmented into morphemes. The assignment follows a very intuitive schema:
Morpheme Glosses
Morpheme Glosses Panel (new in EDICTOR 2.0)
In order to check how well a given transcription was carried out, the Phonology panel can be used. It lists all phonemes for a given language (proper segmentation is required) and their frequency of occurrence. The expert can thus check the correctness of very rare phonemes or weird characters. Since the Phonology panel links to the Wordlist panel, experts can quickly find the words containing specific phonemes and correct them or inspect them. In addition, an IPA chart can be displayed to check the structural properties of the sound system of a given language.
Quasi-Deprecated and Superceded by the Morpheme Glosses Panel
This panel was an early attempt to handle morpheme glosses, which offers some potentially interesting features like bipartite word family graphs, but will otherwise not really help you much to advance your data, so I recommend to ignore it.
Quasi-Deprecated and Superceded by the Correspondence Patterns Panel If alignments are provided for a given dataset, one can use the Correspondences Panel of the EDICTOR to compare the frequency of sound correspondences between language pairs. In this way, errors in cognate assignment or alignment analyses can be quickly corrected and a general idea regarding regular sound correspondences can be derived. Sound correspondences can also be defined for a given context. This needs to be submitted by the user as additional data in an additional column, or can be automatically computed, based on the idea of prosodic strings (List 2014) which assign each sound to a given value based on its prosodic weight.
You can also inspect correspondence patterns across all the languages in your sample, provided you have -- again -- completely aligned your data. A first method for correspondence pattern identification was proposed by List (2019), but the algorithm is time consuming and therefore only available in Python and you have to analyze your data with the algorithm and then load the file into the EDICTOR to inspect your data properly. However, the EDICTOR offers a very simple greedy solution that you can use for quick data inspection and which usually shows the most frequent patterns in your data.
Inspecting how the cognate sets are distributed in your data is very useful to get a direct impression into certain aspects of subgrouping. You can easily do this with help of the Cognate Sets panel of the EDICTOR. In addition, you can also export your data to the Nexus format from here and use the file in biological software packages, such as SplitsTree, to infer a quick tree or network.
Templates can be used to develop first questionnaires that can then be filled out with help of the EDICTOR. Template functionality is still rudimentary in the EDICTOR. Users can select among different concept lists (Swadesh, Blust, etc.) and also merge multiple concept lists. More finegrained operations (the intersection of concept lists, or mergers which take concept similarity into account) are not yet implemented, but are currently developed for the Concepticon, where they will be available with the next official release (planned for 2017).
The EDICTOR has a database backend that allows to store data automatically on a server. In order to support this, databases need to be explicitly created, and there is no official way to do this at the moment: Users who whish to use the database backend need to ask me to set up a database and create passwords for them. The interface for databases in the Customize menu allows to select those items of a given database, which users want to inspect. These are stored in a link that users can then bookmark and use whenever they want to work on the data.
Users can customize a great part of the EDICTOR's default settings. This is done with help of specific URLs that the user can bookmark to call the tool in their preferred view. For example, if one wants to see the alignments immediately when loading a file, this can be specified. If users prefer to use their own IPA keyboard instead of SAMPA conversion, this is also possible. More possibilities for customization will be added in the future, but already for the moment, the EDICTOR offers a great deal of flexibility.
The EDICTOR can also be used as a convenient interface to publish data in a nice form on the web. As it is purely text-based, all that is needed is to clone the EDICTOR software and host it on a server of one's choice. Then, by using customized URLs one can present a given dataset in read-only mode. In this way, users cannot edit the data, but they can use the interactive possibilities to inspect it.
by a dot .
h₂/ə
Gracias a todås!