Reconciling Classical and Computational Approaches in Historical Linguistics
The Cross-Linguistic Data Formats initiative (Forkel et al. 2016, http://cldf.clld.org) comes along with:
Forkel et al., Nature Scientific Data, to appear
Forkel et al., Nature Scientific Data, to appear
[...] it is a well known fact that certain types of morphemes are relatively stable. Pro- nouns and numerals, for example, are occasionally replaced either by other forms from the same language or by borrowed elements, but such replacement is rare. The same is more or less true of other everyday expressions connected with concepts and ex- periences common to all human groups or to the groups living in a given part of the world during a given epoch. (Swadesh 1950: 157)
I, thou, he, we, ye, one, two, three, four, five, six, seven, eight, nine, ten, hundred, all, animal, ashes, back, bad, bark, belly, big, [...] this, tongue, tooth, tree, warm, water, what, where, white, who, wife, wind, woman, year, yellow. (ibid.: 161)
Forthcoming Concepticon version (1.1) features:
Dataset | Transcr. Syst. | Sounds |
---|---|---|
GLD (Ruhlen 2008) | NAPA (modified) | 600+ (?) |
Phoible (Moran et al. 2015) | IPA (specified) | 2000+ |
GLD (Starostin 2015) | UTS | ? |
ASJP (Wichmann et al. 2016) | ASJP Code | 700+ |
PBase (Mielke 2008) | IPA (specified) | 1000+ |
Wikipedia | IPA (unspecified) | ? |
JIPA | IPA (norm?) | 800+ |
In | NFD | Confus. | Alias | Out |
---|---|---|---|---|
ã (U+00E3) | a (U+0061) ◌̃ (U+0303) | ã | ||
a (U+0061) : (U+003a) | a (U+0061) ː (U+02d0) | aː | ||
ʦ (U+02a6) | t (U+0074) s (U+0073) | ts |
In | Identifier |
---|---|
ã | nasalized unrounded open front vowel |
aː | long unrounded open front |
ts | voiceless alveolar affricate stop |
LingPy: Python library for quantitative tasks in historical linguistics offers many methods for sequence comparison (phonetic alignment, cognate detection), phylogenetic reconstruction, ancestral state reconstruction, etc. * can read CLDF files
List, Greenhill, and Gray, PLOS ONE, 2017
List, Greenhill, and Gray, PLOS ONE, 2017
List, Greenhill, and Gray, PLOS ONE, 2017
The EDICTOR is a web-based tool that allows to edit, analyse, and publish etymological data. It is available as a prototype in Version 0.1 and will be further developed in the project "Computer-Assisted Language Comparison" (2017-2021). The tool can be accessed via the website at http://edictor.digling.org, or be downloaded and used in offline form. All that is needed to use the tool is a webbrowser (Firefox, Safari, Chrome). Offline usage is currently restricted to Firefox. The tool is file-based: input is not a database structure, but a plain tab-separated text file (as a single sheet from a spreadsheet editor). The data-formats are identical with those used by LingPy, thus allowing for a close interaction between automatic analysis and manual refinement.
The EDICTOR structure is modular, consisting of different panels that allow for:
Key | Concept | Russian | German | ... |
---|---|---|---|---|
1.1 | world | mir, svet | Welt | ... |
1.21 | earth, land | zemlja | Erde, Land | ... |
1.212 | ground, soil | počva | Erde, Boden | ... |
1.420 | tree | derevo | Baum | ... |
1.430 | wood | derevo | Holz | ... |
CLICS (List et al. 2014) was an online database of synchronic lexical associations ("colexifications") in 221 language varieties of the world.
Database of Cross-Linguistic Colexifications (CLICS):
Problems of CLICS¹
Basic ideas for CLICS²
Results for CLICS² (List et al. 2018)
CLICS², Linguistic Typology, List et al. 2018
Lexibank is planned to function as
¡Gracias a todos!