Introduction
Introduction
Chinese Historical Phonology
Chinese Historical Phonology (音韵学 yīnyùnxué) is the classical discipline,
developed by Chinese scholars long time before the arrival of modern linguistic
theories (mostly based on synchronic grammar).
Introduction
Chinese Historical Phonology
Chinese Historical Phonology investigates specific aspects of the development and the stages of the Chinese language, including specifically:
- the diversification of the dialects (as a rather modern part of the discipline),
- the pronunciation of the Chinese variety encoded in the rhyme books (600 AD) and rhyme tables (ca. 1100 AD), also known as Middle Chinese, and
- the pronunciation of the ancient poems, especially the Book of Odes (詩經 Shījīng, ca. 600 BC), and ancient characters, based on the structural analysis of phonophoric characters.
Introduction
Chinese Historical Phonology
Achievements of Chinese Historical Phonology:
- linguistic reconstruction of Old Chinese pronunciation based on ryme pattern analysis of the Book of Odes and character structure analysis of phonophoric characters
- early phonological classification of Chinese speech sounds
- classification of Chinese dialects based on their individual divergence from Middle Chinese pronunciation
Introduction
Digital Historical Linguistics
What is Digital Historical Linguistics?
- So far, the term has barely been used.
- Having a way to distinguish research in Computational Historical Linguistics, which implies a closeness to NLP approaches, from research that makes active use of computers without relying completely on them, seems desirable.
- In this sense, Digital Historical Linguistics can be seen as the idea of having a discipline that integrates classical, qualitative approaches and computational, quantitative approaches in linguistics.
Introduction
Computer-Assisted Language Comparison
- data in linguistics is steadily increasing
- our methods reach their practical limits, as they are tedious to apply
- we need to take computational methods into account
- but computational methods are not very accurate and may yield wrong results
Introduction
Computer-Assisted Language Comparison
Introduction
Computer-Assisted Language Comparison
Computer-assisted language comparison (CALC) can be seen as one aspect of Digital Historical Linguistics, with the latter covering a broader range than CALC, as not all problems in historical linguistics are problems of language comparison.
The general strategy, however, to avoid black-box approaches, bridge the gap between quantitative and qualitative approaches, while fostering the development of new quantitative and qualitative methods with a specific focus on linguistic questions (opposed to engineering, NLP questions) also holds for my vision of Digital Historical Linguistics as a discipline that uses digital approaches to deal with scientific questions relevant for the field of linguistics.
Introduction
Why «Digital»?
«Measure what is measurable, and make measurable what is not so.» (quote apparently falsely attributed to Galileo Galilei, see Kleinert 2009)
Introduction
Why «Digital»? Juggling Lessons...
Introduction
Why «Digital»? Juggling Lessons...
Introduction
Why «Digital»? Juggling Lessons...
Introduction
Why «Digital»? Juggling Lessons...
Introduction
Why «Digital»? Juggling Lessons...
Introduction
Why «Digital»? Juggling Lessons...
Lessons from Juggling (and Galilei):
- It does not hurt to try to measure something.
- Even if you fail measuring some phenomenon, you will learn something about it.
- Restricting your view on a problem with some kind of model may restrict you at first, but it may also encourage you to look at aspects of the phenomen you had ignored so far.
→ It does not hurt to try, but it may hurt, not to try, even if it may not seem to hurt, while not trying.
Introduction
Digital Chinese Historical Phonology
Problems of Chinese Historical Phonology:
- The discipline was always data-driven, but so far, the interpretation, even of large datasets, was done manually.
- Scholars have assembled large collections of data, but they are not available in computer-readable form.
- Collaboration and interdisciplinary approaches in the field are still rare.
Introduction
Digital Chinese Historical Phonology
Potential of Digital Chinese Historical Phonology:
- Digital approaches may help to offer a new perspective on our data.
- Digital techniques may even further increase the data basis, especially when helping to unify the data and analyses proposed by different scholars.
- Techniques for exploratory data analysis may help scholars to develop new hypotheses.
Rhyme Notation
Rhyme Notation
Excursus: Notation of Music
- When comparing traditions of musical notations over times and cultures, it is clear that none of the techniques used for notation is capable of rendering the music faithfully how it was perceived when people originally created the music.
- Despite this general problem of faithful notation, people across times and cultures have tried to develop systems that would freeze the music they heard or played in a durable medium.
- When comparing notation systems for music which are currently used, we can also say that people have indeed succeeded at least to some degree, to catch the ephemeric with help of their notation systems.
Rhyme Notation
Excursus: Notation of Music
- Linguistics faces a similar problem of notation, given that we want to represent speech in a durable medium.
- What is surprising, however, is, that linguistic practice of notation of speech is often less strict, showing much more variation, and a much more limited degree of comparability, than we find in music: despite the efforts of the IPA, there is a huge varation in which linguists actually use the IPA (Anderson et al. forthcoming, https://clts.clld.org).
Rhyme Notation
Excursus: Notation of Music
- While the high degree of variation in linguistics may also relate to the higher degree of variation in languages in general, it may be helpful for linguists to look at different systems for notation in different cultures and practices, in order to improve the techniques by which we try to represent speech.
Rhyme Notation
From Notation to Annotation
- We can roughly say that notation serves to reflect a specific practice in a different, usually visual, medium, while annotation aims to add information, such as some kind of analysis or interpretation.
- We can distinguish two basic techniques for annotation (when dealing with texts): stand-off and inline, with stand-off annotation representing the analysis independent of the text, by indexing its words, for example, and inline-annotation representing the analysis in the text itself (Eckart 2012).
- It is not clear to me at this point, whether the distinction between notation and annotation is useful after all, as I need to read more about the topic in general.
Rhyme Notation
Annotation of Rhyme Judgments
- Although rhyme analysis plays a crucial role in the reconstruction of Old Chinese phonology, the field has not yet developed a standardized annotation framework for rhyme judgments applied to Ancient Chinese texts.
Rhyme Notation
Annotation of Rhyme Judgments
Wang Li's annotation (1980)
Rhyme Notation
Annotation of Rhyme Judgments
Baxter's annotation (1992)
Rhyme Notation
Annotation of Rhyme Judgments
Karlgren's annotation (1950)
Rhyme Notation
Annotation of Rhyme Judgments
Starostin's annotation (1989)
Rhyme Notation
Annotation of Rhyme Judgments
Behr's annotation A (2008)
Rhyme Notation
Annotation of Rhyme Judgments
Behr's annotation B (2008)
Rhyme Notation
Annotation of Rhyme Judgments
Problems resulting from missing standards
- We have huge problems in comparing different analyses on rhyme judgments.
- We have problems in digitizing different analyses on rhyme judgments in order to make them comparable.
- We have only a few contributions where scholars actually publish their analyses, given that it is so tedious preparing the annotation.
Rhyme Notation
A Framework for Rhyme Judgments
Main ideas:
- Zen of Python: "Simple things should be simple — complex things should be possible"
- Simplicity: allow for a framework that be realized in simple spreadsheet editors (like Excel or LibreOffice)
- Exhaustiveness: allow for a framework that captures many aspects we already know will be important for rhyme annotation.
- Flexibility: allow for a framework that can be easily lifted to more complex annotations, even when initial ones were lacking certain aspects.
Rhyme Notation
A Framework for Rhyme Judgments
Learning from wordlist annotation and CLDF
In order to achieve all these goals, we draw largely from our experience with the enhanced
annotation and computer-assisted manipulation of wordlists in historical linguistics (Hill
and List 2017) and their subsequent inclusion into the CLDF specifications.
Rhyme Notation
A Framework for Rhyme Judgments
Basic structure
- Table format, with first row serving as header, and content per cell in a specific column being standardized.
- Python API for analyzing and checking the data, also supports conversion across formats.
- Examples of best practice to help scholars to create their data in our format specifications.
Rhyme Notation
Examples
Example 1: Wang's (1980) judgments in our format
Rhyme Notation
Examples
Example 2: Providing alignments of Wang's (1980) judgments
Rhyme Notation
A Simplified Format
- We offer in addition a simplified format that allows to prepare a dataset in some initial form, which can then later be converted to the extended format and edited therein.
- This format mimicks how poems are displayed in normal documents, and makes extensive use of inline-annotations.
Rhyme Notation
A Simplified Format
Example 1: Wang's (1980) judgments in the simplified format
Rhyme Notation
A Simplified Format
Example 2: Song «Песня для Цоя» from Zoopark in our simplified annotation.
Rhyme Notation
Visualization of Patterns
- The annotation is not everything, but it allows us to make use of programming solutions to produce quick visualizations of the patterns in the data.
- One very straightfoward case is, for example, the visualization of general rhymin schemes along with rhyme words inside a stanza.
Rhyme Notation
Visualization of Patterns
Example 1: Song «Leto» and its rhyme pattern structure
Rhyme Notation
Visualization of Patterns
Example 2: Poem «Zwielicht» (Eichendorff) and its rhyme pattern structure
Rhyme Notation
Visualization of Patterns
Example 3: Dylan's «I want you»
Rhyme Notation
Visualization of Patterns
Example 4: «Yuèliàng dàibiǎo wǒ de xīn»
Rhyme Notation
Visualization of Patterns
Example 5: Silvio Rodriguez «Te doy una canción»
Rhyme Notation
Visualization of Patterns
Two more examples on visualization techniques
Rhyme Notation
Analysis of Patterns
- We cannot only visualize patterns conveniently in our framework, but also analyze them in multiple ways (many of which we have not yet even developed or thought of)
- A straightforward analysis is the comparison of alternative rhyme judgments, by different scholars.
- But also simple statistics, regarding the number of stanzas, the number of rhyme words, etc., in a given collection are straightforward.
Rhyme Notation
Analysis of Patterns: Baxter vs. Wang
- Both authors (Baxter 1992 and Wáng 1980) describe the same data but analyze it independently.
- We do not know so far, how similar or different rhyme judgments are among scholars.
- But we would like to know, as the judgments have a huge impact on the reconstruction.
Rhyme Notation
Analysis of Patterns: Baxter vs. Wang
- From 1070 common stanzas, 175 are different between Wáng and Baxter, which amounts to 15.9%.
- Applying enhanced measures that also assess partial similarity between stanzas and general trends, we find 97% of similarity between Baxter's and Wáng’s rhyme judgments.
Rhyme Notation
Analysis of Patterns: Baxter vs. Wang
|
|
Wáng (1980) | Baxter (1992) |
Rhyme Notation
Summary
- Annotation can unleash very powerful forces in scientific research, and its importance is way too often neglected.
- Rhyme annotation offers -- in specific for Digital Chinese Linguistics, and in general for Digital Linguistics -- a lot of possibilities for analyses which have so far not been carried out and which could help to investigate questions which have so far not yet been investigated.
- In the future, we need to work on increasing the number of examples, in order to provide more illustrations for the usefulness of our framework, and for annotation in general.
Rhyme Notation
Literature
- List, J.-M., N. Hill, and C. Forster (2018): Towards a standardized annotation of rhyme judgments in Chinese historical phonology (and beyond).
[
Draft article under review]