…is not as clear as I had thought.
According to the standard reconstruction of Proto-Indo-European (PIE), the language had three series of stops. One of the series is thought to have consisted of voiceless unaspirated stops: *p, *t, *ḱ, *k, and *kʷ. Another is thought to have consisted of voiced unaspirated stops: *b, *d, *ǵ, *g and *gʷ. And the other is thought to have consisted of voiced aspirated stops: *bʰ, *dʰ, *ǵʰ, *gʰ and *gʷʰ. These series were preserved in this form in Sanskrit, although Sanskrit also innovated a fourth series of voiceless aspirated stops out of clusters consisting of voiceless stops followed by laryngeals. In Proto-Germanic, however, the situation is different. The PIE voiceless unaspirated stops have become voiceless fricatives; c.f. Proto-Germanic *þū (> English thou) and Sanskrit tvám ‘you (singular)’. The PIE voiced unaspirated stops have become voiceless unaspirated stops; c.f. Proto-Germanic *twō (> English two) and Sanskrit dvā́ ‘two’. And the PIE voiced aspirated stops have become voiced unaspirated stops; c.f. Proto-Germanic *meduz (> English mead) and Sanskrit mádhu ‘honey’.
I had always assumed that the change went something like this. First, the voiceless unaspirated stops fricativised, retaining their lack of voice and aspiration and becoming voiceless fricatives. Changes of stops into fricatives are common and unremarkable; phonologists disagree on whether this is due to a natural tendency towards lenition (weakening) or due to assimilation to neighbouring phonemes which are more sonorous, but there is no dispute that such a change can be phonetically motivated. The change can be written formally in terms of distinctive features as follows.
[-continuant, -voice] > [+continuant]
Second, the voiced unaspirated stops devoiced, retaining their lack of frication and aspiration and becoming voiceless unaspirated stops. This change would be unusual if it occured on its own. However, the previous change had left the language with no voiceless unaspirated stops, only voiced unaspirated stops and voiced aspirated stops. The [±voice] feature which had been used to distinguish the three original series was now redundant. For obstruents the unmarked value of this feature is [-voice] (voicelessness); that is, obstruents tend to be voiceless unless something forces them to be voiced. Therefore, it was natural for the voiced unaspirated stops to be devoiced. The change can be written formally in terms of distinctive features as follows.
[-continuant, +voice, -spread glottis] > [-voice]
Third, the voiced aspirated stops deaspirated, retaining their voicing and lack of frication and becoming voiced aspirated stops. This change was increased in likelihood due to the fact that the previous change left the language with two series of stops, one of which was voiceless unaspirated and one of which was voiced aspirated; the two features [±voice] and [±spread glottis] were therefore redundant against each other. [-spread glottis] is the unmarked value of the [±spread glottis] feature on stops, so it was natural to resolve this by deaspirating the voiced aspirated stops (although devoicing the voiced aspirated stops would have worked just as well). The change can be written formally in terms of distinctive features as follows.
[-continuant, +spread glottis] > [-spread glottis]
But there are two questions I have about this account.
- Why did the second change involve the voiced unaspirated stops devoicing, rather than the voiced aspirated ones? The redundancy of the [±voice] feature could have been resolved either way. In fact, why didn’t both kinds of stop devoice? Since [-voice] is unmarked for obstruents there is nothing stopping this from happening.
- Why did the third change involve the voiced aspirated stops deaspirating rather than devoicing? Since the [±voice] and [±spread glottis] features were redundant against each other devoicing would have worked just as well as a means of resolving the redundancy.
Now, sound change is not a deterministic process, so perhaps the answers to these questions are just that out of all of the different ways the redundancies in question could be resolved, these were the ways that were chosen, more or less at random. I am satisfied with this as the answer to question 2. In fact, with respect to question 2 it seems like deaspiration would be a more likely occurence than devoicing because it is much more common for languages to distinguish stops using the [±voice] feature than it is for them to distinguish stops using the [±spread glottis] feature; contrasts of voice are therefore probably favoured over contrasts of aspiration (although this is only a tendency, and there are plenty of languages like Mandarin Chinese where [±spread glottis] is distinctive but [±voice] is not).
But I am less satisfied with this as an answer to question 1. As I mentioned above, the redundancy of the [±voice] feature could have been solved in three different ways:
- devoicing of the voiced unaspirated stops, resulting in a contrast between voiceless unaspirated stops and voiced aspirated stops.
- devoicing of the voiced aspirated stops, resulting in a contrast between voiced unaspirated stops and voiceless aspirated stops.
- devoicing of both kinds of stop, resulting in a contrast between voiceless unaspirated stops and voiceless aspirated stops.
There are languages with a contrast between voiced unaspirated stops and voiceless aspirated stops, as would result from option 2. English is such a language. There are also languages with a contrast between voiceless unaspirated stops and voiceless aspirated stops, as would result from option 3. Mandarin Chinese is such a language. But I know of no language which has a contrast between voiceless unaspirated stops and voiced aspirated stops, as would result from option 1. Yet option 1 seems to have been the option that was taken. This is odd.
I think there are phonetic reasons why we would expect options 2 or 3 to be favoured over option 1. If you examine the articulatory mechanisms which are used to produce voiced aspirated stops, you can see them as half-voiced stops, closer to voiceless stops than voiced unaspirated stops (but still voiced). If you think about voiced aspirated stops in this way, option 1 is weird, because it involves change of the voiced unaspirated (i.e. fully voiced) stops directly into voiceless unaspirated stops without passing through the intermediate stage where they would be voiced aspirated (i.e. half-voiced) and end up merging with the voiced aspirated stops. If the characterisation of voiced aspirated stops as half-voiced already makes sense to you, you can skip the next few paragraphs, because I’m now going to try and explain why this is an accurate characterisation.
The first thing that I want to explain is what voiced aspirated stops are. In terms of distinctive features, they are parallel to voiceless aspirated stops. Voiced aspirated stops are [+voice] and [+spread glottis], voiceless aspirated stops are [-voice] and [+spread glottis]. But the meaning of [+spread glottis] is different in the two cases. As a feature of voiceless stops, [+spread glottis] corresponds to increased duration of the period during which the vocal folds are prevented from vibrating (normally by keeping the vocal folds apart from each other, hence the name of the feature, although reducing the airflow is also an option). The between the release of a stop and the beginning of vocal fold vibration in order to voice the following voiced phoneme is called the voice onset time (VOT). For voiceless unaspirated stops, the VOT is close to 0, while for voiceless aspirated stops the VOT is larger, so that there is an audible period after the stop has been released where air flows through the glottis but the vocal folds do not vibrate. This results in a sound being produced during this period which is in fact exactly [h], the voiceless glottal continuant (although speakers of languages which have aspirated stops don’t usually perceive the [h], instead perceiving it as part of the preceding stop).
During the production of voiced stops, the vocal folds are already vibrating (that’s what it means for a stop to be voiced). So it is impossible for voiced stops to be aspirated if aspiration is defined as having a positive VOT1. Instead, [+spread glottis] as a feature of voiced stops corresponds to the vocal folds being held further apart than is normal for voiced stops, roughly speaking. The vocal folds are still close enough that they vibrate during the production of voiced aspirated stops, so such stops are not completely voiceless, but they are closer to voiceless than voiced unaspirated stops. The kind of voice that accompanies voiced aspirated stops is called breathy voice, as opposed to the modal voice that accompanies voiced unaspirated stops. It might help to look at the following diagram, which illustrates the relationship between the degree of closure of the glottis and different kinds of voicing. The diagram is adapted from Gordon & Ladefoged (2001).
(I should note that talking about the degree of closure of the glottis as if this was a scalar variable is an oversimplification. When the vocal folds vibrate, what happens is that the glottis alternates between a state where it is more or less fully open (as when a voiceless sound is being produced) and a state where it is more or less fully closed (as when a glottal stop is being produced). Closure occurs due to tension from the laryngeal muscles and opening occurs due to pressure from the flow of air through the trachea; closure results in buildup of air below the glottis, resulting in increased pressure, while opening allows air to flow through a greater area, resulting in decreased pressure, and this is why the alternation occurs. For a given rate of flow of air, there is a maximal tension above which opening cannot occur and a minimal tension below which closure cannot occur, and in between these two extremes there is an optimal tension which results in maximal vibration; this tension is approached during the production of modally-voiced sounds. If the tension is below the optimal tension but above the minimal tension, the result is a breathy-voiced sound. If the tension is above the optimal tension but below the maximal tension, the result is a creaky-voiced sound. Alternatively, creaky-voiced sounds can be produced by having the glottis completely closed at one end, with modal voice at the other end, and breathy-voiced sounds can be produced by having the glottis open so that the vocal folds do not vibrate at one end, with modal voice at the other end. But regardless of how these sounds are produced, they sound the same, so the distinction is not important. Either way, it is still accurate to say that breathy-voiced sounds are in a position between voiceless sounds and modally-voiced sounds.)
It would be helpful to see how voiced aspirated stops behave with respect to sound change in attested languages. Unfortunately, voiced aspirated stops are rare. which limits the number of available examples. As far as I know, voiced aspirated stops are mainly found in the Indo-Aryan languages of South Asia and the Nguni languages of South Africa. In the Indo-Aryan languages the voiced aspirated stops have been inherited from PIE, or at least Vedic Sanskrit (depending on what you believe about the nature of the PIE stops), and most of them seem to have preserved them unchanged. Sinhala and Kashmiri have no voiced aspirated stops, but I don’t know and can’t find any information on what happened to them in these languages. So it seems that the voiced aspirated stops have been stable in these languages. That suggests the rarity of voiced aspirated stops is probably more due to the infrequency of sound changes that would make them phonemic rather than inherent instability. However, the mutual influence of these languages upon each other within the South Asian linguistic area might have helped preserve the voiced aspirated stops; the fact that the two most peripheral Indo-Aryan languages do not have them is perhaps suggestive that this has been the case. What about the Nguni languages? These are a tight-knit group, probably having a common origin within the last millennium, and their closest relatives such as Tswana have no voiced aspirated stops. So their voiced aspirated stops are of more recent vintage. Interestingly, Traill, Khumalo & Fridjhon (1987) have found that the Zulu voiced aspirated stops are actually voiceless, with the breathy voice occuring after the release on the following vowel. This seems like it could be the first step on a change of voiced aspirated stops into voiceless aspirated stops. But I don’t think any of this evidence is of much use in making the case that Grimm’s Law is weird. My case primarily rests on the idea that voiced aspirated stops are intermediate between voiceless and modally-voiced stops on the basis of how they are produced.
If the changes as described above are odd, maybe we should consider the possibility that the changes described by Grimm’s Law were of a different nature.
Perhaps a minor amendment can solve the problem. It is universally agreed that the Proto-Germanic voiced stops had voiced fricative allophones. It is not totally clear which environments the stops occured in and which environments the fricatives occured in, but they were all definitely stops after nasals and when geminate and fricatives after vowels and diphthongs. There are three different ways this situation might have come to be.
- The PIE voiced aspirated stops might have turned into voiced unaspirated stops first and then acquired fricative allophones in certain environments.
- The PIE voiced aspirated stops might have turned into voiced unaspirated fricatives first and then acquired stop allophones in certain environments.
- The PIE voiced aspirated stops might have turned into voiced unaspirated fricatives in certain environments and voiced unaspirated stops in others.
If we suppose that number 2 is the accurate description of what happened, then it is possible that the fricativisation of the PIE voiced aspirated stops occured before the devoicing of the PIE voiced unaspirated stops. This devoicing would then be perfectly natural because the PIE voiced unaspirated stops would be the only stops remaining in the language, so the marked [+voice] feature would be dropped from them. The voiced aspirated stops would probably have become voiced aspirated fricatives (i.e. breathy-voiced fricatives) initially and then these fricatives would have become modally-voiced since there would be no need for them to contrast with modally-voiced fricatives. Is it plausible that the voiceless unaspirated and voiced aspirated stops would have devoiced, but not the voiced unaspirated stops? What do these two kinds of stop have in common that the third stop lacks? If we think of voiced aspirated stops as half-voiced stops, we can describe the change as affecting all of the stops which were not fully voiced. The change is especially plausible, however, if we suppose that the PIE voiceless unaspirated stops had become aspirated before the changes described by Grimm’s Law took place. In that case, the change would affect the aspirated stops and not affect the unaspirated stops. Fricativisation of aspirated stops but not unaspirated stops is a very well-attested sound change; it happened in Greek, for example. The sequence of changes would be as follows:
[-continuant, -voice] > [+spread glottis]
[-continuant, +spread glottis] > [+continuant]
[-continuant, +voice] > [-voice]
[+continuant, +voice] > [-continuant]
(The last change would have occurred only in some environments; there are also conditioned exceptions to some of the other changes.)
Is there any other reason to think the PIE voiceless unaspirated stops might have become aspirated in Proto-Germanic before fricativising? Well, the reflexes of the Proto-Germanic voiceless stops are aspirated in the North Germanic languages and English, and have become affricates in some positions in German which suggests that they were originally aspirated; the lack of aspiration in Dutch can probably be attributed to French influence. That suggests the Proto-Germanic voiceless stops were already aspirated. Of course, these voiceless stops are the reflexes of the PIE voiced unaspirated stops, not the PIE voiceless unaspirated stops. But perhaps the rule that aspirated voiceless stops was persistent in Proto-Germanic, so that it applied to both the PIE voiceless unaspirated stops before they fricativised and the PIE voiced unaspirated stops after they were devoiced. The rule seems to have persisted into German, because German went through its own kind of replay of Grimm’s Law in which the Proto-Germanic voiceless stops became affricates or fricatives and the Proto-Germanic voiced stops were devoiced. This second consonant shift was never fully completed in most German dialects; in Standard German, for example, Proto-Germanic *b and *g were not devoiced in word-initial position. However, *d was devoiced (c.f. English daughter, German Tochter) and modern Standard German /t/ is aspirated, so, for example, Tochter is pronounced [ˈtʰɔxtɐ].
I think this is a satisfactory solution to the problem. The idea that the PIE voiced aspirated stops became fricatives first is not a new one, in fact it is probably the favoured scenario, but I have never seen it justified in this way, and Ringe (2006) suggests that the voiced aspirates changed into both stops and fricatives depending on the environment (number 3 above), which is incompatible with the scenario I have proposed here.
Finally, I think I should mention that all of this reasoning has been done on the assumption that PIE had voiceless unaspirated, voiced unaspirated and voiced aspirated stops. If you subscribe to an alternative hypothesis about the nature of the PIE stops, such as the glottalic theory, Grimm’s Law might have to be explained in a completely different way. But despite it not being as easy as it might appear at first glance, it does seem that the standard hypothesis is capable of explaining Grimm’s Law.
Whether it can explain Verner’s Law is another matter. I have always thought it a little odd that the voiceless fricatives were voiced after unaccented syllables but not after accented syllables. It is not obvious how accent and voice can affect each other. But I’ll discuss this, perhaps, in another post.
Gordon, M., & Ladefoged, P. (2001). Phonation types: a cross-linguistic overview. Journal of Phonetics, 29(4), 383-406.
Ringe, D. (2006). From Proto-Indo-European to Proto-Germanic: A Linguistic History of English: Volume I. Oxford University Press.
Traill, A., Khumalo, J. S., & Fridjhon, P. (1987). Depressing facts about Zulu. African Studies, 46(2), 255-274.