Vowel-initial and vowel-final roots in Proto-Indo-European

A remarkable feature of Proto-Indo-European (PIE) is the restrictiveness of the constraints on its root structure. It is generally agreed that all PIE roots were monosyllabic, containing a single underlying vowel. In fact, the vast majority of the roots are thought to have had a single underlying vowel, namely *e. (Some scholars reconstruct a small number of roots with underlying *a rather than *e; others do not, and reconstruct underlying *e in every PIE root.) It is also commonly supposed that every root had at least one consonant on either side of its vowel; in other words, that there were no roots which began or ended with the vowel (Fortson 2004: 71).

I have no dispute with the first of these constraints; though it is very unusual, it is not too difficult to understand in connection with the PIE ablaut system, and the Semitic languages are similar with their triconsonantal, vowel-less roots. However, I think the other constraint, the one against vowel-initial and vowel-final roots, is questionable. In order to talk about it with ease and clarity, it helps to have a name for it: I’m going to call it the trisegmental constraint, because it amounts to the constraint that every PIE root contains at least three segments: the vowel, a consonant before the vowel, and a consonant after the vowel.

The first thing that might make one suspicious of the trisegmental constraint is that it isn’t actually attested in any IE language, as far as I know. English has vowel-initial roots (e.g. ask) and vowel-final roots (e.g. fly); so do Latin, Greek and Sanskrit (cf. S. aj- ‘drive’, G. ἀγ- ‘lead’, L. ag- ‘do’), and L. dō-, G. δω-, S. dā-, all meaning ‘give’). And for much of the early history of IE studies, nobody suspected the constraint’s existence: the PIE roots meaning ‘drive’ and ‘give’ were reconstructed as *aǵ- and *dō-, respectively, with an initial vowel in the case of the former and a final vowel in the case of the latter.

It was only with the development of the laryngeal theory that the reconstruction of the trisegmental constraint became possible. The initial motivation for the laryngeal theory was to simplify the system of ablaut reconstructed for PIE. I won’t go into the motivation in detail here; it’s one of the most famous developments in IE studies so a lot of my readers are probably familiar with it already, and it’s not hard to find descriptions of it. The important thing to know, if you want to understand what I’m talking about here, is that the laryngeal theory posits the existence of three consonants in PIE which are called laryngeals and written *h1, *h2 and *h3, and that these laryngeals can be distinguished by their effects on adjacent vowels: *h2 turns adjacent underlying *e into *a and *h3 turns adjacent underlying *e into *o. In all of the IE languages other than the Anatolian languages (which are all extinct, and which records of were only discovered in the 20th century), the laryngeals are elided in pretty much everywhere, and their presence is only discernable from their effects on adjacent segments. Note that as well as changing the quality (“colouring”) underlying *e, they also lengthen preceding vowels. And between consonants, they are reflected as vowels, but as different vowels in different languages: in Greek *h1, *h2, *h3 become ε, α, ο respectively, in Sanskrit all three become i, in the other languages all three generally became a.

So, the laryngeal theory allowed the old reconstructions *aǵ- and *dō- to be replaced by *h2éǵ- and *deh3– respectively, which conform to the trisegmental constraint. In fact every root reconstructed with an initial or final vowel by the 19th century IEists could be reconstructed with an initial or final laryngeal instead. Concrete support for some of these new reconstructions with laryngeals came from the discovery of the Anatolian languages, which preserved some of the laryngeals in some positions as consonants. For example, the PIE word for ‘sheep’ was reconstructed as *ówis on the basis of the correspondence between L. ovis, G. ὄϊς, S. áviḥ, but the discovery of the Cuneiform Luwian cognate ḫāwīs confirmed without a doubt that the root must have originally begun with a laryngeal (although it is still unclear whether that laryngeal was *h2, preceding *o, or *h3, preceding *e).

There are also indirect ways in which the presence of a laryngeal can be evidenced. Most obviously, if a root exhibits the irregular ablaut alternations in the early IE languages which the laryngeal theory was designed to explain, then it should be reconstructed with a laryngeal in order to regularize the ablaut alternation in PIE. In the case of *h2eǵ-, for example, there is an o-grade derivative of the root, *h2oǵmos ‘drive’ (n.), which can be reconstructed on the evidence of Greek ὄγμος ‘furrow’ (Ringe 2006: 14). This shows that the underlying vowel of the root must have been *e, because (given the laryngeal theory) the PIE ablaut system did not involve alternations of *a with *o, only alternations of *e, *ō or ∅ (that is, the absence of the segment) with *o. But this underlying *e is reflected as if it was *a in all the e-grade derivatives of *h2eǵ- attested in the early IE languages (e.g. in the 3sg. present active indicative forms S. ájati, G. ἀγει, L. agit). In order to account for this “colouring” we must reconstruct *h2 next to the *e. Similar considerations allow us to be reasonably sure that *deh3– also contained a laryngeal, because the e-grade root is reflected as if it had *ō (S. dádāti, G. δίδωσι) and the zero-grade root in *dh3tós ‘given’ exhibits the characteristic reflex of interconsonantal *h3 (S. -ditáḥ, G. dotós, L. datus).

But in many cases there does not seem to be any particular evidence for the reconstruction of the initial or final laryngeal other than the assumption that the trisegmental constraint existed. For example, *h1éḱwos ‘horse’ could just as well be reconstructed as *éḱwos, and indeed this is what Ringe (2006) does. Likewise, there is no positive evidence that the root *muH- of *muHs ‘mouse’ (cf. S. mūṣ, G. μῦς, L. mūs) contained a laryngeal: it could just as well be *mū-. Both of the roots *(h1)éḱ- and *muH/ū- are found, as far as I know, in these stems only, so there is no evidence for the existence of the laryngeal from ablaut. It is true that PIE has no roots that can be reconstructed as ending in a short vowel, and this could be seen as evidence for at least a constraint against vowel-final roots, because if all the apparent vowel-final roots actually had a vowel + laryngeal sequence, that would explain why the vowel appears to be long. But this is not the only possible explanation: there could just be a constraint against roots containing a light syllable. This seems like a very natural constraint. Although the circumstances aren’t exactly the same—because English roots appear without inflectional endings in most circumstances, while PIE roots mostly didn’t—the constraint is attested in English: short unreduced vowels like that of cat never appear in root-final (or word-final) position; only long vowels, diphthongs and schwa can appear in word-final position, and schwa does not appear in stressed syllables.

It could be argued that the trisegmental constraint simplifies the phonology of PIE, and therefore it should be assumed to exist pending the discovery of positive evidence that some root does begin or end with a vowel. It simplifies the phonology in the sense that it reduces the space of phonological forms which can conceivably be reconstructed. But I don’t think this is the sense of “simple” which we should be using to decide which hypotheses about PIE are better. I think a reconstructed language is simpler to the extent that it is synchronically not unusual, and that the existence of whatever features it has that are synchronically unusual can be justified by explanations of features in the daughter languages by natural linguistic changes (in other words, both synchronic unusualness and diachronic unusualness must be taken into account). The trisegmental constraint seems to me synchronically unusual, because I don’t know of any other languages that have something similar, although I have not made any systematic investigation. And as far as I know there are no features of the IE languages which the trisegmental constraint helps to explain.

(Perhaps a constraint against vowel-initial roots, at least, would be more natural if PIE had a phonemic glottal stop, because people, or at least English and German speakers, tend to insert subphonemic glottal stops before vowels immediately preceded by a pause. Again, I don’t know if there are any cross-linguistic studies which support this. The laryngeal *h1 is often conjectured to be a glottal stop, but it is also often conjectured to be a glottal fricative; I don’t know if there is any reason to favour either conjecture over the other.)

I think something like this disagreement over what notion of simplicity is most important in linguistic reconstruction underlies some of the other controversies in IE phonology. For example, the question of whether PIE had phonemic *a and *ā: the “Leiden school” says it didn’t, accepting the conclusions of Lubotsky (1989), most other IEists say it did. The Leiden school reconstruction certainly reduces the space of phonological forms which can be reconstructed in PIE and therefore might be better from a falsifiability perspective. Kortlandt (2003) makes this point with respect to a different (but related) issue, the sound changes affecting initial laryngeals in Anatolian:

My reconstructions … are much more constrained [than the ones proposed by Melchert and Kimball] because I do not find evidence for more than four distinct sequences (three laryngeals before *-e- and neutralization before *-o-) whereas they start from 24 possibilites (zero and three laryngeals before three vowels *e, *a, *o which may be short or long, cf. Melchert 1994: 46f., Kimball 1999: 119f.). …

Any proponent of a scientific theory should indicate the type of evidence required for its refutation. While it is difficult to see how a theory which posits *H2 for Hittite h- and a dozen other possible reconstructions for Hittite a- can be refuted, it should be easy to produce counter-evidence for a theory which allows no more than four possibilities … The fact that no such counter-evidence has been forthcoming suggests that my theory is correct.

Of course the problem with the Leiden school reconstruction is that for a language to lack phonemic low vowels is very unusual. Arapaho apparently lacks phonemic low vowels, but it’s the only attested example I’ve heard of. But … I don’t have any direct answer to Kortlandt’s concerns about non-falsifiability. My own and other linguists’ concerns about the unnaturalness of a lack of phonemic low vowels also seem valid, but I don’t know how to resolve these opposing concerns. So until I can figure out a solution to this methodological problem, I’m not going to be very sure about whether PIE had phonemic low vowels and, similarly, whether the trisegmental constraint existed.

References

Fortson, B., 2004. Indo-European language and culture: An introduction. Oxford University Press.

Kortlandt, F., 2003. Initial laryngeals in Anatolian. Orpheus 13-14 [Gs. Rikov] (2003-04), 9-12.

Lubotsky, A., 1989. Against a Proto-Indo-European phoneme *a. The New Sound of Indo–European. Essays in Phonological Reconstruction. Berlin–New York: Mouton de Gruyter, pp. 53–66.

Ringe, D., 2006. A Linguistic History of English: Volume I, From Proto-Indo-European to Proto-Germanic. Oxford University Press.

Advertisements

8 responses to “Vowel-initial and vowel-final roots in Proto-Indo-European

  1. James Matisoff:

    While some Aslian languages have open major syllables (e.g. Semoq Beri tu ‘breast,’ cɔ ‘dog,’ ti ‘hand’), others do not permit open syllables in word-final position (Temiar, Jah Hut).

    Geoffrey Benjamin:

    My own published versions of the words here are tuh, cɔh, and teh or thih. I’m pretty sure that all Aslian languages have only consonants word-finally. [When in doubt, subject the word to copyfixation, whereupon the -h, -ʔ, or -k will be heard clearly in the middle of the word. By this test, as I remark later, some of Asmah’s Kentag Bong forms are true reduplications, not copyfixations.] The same applies throughout to word initials, which are also never vowels in Aslian.

    (source)

  2. The Aslian languages don’t allow words to start or end with vowels.

  3. Really engaging post as usual. I appreciate you providing a run-down of the laryngeal theory of PIE phonology, as I remembered reading about it but had forgotten the gist of it.

    Here are a couple of stray comments (possibly naive, and not directly relevant to the main point of your essay). First of all, while the trisegmental constraint seems very unusual, the feature of having only one vowel phoneme also seems very unusual. I believe that among languages spoken today, only a couple of Caucasian languages come close to it. For this reason, my first reaction to the laryngeal theory is to be very skeptical and to look for alternate explanations for vowel differences in IE languages. This could include the temporary explanation that vowel phonemes tend to be very volatile and for the time being there might not be any determinable law that governs it for the early history of IE families. However, I’m sure I vastly underestimate the amount of evidence out there for the laryngeal theory.

    Secondly, I don’t remember much of anything very specific about PIE grammar but do know that PIE is believed to have been highly inflected. This is not mentioned much in your post, but it gives me a vague idea of how the trisegmental constraint (at least the requirement of ending in a consonant) might not be so implausible as it seems at first glance. Presumeably, not much earlier in the development of PIE, inflective suffixes were formed by particles being affixed to roots — in other words, it seems that most highly inflected languages were recently much more analytic. It seems possible to me that something akin to a trisegmental constraint might have come about as a result of particles becoming affixed to roots. More concretely speaking, maybe roots that ended in a vowel had that vowel elided in the presence of a suffix, and maybe this caused no problem because of a paucity of minimal pairs with roots that differed only in the ending vowels. I wouldn’t venture to seriously conjecture this, of course, without at least studying PIE grammar beforehand — for instance, as far as I remember, there are no PIE words that consist of the bare root with no ending, but I could be wrong.

    • It’s not quite accurate to say that PIE had only one vowel phoneme. It did have only one vowel phoneme (/e/) which could appear as the underlying vowel in a root. However, other vowel phonemes (at least /o/) appeared on the surface and contrasted with each other in similar or identical phonological environments, e.g. *dóm ‘house’ (voc. sg.) ~ *dém ‘house’ (loc. sg.). The conditioning for ablaut is morphological, not phonological. It’s even to some degree lexical: for example, as far as I know, there’s no systematic difference between root nouns that happen to have e-grade and accent on the root in the strong cases, zero-grade and accent on the ending in the weak cases (amphikinetics) and root nouns that happen to have o-grade and accent on the root in the strong cases, e-grade and accent on the root in the weak cases (acrostatics). Theoretically you could have an amphikinetic root noun such as *ten- ~ *tn- (nom. sg. *téns, gen. sg. *tnés) and an acrostatic root noun such as *tón- ~ *tén- (nom. sg. *tóns, gen. sg. *téns), whose nom. sg. forms would comprise a minimal pair in exactly the same morphological environment. You could probably even analyse the acrostatic nouns as having underlying *o, rather than *e, if you wanted. It’s by no means certain that the PIE vowel-alternation phenomena we lump together as “ablaut” all share the same origin (for example the ablaut of the thematic vowel has basically nothing in common with the ablaut of root vowels, other than that it involves an alternation between *e and *o), and in some cases it may not be the e-grade which we would most naturally identify as the underlying grade if we knew how the alternation actually developed.

      Also, /j/ and /w/, though normally analysed as consonants, did appear frequently as vowels [i] and [u] on the surface, and there was some analogical morphological conditioning which interfered with the phonologically-conditioned syllabification rules (the word-final sequences /-im/ and /-um/, appearing in the acc. sg. forms of i-stems and u-stems, were syllabified as [im] and [um], not as [jm̥] and [wm̥] as expected by the usual syllabification rules; adding the nasal present stem-forming infix to roots of the form *Ce(y/w)C- where C is an obstruent resulted in unaccented stems of the form *C(i/u)-n-C-, not *C(y/w)-n̥-C- as expected). There might even have been a near-minimal pair for /i/ and /j/: *néwios ‘new’ (cf. Sanskrit návyas, scanned as a trisyllable in the Rigveda, and Welsh newydd) and *ályos ‘other’ (cf. Welsh eil).

      Undoubtedly it’s true that most instances of PIE [i] and [u] can be derived from underlying /j/ and /w/ by phonological conditioning, and that’s probably unusual. It may actually be common for [j] and [w] to be allophones of /i/ and /u/, because sound changes that resyllabify high vowels as semivowels in more consonant-y positions, and vice versa, are very common (e.g. i, u > j, w / C_V). I think French can be analysed in this way, as lacking phonemic /j/ and /w/. But for the phonemes to pattern as consonants /j/ and /w/ rather than vowels /i/ and /u/, as they clearly do in PIE given that /i/ and /u/ do not appear as underlying vowels in roots, is probably unusual. However… it’s not quite as self-evidently unusual as if PIE actually didn’t use the high region of the vowel space contrastively at all.

      Moreover, I can think of a natural way in which it could have come about as a consequence of the development of the PIE ablaut system. Let’s say that pre-PIE had the very non-unusual three-vowel /i/, /u/, /a/ system, which could all be the underlying vowels of roots, as well as consonants /j/ and /w/. Here are three root structures possible in this pre-PIE: CaC-, CiC-, CuC-, CajC-, CawC-. Accent falls on the root when an ending without underlying accent is added, like nom. sg. *-as, but if the ending has underlying accent, like gen. sg. *-ás, this takes precedence. So we have nom. sg. CáCas, gen. sg. CaCás. Now a syncope rule elides /a/ in unaccented syllables. We get:

      • nom. sg. CáCs, gen. sg. CCás in the CaC- roots;
      • nom. sg. C(í/ú)Cs, gen. sg. C(i/u)Cás in the C(i/u)C- roots;
      • nom. sg. Cá(j/w)Cs, gen. sg. C(i/u)Cás in the Ca(j/w)C- roots.

      In the CaC- and Ca(j/w)C- roots, we can already see the basic e ~ ∅ alternation attested in PIE, modulo a later a > e change. (It seems likely that something like this scenario is responsible for the PIE e ~ ∅ alternation, given the strong correlation of e-grade with accent and zero-grade with lack of accent; the tricky part is figuring out how the o-grade figures into the scenario.) The C(i/u)C- roots are still what we would identify as non-ablauting roots. It’s possible that some of them survived in this form (e.g. *bʰuh2– ‘become’, which Ringe says should be reconstructed as non-ablauting with invariant *-u-). But in the gen. sg. and other forms where the ending was accented, they would be indistinguishable from Ca(j/w)C- roots, and it seems plausible that most of them would have their nom. sg. forms analogically remodelled as Cá(j/w)Cs on the model of the Ca(j/w)C- roots (and likewise for other forms with the accent on the root). This would eliminate the vast majority of cases of underlying /i/ and /u/ in roots.

      I would imagine something similar (reanalyses and analogical changes in connection with the development of the ablaut system) has happened to bring about the absence of underlying *o in roots, although as I alluded to above it’s not even clear to me how far the absence of underlying *o in roots is a property of the language itself, and not just how we choose to analyse it.

      So yeah, these peculiarities of the PIE vowel system, while unusual, do seem to me to “fit in” with the ablaut system in a way that the trisegmental constraint or the lack of *a don’t, and that’s why they don’t bother me too much.

      I’d like to respond to some of the other points you raised in your comment, but I think I’ve spent long enough on this comment already, so maybe later 🙂

    • First of all, while the trisegmental constraint seems very unusual, the feature of having only one vowel phoneme also seems very unusual. I believe that among languages spoken today, only a couple of Caucasian languages come close to it.

      You’re thinking of NWC — NEC languages tend to have large vowel inventories, and SC languages are about average. The typical NWC vowel inventory is something like /aː ɜ ɨ/, but /aː/ is probably secondary, leaving a binary vowel contrast in the proto-language of ‘low’ vs. ‘high’. (IIRC NWC also has pervasive ablaut.) However, NWC doesn’t have syllabic resonants the way IE does, so a better comparison is Nuxalk, which can be analyzed as only having one vowel, /a/, since the phonetic vowels are [a i u], /j w/ are present in the consonantal inventory, and almost anything can be syllabic.

      Another language that can be analyzed as having only one vowel is Moloko, but you have to posit word-level ‘prosodies’. In Moloko, the vowels in a word all have to agree in frontness (front/central/back), and there’s a binary high/low opposition where the high vowel is epenthetic, leaving you with the one unpredictable vowel /a/ and the two word-level prosodies /ʲ ʷ/. The prosodies also affect consonants, but this can just as well be analyzed as affricates like /ts/ palatalizing around front vowels.

      Of course, none of this is particularly applicable to PIE, except for the NWC example — it seems that /a:/ developed from coalescence of the low vowel and a laryngeal.

    • For this reason, my first reaction to the laryngeal theory is to be very skeptical and to look for alternate explanations for vowel differences in IE languages. This could include the temporary explanation that vowel phonemes tend to be very volatile and for the time being there might not be any determinable law that governs it for the early history of IE families. However, I’m sure I vastly underestimate the amount of evidence out there for the laryngeal theory.

      One of the posts I’d like to do at some point is an explanation of the laryngeal theory and the evidence for it. Partly for myself as well as anybody reading my blog, because I still don’t have a very strong grasp of the issues involved. Most of what I’ve read about the laryngeal theory has been from modern textbooks which probably paint a somewhat anachronistic and simplified picture of the thought process behind it, as descriptions of theoretical advances in textbooks generally do; so I’d like to read some of the primary literature before making any post about it (e.g. Werner Winter’s Evidence for Laryngeals).

      Secondly, I don’t remember much of anything very specific about PIE grammar but do know that PIE is believed to have been highly inflected. This is not mentioned much in your post, but it gives me a vague idea of how the trisegmental constraint (at least the requirement of ending in a consonant) might not be so implausible as it seems at first glance. Presumeably, not much earlier in the development of PIE, inflective suffixes were formed by particles being affixed to roots — in other words, it seems that most highly inflected languages were recently much more analytic. It seems possible to me that something akin to a trisegmental constraint might have come about as a result of particles becoming affixed to roots. More concretely speaking, maybe roots that ended in a vowel had that vowel elided in the presence of a suffix, and maybe this caused no problem because of a paucity of minimal pairs with roots that differed only in the ending vowels. I wouldn’t venture to seriously conjecture this, of course, without at least studying PIE grammar beforehand — for instance, as far as I remember, there are no PIE words that consist of the bare root with no ending, but I could be wrong.

      There are a few morphological environments in which PIE roots can appear without an ending: second-person singular imperatives and vocative and locative singulars. But it’s true that these are reasonably marginal environments, so I do think it’s likely that a constraint against vowel-final roots could have come about due to PIE’s suffixing morphology in something like the way you describe. I’d still maintain that there’s no very definite proof of this constraint having existed, so it’s probably still worth not reconstructing root-final laryngeals just for the sake of the constraint.

  4. David Marjanović

    But in many cases there does not seem to be any particular evidence for the reconstruction of the initial or final laryngeal other than the assumption that the trisegmental constraint existed. For example, *h₁éḱwos ‘horse’ could just as well be reconstructed as *éḱwos, and indeed this is what Ringe (2006) does.

    More recently, however, evidence for specifically *h₁ in that word has surfaced: the h- of Greek híppos. I warmly recommend reading the whole paper, which explains a whole lot.

    Likewise, there is no positive evidence that the root *muH- of *muHs ‘mouse’ (cf. S. mūṣ, G. μῦς, L. mūs) contained a laryngeal

    Indeed not. Two alternatives have been proposed: 1) the word is underlyingly *mus, which is too short for a stressed word (but found in the Latin diminutive musculus!), so the vowel was lengthened; 2) the root is *mus-, so that adding the animate nom. sg. *-s would result in **muss, which isn’t allowed because long consonants aren’t allowed, so Szemerényi’s law kicks in and moves the length to the vowel.

    Lengthening of stressed monosyllables explains a bunch of alternations, e.g. the 2nd sg. pronoun *tu ~ *tū (separate reflexes of both versions are found in Old English alone).

Leave a Reply

Fill in your details below or click an icon to log in:

WordPress.com Logo

You are commenting using your WordPress.com account. Log Out / Change )

Twitter picture

You are commenting using your Twitter account. Log Out / Change )

Facebook photo

You are commenting using your Facebook account. Log Out / Change )

Google+ photo

You are commenting using your Google+ account. Log Out / Change )

Connecting to %s