That’s OK, but this’s not OK?

Here’s something peculiar I noticed the other day about the English language.

The word is (the third-person singular present indicative form of the verb be) can be ‘contracted’ with a preceding noun phrase, so that it is reduced to an enclitic form -‘s. This can happen after pretty much any noun phrase, no matter how syntactically complex:

(1) he’s here

/(h)iːz ˈhiːə/[1]

(2) everyone’s here

/ˈevriːwɒnz ˈhiːə/

(3) ten years ago’s a long time

/ˈtɛn ˈjiːəz əˈgəwz ə ˈlɒng ˈtajm/

However, one place where this contraction can’t happen is immediately after the proximal demonstrative this. This is strange, because it can certainly happen after the distal demonstrative that, and one wouldn’t expect these two very similar words to behave so differently:

(4) that’s funny
/ˈðats ˈfʊniː/

(5) *this’s funny

There is a complication here which I’ve kind of skirted over, though. Sure, this’s funny is unacceptable in writing. But what would it sound like, if it was said in speech? Well, the -’s enclitic form of is can actually be realized on the surface in a couple of different ways, depending on the phonological environment. You might already have noticed that it’s /-s/ in example (4), but /-z/ in examples (1)-(3). This allomorphy (variation in phonological form) is reminiscent of the allomorphy in the plural suffix: cats is /ˈkats/, dogs is /ˈdɒgz/, horses is /ˈhɔːsɪz/. In fact the distribution of the /-s/ and /-z/ realizations of -‘s is exactly the same as for the plural suffix: /-s/ appears after voiceless non-sibilant consonants and /-z/ appears after vowels and voiced non-sibilant consonants. The remaining environment, the environment after sibilants, is the environment in which the plural suffix appears as /-ɪz/. And this environment turns out to be exactly the same environment in which -’s is unacceptable in writing. Here are a couple more examples:

(6) *a good guess’s worth something (compare: the correct answer’s worth something)

(7) *The Clash’s my favourite band (compare: Pearl Jam’s my favourite band)

Now, if -‘s obeys the same rules as the plural suffix then we’d expect it to be realized as /-ɪz/ in this environment. However… this is exactly the same sequence of segments that the independent word is is realized as when it is unstressed. One might therefore suspect that in sentences like (8) below, the morpheme graphically represented as the independent word is is actually the enclitic -‘s, it just happens to be realized the same as the independent word is and therefore not distinguished from it in writing. (Or, perhaps it would be more elegant to say that the contrast between enclitic and independent word is neutralized in this environment.)

(8) The Clash is my favourite band

Well, this is (*this’s) a very neat explanation, and if you do a Google search for “this’s” that’s pretty much the explanation you’ll find given to the various other confused people who have gone to websites like English Stack Exchange to ask why this’s isn’t a word. Unfortunately, I think it can’t be right.

The problem is, there are some accents of English, including mine, which have /-əz/ rather than /-ɪz/ in the allomorph of the plural suffix that occurs after sibilants, while at the same time pronouncing unstressed is as /ɪz/ rather than /əz/. (There are minimal pairs, such as peace is upon us /ˈpiːsɪz əˈpɒn ʊz/ and pieces upon us /ˈpiːsəz əˈpɒn ʊz/.) If the enclitic form of is does occur in (8) then we’d expect it to be realized as /əz/ in these accents, just like the plural suffix would be in the same environment. This is not what happens, at least in my own accent: (8) can only have /ɪz/. Indeed, it can be distinguished from the minimally contrastive NP (9):

(9) The Clash as my favourite band

In fact this problem exists in more standard accents of English as well, because is is not the only word ending in /-z/ which can end a contraction. The third-person singular present indicative of the verb have, has, can also be contracted to -‘s, and it exhibits the expected allomorphy between voiceless and voiced realizations:

(10) it’s been a while /ɪts ˈbiːn ə ˈwajəl/

(11) somebody I used to know’s disappeared /ˈsʊmbɒdiː aj ˈjuːst tə ˈnəwz dɪsəˈpijəd/

But like is it does not contract, at least in writing, after sibilants, although it may drop the initial /h-/ whenever it’s unstressed:

(12) this has gone on long enough /ˈðɪs (h)əz gɒn ɒn lɒng əˈnʊf/

I am not a native speaker of RP, so, correct me if I’m wrong. But I would be very surprised if any native speaker of RP would ever pronounce has as /ɪz/ in sentences like (12).

What’s going on? I actually do think the answer given above—that this’s isn’t written because it sounds exactly the same as this is—is more or less correct, but it needs elaboration. Such an answer can only be accepted if we in turn accept that the plural -s, the reduced -‘s form of is and the reduced -‘s form of has do not all exhibit the same allomorph in the environment after sibilants. The reduced form of is has the allomorph /-ɪz/ in all accents, except in those such as Australian English in which unstressed /ɪ/ merges with schwa. The reduced form of has has the allomorph /-əz/ in all accents. The plural suffix has the allomorph /-ɪz/ in some accents, but /-əz/ in others, including some in which /ɪ/ is not merged completely with schwa and in particular is not merged with schwa in the unstressed pronunciation of is.

Introductory textbooks on phonology written in the English language are very fond of talking about the allomorphy of the English plural suffix. In pretty much every treatment I’ve seen, it’s assumed that /-z/ is the underlying form, and /-s/ and /-əz/ are derived by phonological rules of voicing assimilation and epenthesis respectively, with the voicing assimilation crucially coming after the epenthesis (otherwise we’d have an additional allomorph /-əs/ after voiceless sibilants, while /-əz/ would only appear after voiced sibilants). This is the best analysis when the example is taken in isolation, because positing an epenthesis rule allows the phonological rules to be assumed to be productive across the entire lexicon of English. If such a fully productive deletion rule were posited, then it would be impossible to account for the pronunciation of a word like Paulas (‘multiple people named Paula’) with /-əz/ on the surface, whose underlying form would be exactly the same, phonologically, as Pauls (‘multiple people named Paul’). (This example only works if your plural suffix post-sibilant allomorph is /-əz/ rather than /-ɪz/, but a similar example could probably be exhibited in the other case.) One could appeal to the differing placement of the morpheme boundary but this is unappealing.

However, the assumption that a single epenthesis rule operating between sibilants is productive across the entire English lexicon has to be given up, because ‘s < is and ‘s < has have different allomorphs after sibilants! Either they are accounted for by two different lexically-conditioned epenthesis rules (which is a very unappealing model) or the allomorphs with the vowels are actually the underlying ones, and the allomorphs without the vowels are produced by a not phonologically-conditioned but at least (sort of) morphologically-conditioned deletion rule that elides fully reduced unstressed vowels (/ə/, /ɪ/) before word-final obstruents. This rule only applies in inflectional suffixes (e.g. lettuce and orchid are immune), and even there it does not apply unconditionally because the superlative suffix -est is immune to it. But this doesn’t bother me too much. One can argue that the superlative is kind of a marginal inflectional category, when you put it in the company of the plural, the possessive and the past tense.

A nice thing about the synchronic rule I’m proposing here is that it’s more or less exactly the same as the diachronic rule that produced the whole situation in the first place. The Old English nom./acc. pl., gen. sg., and past endings were, respectively, -as, -es, -aþ and -ede. In Middle English final schwa was elided unconditionally in absolute word-final position, while in word-final unstressed syllables where it was followed by a single obstruent it was gradually eliminated by a process of lexical diffusion from inflectional suffix to inflectional suffix, although “a full coverage of the process in ME is still outstanding” (Minkova 2013: 231). Even the superlative suffix was reduced to /-st/ by many speakers for a time, but eventually the schwa-ful form of this suffix prevailed.

I don’t see this as a coincidence. My inclination, when it comes to phonology, is to see the historical phonology as essential for understanding the present-day phonology. Synchronic phonological alternations are for the most part caused by sound changes, and trying to understand them without reference to these old sound changes is… well, you may be able to make some progress but it seems like it’d be much easier to make progress more quickly by trying to understand the things that cause them—sound changes—at the same time. This is a pretty tentative paragraph, and I’m aware I’d need a lot more elaboration to make a convincing case for this stance. But this is where my inclination is headed.

[1] The transcription system is the one which I prefer to use for my own accent of English.


Minkova, D. 2013. A Historical Phonology of English. Edinburgh University Press.

A language with no word-initial consonants

I was having a look at some of the squibs in Linguistic Inquiry today, which are often fairly interesting (and have the redeeming quality that, when they’re not interesting, they’re at least short), and there was an especially interesting one in the April 1970 (second ever) issue by R. M. W. Dixon (Dixon 1970) which I’d like to write about for the benefit of those who can’t access it.

In Olgolo, a variety of Kunjen spoken on the Cape York Peninsula, there appears to been a sound change that elided consonants in initial position. That is, not just consonants of a particular variety, but all consonants. As a result of this change, every word in the language begins with a vowel. Examples (transcriptions in IPA):

  • *báma ‘man’ > áb͡ma
  • *míɲa ‘animal’ > íɲa
  • *gúda ‘dog’ > úda
  • *gúman ‘thigh’ > úb͡man
  • *búŋa ‘sun’ > úg͡ŋa
  • *bíːɲa ‘aunt’ > íɲa
  • *gúyu ‘fish’ > úyu
  • *yúgu ‘tree, wood’ > úgu

(Being used to the conventions of Indo-Europeanists, I’m a little disturbed by the fact that Dixon doesn’t identify the linguistic proto-variety to which the proto-forms in these examples belong, nor does he cite cognates to back up his reconstruction. But I presume forms very similar to the proto-forms are found in nearby Paman languages. In fact, I know for a fact that the Uradhi word for ‘tree’ is /yúku/ because Black (1993) mentions it by way of illustrating the remarkable Uradhi phonological rule which inserts a phonetic [k] or [ŋ] after every vowel in utterance-final position. Utterance-final /yúku/ is by this means realized as [yúkuk] in Uradhi.)

(The pre-stopped nasals in some of these words [rather interesting segments in of themselves, but fairly widely attested, see the Wikipedia article] have arisen due to a sound change occurring before the word-initial consonant elision sound change, which pre-stopped nasals immediately after word-initial syllables containing a stop or *w followed by a short vowel. This would have helped mitigate the loss of contrast resulting from the word-initial consonant elision sound change a little, but only a little, and between e.g. the words for ‘animal’ and ‘aunt’ homophony was not averted because ‘aunt’ had an originally long vowel [which was shortened in Olgolo by yet another sound change].)

Dixon says Olgolo is the only language he’s heard of in which there are no word-initial consonants, although it’s possible that more have been discovered since 1970. However, there is a caveat to this statement: there are monoconsonantal prefixes that can be optionally added to most nouns, so that they have an initial consonant on the surface. There are at least four of these prefixes, /n-/, /w-/, /y-/ and /ŋ-/; however, every noun seems to only take a single one of these prefixes, so we can regard these three forms as lexically-conditioned allomorphs of a single prefix. The conditioning is in fact more precisely semantic: roughly, y- is added to nouns denoting fish, n- is added to nouns denoting other animals, and w- is added to nouns denoting various inanimates. The prefixes therefore identify ‘noun classes’ in a sense (although these are probably not noun classes in a strict sense because Dixon gives no indication that there are any agreement phenomena which involve them). The prefix ŋ- was only seen on a one word, /ɔ́jɟɔba/ ~ /ŋɔ́jɟɔba/ ‘wild yam’ and might be added to all nouns denoting fruits and vegetables, given that most Australian languages with noun classes have a noun class for fruits and vegetables, but there were no other such nouns in the dataset (Dixon only noticed the semantic conditioning after he left the field, so he didn’t have a chance to elicit any others). It must be emphasized, however, that these prefixes are entirely optional, and every noun which can have a prefix added to it can also be pronounced without the prefix. In addition some nouns, those denoting kin and body parts, appear to never take a prefix, although possibly this is just a limitation of the dataset given that their taking a prefix would be expected to be optional in any case. And words other than nouns, such as verbs, don’t take these prefixes at all.

Dixon hypothesizes that the y- and n- prefixes are reduced forms of /úyu/ ‘fish’ and /íɲa/ ‘animal’ respectively, while w- may be from /úgu/ ‘tree, wood’ or just an “unmarked” initial consonant (it’s not clear what Dixon means by this). These derivations are not unquestionable (for example, how do we get from /-ɲ-/ to /n-/ in the ‘animal’ prefix?) But it’s very plausible that the prefixes do originate in this way, even if the exact antedecent words are difficult to identify, because similar origins have been identified for noun class prefixes in other Australian languages (Dixon 1968, as cited by Dixon 1970). Just intuitively, it’s easy to see how nouns might come to be ever more frequently replaced by compounds of the dependent original noun and a term denoting a superset; cf. English koala ~ koala bear, oak ~ oak tree, gem ~ gemstone. In English these compounds are head-final but in other languages (e.g. Welsh) they are often head-initial, and presumably this would have to be the case in pre-Olgolo in order for the head elements to grammaticalize into noun class prefixes. The fact that the noun class prefixes are optional certainly suggests that the system is very much incipient, and still developing, and therefore of recent origin.

It might therefore be very interesting to see how the Olgolo language has changed after a century or so; we might be able to examine a noun class system as it develops in real time, with all of our modern equipment and techniques available to record each stage. It would also be very interesting to see how quickly this supposedly anomalous state of every word beginning with a vowel (in at least one of its freely-variant forms) is eliminated, especially since work on Australian language phonology since 1970 has established many other surprising findings about Australian syllable structure, including a language where the “basic’ syllable type appears to be VC rather than CV (Breen & Pensalfini 1999). Indeed, since Dixon wrote this paper 46 years ago Olgolo might have changed considerably already. Unfortunately, it might have changed in a somewhat more disappointing way. None of the citations of Dixon’s paper recorded by Google Scholar seem to examine Olgolo any further, and the documentation on Kunjen (the variety which includes Olgolo as a subvariety) recorded in the Australian Indigenous Languages Database isn’t particularly overwhelming. I can’t find a straight answer as to whether Kunjen is extinct today or not (never mind the Olgolo variety), but Dixon wasn’t optimistic about its future in 1970:

It would be instructive to study the development of Olgolo over the next few generations … Unfortunately, the language is at present spoken by only a handful of old people, and is bound to become extinct in the next decade or so.


Black, P. 1993 (post-print). Unusual syllable structure in the Kurtjar language of Australia. Retrieved from on 26 September 2016.

Breen, G. & Pensalfini, R. 1999. Arrernte: A Language with No Syllable Onsets. Linguistic Inquiry 30 (1): 1-25.

Dixon, R. M. W. 1968. Noun Classes. Lingua 21: 104-125.

Dixon, R. M. W. 1970. Olgolo Syllable Structure and What They Are Doing about It. Linguistic Inquiry 1 (2): 273-276.

The insecurity of relative chronologies

One of the things historical linguists do is reconstruct relative chronologies: statements about whether one change in a language occurred before another change in the language. For example, in the history of English there was a change which raised the Middle English (ME) mid back vowel /oː/, so that it became high /uː/: boot, pronounced /boːt/ in Middle English, is now pronounced /buːt/. There was also a change which caused ME /oː/ to be reflected as short /ʊ/ before /k/ (among other consonants), so that book is now pronounced as /bʊk/. There are two possible relative chronologies of these changes: either the first happens before the second, or the second happens before the first. Now, because English has been well-recorded in writing for centuries, because these written records of the language often contain phonetic spellings, and because they also sometimes communicate observations about the language’s phonetics, we can date these changes quite precisely. The first probably began in the thirteenth century and continued through the fourteenth, while the second took place in the seventeenth century (Minkova 2015: 253-4, 272). In this particular case, then, no linguistic reasoning is needed to infer the relative chronology. But much of if not most of the time in historical linguistics, we are not so lucky, and are dealing with the history of languages for which written records in the desired time period are much less extensive, or completely nonexistent. Relative chronologies can still be inferred under these circumstances; however, it is a methodologically trickier business. In this post, I want to point out some complications associated with inferring relative chronologies under these circumstances which I’m not sure historical linguists are always aware of.

Let’s begin by thinking again about the English example I gave above. If English was an unwritten language, could we still infer that the /oː/ > /uː/ change happened before the /oː/ > /ʊ/ change? (I’m stating these changes as correspondences between Middle English and Modern English sounds—obviously if /oː/ > /uː/ happened first then the second change would operate on /uː/ rather than /oː/.) A first answer might go something along these lines: if the /oː/ > /uː/ change in quality happens first, then the second change is /uː/ > /ʊ/, so it’s one of quantity only (long to short). On the other hand, if /oː/ > /ʊ/ happens first we have a shift of both quantity and quality at the same time, followed by a second shift of quality. The first scenario is simpler, and therefore more likely.

Admittedly, it’s only somewhat more likely than the other scenario. It’s not absolutely proven to be the correct one. Of course we never have truly absolute proofs of anything, but I think there’s a good order of magnitude or so of difference between the likelihood of /oː/ > /uː/ happening first, if we ignore the evidence of the written records and accept this argument, and the likelihood of /oː/ > /uː/ happening first once we consider the evidence of the written records.

But in fact we can’t even say it’s more likely, because the argument is flawed! The /uː/ > /ʊ/ would involve some quality adjustment, because /ʊ/ is a little lower and more central than /uː/.[1] Now, in modern European languages, at least, it is very common for minor quality differences to exist between long and short vowels, and for lengthening and shortening changes to involve the expected minor shifts in quality as well (if you like, you can think of persistent rules existing along the lines of /u/ > /ʊ/ and /ʊː/ > /uː/, which are automatically applied after any lengthening or shortening rules to “adjust” their outputs). We might therefore say that this isn’t really a substantive quality shift; it’s just a minor adjustment concomitant with the quality shift. But sometimes, these quality adjustments following lengthening and shortening changes go in the opposite direction than might be expected based on etymology. For example, when /ʊ/ was affected by open syllable lengthening in Middle English, it became /oː/, not /uː/: OE wudu > ME wood /woːd/. This is not unexpected, because the quality difference between /uː/ and /ʊ/ is (or, more accurately, can be) such that /ʊ/ is about as close in quality to /oː/ as it is to /uː/. Given that /ʊ/ could lengthen into /oː/ in Middle English, it is hardly unbelievable that /oː/ could shorten into /ʊ/ as well.

I’m not trying to say that one should go the other way here, and conclude that /oː/ > /ʊ/ happened first. I’m just trying to argue that without the evidence of the written records, no relative chronological inference can be made here—not even an insecure-but-best-guess kind of relative chronological inference. To me this is surprising and somewhat disturbing, because when I first started thinking about it I was convinced that there were good intrinsic linguistic reasons for taking the /oː/ > /uː/-first scenario as the correct one. And this is something that happens with a lot of relative chronologies, once I start thinking about them properly.

Let’s now go to an example where there really is no written evidence to help us, and where my questioning of the general relative-chronological assumption might have real force. In Greek, the following two very well-known generalizations about the reflexes of Proto-Indo-European (PIE) forms can be made:

  1. The PIE voiced aspirated stops are reflected in Greek as voiceless aspirated stops in the general environment: PIE *bʰéroh2 ‘I bear’ > Greek φέρω, PIE *dʰéh₁tis ‘act of putting’ > Greek θέσις ‘placement’, PIE *ǵʰáns ‘goose’ > Greek χήν.
  2. However, in the specific environment before another PIE voiced aspirated stop in the onset of the immediately succeeding syllable, they are reflected as voiceless unaspirated stops: PIE *bʰeydʰoh2 ‘I trust’ > Greek πείθω ‘I convince’, PIE *dʰédʰeh1mi ‘I put’ > Greek τίθημι. This is known as Grassman’s Law. PIE *s (which usually became /h/ elsewhere) is elided in the same environment: PIE *segʰoh2 ‘I hold’ > Greek ἔχω ‘I have’ (note the smooth breathing diacritic).

On the face of it, the fact that Grassman’s Law produces voiceless unaspirated stops rather than voiced ones seems to indicate that it came into effect only after the sound change that devoiced the PIE voiced aspirated stops. For otherwise, the deaspiration of these voiced aspirated stops due to Grassman’s Law would have produced voiced unaspirated stops at first, and voiced unaspirated stops inherited from PIE, as in PIE *déḱm̥ ‘ten’ > Greek δέκα, were not devoiced.

However, if we think more closely about the phonetics of the segments involved, this is not quite as obvious. The PIE voiced aspirated stops could surely be more accurately described as breathy-voiced stops, like their presumed unaltered reflexes in modern Indo-Aryan languages. Breathy voice is essentially a kind of voice which is closer to voicelessness than voice normally is: the glottis is more open (or less tightly closed, or open at one part and not at another part) than it is when a modally voiced sound is articulated. Therefore it does not seem out of the question for breathy-voiced stops to deaspirate to voiceless stops if they are going to be deaspirated, in a similar manner as ME /ʊ/ becoming /oː/ when it lengthens. Granted, I don’t know of any attested parallels for such a shift. And in Sanskrit, in which a version of Grassman’s Law also applies, breathy-voiced stops certainly deaspirate to voiced stops: PIE *dʰédʰeh1mi ‘I put’ > Sanskrit dádhāmi. So the Grassman’s Law in Greek certainly has to be different in nature (and probably an entirely separate innovation) from the Grassman’s Law in Sanskrit.[2]

Another example of a commonly-accepted relative chronology which I think is highly questionable is the idea that Grimm’s Law comes into effect in Proto-Germanic before Verner’s Law does. To be honest, I’m not really sure what the rationale is for thinking this in the first place. Ringe (2006: 93) simply asserts that “Verner’s Law must have followed Grimm’s Law, since it operated on the outputs of Grimm’s Law”. This is unilluminating: certainly Verner’s Law only operates on voiceless fricatives in Ringe’s formulation of it, but Ringe does not justify his formulation of Verner’s Law as applying only to voiceless fricatives. In general, sound changes will appear to have operated on the outputs of a previous sound change if one assumes in the first place that the previous sound change comes first: the key to justifying the relative chronology properly is to think about what alternative formulations of each sound change are required in order to make the alternative chronology (such alternative formulations can almost always be formulated), and establish the high relative unnaturalness of the sound changes thus formulated compared to the sound changes as formulable under the relative chronology which one wishes to justify.

If the PIE voiceless stops at some point became aspirated (which seems very likely, given that fricativization of voiceless stops normally follows aspiration, and given that stops immediately after obstruents, in precisely the same environment that voiceless stops are unaspirated in modern Germanic languages, are not fricativized), then Verner’s Law, formulated as voicing of obstruents in the usual environments, followed by Grimm’s Law formulated in the usual manner, accounts perfectly well for the data. A Wikipedia editor objects, or at least raises the objection, that a formulation of the sound change so that it affects the voiceless fricatives, specifically, rather than the voiceless obstruents as a whole, would be preferable—but why? What matters is the naturalness of the sound change—how likely it is to happen in a language similar to the one under consideration—not the sizes of the categories in phonetic space that it refers to. Some categories are natural, some are unnatural, and this is not well correlated with size. Both fricatives and obstruents are, as far as I am aware, about equally natural categories.

I do have one misgiving with the Verner’s Law-first scenario, which is that I’m not aware of any attested sound changes involving intervocalic voicing of aspirated stops. Perhaps voiceless aspirated stops voice less easily than voiceless unaspirated stops. But Verner’s Law is not just intervocalic voicing, of course: it also interacts with the accent (precisely, it voices obstruents only after unaccented syllables). If one thinks of it as a matter of the association of voice with low tone, rather than of lenition, then voicing of aspirated stops might be a more believable possibility.

My point here is not so much about the specific examples; I am not aiming to actually convince people to abandon the specific relative chronologies questioned here (there are likely to be points I haven’t thought of). My point is to raise these questions in order to show at what level the justification of the relative chronology needs to be done. I expect that it is deeper than many people would think. It is also somewhat unsettling that it relies so much on theoretical assumptions about what kinds of sound changes are natural, which are often not well-established.

Are there any relative chronologies which are very secure? Well, there is another famous Indo-European sound law associated with a specific relative chronology which I think is secure. This is the “law of the palatals” in Sanskrit. In Sanskrit, PIE *e, *a and *o merge as a; but PIE *k/*g/*gʰ and *kʷ/*gʷ/*gʷʰ are reflected as c/j/h before PIE *e (and *i), and k/g/gh before PIE *a and *o (and *u). The only credible explanation for this, as far as I can see, is that an earlier sound change palatalizes the dorsal stops before *e and *i, and then a later sound change merges *e with *a and *o. If *e had already merged with *a and *o by the time the palatalization occurred, then the palatalization would have to occur before *a, and it would have to be sporadic: and sporadic changes are rare, but not impossible (this is the Neogrammarian hypothesis, in its watered-down form). But what really clinches it is this: that sporadic change would have to apply to dorsal stops before a set of instances of *a which just happened to be exactly the same as the set of instances of *a which reflect PIE *e, rather than *a or *o. This is astronomically unlikely, and one doesn’t need any theoretical assumptions to see this.[3]

Now the question I really want to answer here is: what exactly are the relevant differences in this relative chronology that distinguish it from the three more questionable ones I examined above, and allow us to infer it with high confidence (based on the unlikelihood of a sporadic change happening to appear conditioned by an eliminated contrast)? It’s not clear to me what they are. Something to do with how the vowel merger counterbleeds the palatalization? (I hope this is the correct relation. The concepts of (counter)bleeding and (counter)feeding are very confusing for me.) But I don’t think this is referring to the relevant things. Whether two phonological rules / sound changes (counter)bleed or (counter)feed each other is a function of the natures of the phonological rules / sound changes; but when we’re trying to establish relative chronologies we don’t know what the natures of the phonological rules / sound changes are! That has to wait until we’ve established the relative chronologies. I think that’s why I keep failing to compute whether there is also a counterbleeding in the other relative chronologies I talked about above: the question is non-well-formed. (In case you can’t tell, I’m starting to mostly think aloud in this paragraph.) What we do actually know are the correspondences between the mother language and the daughter language[4], so an answer to the question should state it in terms of those correspondences. Anyway, I think it is best to leave it here, for my readers to read and perhaps comment with their ideas, providing I’ve managed to communicate the question properly; I might make another post on this theme sometime if I manage to work out (or read) an answer that satisfies me.

Oh, but one last thing: is establishing the security of relative chronologies that important? I think it is quite important. For a start, relative chronological assumptions bear directly on assumptions about the natures of particular sound changes, and that means they affect our judgements of which types of sound changes are likely and which are not, which are of fundamental importance in historical phonology and perhaps of considerable importance in non-historical phonology as well (under e.g. the Evolutionary Phonology framework of Blevins 2004).[5] But perhaps even more importantly, they are important in establishing genetic linguistic relationships. Ringe & Eska (2014) emphasize in their chapter on subgrouping how much less likely it is for languages to share the same sequence of changes than the same unordered set of changes, and so how the establishment of secure relative chronologies is our saving grace when it comes to establishing subgroups in cases of quick diversification (where there might be only a few innovations common to a given subgroup). This seems reasonable, but if the relative chronologies are insecure and questionable, we have a problem (and the sequence of changes they cite as establishing the validity of the Germanic subgroup certainly contains some questionable relative chronologies—for example they have all three parts of Grimm’s Law in succession before Verner’s Law, but as explained above, Verner’s Law could have come before Grimm’s; the third part of Grimm’s Law may also have not happened separately from the first).

[1] This quality difference exists in present-day English for sure—modulo secondary quality shifts which have affected these vowels in some accents—and it can be extrapolated back into seventeenth-century English with reasonable certainty using the written records. If we are ignoring the evidence of the written records, we can postulate that the quality differentiation between long /uː/ and short /ʊ/ was even more recent than the /uː/ > /ʊ/ shift (which would now be better described as an /uː/ > /u/ shift). But the point is that such quality adjustment can happen, as explained in the rest of the paragraph.

[2] There is a lot of literature on Grassman’s Law, a lot of it dealing with relative chronological issues and, in particular, the question of whether Grassman’s Law can be considered a phonological rule that was already present in PIE. I have no idea why one would want to—there are certainly PIE forms inherited in Germanic that appear to have been unaffected by Grassman’s Law, as in PIE *bʰeydʰ- > English bide; but I’ve hardly read any of this literature. My contention here is only that the generally-accepted relative chronology of Grassman’s Law and the devoicing of the PIE voiced aspirated stops can be contested.

[3] One should bear in mind some subtleties though—for example, *e and *a might have gotten very, very phonetically similar, so that they were almost merged, before the palatalization occured. If one wants to rule out that scenario, one has to appeal again to the naturalness of the hypothesized sound changes. But as long as we are talking about the full merger of *e and *a we can confidently say that it occurred after palatalization.)

[4] Actually, in practice we don’t know these with certainty either, and the correspondences we postulate to some extent are influenced by our postulations about the natures of sound changes that have occurred and their relative chronologies… but I’ve been assuming they can be established more or less independently throughout these posts, and that seems a reasonable assumption most of the time.

[5] I realize I’ve been talking about phonological changes throughout this post, but obviously there are other kinds of linguistic changes, and relative chronologies of those changes can be established too. How far the discussion in this post applies outside of the phonological domain I will leave for you to think about.


Blevins, J. 2004. Evolutionary phonology: The emergence of sound patterns. Cambridge University Press.

Minkova, D. 2013. A historical phonology of English. Edinburgh University Press.

Ringe, D. 2006. A linguistic history of English: from Proto-Indo-European to Proto-Germanic. Oxford University Press.

Ringe, D. & Eska, J. F. 2013. Historical linguistics: toward a twenty-first century reintegration. Cambridge University Press.

Animacy and the meanings of ‘in front of’ and ‘behind’

The English prepositions ‘in front of’ and ‘behind’ behave differently in an interesting way depending on whether they have animate or inanimate objects.

To illustrate, suppose there are two people—let’s call them John and Mary—who are standing colinear with a ball. Three parts of the line can be distinguished: the segment between John’s and Mary’s positions (let’s call it the middle segment), the ray with John at its endpoint (let’s call it John’s ray), and the ray with Mary at its endpoint (let’s call it Mary’s ray). Note that John may be in front of or behind his ray, or at the side of it, depending on which way he faces; likewise with Mary, although, let’s assume that Mary is either in front of or behind her ray. What determines whether John describes the position of the ball, relative to Mary, as “in front of Mary” or “behind Mary”? First, note that it doesn’t matter which way John is facing. The relevant parameters are the way Mary is facing, and whether the ball is on the middle segment or Mary’s ray. So there are four different situations to consider:

  1. The ball is on the middle segment, and Mary is facing the middle segment. In this case, John can say, “Mary, the ball is in front of you.” But if he said, “Mary, the ball is behind you,” that statement would be false.
  2. The ball is on the middle segment, and Mary is facing her ray. In this case, John can say, “Mary, the ball is behind you.” But if he said, “Mary, the ball is in front of you,” that statement would be false.
  3. The ball is on Mary’s ray, and Mary is facing her ray. In this case, John can say, “Mary, the ball is in front of you.” But if he said, “Mary, the ball is behind you,” that statement would be false.
  4. The ball is on Mary’s ray, and Mary is facing the middle segment. In this case, John can say, “Mary, the ball is behind you.” But if he said, “Mary, the ball is in front of you,” that statement would be false.

So, the relevant variable is whether the ball’s position, and the position towards which Mary is facing, match up: if Mary faces the part of the line the ball is on, it’s in front of her, and if Mary faces away from the part of the line the ball is on, it’s behind her.

This all probably seems very obvious and trivial. But consider what happens if we replace Mary with a lamppost. A lamppost doesn’t have a face; it doesn’t even have clearly distinct front and back sides. So one of the parameters here—the way Mary is facing—has disappeared. But one has also been added—because now the way that John is facing is relevant. So there are still four situations:

  1. The ball is on the middle segment, and John is facing the middle segment. In this case, John can say, “The ball is in front of the lamppost.”
  2. The ball is on the middle segment, and John is facing his ray. In this case, I don’t think it really makes sense for John say either, “The ball is in front of the lamppost,” or, “The ball is behind the lamppost,” unless he is implicitly taking the perspective of some other person who is facing the middle segment. The most he can say is, “The ball is between me and the lamppost.”
  3. The ball is on Mary’s (or rather, the lamppost’s) ray, and John is facing the middle segment. In this case, John can say, “The ball is behind the lamppost.”
  4. The ball is on Mary’s (or rather, the lamppost’s) ray, and John is facing his ray. In this case, I don’t think it really makes sense for John say either, “The ball is in front of the lamppost,” or, “The ball is behind the lamppost,” unless he is implicitly taking the perspective of some other person who is facing the middle segment. The most he can say is, “The ball is behind me, and past the lamppost.”

A preliminary hypothesis: it seems that the prepositions ‘in front of’ and ‘behind’ can only be understood with reference to the perspective of a (preferably) animate being who has a face and a back, located on opposite sides of their body. If the object is animate, then this being is the object. The preposition ‘in front of’ means ‘on the ray extending from [the object]’s face’. The preposition ‘behind’ means ‘on the ray extending from [the object]’s back’. But if the object is inanimate, then … well, it seems to me that there are two analyses you could make:

  • The definitions just become completely different. The prepositions ‘in front of’ and ‘behind’ now presuppose that the object is on the ray extending from the speaker’s face. If the subject (the referent of the noun to which the prepositional phrase is attached, e.g. the ball above) is between the speaker and the object, it’s in front of the object. Otherwise (given the presupposition), it’s behind the object.
  • If the speaker is facing the object, the speaker imagines that the object has a face and a back and is looking back at the speaker. Then the regular definitions apply, so ‘in front of’ means ‘on the ray extending from [the object]’s face, i.e. on the ray extending from [the speaker]’s back or on the middle segment’, and ‘behind’ means ‘on the ray extending from [the object]’s back, i.e. on the ray extending from [the speaker]’s face but not on the middle segment’. On the other hand, if the speaker isn’t facing the object, then (for some reason) they fail to imagine the object as having a face and a back.

The first analysis feels more intuitively correct to me, when I think about what ‘in front of’ and ‘behind’ mean with inanimate objects. But the second analysis makes the same predictions, does not require the postulation of separate definitions in the animate-object and inanimate-object cases and goes some way towards explaining the presupposition that the object is on the ray extending from the speaker’s face (though it does not explain it completely, because it is still puzzling to me why the speaker imagines in particular that the object is facing the speaker, and why no such imagination takes place when the speaker does not face the object). Perhaps it should be preferred, then, although I definitely don’t intuitively feel like phrases like ‘in front of the lamppost’ are metaphors involving an imagination of the lamppost as having a face and a back.

Now, I’ve been talking above like all animate objects have a face and a back and all inanimate objects don’t, but this isn’t quite the case. Although the prototypical members of the categories certainly correlate in this respect, there are inanimate objects like cars, which can be imagined as having a face and a back, and certainly at least have distinct front and back sides. (It’s harder to think of examples of animates that don’t have a front and a back. Jellyfish, perhaps—but if a jellyfish is swimming towards you, you’d probably implicitly imagine its front as being the side closer to you. Given that animates are by definition capable of movement, perhaps animates necessarily have fronts and backs in this sense.)

With respect to these inanimate objects, I think they can be regarded both as animates/faced-and-backed beings or inanimates/unfaced-and-unbacked beings, with free variation as to whether they are so regarded. I can imagine John saying, “The ball is in front of the car,” if John is facing the boot of the car and the ball is in between him and the boot. But I can also imagine him saying, “The ball is behind the car.” He’d really have to say something more specific to make it clear where the ball is. This is much like how non-human animates are sometimes referred to as “he” or “she” and sometimes referred to as “it”.

The reason I started thinking about all this was that I read a passage in Claude Hagège’s 2010 book, Adpositions. Hagège gives the following three example sentences in Hausa:

(1) ƙwallo ya‐na gaba-n Audu
ball 3SG.PRS.S‐be in.front.of-3SG.O Audu
‘the ball is in front of Audu’

(2) ƙwallo ya‐na bayan‐n Audu
ball 3SG.PRS.S‐be behind-3SG.O Audu
‘the ball is behind Audu’

(3) ƙwallo ya‐na baya-n telefo
ball 3SG.PRS.S‐be behind-3SG.O telephone
‘the ball is in front of the telephone’ (lit. ‘the ball is behind the telephone’)

He then writes (I’ve adjusted the numbers of the examples; emphasis original):

If the ball is in front of someone whom ego is facing, as well as if the ball is behind someone and ego is also behind this person and the ball, Hausa and English both use an Adp [adposition] with the same meaning, respectively “in front of” in (1), and “behind” in (2). On the contrary, if the ball is in front of a telephone whose form is such that one can attribute this set a posterior face, which faces ego, and an anterior face, oriented in the opposite direction, the ball being between ego and the telephone, then English no longer uses the intrinsic axis from front to back, and ignores the fact that the telephone has an anterior and a posterior face: it treats it as a human individual, in front of which the ball is, whatever the face presented to the ball by the telephone, hence (3). As opposed to that, Hausa keeps to the intrinsic axis, in conformity to the more or less animist conception, found in many African cultures and mythologies, which views objects as spatial entities possessing their own structure. We thus have, here, a case of animism in grammar.

I don’t entirely agree with Hagège’s description here. I think a telephone is part of the ambiguous category of inanimate objects that have clearly distinct fronts and backs, and which can therefore be treated either way with respect to ‘in front of’ and ‘behind’. It might be true that Hausa speakers show a much greater (or a universal) inclination to treat inanimate objects like this in the manner of animates, but I’m not convinced from the wording here that Hagège has taken into account the fact that there might be variation on this point within both languages. And even if there is a difference, I would caution against assuming it has any correlation with religious differences (though it’s certainly a possibility which should be investigated!)

But it’s an interesting potential cross-linguistic difference in adpositional semantics. And regardless, I’m glad to have read the passage because it’s made me aware of this interesting complexity in the meanings of ‘in front of’ and ‘behind’, which I had never noticed before.

Vowel-initial and vowel-final roots in Proto-Indo-European

A remarkable feature of Proto-Indo-European (PIE) is the restrictiveness of the constraints on its root structure. It is generally agreed that all PIE roots were monosyllabic, containing a single underlying vowel. In fact, the vast majority of the roots are thought to have had a single underlying vowel, namely *e. (Some scholars reconstruct a small number of roots with underlying *a rather than *e; others do not, and reconstruct underlying *e in every PIE root.) It is also commonly supposed that every root had at least one consonant on either side of its vowel; in other words, that there were no roots which began or ended with the vowel (Fortson 2004: 71).

I have no dispute with the first of these constraints; though it is very unusual, it is not too difficult to understand in connection with the PIE ablaut system, and the Semitic languages are similar with their triconsonantal, vowel-less roots. However, I think the other constraint, the one against vowel-initial and vowel-final roots, is questionable. In order to talk about it with ease and clarity, it helps to have a name for it: I’m going to call it the trisegmental constraint, because it amounts to the constraint that every PIE root contains at least three segments: the vowel, a consonant before the vowel, and a consonant after the vowel.

The first thing that might make one suspicious of the trisegmental constraint is that it isn’t actually attested in any IE language, as far as I know. English has vowel-initial roots (e.g. ask) and vowel-final roots (e.g. fly); so do Latin, Greek and Sanskrit (cf. S. aj- ‘drive’, G. ἀγ- ‘lead’, L. ag- ‘do’), and L. dō-, G. δω-, S. dā-, all meaning ‘give’). And for much of the early history of IE studies, nobody suspected the constraint’s existence: the PIE roots meaning ‘drive’ and ‘give’ were reconstructed as *aǵ- and *dō-, respectively, with an initial vowel in the case of the former and a final vowel in the case of the latter.

It was only with the development of the laryngeal theory that the reconstruction of the trisegmental constraint became possible. The initial motivation for the laryngeal theory was to simplify the system of ablaut reconstructed for PIE. I won’t go into the motivation in detail here; it’s one of the most famous developments in IE studies so a lot of my readers are probably familiar with it already, and it’s not hard to find descriptions of it. The important thing to know, if you want to understand what I’m talking about here, is that the laryngeal theory posits the existence of three consonants in PIE which are called laryngeals and written *h1, *h2 and *h3, and that these laryngeals can be distinguished by their effects on adjacent vowels: *h2 turns adjacent underlying *e into *a and *h3 turns adjacent underlying *e into *o. In all of the IE languages other than the Anatolian languages (which are all extinct, and which records of were only discovered in the 20th century), the laryngeals are elided in pretty much everywhere, and their presence is only discernable from their effects on adjacent segments. Note that as well as changing the quality (“colouring”) underlying *e, they also lengthen preceding vowels. And between consonants, they are reflected as vowels, but as different vowels in different languages: in Greek *h1, *h2, *h3 become ε, α, ο respectively, in Sanskrit all three become i, in the other languages all three generally became a.

So, the laryngeal theory allowed the old reconstructions *aǵ- and *dō- to be replaced by *h2éǵ- and *deh3– respectively, which conform to the trisegmental constraint. In fact every root reconstructed with an initial or final vowel by the 19th century IEists could be reconstructed with an initial or final laryngeal instead. Concrete support for some of these new reconstructions with laryngeals came from the discovery of the Anatolian languages, which preserved some of the laryngeals in some positions as consonants. For example, the PIE word for ‘sheep’ was reconstructed as *ówis on the basis of the correspondence between L. ovis, G. ὄϊς, S. áviḥ, but the discovery of the Cuneiform Luwian cognate ḫāwīs confirmed without a doubt that the root must have originally begun with a laryngeal (although it is still unclear whether that laryngeal was *h2, preceding *o, or *h3, preceding *e).

There are also indirect ways in which the presence of a laryngeal can be evidenced. Most obviously, if a root exhibits the irregular ablaut alternations in the early IE languages which the laryngeal theory was designed to explain, then it should be reconstructed with a laryngeal in order to regularize the ablaut alternation in PIE. In the case of *h2eǵ-, for example, there is an o-grade derivative of the root, *h2oǵmos ‘drive’ (n.), which can be reconstructed on the evidence of Greek ὄγμος ‘furrow’ (Ringe 2006: 14). This shows that the underlying vowel of the root must have been *e, because (given the laryngeal theory) the PIE ablaut system did not involve alternations of *a with *o, only alternations of *e, *ō or ∅ (that is, the absence of the segment) with *o. But this underlying *e is reflected as if it was *a in all the e-grade derivatives of *h2eǵ- attested in the early IE languages (e.g. in the 3sg. present active indicative forms S. ájati, G. ἀγει, L. agit). In order to account for this “colouring” we must reconstruct *h2 next to the *e. Similar considerations allow us to be reasonably sure that *deh3– also contained a laryngeal, because the e-grade root is reflected as if it had *ō (S. dádāti, G. δίδωσι) and the zero-grade root in *dh3tós ‘given’ exhibits the characteristic reflex of interconsonantal *h3 (S. -ditáḥ, G. dotós, L. datus).

But in many cases there does not seem to be any particular evidence for the reconstruction of the initial or final laryngeal other than the assumption that the trisegmental constraint existed. For example, *h1éḱwos ‘horse’ could just as well be reconstructed as *éḱwos, and indeed this is what Ringe (2006) does. Likewise, there is no positive evidence that the root *muH- of *muHs ‘mouse’ (cf. S. mūṣ, G. μῦς, L. mūs) contained a laryngeal: it could just as well be *mū-. Both of the roots *(h1)éḱ- and *muH/ū- are found, as far as I know, in these stems only, so there is no evidence for the existence of the laryngeal from ablaut. It is true that PIE has no roots that can be reconstructed as ending in a short vowel, and this could be seen as evidence for at least a constraint against vowel-final roots, because if all the apparent vowel-final roots actually had a vowel + laryngeal sequence, that would explain why the vowel appears to be long. But this is not the only possible explanation: there could just be a constraint against roots containing a light syllable. This seems like a very natural constraint. Although the circumstances aren’t exactly the same—because English roots appear without inflectional endings in most circumstances, while PIE roots mostly didn’t—the constraint is attested in English: short unreduced vowels like that of cat never appear in root-final (or word-final) position; only long vowels, diphthongs and schwa can appear in word-final position, and schwa does not appear in stressed syllables.

It could be argued that the trisegmental constraint simplifies the phonology of PIE, and therefore it should be assumed to exist pending the discovery of positive evidence that some root does begin or end with a vowel. It simplifies the phonology in the sense that it reduces the space of phonological forms which can conceivably be reconstructed. But I don’t think this is the sense of “simple” which we should be using to decide which hypotheses about PIE are better. I think a reconstructed language is simpler to the extent that it is synchronically not unusual, and that the existence of whatever features it has that are synchronically unusual can be justified by explanations of features in the daughter languages by natural linguistic changes (in other words, both synchronic unusualness and diachronic unusualness must be taken into account). The trisegmental constraint seems to me synchronically unusual, because I don’t know of any other languages that have something similar, although I have not made any systematic investigation. And as far as I know there are no features of the IE languages which the trisegmental constraint helps to explain.

(Perhaps a constraint against vowel-initial roots, at least, would be more natural if PIE had a phonemic glottal stop, because people, or at least English and German speakers, tend to insert subphonemic glottal stops before vowels immediately preceded by a pause. Again, I don’t know if there are any cross-linguistic studies which support this. The laryngeal *h1 is often conjectured to be a glottal stop, but it is also often conjectured to be a glottal fricative; I don’t know if there is any reason to favour either conjecture over the other.)

I think something like this disagreement over what notion of simplicity is most important in linguistic reconstruction underlies some of the other controversies in IE phonology. For example, the question of whether PIE had phonemic *a and *ā: the “Leiden school” says it didn’t, accepting the conclusions of Lubotsky (1989), most other IEists say it did. The Leiden school reconstruction certainly reduces the space of phonological forms which can be reconstructed in PIE and therefore might be better from a falsifiability perspective. Kortlandt (2003) makes this point with respect to a different (but related) issue, the sound changes affecting initial laryngeals in Anatolian:

My reconstructions … are much more constrained [than the ones proposed by Melchert and Kimball] because I do not find evidence for more than four distinct sequences (three laryngeals before *-e- and neutralization before *-o-) whereas they start from 24 possibilites (zero and three laryngeals before three vowels *e, *a, *o which may be short or long, cf. Melchert 1994: 46f., Kimball 1999: 119f.). …

Any proponent of a scientific theory should indicate the type of evidence required for its refutation. While it is difficult to see how a theory which posits *H2 for Hittite h- and a dozen other possible reconstructions for Hittite a- can be refuted, it should be easy to produce counter-evidence for a theory which allows no more than four possibilities … The fact that no such counter-evidence has been forthcoming suggests that my theory is correct.

Of course the problem with the Leiden school reconstruction is that for a language to lack phonemic low vowels is very unusual. Arapaho apparently lacks phonemic low vowels, but it’s the only attested example I’ve heard of. But … I don’t have any direct answer to Kortlandt’s concerns about non-falsifiability. My own and other linguists’ concerns about the unnaturalness of a lack of phonemic low vowels also seem valid, but I don’t know how to resolve these opposing concerns. So until I can figure out a solution to this methodological problem, I’m not going to be very sure about whether PIE had phonemic low vowels and, similarly, whether the trisegmental constraint existed.


Fortson, B., 2004. Indo-European language and culture: An introduction. Oxford University Press.

Kortlandt, F., 2003. Initial laryngeals in Anatolian. Orpheus 13-14 [Gs. Rikov] (2003-04), 9-12.

Lubotsky, A., 1989. Against a Proto-Indo-European phoneme *a. The New Sound of Indo–European. Essays in Phonological Reconstruction. Berlin–New York: Mouton de Gruyter, pp. 53–66.

Ringe, D., 2006. A Linguistic History of English: Volume I, From Proto-Indo-European to Proto-Germanic. Oxford University Press.

The Duke of York gambit in diachronic linguistics


Pullum (1976) discusses a phenomenon he evocatively calls the “Duke of York gambit”—the postulation of a derivation of the form A → B → A, which takes the underlying structure A “up to the top of the hill” into a different form B and then takes it “down again” into A on the surface (usually in a more restricted environment, otherwise the postulation of this derivation would not be able to explain anything). Such derivations are called “Duke of York derivations”.

As an illustrative example, consider the case of word-final devoicing in Dutch. Like many other languages, Dutch distinguishes its voiceless and voiced stop phonemes only in non-word-final position. In word-final position, voiceless stops are found exclusively, so that, for example, goed, the cognate of English good, is pronounced [ɣut] in isolation. But morphologically related words like goede ‘good one’, pronounced [ɣudə], seem to indicate that the segment written d is in fact underlyingly /d/ and it becomes [t] by a phonological rule that word-final obstruents become voiceless. We therefore have a derivation /d/ → [t]. Now, in fast, connected speech, goed is not always pronounced [ɣut]. Before a word that begins with a voiced obstruent such as boek ‘book’, it may be pronounced with [d]: goed boek [ɣudbuk]. Some linguists like Brink (1974) have therefore proposed a second phonological rule that grants word-final obstruents the voicing of the obstruent beginning the next word (if there is such an obstruent) in fast, connected speech. This rule applies after the first phonological rule that devoices word-final obstruents, so that the pronunciation [d] of the d in goed boek is derived from underlying /d/ by two steps: /d/ → /t/ → [d]. This is a Duke of York derivation.

Many linguists, as Pullum documents, find Duke of York gambits like this objectionable. They question others’ analyses on the grounds that they postulate Duke of York derivations to take place, and they decide between analyses of their own by disfavouring those which involve Duke of York gambits. In this particular case, an objection is reasonable enough: why not simply propose that in fast, connected speech, the words in phrases run into one another and become unitary words? In that case, the rule devoicing word-final obstruents would not apply to the d in goed boek in the first place because it would be not be in word-final position; the word-final segment would be the k in boek.

Yet Pullum finds no principled reason to disfavour analyses involving Duke of York gambits just because they involve Duke of York gambits. Clearly some linguists find something unsavoury about such analyses: in the quotes in Pullum’s paper, we can find descriptions of them as “to be viewed with some suspicion” (Brame & Bordelois 1973: 115), “rather suspicious” (White 1973: 159), “theoretically quite illegitimate” (Hogg 1973: 10), “hardly an attractive solution” (Chomsky & Halle 1968: 270), “clearly farcical” (Smith 1973: 33), and “extremely implausible” (Johnson 1974: 98). (See Pullum’s paper for the full references.) But none of them articulate the problem explicitly. If an analysis can be replaced by a simpler one with equal or greater explanatory power, that’s one thing: that would be a problem by the well-established principle of Ockham’s Razor. But a Duke of York gambit does not necessarily make an analysis more complex than the alternative in any well-defined way. Even with the Dutch example above, the greater simplicity of the Duke of York gambit-less solution proposed can be questioned: is it really simpler to propose a process of allegro word unification (at the fairly deep underlying level at which the word-final devoicing rule must apply) which we might be able to do without otherwise?

Pullum mentions some other examples where a Duke of York gambit might even seem part of the obviously preferable analysis. In Nootka, according to Campbell (1973), there is a phonological rule of assimilation that turns /k/ into /kʷ/ immediately after /o/, and there is another phonological rule of word-final delabialization that turns /kʷ/ into /k/ in word-final position. And, in word-final position, the sequence /-ok/ appears and the sequence /-okʷ/ does not appear. If the word-final delabialization rule applies before the assimilation rule, then we would expect instead to find the sequence /-okʷ/ to the exclusion of /-ok/ in word-final position. The only possible analysis, if the rules are to be ordered, is to have assimilation before word-final delabialization: but this means that word-final /-ok/ undergoes the Duke of York derivation /-ok/ → /-okʷ/ → /-ok/. And the use of a model with ordered rules is not essential here, because a Duke of York derivation is obtained even in a very natural model with unordered rules: if we say that rules apply at any point that they can apply but they only apply at most once, then we again have that word-final /-ok/ is susceptible to the assimilation rule (but not the delabialization rule), so /-ok/ becomes /-okʷ/, but then /-okʷ/ is susceptible to the delabialization rule (but not the assimilation rule), so /-okʷ/ becomes /-ok/. One could propose that the assimilation rule is restricted in its application to the non-word-final environment. But this is peculiar: why should a progressive assimilation rule pay any heed to whether there is a word boundary after the assimilating segment? Any way of accounting for such a restriction could easily involve making complicated assumptions which would make the Duke of York gambit analysis preferable by Ockham’s Razor.


Now, Pullum discusses only synchronic derivations in his paper. But diachronic derivations can also of course be Duke of York derivations. It is interesting, then, to consider how we should evaluate diachronic analyses that postulate Duke of York derivations. Such analyses are favoured or disfavoured for different reasons than synchronic analyses, so, even if one accepts Pullum’s conclusion that synchronic Duke of York gambits are unobjectionable in of themselves, the situation could conceivably be different for diachronic Duke of York gambits.

My first intuition is that there is even less reason to object to Duke of York gambits in the diachronic context. After all, diachronic analyses deal with changes that we can actually see happening, over the course of years or decades or centuries, and observe the intermediate stages of. (Of course, this is only the case in practice for a very small subset of the diachronic change that we are interested in—until time travel is invented nobody can go and observe the real-time development of languages like Proto-Indo-European.) It is not inconceivable that a change might be “undone” on a short time-scale and it seems inevitable that some changes will be undone on longer time-scales. There is some very strong evidence for such long-term Duke of York derivations having happened in a diachronic sense. The history of English provides a nice example. In Old English, front vowels were “broken” in certain environments (e.g. before h): *æ became ea, *e become eo, and *i became io. We do not, of course, know with absolute certainty exactly how these segments were pronounced, unbroken or broken, but it is at least fairly certain that unbroken *æ, *e and *i were pronounced as [æ], [e] and [i] or vowels of very similar quality. The broken vowels remained largely unchanged throughout the Old English period, except that io was everywhere replaced by eo. But by the Middle English period they had been once again “unbroken” to a and e respectively—the only eventual change was to pre-Old English broken *i which eventually became Middle English e. There may or may not have been minor changes in the pronunciations of these letters in the meantime—[æ] to [a], [e] to [ɛ], [i] to [ɪ]—but these seem scarcely large enough for this sequence of changes to not count as a diachronic Duke of York derivation.

But there are indeed linguists who appear to object to the postulation of diachronic Duke of York derivations, just like the linguists Pullum mentions. Cercignani (1972) seems to rely on such an objection in his questioning of the hypothesis that Proto-Germanic *ē became *ā in stressed syllables in pre-Old English and pre-Old Frisian. The relevant facts here are as follows.

  1. The general reflexes of late Proto-Indo-European *ē in initial syllables in the Germanic languages are exemplified by the following example: Proto-Indo-European *dʰéh1tis ‘act of putting’ (cf. Greek θέσις; Sanskrit dádhāti and Greek τίθημι for the root *dʰeh1 ‘put’) ↣ Proto-Germanic *dēdiz ‘deed’ (with -d- [< Proto-Indo-European *-t-] levelled in from the Proto-Indo-European oblique stem *dʰh1téy-) > Gothic -deþs in missadeþs ‘misdeed’, Old Norse dáð, Old English (West Saxon) dǣd, Old English (non-West Saxon) and Old Frisian dēd, Old Saxon dād and Old High German tāt. One can see that Gothic, Old English (non-West Saxon) and Old Frisian reflect the vowel’s presumed original mid quality, Old Norse, Old Saxon and Old High German have shifted it to a low vowel, and Old English (West Saxon) is intermediate, having shifted it to a near-low front vowel. Length is preserved in every case (Gothic e is a long vowel, it’s just not marked with a macron diacritic because Gothic has no short /e/ phoneme). It is reasonable to reconstruct *ē for Proto-Germanic, reflecting the original Proto-Indo-European quality, and to assume that the shifts have taken place at a post-Proto-Germanic date.
  2. In Old English and Old Frisian, Proto-Germanic *ē is reflected as ō if it was immediately before an underlying nasal (including nasals before *h and *hʷ, which were allophonically elided in Proto-Germanic) in Proto-Germanic: Proto-Germanic *mēnō̄ ‘moon’ (cf. Old Saxon and Old High German māno; Gothic mena with -a [< Proto-Germanic *-a] levelled in from the stem *mēnan-; Old Norse máni with -i levelled in from nouns ending in -i < *-ija [with *-a levelled in as in Gothic] ← Proto-Germanic *-ijō̄) > Old English, Old Frisian mōna.
  3. In Old English, Proto-Germanic *ē is reflected as ā immediately before w: Proto-Germanic *sēgun ‘saw’ (3pl.) (cf. Old Norwegian and Old Swedish ságu, Old English [non-West Saxon] and Old Frisian sēgon; Gothic seƕun and Old High German sāhun with Gothic -ƕ- and Old High German -h- [< Proto-Germanic *-hʷ-] levelled in from the infinitive and present stems *sehʷ- and *sihʷ- and the past. sg. stem *sahʷ-; Old Saxon sāwun with -w- [< Proto-Germanic *-w-] levelled in from the past subj. stem *sēwī-) ↣ Old English (West Saxon) sāwon with -w- < Proto-Germanic *-w- levelled in as in Old Saxon.

The question is in which languages the shift from *ē to *ā reflected in Old Norse, Old Saxon, Old High German and (partially, at least) in Old English (West Saxon) took place. Cercignani argues that it took place only in the languages it is reflected in, with Old English and Old Frisian being partially or totally unaffected by this shift. Let us call this the restriction hypothesis. Other linguists propose that it took place in every Proto-Germanic language other than Gothic, including Old English and Old Frisian, and later shifts are responsible for the reflection of Proto-Germanic *ē as ǣ or ē in Old English and Old Frisian. Let us call this the extension hypothesis (because it postulates a more extensive area for the *ē > *ā shift to take place in than the restriction hypothesis). The derivation *ē > *ā > ē which must have taken place in Old English (non-West Saxon) and Old Frisian if the extension hypothesis is to be accepted is, of course, a Duke of York derivation, and it is clear that Cercignani regards this is a major strike against the extension hypothesis.

The restriction hypothesis certainly appears simpler and, therefore, preferable at first glance. However, there are various pieces of evidence that complicate matters—most obviously points 2 and 3 above. If Proto-Germanic *ē became *ā in pre-Old English and pre-Old Frisian before shifting back to a higher quality, then we can explain the reflection of Proto-Germanic *ē as ō when nasalized or immediately before a nasal as the result of a shift *ā > *ō in this environment (paralleled by the present-day shift /ɑ̃/ > [ɔ̃] in some French dialects). This is more believable than a direct shift *ē > *ō and arguably simpler than a two-step shift *ē > *ā > *ō occurring exclusively in this nasal environment. Likewise, one might argue that the postulation of a slight restriction on the environment of the *ā-fronting sound change in Old English, allowing for retention of *ā before *w, is simpler than the postulation of an entirely separate sound change shifting *ē to *ā before *w in Old English. Neither of these arguments is at all conclusive, but they might be sufficient to make the reader adjust their estimations of the two hypotheses’ probabilities a little in favour of the extension hypothesis. As far as I can tell, the thrust of Cercignani’s argument is that, even if the consideration of points 2 and 3 does make the restriction hypothesis more complicated than it seems at first glance, the postulation of Duke of York derivations is preposterous enough that the restriction hypothesis is still by far the favourable one. Naturally I, not thinking that Duke of York derivations are necessarily preposterous, disagree.

In any case there is some more conclusive evidence for the extension hypothesis not mentioned by Cercignani, but mentioned by Ringe (2014: 13). The Proto-Germanic distal demonstrative and interrogative locative adverbs ‘there’ and ‘where’ can be reconstructed as *þar and *hʷar on the basis of Gothic þar and ƕar and Old Norse þar and hvar. Further support for these reconstructions comes from the fact that they can be transparently derived from the Proto-Germanic distal demonstrative and interrogative stems *þa- and *hʷa- by the addition of a locative suffix *-r (also found on other adverbs such as *aljar ‘elsewhere’ [cf. Gothic aljar, Old English ellor] ← *alja- ‘other’ + *-r). But in the West Germanic languages, the reflexes are as if they contained Proto-Germanic *ē: Old English (West Saxon) þǣr and hwǣr, Old English (non-West Saxon) þēr and hwēr, Old Frisian thēr and hwēr, Old Saxon thār and hwār, Old High German dār and wār. The simplest way to explain this is to propose that there has been an irregular lengthening of these words to *þār and *hwār in Proto-West Germanic, and that the *-ā- in these words was raised in Old English and Old Frisian by the same changes that raised *ā < Proto-Germanic *ē. Proponents of the restriction hypothesis must propose an irregular raising as well as a lengthening in these words, which is perhaps less believable (one can imagine adverbs with the sense ‘here’ and ‘there’ being lengthened due to contrastive emphasis—Ringe alludes to “heavy deictic stress”, which may be the same thing, although he doesn’t explain the term) and, most importantly, one must propose that this irregular raising only happens in Old English and Old Frisian, with the identity of the reflexes of Proto-Germanic *a in these words with the reflexes of Proto-Germanic *ē in stressed syllables existing entirely by coincidence. It is true that Proto-Germanic short *a in stressed syllables became *æ in Old English and Old Frisian, so if we propose that the irregular lengthening occurred after this change as an areal innovation among the West Germanic languages, we can account for Old English (West Saxon) þǣr and hwǣr; but this does not account for Old English (non-West Saxon) þēr and hwēr and Old Frisian thēr and hwēr, which have to be accounted for by an irregular raising.

To me this additional evidence seems fairly decisive. In that case, with the extension hypothesis accepted, we have a nice example of a diachronic Duke of York derivation which we know must have run its full course in a fairly short time, because we can date the Proto-Northwest Germanic *ē > *ā shift and the Old English *ǣ > ē shift (fed by the *ā > *ǣ shift, whose date is irrelevant here because it must have occurred in between these two) with reasonable precision. Ringe (p. 12), citing Grønvik (1998), says that the *ē > *ā shift is “attested from the second half of the 2nd century AD”. This is presumably based on runic evidence. As for the *ǣ > ē shift, it was one of the very early Old English sound changes in the dialects it took place in, being attested already in apparent completion in the oldest Old English texts (which date to the 8th century AD). The fact that it is shared with Old Frisian also suggests an early date. We can therefore say that there were at most five or six centuries between the two shifts, and quite likely considerably less.


To summarize: though they may seem somehow untidy, Duke of York derivations, whether diachronic or synchronic, are not intrinsically implausible. The simplest hypothesis that accounts for the data should always be preferred, but this is not always the hypothesis that avoids the Duke of York gambit. On the diachronic side of things, Duke of York derivations can certainly take place over many centuries—which nobody would dispute—but they can also take place over periods of just a few centuries, as evidenced by the history of Proto-Germanic *ē in Old English and Old Frisian.


Brink, D., 1974. Characterizing the natural order of application of phonological rules. Lingua, 34(1), pp. 47-72.

Campbell, L., 1973. Extrinsic order lives. Bloomington, IN: Indiana University Linguistics Club Publications.

Cercignani, F., 1972. Indo-European ē in Germanic. Zeitschrift für vergleichende Sprachforschung, 86(1. H), pp. 104-110.

Grønvik, O., 1998. Untersuchungen zur älteren nordischen und germanischen Sprachgeschichte. Lang.

Pullum, G. K., 1976. The Duke of York gambit. Journal of Linguistics, 12(01), pp. 83-102.

Ringe, D. & Taylor, A., 2014. A Linguistic History of English: Volume II, The Development of Old English. OUP.

The Kra-Dai languages of Hainan

One of my favourite blogs on the Internet is Martin Lewis’s GeoCurrents, a consistently high-quality and information-dense blog about geography, especially geopolitics, cultural geography and economic geography. As a student in linguistics I’m especially interested in the posts about linguistic geography (which comes under cultural geography), but almost every GeoCurrents post is interesting, and tells me lots of things I didn’t already know. As an example post for interested readers which touches on cultural, linguistic and ethnic geography and the history of agriculture, I recommend The Lost World of the Sago Eaters. Unfortunately, Martin Lewis recently announced that he was going to have to stop making any more posts until, at least, next June. So I thought it might be a good idea to try and do some posts in the style of GeoCurrents on this blog—introducing the reader to some region of the world and telling them whatever interesting things I can find out about this region and the people who live there.

For this particular post, I’ve decided to write about the island of Hainan, and in particular the Kra-Dai languages which are spoken there, which are in my opinion pretty interesting for several reasons. Hainan is an island off the southern coast of China, in the South China Sea. If you look along the southern Chinese coast, you can see Taiwan off the southeast, and then, further towards the west, not far from the Indochinese peninsula, and just across a strait from a little peninsula jutting out to the south, there’s another island of similar size—that’s Hainan. Politically, it’s part of the People’s Republic of China, and it has generally been a possession of the various Chinese states that have existed for over two thousand years. That makes it far more Chinese in terms of age than Taiwan, which was not settled by Han Chinese until the 17th century, was actually claimed by the Netherlands and Spain before China, and was under Japanese rule for most of the first half of the 20th century. However, Hainan has always been on the periphery of China culturally and economically, as well as geographically. Interested readers are referred to Michalk (1986) for an overview of the island’s history.

Below I’ve included a map of the languages of Hainan, based mainly on Steven Huffman‘s language maps.1 One remarkable feature of the island’s linguistic geography which you can see from this map is that languages of four out of the five language families of Southeast Asia are spoken on it: Sino-Tibetan, Hmong-Mien, Kra-Dai and Austronesian. Only Austro-Asiatic is absent, although an Austro-Asiatic language (Vietnamese) is spoken on the nearby island of Bạch Long Vĩ, which is politically part of Vietnam. That’s quite impressive for an island not much larger than Sicily.


All of these languages are interesting and worthy of discussion, but for the sake of not giving me too much to write about I’m going to focus on this post on those belonging to the Kra-Dai family: Be, Li, Cunhua and Jiamao. I will also—because it’s relevant—discuss the Austronesian language, Huihui, a little as well. These Kra-Dai languages are, most likely, the “indigenous” languages of the island, in the sense that they, or direct ancestors of them, were spoken on the island before the others. The Chinese language was obviously brought to Hainan by the Chinese settlers arriving mostly in the second millennium AD; the Mun language is closely related to (and sometimes considered the same as) the Kim Mun language spoken by some people of the Yao ethnicity in the mainland Chinese provinces of Guangxi and Hunan, and therefore these Mun-speakers are probably recent arrivals as well.

The most widespread and probably the most well-known language spoken on Hainan other than Chinese is the Li language, which is spoken in the mountainous interior of Hainan. Chinese sources use the name Li 黎 “black” to refer to the frequently-rebellious indigenous people of Hainan as early as the time of the Song Dynasty (960–1279). Of course, it can’t be assumed that this name refers to exactly the same group of people as the modern name Li does. But the geographic location of the Li-speakers—in the most inaccessible parts of the island, with Chinese settlers occupying the more habitable coastal lowlands—and their language’s phylogenetic position within Kra-Dai (we’ll talk about this more below) does strongly suggest that their language was the main language spoken on Hainan before Chinese settlement. Linguists sometimes use the name Hlai instead, which is presumably based on a native self-appellation. They also sometimes speak of the “Hlai languages” rather than the “Hlai language”, because, much like Chinese, the various Hlai “dialects” are actually highly divergent and often mutually unintelligible. This again suggests an antiquity to the presence of the Li language on Hainan—there must have been plenty of time for these dialects to differentiate from one another. Norquest (2007) has attempted a reconstruction of Proto-Hlai; you can look at his dissertation to get an idea of how different these dialects are from each other.

Two of the Li dialects—Cunhua and Nadouhua—have a special status. They are not much more distinct from neighbouring Li dialects than any other Li dialects are from the dialects neighbouring them. However, the speakers of these dialects are classified by the Chinese government as members of the Han Chinese ethnicity rather than the Li ethnicity, and they themselves identify more with the Han Chinese than the Li. Speakers of other Li dialects also refer to Han Chinese and Cunhua or Nadouhua speakers by the same name, Moi. Cunhua and Nadouhua do have lots of borrowings from Chinese to a much greater extent than the other Li dialects, but according to Norquest (2007) their basic vocabulary is mostly of Li origin which indicates that they should be regarded as Li dialects heavily influenced by Chinese, rather than Chinese dialects influenced by Li or mixed languages. The influence from Chinese is probably due to the fact that the speakers of these dialects live in the coastal lowlands, not in the mountains as do the speakers of the other Li dialects, where contact with Chinese settlers is greater. It is also likely that the speakers have significant Han Chinese ancestry as well as Li ancestry, but I don’t know if any genetic studies have been done. In any case, because of their different ethnic status Cunhua and Nadouhua are often regarded as comprising a separate language from Li, usually referred to as Cunhua or Cun after the more well-known of the two dialects (Cunhua has many times more speakers than Nadouhua). This is reflected on the map above.

Another Li “dialect” is special because it is in the opposite situation to Cunhua and Nadouhua: its speakers do not have a separate ethnic identity from the Li, but the language is clearly divergent and may not even be genetically a Li language at all. This is Jiamao, which is also shown as a distinct language on the map above. Less than half of its lexicon appears to be of Li origin—that is, more than half of its words cannot be identified as similar to words in other Li dialects. Moreover—and more significantly—linguists have been unable to establish regular sound correspondences between the Jiamao words that do look similar to those in other Li dialects, and those Li dialects. In the words of Thurgood (1992a):

The Jiamao tones do not correspond with the tones of Proto-Hlai at all. The Jiamao initials and finals correspond, but with a pervasive, unsystematic irregularity that raised more questions than it answered. The Jiamao initials often have two relatively-frequent unconditioned reflexes, with other less-frequent reflexes thrown in apparently randomly. The more comparative work that was done, the more obvious it became that a comparative approach was not going to explain the “extreme (and apparently unsystematic) aberrancy” of Jiamao.

Some information given to Thurgood by a Chinese linguist, Ni Dabai (it’s not clear where Ni Dabai got the information from) gave him an idea as to why this might be this case. Ni Dabai said that the Jiamao were originally Muslims, and they arrived in two waves, the first in 986 AD and 988 AD and the second in 1486. Thurgood concluded from this that the Jiamao were originally speakers of an Austro-Asiatic language, who migrated to Hainan and thus ended up in close contact with Li speakers. The Jiamao ignored the tone of the Li words they borrowed, and instead decided which tone to pronounce them with based on their initial consonants; this explains the apparently random tone correspondences. And they borrowed words in several strata; this explains the one-to-many correspondences among the non-tonal segments.

I’m not entirely sure how Thurgood gets straight to “they must have been Austro-Asiatic speakers” from “they were originally Muslims,” though. Unfortunately the copy of Thurgood’s paper that I can access online is inexplicably cut off after the fourth page, so I don’t know if he elaborates on the scenario later on in the paper. I’m not aware of any Austro-Asiatic-speaking ethnic group whose members are mostly Muslim. My understanding is that most of the Muslims in Southeast Asia are the Malays, and their close relatives, the Chams, who speak Austronesian languages. To my uninformed, non-Southeast Asian expert, not-having-access-to-the-full-Thurgood-paper self, the Chams seem like the obvious candidates. The Cham kingdom (Champa), situated in what is now southern Vietnam, was for a millennium and a half an integral part of the political landscape of continental Southeast Asia. Its history is one of constant conflict with the Vietnamese kingdom to its north, in which it tended to be something of the underdog. The Vietnamese sacked the Cham capital in 982, 1044, 1068, 1069 (clearly, the 11th century wasn’t a good time for Champa), 1252, 1446, and 1471; after the last and most catastrophic sacking in 1471, the Vietnamese emperor finally annexed the capital and reduced Champa to a rump state occupying only what were originally just its southern regions. Then these regions, too, were chipped away over the next few centuries, and Champa finally vanished from the map in 1832. Some Cham still live in these regions, but they are no longer the dominant ethnic group there, having mostly either been massacred or fled—mostly to Cambodia in the west, but also, in relatively small numbers, to Hainan in the east. This is how the Austronesian language you can see on the map, Huihui, ended up being spoken in Hainan. Huihui is simply an old-fashioned Chinese word for “Muslim”2, and the speakers of Huihui are indeed Muslims. The Huihui themselves call themselves and their language Tsat (which is cognate to Cham). According to Thurgood (1992b), the Tsat came to Hainan after the sacking of 982, and were mostly merchants who had established connections in the area, which explains their Muslim faith (most Cham at the time were Hindu, but much of the merchant class was Muslim; the Cham only became majority-Muslim during the 15th century, which is about the same time that the Malays converted). More Chams might have migrated to join the Tsat after the subsequent sackings.

Now, the dates Ni Dabai gave for the waves of Jiamao settlement—986 AD, 988 AD, 1486—are just a few years after the sackings of 982 AD and 1471 AD respectively, and that suggests to me that Jiamao, like Huihui, may have a Cham origin. But whereas the Cham origin of Huihui explains most everything about it, there are still a lot of unanswered questions with respect to Jiamao even if we accept that it has a Cham origin. Most obviously, what would have led them to take up residence in the highlands of the southeast, rather than the southern coast where Cham traders would have established the most contacts, and to assimilate so much into the Li culture that they gave up Islam (they are now animists like the Li and Be) and extensively relexified their language with Li loanwords?

Then there’s the problem of the actual linguistic evidence. Norquest in his dissertation examined the Jiamao lexicon and found a grand total of… 2 possible words of Austronesian origin (ɓaŋ˥ ɓɯa˩ ‘butterfly’ and pəj˦ ‘pig’; cf. Proto-Austronesian *qari-baŋbaŋ and *babuy), and none of Austro-Asiatic or of any other identifiable origin, apart from Li. He therefore regards the language as a provisional language isolate. Now, I don’t know how well Norquest knows Austronesian and Austro-Asiatic. He doesn’t explicitly rule out a connection with either of those families; he’s more concerned with simply listing the non-Li Jiamao vocabulary than identifying its origin. So it’s not impossible that Jiamao’s non-Li vocabulary is from one of the main Southeast Asian families, but this is certainly something on which more research needs to be done. I have included below some of the Jiamao and Proto-Hlai words for various body parts, to illustrate the difference; this data is taken from Norquest’s dissertation.

Proto-Hlai Jiamao Sense
*dʱəŋ pʰan1 ‘face’
*ʋaːɦ vet10 ‘shoulder’
*kʰiːn tɯːn1 ‘arm’
*ɦaːŋ tsʰɔːŋ1 ‘chin’

In any case, I assume Thurgood had a good reason for proposing the Austro-Asiatic connection (I just can’t figure out by myself what that reason would be). Another caveat to bear in mind here is that Ni Dabai’s information might be incorrect—even if the story of Jiamao being descended from Muslim immigrants arriving in 986 AD, 988 AD and 1486 isn’t completely false, it could be wrong in some details: perhaps they were Hindus rather than Muslims, and perhaps the dates are inaccurate. In short, it’s a mystery. But an interesting one, don’t you think? It’s just a shame that there has been so little investigation into it, so far—Thurgood’s not-wholly-accessible paper and Norquest’s dissertation are the only two papers I can find which go into any detail about Jiamao.


Moving on… there is one other Kra-Dai language spoken on Hainan, which is completely different, both linguistically and ethnically, from Li. The Be language constitutes a branch of Kra-Dai of its own, and it does not appear to be much more closely related to the Li languages than it is to other Kra-Dai languages. The subgrouping of the branches of the Kra-Dai family is not particularly certain (as usual for language families—subgrouping is a hard problem in linguistics); Wikipedia gives a nice overview, and I’ve included a tree on the right adapted from Blench (2013) below (which appears to be just the Edmondson and Solnit classification mentioned in the Wikipedia article). As you can see, Be is often considered the closest relative of the Tai branch (the one that contains the one Kra-Dai language most people have heard of, Thai, the official language of Thailand). In fact, Norquest in  his dissertation mentions that it shows the greatest lexical similarity with the Northern Tai subgroup, specifically, meaning it might actually be a Tai language; unfortunately, this cannot be verified until more comparative work on Kra-Dai languages is done (no full reconstruction of Proto-Tai or Proto-Northern Tai is yet available).

This suggests that Be is a more recent arrival on Hainan than Li, because it must have arrived after or close to the time that the Tai subgroup separated from the other Kra-Dai languages, whereas Li could have split off straight from Proto-Tai-Kadai. Shintani (1991) has some phonological evidence which he says supports this: the Hainanese dialect of Chinese has undergone a sound change s > t (that is, s in other Chinese dialects corresponds to Hainanese t), and the Be language reflects this sound change in borrowings from Chinese such as tuan “garlic” (cf. Mandarin suan). That means it must have borrowed these words from Hainanese, and Shintani takes this as indicating that Be speakers arrived on Hainan after Chinese settlers were established on the island (that would be no earlier than the time of the Song Dynasty of 960-1279). But I don’t quite follow this inference—couldn’t the Be have arrived first, and borrowed these words only after the Chinese arrived?

That a Tai-speaking group might have migrated to Hainan in the historical period is not implausible, however. Although the political prominence of Thai in modern times might lead you to think otherwise, the Tai languages originated in southern China—more precisely, in the area of the modern provinces of Guizhou and Guangxi, probably extending into adjacent regions of Yunnan and Vietnam as well—and were restricted to that region for much of the historical period. Around 1000 AD, some of them began to migrate to the southwest, perhaps to escape Chinese political domination, although this doesn’t seem like a complete explanation—though the Chinese population in the area has surely been growing over time, they had held the political power since long before 1000 AD. (Also, plenty of Tai-speaking peoples remained in their homeland—in fact, the Tai-speaking Zhuang people still comprise over a quarter of the population of Guangxi). These migrations continued for the next couple of centuries, and by the 13th century the familiar Tai kingdoms of the historical record were being established (Sukhothai in the central part of modern Thailand; Lanna in the northern part of modern Thailand; the Shan states in the eastern part of modern Burma; and Ahom way over in the Brahmaputra valley just east of modern Bangladesh). The Lao people of Laos established their kingdom, known then as Lan Xang “[land of the] million elephants”, in the following century. Over the centuries these evolved into the modern Tai states of Thailand and Laos. Now, if the Tai migrated to the southwest because they wished to leave southern China (rather than being attracted by some particular feature of the southwest), we could positively expect some of them to take the alternative route to the direct south and end up on Hainan. Perhaps this, then, is the origin of the Be.

There is an alternative scenario I can think of which is probably less plausible, but a bit more exciting. Maybe the Be have always been on Hainan—or at least, they have been there as long as the Li have. Be being part of or most closely related to the Tai branch isn’t incompatible with this hypothesis. There’s a useful heuristic in linguistics that a region where a language family is most diverse is likely to be its place of origin, because the longer the presence of a speech variety in a given area, the more time it has to diversify into divergent but genetically related daughters. It’s a heuristic, not a rule, so exceptions are possible, and in fact one of the obvious ways an exception could arise is if external pressure repeatedly pushes speakers of languages in the family into a particular small cul-de-sac region (a “refugium”), which is what would have happened in Hainan in the scenario described in the above paragraph. And of course, the diversity of Kra-Dai in Hainan, with just two independent branches represented, isn’t that much greater than anywhere else (there are four independent branches in Guangxi, namely Kra, Lakkia, Kam-Sui and Tai, and by including an adjacent region of Guangdong the remaining Biao branch can be included as well; of course, Guangxi is a lot bigger than Hainan, and depending on how deep you imagine some of the proposed subgroups are, your perception of each region’s diversity might be altered). But I don’t think it’s ludicrous to think that the Kra-Dai languages, or at least a sub-clade of them excluding Kra, might have originated on Hainan. They might have differentiated first into a southern variety (pre-Li) and a northern variety; a first wave of migration onto the mainland, by the speakers of the northern variety, would have brought about the split between Proto-Lakkia-Biao-Kam-Sui and Proto-Be-Tai; and a second wave would have brought about the split between Proto-Tai and Be.

This is especially interesting to consider in the light of the Austro-Tai hypothesis, one of the most plausible macrofamily proposals floating around. Essentially it proposes a genetic relationship between the Kra-Dai languages and the Austronesian languages, although opinions among proponents differ as to whether Kra-Dai is coordinate to Austronesian (that is, Proto-Kra-Dai and Proto-Austronesian share a common ancestor, but neither is the ancestor of the other) or subordinate to Austronesian (that is, Proto-Austronesian is the ancestor of Proto-Kra-Dai). Sagart (2004) is of the opinion that it is subordinate. If Kra-Dai is subordinate to Austronesian then the possibility arises that Austronesians migrated to Hainan, just as they migrated to essentially all of the islands in southeast Asia and Oceania (plus Madagascar!) Unfortunately, the facts do not seem friendly to this neat hypothesis: nobody, so far as I know, goes so far as to say that Kra-Dai is subordinate to Malayo-Polynesian (the subgroup of Austronesian which includes all of the Austronesian languages outside of Taiwan), and the Austronesians probably hadn’t developed their island-hopping habits so extensively at the point where they were still in Taiwan. The more likely scenario, if the Austro-Tai hypothesis is correct, is that Proto-Kra-Dai was the result of a migration from Taiwan onto mainland China; and in order to reconcile this with the Hainan homeland hypothesis we’d have to propose a migration onto Hainan and then multiple migrations back out again, which is kind of untidy. So, for various reasons, I don’t really think the Hainan homeland hypothesis is likely to be correct. I’d say it’s more likely that the homeland of the Kra-Dai languages is on the mainland, somewhere in Guangxi. But it’s not impossible.


  1. ^ Huffman’s maps do not always make it clear which language is spoken within a given boundary; in order to identify the languages spoken in scattered pockets in the northern part of the Li-speaking area and to the north and east of that area, I had to refer to the wonderful but not entirely reliable map at Muturzikin. Unfortunately the boundaries on Muturzikin’s map are not entirely the same as those on Huffman’s, and even on Muturzikin’s map, it is sometimes not entirely clear what language is spoken within a particular boundary, so I have had to make some guesses in identifying all of these pockets as Mun-speaking.
  2. ^ The modern Chinese word for “Muslim” is Musilin, but the unreduplicated word Hui, which strictly speaking refers only to Chinese Muslims, is often colloquially used to refer to Muslims of any nationality.


Blench, R., 2013. The prehistory of the Daic (Tai-Kadai) speaking peoples and the hypothesis of an Austronesian connection. In Unearthing Southeast Asia’s past: Selected Papers from the 12th International Conference of the European Association of Southeast Asian Archaeologists (Vol. 1, pp. 3-15).

Michalk, D.L., 1986. Hainan Island: A brief historical sketch. Journal of the Hong Kong Branch of the Royal Asiatic Society, pp.115-143.

Norquest, P.K., 2007. A phonological reconstruction of Proto-Hlai. ProQuest.

Sagart, L., 2004. The higher phylogeny of Austronesian and the position of Tai-Kadai. Oceanic Linguistics, 43(2), pp.411-444.

Shintani, T., 1991. Preglottalized consonants in the languages of Hainan Island, China. Journal of Asian and African Studies, (41), pp.1-10.

Thurgood, G., 1992. The aberrancy of the Jiamao dialect of Hlai: speculation on its origins and history. Southeast Asian Linguistics Society I, pp.417-433.

Thurgood, G., 1992b. From Atonal to Tonal in Utsat (A Chamic Language of Hainan). In Proceedings of the Eighteenth Annual Meeting of the Berkeley Linguistics Society: Special Session on the Typology of Tone Languages (pp. 145-146).

The perfect pathway

Anybody who knows French or German will be familiar with the fact that the constructions in these languages described as “perfects” tend to be used in colloquial speech as simple pasts1 rather than true perfects. This can be illustrated by the fact that the English sentence (1) is ungrammatical, whereas the French and German sentences (2) and (3) are perfectly grammatical.

  1. *I have left yesterday.
  1. Je suis parti hier.
    I am leave-PTCP yesterday
    “I left yesterday.”
  1. Ich habe gestern verlassen.
    I have-1SG yesterday leave-PTCP
    “I left yesterday.”

The English perfect is a true perfect, referring to a present state which is the result of a past event. So, for example, the English sentence (4) is paraphrased by (5).

  1. I have left.
  1. I am in the state of not being present resulting from having left.

As it is specifically present states which are referred to by perfects, it makes no sense for a verb in the perfect to be modified by an adverb of past time like ‘yesterday’. That’s why (1) is ungrammatical. In order for ‘yesterday’ to modify the verb in (1), the verb would have to refer to a past state resulting from an event further in the past; the appropriate category for such a verb is not the perfect but rather the pluperfect or past perfect, which is formed in the same way as the perfect in English except that the auxiliary verb have takes the past tense. It’s perfectly fine for adverbs of past time to modify the main verbs of pluperfect constructions; c.f. (6).

  1. I had left yesterday.

If the French and German “perfects” were true perfects like the English perfect, (2) and (3) would have to be ungrammatical too, and as they are not in fact ungrammatical we can conclude that these “perfects” are not true perfects. (Of course one could also conclude this from asking native speakers about the meaning of these “perfects”, and one has to take this step to be able to conclude that they are in fact simple pasts; the above is just a neat way of demonstrating their non-true perfect nature via the medium of writing.)

French and German verbs do have simple past forms which have a distinctive inflection; for example, partis and verließ are the first-person singular inflected simple past forms of the verbs meaning ‘leave’ in sentences (2) and (3) respectively, corresponding to the first-person singular present forms pars and verlasse. But these inflected simple past forms are not used in colloquial speech; their function has been taken over by the “perfect”. If you take French or German lessons you are taught how to use the “perfect” before you are taught how to use the simple past, because the “perfect” is more commonly used; it’s the other way round if you take English lessons, because in English the simple past is not restricted to literary speech, and is more common than the perfect as it has a more basic meaning.

The French and German “perfects” were originally true perfects even in colloquial speech, just as in English. So how did this change in meaning from perfect to simple past occur? One way to understand it is as a simple case of generalization. The perfect is a kind of past; if one were to translate (4) into a language such as Turkish which does not have any sort of perfect construction, but does have a distinction between present and past tense, one would translate it as a simple past, as in (7).

  1. Ayrıldım.
    “I left / have left.”

The distinction in meaning between the perfect and the simple past is rather subtle, so it is not hard to imagine the two meanings being confused with each other frequently enough that the perfect came eventually to be used with the same meaning as the simple past. This could have been a gradual process. After all, it is often more or less a matter of arbitrary perspective whether one chooses to focus on the state of having done something, and accordingly use the perfect, or on the doing of the thing itself, and accordingly use the simple past. Here’s an example: if somebody tells you to look up the answer to a question which was raised in a discussion of yours with them, and you go away and look up the answer, and then you meet this person again, you might say either “I looked up the answer” or “I’ve looked up the answer”. At least to me, neither utterance seems any more expected in that situation than other. French and German speakers may have tended over time to more and more err on the side of focusing on the state, so that the perfect construction became more and more common, and this would encourage reanalysis of the meaning of the perfect as the same as that of the simple past.

But it might help to put this development in some further context. It’s not only in French and German that this development from perfect to simple past has occurred. In fact, it seems to be pretty common. Well, I don’t know about other families, but it is definitely common among the Indo-European (IE) languages. There is, in fact, evidence that the development occurred in the history of English, during the development of Proto-Germanic from Proto-Indo-European (PIE). (This means German has undergone the development twice!) I’ll talk a little bit about this pre-Proto-Germanic development, because it’s a pretty interesting one, and it ties in with some of the other cases of the development attested from IE languages.

PIE (or at least a late stage of it; we’ll talk more about that issue below) distinguished three different aspect categories, which are traditionally called the “present”, “aorist” and “perfect”. The names of these aspects do not have their usual meanings—if you know about the distinction between tense and aspect, you probably already noticed that “present” is normally the name of a tense, rather than an aspect. (Briefly, tense is an event or state’s relation in time to the speech act, aspect is the structure of the event on the timeline without any reference to the speech act; for example, aspect includes things like whether the event is completed or not. But this isn’t especially important to our discussion.) The better names for the “present” and “aorist” aspects are imperfective and perfective, respectively. The difference between them is the same as that between the French imperfect and the French simple past: the perfective (“aorist”) refers to events as completed wholes and the imperfective (“present”) refers to other events, such as those which are iterated, habitual or ongoing. Note that present events cannot be completed yet and therefore can only be referred to by imperfectives (“presents”). But past events can be referred to by either imperfectives or perfectives. So, although PIE did distinguish two tenses, present and past, in addition to the three aspects, the distinction was only made in the imperfective (“present”, although that name is getting especially confusing here) aspect because the perfective (“aorist”) aspect entailed past tense. The past tense of the imperfective aspect is called the imperfect rather than the past “present” (I guess even IEists would find that terminology too ridiculous).

So what was the meaning of the PIE “perfect”? Well, the PIE “perfect” is reflected as a true perfect in Classical Greek. The system of Classical Greek, with the imperfect, aorist and true perfect all distinguished from one another, was more or less the same as that of modern literary French. However, according to Ringe (2006: 25, 155), the “perfect” in the earlier Greek of Homer’s poems is better analyzed as a simple stative, referring to a present state without any implication of this state being the result of a past event. Now, I’m not sure exactly what the grounds for this analysis are. Ringe doesn’t elaborate on it very much and the further sources it refers to (Wackernagel 1904; Chantraine 1927) are in German and French, respectively, so I can’t read them very easily. The thing is, every state has a beginning, which can be seen as an event whose result is the state, and thus every simple stative can be seen as a perfect. English does distinguish simple statives from perfects (predicative adjectives are stative, as are certain verbs in the present tense, such as “know”). The difference seems to me to be something to do with how salient the event that begins the state—the state’s inception—is. Compare sentences (8) and (9), which have more or less the same meaning except that the state’s inception is more salient in (9) (although still not as salient as it is in (10)).

  1. He is dead.
  1. He has died.
  1. He died.

But I don’t know if there are any more concrete diagnostic tests that can distinguish a simple stative from a perfect. Homeric and Classical Greek are extinct languages, and it seems like it would be difficult to judge the salience of inceptions of states in sentences of these languages without having access to native speaker intutions.

It is perhaps the case that some states are crosslinguistically more likely than others to be referred to by simple statives, rather than perfects. Perhaps the change was just that the “perfect” came to be used more often to refer to states that crosslinguistically tend to be referred to by perfects. Ringe (2006: 155) says:

… a large majority of the perfects in Classical Attic are obvious innovations and have meanings like that of a Modern English perfect; that is, they denote a past action and its present result. We find ἀπεκτονέναι /apektonénai/ ‘to have killed’, πεπομφέναι /pepompʰénai/ ‘to have sent’, κεκλοφέναι /keklopʰénai/ ‘to have stolen’, ἐνηνοχέναι /enęːnokʰénai/ ‘to have brought’, δεδωκέναι /dedǫːkénai/ ‘to have given’, γεγραφέναι /gegrapʰénai/ ‘to have written’, ἠχέναι /ęːkʰénai/ ‘to have led’, and many dozens more. Most are clearly new creations, but a few appear to be inherited stems that have acquired the new ‘resultative’ meaning, such as λελοιπέναι /leloipʰénai/ ‘to have left behind’ and ‘to be missing’ (the old stative meaning).

These newer perfects could still be glossed as simple statives (‘to be a thief’ instead of ‘to have stolen’, etc.) but the states they refer to do seem to me to be ones which inherently tend to involve a salient reference to the inception of the state.

There is a pretty convincing indication that the “perfect” was a simple stative at some point in the history of Greek: some Greek verbs whose meanings are conveyed by lexically stative verbs or adjectives in English, such as εἰδέναι ‘to know’ and δεδιέναι ‘to be afraid of’, only appear in the perfect and pluperfect. These verbs are sometimes described as using the perfect in place of the present and the pluperfect in place of the imperfect, although at least in Homeric Greek their appearance in only the perfect and pluperfect is perfectly natural in respect of their meaning and does not need to be treated as a special case. These verbs continued to appear only in the perfect and pluperfect during the Classical period, so they do not tell us anything about when the Greek “perfect” became a true perfect.

Anyway, it is on the basis of the directly attested meaning of the “perfect” in Homeric Greek that the PIE “perfect” is reconstructed as a simple stative. Other IE languages do preserve relics of the simple stative meaning which add to the evidence for this reconstruction. There are in fact relics of the simple stative meaning in the Germanic languages which have survived, to this day, in English. These are the “preterite-present” or “modal” verbs: can, dare, may, must, need, ought, shall and will. Unlike other English verbs, these verbs do not take an -s ending in the third person singular (dare and need can take this ending, but only when their complements are to-infinitives rather than bare infinitives). Apart from will (which has a slightly more complicated history), the preterite-present verbs are precisely those whose presents are reflexes of PIE “perfects” rather than PIE “presents” (although some of them have unknown etymologies). It is likely that they were originally verbs that appeared only in the perfect, like Greek εἰδέναι ‘to know’.2

Most of the PIE “perfects”, however, ended up as the simple pasts of Proto-Germanic strong verbs. (That’s why the preterite-present verbs are called preterite-presents: “preterite” is just another word for “past”, and the presents of preterite-present verbs are inflected like the pasts of other verbs.) Presumably these “perfects” underwent the whole two-step development from simple stative to perfect to simple past. There was plenty of time for this to occur: remember that the Germanic languages are unattested before 100 AD, and the development of the true perfect in Greek had already occurred by 500 BC. Just as the analytical simple pasts of colloquial French and German, which are the reflexes of former perfects, have completely replaced the older inflected simple pasts, so the PIE “perfects” completely replaced the PIE “aorists” in Proto-Germanic. According to Ringe (2006: 157) there is absolutely no trace of the PIE “aorist” in any Germanic language. Proto-Germanic also lost the PIE imperfective-perfective opposition, and again the simple pasts reflecting the PIE “perfects” completely replaced the PIE imperfects—with a single exception. This was the verb *dōną ‘to do’, whose past stem *ded- is a reflex of the PIE present stem *dʰédeh1 ‘put’. Admittedly, the development of this verb as a whole is somewhat mysterious (it is not clear where its present stem comes from; proposals have been put forward, but Ringe 2006: 160 finds none of them convincing) but given its generic meaning and probable frequent use it is not surprising to find it developing in an exceptional way. One reason we can be quite sure it was used very frequently is that the *ded- stem is the same one which is though to be reflected in the past tense endings of Proto-Germanic weak verbs. There’s a pretty convincing correspondence between the Gothic weak past endings and the Old High German (OHG) past endings of tuon ‘to do’:

Past of Gothic waúrkjan ‘to make’ Past of OHG tuon ‘to do’
Singular First-person waúrhta ‘I made’ tëta ‘I did’
Second-person waúrhtēs ‘you (sg.) made’ tāti ‘you (sg.) did’
Third-person waúrhta ‘(s)he made’ tëta ‘(s)he did’
Plural First-person waúrhtēdum ‘we made’ tāti ‘we did’
Second-person waúrhtēduþ ‘you (pl.) made’ tātīs ‘you (pl.) did’
Third-person waúrhtēdun ‘they made’ tāti ‘they did’

Note that Proto-Germanic is reflected as ē in Gothic but ā in the other Germanic languages, so the alternation between -t- and -tēd- at the start of each ending in Gothic corresponds exactly, phonologically and morphologically, to the alternation between the stems tët- and tāt- in OHG.

The pasts of Germanic weak verbs must have originally been formed by an analytical construction with a similar syntax as the English, French and German perfect constructions, involving the auxiliary verb *dōną ‘to do’ in the past tense (probably in a sense of ‘to make’) and probably the past participle of the main verb. As pre-Proto-Germanic had SOV word order, the auxiliary verb could then be reinterpreted as an ending on the past participle, which would take us (with a little haplology) from (11) to (12).

  1. *Ek wēpną wurhtą dedǭ.
    I weapon made-NSG wrought-1SG
    “I wrought a weapon” (lit. “I made a weapon wrought”)
  1. *Ek wēpną wurht(ąd)edǭ
    I weapon wrought-1SG
    “I wrought a weapon”

(The past of waúrht- is glossed here by the archaic ‘wrought’ to distinguish it from ded- ‘make’, although ‘make’ is the ideal gloss for both verbs. I should probably have just used a verb other than waúrhtjan in the example to avoid this confusion, but oh well.)

Why couldn’t the pasts of weak verbs have been formed from PIE “perfects”, like those of strong verbs? The answer is that the weak verbs were those that did not have perfects in PIE to use as pasts. Many PIE verbs never appeared in one or more of the three aspects (“present”, “aorist” and “perfect”). I already mentioned the verbs like εἰδέναι < PIE *weyd- ‘to know’ which only appeared in the perfect in Greek, and probably in PIE as well. One very significant and curious restriction in this vein was that all PIE verbs which were derived from roots by the addition of a derivational suffix appeared only in the present aspect. There is no semantic reason why this restriction should have existed, and it is therefore one of the most convincing indications that PIE did not originally have morphological aspect marking on verbs. Instead, aspect was marked by the addition of derivational suffixes. There must have been a constraint on the addition of multiple derivational suffixes to a single root (perhaps because it would mess up the ablaut system, or perhaps just because it’s a crosslinguistically common constraint), and that would account for this curious restriction. Other indications that aspect was originally marked by derivational suffixes in PIE are the fact that the “present”, “aorist” and “perfect” stems of each PIE verb do not have much of a consistent formal relation to one another (there are some consistencies, e.g. all verbs which have a perfect stem form it by reduplication of the initial syllable, although *weyd- ‘know’, which has no present or aorist stem, is not reduplicated; but the general rule is one of inconsistency); there is no single present or aorist suffix, for example, and one pretty much has to learn each stem of each verb off by heart. Also, I’ve think I’ve read, although I can’t remember where I read it, that aspect is still marked (wholly, or largely) by derivational sufixes only in Hittite.

The class of derived verbs naturally expanded over time, while the class of basic verbs became smaller. The inability of derived verbs to have perfect stems is therefore perhaps the main reason why it was necessary to use an alternative strategy for forming the pasts of some verbs in Proto-Germanic, and thus to create a new class of weak verbs separate from the strong verbs.

So that’s the history of the PIE “perfect” in Germanic (with some tangential, but hopefully interesting elaboration). A similar development occurred in Latin. A few PIE “perfects” were preserved in Latin as statives, just like the Germanic preterite-presents (meminisse ‘to remember’, ōdisse ‘to hate’, nōvisse ‘to recognize, to know (someone)’); the others became simple pasts. But I don’t know much about the details of the developments in Latin.


perfect-pathwayWe’ve seen evidence from Indo-European languages that there’s a kind of developmental pathway going on: statives develop into perfects, and perfects develop into simple pasts. In order for the first step to occur there has to be some kind of stative category, and it looks like this might be a relatively uncommon feature: most of the languages I’ve seen have a class of lexically stative verbs or tend to use entirely different syntax for events and states (e.g. verbs for events, adjectives for states). (English does a bit of both.) The existence of the stative category in PIE might be associated with the whole aspectual system’s recent genesis via morphologization of derivational suffixes. Of course the second part of the pathway can occur on its own, as it did in French and German after perfects were innovated via an analytical construction. It is also possible for simple pasts to be innovated straight away via analytical constructions, as we saw with the Germanic weak past inflection.

It would be interesting to hear if there are any other examples of developments occurring along this pathway, or, even more interestingly, examples where statives, perfects or simple pasts have developed or have been developed in completely different ways, from non-Indo-European languages (or Indo-European languages that weren’t mentioned here).


  1. ^ I’m using the phrase “simple past” here to refer to the past tense without the additional meaning of the true perfect (that of a present state resulting from the past event). In French the simple past can be distinguished from the imperfect as well as the perfect: the simple past refers to events as completed wholes (and is therefore said to have perfective aspect), while the imperfect refers either to iterated or habitual events, or to part of an event without the entailment that the event was completed (and is therefore said to have imperfective aspect). The perfect also refers to events as completed wholes, but it also refers to the state resulting from the completion of such events, more or less at the same time (arguably the state is the more primary reference). In colloquial French, the perfect is used in place of the simple past, so that no distinction is made between the simple past and perfect (and the merged category takes the name of the simple past), but the distinction from the imperfect is preserved. Thus the “simple past” in colloquial French is a little different from the “simple past” in colloquial German; German does not distinguish the imperfect from the simple past in either its literary or colloquial varieties. The name “aorist” can be used to refer to a simple past category like the one in literary French, i.e., a simple past which is distinct from both the perfect and the imperfect.
  2. ^ Of course, εἰδέναι appears in the pluperfect as well as the perfect, but the Greek pluperfect was an innovation formation, not inherited from PIE, and there is no reason to think Proto-Germanic ever had a pluperfect. The Proto-Germanic perfect might well have referred to a state of indeterminate tense resulting from a past event, in which case it verbs in the perfect probably could be modified with adverbs of past time like ‘yesterday’. It is a curious thing that the present and past tenses were not distinguished in the PIE “perfect”; there is no particular reason why they should not have been (simple stative meaning is perfectly compatible with both tenses, c.f. English “know” and “knew”) and it is therefore perhaps an indication that tense distinction was a recent innovation in PIE, which had not yet had time to spread to aspects other than the imperfective (“present”). The nature of the endings distinguishing the present and past tense is also suggestive of this; for example the first-person, second-person and third-person singular endings are *-mi, *-si and *-ti respectively in the present and *-m, *-s and *-t respectively in the past, so the present endings can be derived from the past endings by the addition of an *-i element. This *-i element has been hypothesised to be have originally been a particle indicating present tense; it’s called the hic et nunc (‘here and now’) particle. I don’t know how the other endings are accounted for though.


Ringe, D., 2006. From Proto-Indo-European to Proto-Germanic: A Linguistic History of English: Volume I: A Linguistic History of English (Vol. 1). Oxford University Press.

Dirichlet’s approximation theorem

The definition of rational numbers is usually expressed as follows.

Definition 1 For every real number {x}, {x} is rational if and only if there are integers {p} and {q} such that {q \ne 0} and {x = p/q}.

Remark 1 For every pair of integers {p} and {q} such that {q \ne 0}, {p/q = (p'/\gcd(p, q))/(|q|/\gcd(p, q))}, where {p' = p} if {q > 0} and {p' = -p} if {q < 0}. Therefore, the definition which is the same as Definition 1 except that {q} is required to be positive and {p} is required to be coprime to {q} is equivalent to Definition 1.

However, there’s a slightly different way one can express the definition, which uses the fact that the equations {x = p/q} and {q x = p} are equivalent.

Definition 2 For every real number {x}, {x} is rational if and only if there is a nonzero integer {q} such that {q x} is an integer.

Remark 2 The definition which is the same as Definition 3 except that {q} is required to be positive and {q x} is required to be coprime to {q} is equivalent to Definition 3.

The nice thing about Definition 3 is that it immediately brings to mind the following algorithm for verifying that a real number {x} is rational: iterate through the positive integers in ascending order, and for each positive integer {q} check whether {q x} is an integer. (It’s assumed that it is easy to check whether an arbitrary real number is an integer.) If it is an integer, stop the iteration. The algorithm terminates if and only if {x} is rational. The algorithm is obviously not very useful if it is actually used by a computer to check for rationality—one obvious problem is that it cannot verify irrationality, it can only falsify it. But it is useful as a guuide to thought. Mathematical questions are often easier to think about if they are understood in terms of processes, rather than in terms of relationships between static objects.

In particular, there’s a natural way in which some irrational numbers can be said to be “closer to rational” than others, in terms of this algorithm. If {x} is irrational, then none of the terms in the sequence {\langle x, 2 x, 3 x, \dotsc \rangle} are integers. But how close to integers are the terms? The closer they are to integers, the closer to rational {x} can be said to be.

But how is the closeness of the integers to the terms of the sequence to be measured? There are different ways this can be done. Perhaps the most natural way to start off with is to measure it by the minimum of the distances of the terms from the closest integers to them—that is, the minimum of the set {\{|q x - p|: p \in \mathbf Z, q \in \mathbf N\}}. Of course, this minimum may not even exist—it may be possible to make {|q x - p|} arbitrarily small by choosing appropriate integers {p} and {q} such that {q > 0}. So the first question to answer is this: for which values of {x} does the minimum exist?

The answer to this question is given by Dirichlet’s approximation theorem.

Theorem 3 (Dirichlet’s approximation theorem) For every real number {x} and every positive integer {n}, there are integers {p} and {q} such that {q > 0} and

\displaystyle  |q x - p| < \frac 1 n. \ \ \ \ \ (1)

Proof: First, let us define some notation. For every real number {x}, let {[x]} denote the greatest integer less than or equal to {x} and let {\{x\}} denote {x - [x]}. Note that the inequality {0 \le \{x\} < 1} always holds.

Now, suppose {x} is a real number and {n} is a positive integer. The {n + 1} real numbers 0, {\{x\}}, {\{2 x\}}, … and {\{n x\}} are all in the half-open interval {I = [0, 1)}. This interval can be partitioned into the {n} sub-intervals {I_1 = [0, 1/n}, {I_2 = [1/n, 2/n)}, \dots and {I_n = [1 - 1/n, 1)}, each of length {1/n}. These {n + 1} real numbers are distributed among these {n} sub-intervals, and since there are more real numbers than sub-intervals at least one of the sub-intervals contains more than one of the real numbers. That is, there are integers {r} and {s} such that {\{r x\}} and {\{s x\}} are in the same sub-interval and hence {|\{r x\} - \{s x\}| < 1/n}. Or, equivalently:

\begin{array}{rcl} 1/n &>& \|\{r x\} - \{s x\}| \\  &=& |(r x - [r x]) - (s x - [s x])| \\  &=& |(r - s) x - ([r x] - [s x])|, \end{array}

so if we let {q = r - s} and {p = [r x] - [s x]} we have {|q x - p| < 1/n}. And {r} and {s} can be chosen so that {r < s} and hence {q} is positive. \Box

Dirichlet’s approximation theorem says that for every real number {x}, {q x - p} can be made arbitrarily small by choosing appropriate integers {p} and {q} such that {q > 0}, and hence the minimum of the set {\{|q x - p|: p \in \mathbf Z, q \in \mathbf N\}} does not exist.

It may not be immediately obvious from the way in which it has been presented here why Dirichlet’s approximation theorem is called an “approximation theorem”. The reason is that if the inequality {|q x - p| < 1/n} is divided through by {q} (which produces an equivalent inequality, given that {q} is positive), the result is

\displaystyle  \left| x - \frac p q \right| < \frac 1 {n q}. \ \ \ \ \ (2)

So Dirichlet’s approximation theorem can also be interpreted as saying that for every real number {x} and every positive integer {n}, it is possible to find a rational approximation {p/q} to {x} (where {p} and {q} are integers and {q > 0}) whose error is less than {1/(nq)}. In fact, this is how the theorem is usually presented. When it’s presented in this way, Dirichlet’s approximation theorem can be seen as an addendum to the fact that for every positive integer {n}, it is possible to find a rational approximation {p/q} to {x} whose error is less than {1/n}—that is, arbitrarily small rational approximations exist to every real number. (This is very easily proven—it’s really just another way of expressing the fact that the set of all rational numbers, {\mathbf Q}, is dense in the set of all real numbers, {\mathbf R}.) After obtaining that result, one might naturally think, “well, in this sense all real numbers are equally well approximable by rational numbers, but perhaps if I make the condition more strict by adding a factor of {1/q} into the quantity the error has to be less than, I can uncover some interesting differences in the rational approximability of different real numbers.” But the relevance of Dirichlet’s approximation theorem can also be understood in a more direct way, and that’s what I wanted to show with this post.

Of course putting this extra factor in doesn’t lead to the discovery of any interesting differences in the rational approximability of of different real numbers. In order to get to the interesting differences, you have to add in yet another factor of {1/q}. A real number {x} is said to be well approximable if and only if for every positive integer {n}, there are integers {p} and {q} such that {q > 0} and

\displaystyle  |q x - p| < \frac 1 {n q}, \ \ \ \ \ (3)

or, equivalently,

\displaystyle  \left| x - \frac p q \right| < \frac 1 {n q^2}. \ \ \ \ \ (4)

Otherwise, {x} is said to be badly approximable. Some real numbers are well approximable, and some are badly approximable.

There is in fact a very neat characterisation of the distinction in terms of continued fractions. The real numbers that are well approximable are precisely those that have arbitrarily large terms in their continued fraction expansion. For example, {e} is well-approximable because its continued fraction expansion is

\displaystyle  [2, 1, 2, 1, 1, 4, 1, 1, 6, 1, 1, \dotsc]

(Note that the pattern only appears from the third term onwards, so it’s really {(e - 2)/(3 - e)} that has the interesting continued fraction expansion.) Every multiple of 2 appears in this continued fraction expansion, so there are arbitrarily large terms. The real numbers that are badly approximable, on the other hand, are those that have a maximum term in their continued fraction expansion. They include all quadratic irrational numbers (since those numbers have continued fraction expansions which are eventually periodic), as well as others. For example, the real number with the continued fraction expansion

\displaystyle  [1, 2, 1, 1, 2, 1, 1, 1, 2, \dotsc]

is badly approximable. This distinction is the topic of the final year project I’m currently doing for my mathematics course at university.

I guess it would be possible to motivate the well approximable-badly approximable distinction in a similar way: note that a real number {x} is rational if and only if there is an integer {q} such that {q^2 x} is an integer divisible by {q}, and then go on to say that the closeness of rationality of an irrational number {x} can be judged by how close the terms of the sequence {\langle x, 4 x, 9 x, \dotsc \rangle} are to integers that are multiples of {q}. The well approximable numbers would be those for which there exist terms of the sequence arbitrarily close to integers. Of course, this is a lot more contrived.

The evolutionary approach to syntax


I’ve been reading an interesting book lately called Evolutionary Syntax, which is by Ljiljana Progovac. I picked it up because, judging by the title, I thought it might be something in a similar vein as Juliette Blevins’ Evolutionary Phonology. However, it’s not quite the same thing. I’ve not read Blevins’ book yet, but as I understand it, it’s about how patterns in sound change can be used to explain patterns in the phonology of the world’s languages. So it examines the evolution of phonology as it has taken place since the development of the mature human language faculty, and it’s more concerned with identifying the general tendencies of phonological evolution than describing the particular changes that languages have gone through. Progovac’s book, on the other hand, is about the evolution of syntax during the maturation of the human language faculty. This was a case of directed evolution—it’s reasonable to assume that there was a gradual tend towards more and more complex syntactic structures, and that’s the view Progovac takes. It’s also something which it makes sense to talk about as operating on human language as a whole, rather than on specific languages. So, roughly, while Blevins’ question is how phonology can evolve, Progovac’s question is how syntax did evolve.

Another important difference between these two evolutionary processes is the source of the mutations that allow them to occur. For Blevins it’s the errors that learners of a new language make in reproducing the language’s grammar from the speech they hear, while for Progovac it’s straightforward genetic variation: she thinks the evolution of syntax was driven by natural selection.

Anyway, it wasn’t a disappointment to me that the book wasn’t about exactly what I thought it would be; I still found it very interesting. (A more direct counterpart of Evolutionary Phonology for syntax would be interesting as well, though.) It was a little surprising to me that I did, because syntax has never been one of my favourite subfields of linguistics. Another book I’m trying to get through at the moment, because I need to for one of my classes, is David Adger’s Core Syntax, which does the standard thing of attempting to analyse syntax from a purely synchronic perspective, and I’m finding that one pretty dull and hard to get through. Part of the reason for this is that I have a particular interest in evolutionary processes; I like to see everything from an evolutionary perspective if it can possibly be seen that way. But another part of it is that I suspect theoretical approaches in linguistics that don’t make a lot of use of a diachronic perspective aren’t going to have much luck. Language is something that is enormously variable over time, space, social context, and other things, and when language is considered as removed from the context of variation, it seems likely that it will no longer be possible (or at least it may be less possible) to make sense of a lot of its characteristics.

I don’t by any means have a great deal of familiarity with theoretical linguistics, so don’t take my opinion here too seriously. But I do think my opinion is backed up by one part of Evolutionary Syntax. In chapter 5, Progovac outlines how the phenomenon of islandhood can be analysed using her evolutionary framework. Her analysis is rather different from the mainstream approach that I’ve been learning about at university, but the explanations it yields are somewhat more convincing to me. I’ll try to explain her analysis in the rest of this post. First, though, I should explain what islandhood is, and what about it needs explaining.


Consider the following two sentences:

(1) John loves Mary.

(2) Who does John love?

The meanings of these two sentences are similar. Both of them involve a proposition of the form ‘John loves x‘. In (1), the x is Mary, and the sentence is an assertion of the proposition’s truth. In (2), the x is a dummy variable, and the sentence is a question asking for a description of some x such that the proposition ‘John loves x‘ is true. So which word in (2) (if any) represents the dummy variable x? The most natural assumption is that it is the word who. But the position of who in (2) is very different from the position of Mary in (1), even though the meaning ‘Mary’ and the dummy variable x are in the same semantic “position” in both sentences.

There are some languages, like French, in which it would seem that a straightforward analysis of the translation of who (qui) as the dummy variable x is entirely possible. Cf. sentence (3) below, in which qui comes after the verb.

(3) Jean aime qui?

Perhaps, then, the who in (2) does represent the dummy variable, but for some reason the word is “moved” from its expected position after loves to the front of the sentence. The use of the verb “move” here reflects a particular conceptualization of how sentences are formed, where there is more than one level of structure—there’s an underlying structure where the word is not moved, and a surface structure where it is. There are syntactic theories that make this intuitive notion of “movement” more precise, but others analyze (2) in a way that does not involve anything the theory’s proponents like to call “movement”. I’m going to call the thing that explains the difference between (1) and (2) “movement” here, just so it has a name, without implying any particular analysis.

Now, islandhood is the phenomenon of movement being disallowed, for some reason, in certain syntactic environments. Consider, for example, the following examples from Progovac’s book (p. 133). (I’ve added answers to each question in order to help you parse the “expected” meaning.)

(4) *What did Bill reject the accusation that John stole? (cf. Bill rejected the accusation that John stole the jewellery.)

(5) *Which book did Bill visit the store that had in stock? (cf. Bill visited the store that had Crime and Punishment in stock.)

The stars at the start of these sentences indicate that they’re ungrammatical: that is, they violate the syntactic constraints of English. They’re supposed to give you the same instinctive “this is wrong” feeling as sentences like I speaks English correctly or I speak correctly English or I speak the English correctly. Note also that the problem isn’t in the meaning of the words in these sentences. Some sentences, like Colourless green ideas sleep furiously (Chomsky’s famous example) feel wrong for this reason, but it’s quite easy to see what meanings these ungrammatical sentences “should” have. That’s how you know the problem is syntactic, rather than semantic.

The specific problem with sentences (4) and (5) is that the wh-phrases in each of them have been moved out of subordinate clauses that are attached to the object of the main clause. For some reason, movement out of this environment is forbidden. In fact, it is forbidden out of all subordinate clauses that are attached to nouns. Note, however, that movement is permitted out of a subordinate clause if the subordinate clause is not attached to any noun, and is the object of the main clause. In fact, it’s permitted out of an arbitrarily deeply nested sequence of clauses as long as each clause is the object of the previous one, as illustrated by sentences (6), (7) and (8).

(6) Who does John think Mary loves? (cf. John thinks Mary loves Bill.)

(7) Who does John think Mary thinks Bill loves? (cf. John thinks Mary thinks Bill loves Susan.)

(8) Who does John think Mary thinks Bill thinks Susan loves? (cf. John thinks Mary thinks Bill thinks Susan loves Paul.)

Environments out of which movement is forbidden are called wh-islands, because wh-phrases are “stranded” within them. Islandhood is the phenomenon of the existence of wh-islands.

It may be the case you don’t consider both of the sentences (4) and (5) ungrammatical. This is not particularly unusual—judgements of islandhood seem to vary quite a lot between individual speakers of a language (and, sometimes, between different times for the same individual). My friend Darcey brought up a good example of this a while ago: sentence (9) below.

(9) *What do you wonder who fixed? (cf. I wonder who fixed the computer.)

She thought this was a perfectly comprehensible and grammatical sentence, but the vast majority of people, including me, would not agree. The problem with it is that the interrogative DP what is being moved out of a subordinate clause which also contains an interrogative DP (who) that has been moved to the front of the clause. (These clauses are known as indirect questions.) Movement out of this environment seems to be generally forbidden for most English speakers. The contrast between sentences (10) and (11) below illustrates this.

(10) What do you know John fixed? (cf. I know John fixed the computer.)

(11) *What do you know who fixed? (cf. I know who fixed the computer.)

There is another way to see the effects of islandhood, in case you find it hard to judge the grammaticality of sentences. Sometimes, the only way to explain why a sentence is not ambiguous is by appealing to islandhood. Consider sentence (12) below.

(12) When did you wonder who fixed the computer? (cf. I wondered, last night, who fixed the computer.)

I have phrased the answer here carefully, because the sentence I wondered who fixed the computer last night is ambiguous. In this sentence, the adverbial phrase last night could also refer to the time the fixing took place, rather than the time the wondering took place. To put it in terms of phrase structure, last night could be contained within the subordinate clause beginning with who fixed the shower, rather than being outside of it. Perhaps the best way to help you the ambiguity if you can’t already see it is to put some guiding brackets in the sentence:

(13) I wondered [who fixed the computer] last night.

(14) I wondered [who fixed the computer last night].

Now, here’s the funny thing. If we take sentence (14), replace last night by when and move the when to the front we get (12), right? But that suggests (14) is a possible answer to (12). And for me, that isn’t true—the ambiguity isn’t present in (12) at all. For me, (12) can only be interpreted as asking about the time the wondering took place, not the time the fixing took place. You might disagree. In particular, I suspect Darcey might disagree, given that she thought (9) was grammatical. But I know at least one other person agrees because my Introduction to Syntax lecturer, when I took that course last year, gave us the problem of explaining the unambiguity of the sentence When did you wonder whether he disappeared? (which is structurally parallel to (9)) as an exercise.

(If there were any students who did consider that sentence ambiguous, they must have found that problem really confusing.)

For people like me who find (9) ungrammatical, there’s an obvious explanation for this situation in terms of islandhood. Going from (14) to (12) involves the movement of when out of a subordinate clause which begins with an interrogative NP, so it is forbidden since indirect questions are wh-islands. But going from (13) to (12) involves no such thing, since last night is not part of the subordinate clause beginning with who in that sentence. I don’t know how else the unambiguity of (12) could be explained; that’s why I think that Darcey and others with different intuitions on the grammaticality of (9) might have different intuitions on the ambiguity of (12).

Now that was a bit of an aside, but I thought you might find it interesting to see a different way in which the phenomenon of islandhood manifests, and maybe it helps a little if you are finding these grammaticality judgements too subjective.

Anyway, the really intriguing question about islandhood is, why is islandhood a thing? Or, similarly, what distinguishes wh-islands from non-wh-islands? Why is the relatively simple six-word sentence (9) ungrammatical, but the horrendously complex sentence (8), involving movement out of a triply-embedded clause, is perfectly fine?


Before I talk about possible answers to this question, first, I should mention the other island environments in English. We’ve already seen that subordinate clauses which are attached to nouns or which begin with interrogative DPs are islands. A famous PhD dissertation by Ross (1967), which was the first comprehensive investigation of islandhood in English, identified the following additional islands:

  1. Sentential subjects (i.e., subordinate clauses in subject position)
  2. DP specifiers (i.e., noun phrases that are attached to nouns via the possessive clitic ‘s)
  3. Coordinated noun phrases (i.e., noun phrases that are attached to noun phrases via coordinating conjunctions)

These are illustrated by the example sentences below.

(15) *What does that he is denying make it worse? (cf. That he is denying his mistake makes it worse.)

(16) *Whose does John love daughter? (cf. John loves Mary’s daughter.)

(17) *Who does John love Mary and? (cf. John loves Mary and Alice.)

The question is: why are the wh-islands the members of this particular set of environments, rather than some other set?

The traditional, Chomskyan answer relies on a fundamental syntactic principle called Subjacency, which was proposed just in order to answer this question. There’s a nice exposition of this approach in chapter 12 of this online syntax textbook by Santorini & Kroch (2007), which I encourage you to read if you’re interested in the details. But I’ll try to give a brief explanation here. Roughly, the idea is that certain phrases comprise barriers to movement, and that if a phrase is moved from one position to another in a sentence, it can cross the boundaries of at most one barrier to movement. It’s fine if it crosses the boundary of just one barrier to movement, but any more than that, and the sentence becomes ungrammatical. So, for example, the ungrammaticality of (9), and other sentences involving movement out of an indirect question, is a result of IPs (inflectional phrases—these roughly correspond to clauses) being barriers to movement. Sentence (2) (Who does John love?) is grammatical because the word who in this sentence only has to cross the boundary of one IP. But in (9), what has to cross the boundary of two IPs, so (9) is ungrammatical. Another kind of phrase which is a barrier to movement is the DP (determiner phrase, roughly the noun phrase). This accounts for the ungrammaticality of sentences (4) and (5), in which the movement is out of subordinate clauses that are attached to objects in the main clause.

But wait, doesn’t that mean sentences (6), (7) and (8) are ungrammatical too? After all, they involve movement across two, three and four IP boundaries, respectively. Well, the crucial thing to understand here is that it’s possible for the same phrase to move multiple times. The principle of Subjacency only forbids crossing multiple barriers to movement in a single movement—crossing multiple barriers in multiple movements is fine. Here’s a diagram showing the phrase structure of sentence (7).


Whoever said syntax was complicated?

There’s obviously a lot of stuff going on here, but you can see that the DP who starts in the position labelled 4 (in the complement position of the VP loves), moves to the position labelled 1 (in the specifier position of the CP (that) Mary loves), crossing only one IP boundary in the process, then moves to its final position (in the specifier position of the CP containing the whole sentence), again crossing only one IP boundary in the process. In order for this two-step movement to be possible it is crucial that there is an empty node in the phrase structure tree, such as 1 in the one above, to act as a “landing site” for the moving interrogative DP. How we do know that this empty node exists? I can’t give a full justification of the underlying assumptions here, but I can give you a good reason to think that goes in the node labelled C, rather than occupying the position within the CP but outside and to the left of the C’ (which is called the specifier position), which is the position moved interrogative DPs occupy. Consider the first couplet of the Canterbury Tales:

(18) Whan that Aprill, with his shoures soote / The droghte of March hath perced to the roote

Here we have a CP (Whan that Aprill…) which begins with an interrogative phrase followed by that, and the only way we can fit both of them in is to suppose that the interrogative phrase goes in the specifier position of the CP and that goes in the C node. It’s not unreasonable to assume that clauses introduced by interrogative NPs in modern English are structured in the same way, only with the that absent.

So we can get around Subjacency if there are landing that can be used to perform the movement in sizable, non-multiple-barrier-crossing chunks. The reason this doesn’t apply in (9) is that the who in the subordinate clause has already moved into the landing site where the what would otherwise go, and there is no other possible landing site. (If what moved before who then it could use the landing site, but generally it is assumed that movement that is constrained to deeper levels of the phrase structure tree happens first.) What about (4) and (5)? In these sentences it is possible to move the interrogative NP into the specifier position of the CP which contains the subordinate clause, because only one IP boundary needs to be crossed to do that. But moving it again into the specifier position of the CP which contains the whole sentence would involve crossing both an IP and a DP, and as we said above, DPs are barriers to movement as well as IPs, so this would violate Subjacency.

Now, this analysis is reasonably successful, but it’s not without problems. For example, assuming that DPs are barriers to movement as well as IPs predicts the ungrammaticality of some sentences that strike most people as grammatical, such as (19) below.

(19) Who did you take a picture of?

There are ways of dealing with this, and you can read about them in the textbook chapter linked to above. It’s always possible to propose ever more complex principles in order to capture the islandhood conditions more precisely. But the more complex the principles are, the less satisfying the explanation is as an improvement over simply listing the wh-island environments in an unsophisticated way, as we noted that we could do at the start of this section.

Even if Subjacency was able to account for everything, there would still be something of a mystery here. A question remains: why does Subjacency exist in the form that it does? That’s something which the Chomskyan approach doesn’t really attempt to answer.


It’s a question we’d like to have an answer to, though. Subjacency is a rather complex, specific principle. If Subjacency blocked movement across barriers in general, I’d be more happy with it—but blocking movement across two barriers but not one? That’s just weird, and it seems like something that needs an explanation.

Note that Subjacency is supposed to be a universal principle. Even though I’ve only been talking about English here, many other languages have much the same set of island environments, and often where there are apparent differences, Chomskyans would argue that this is due to the misidentification of structures in different languages that are actually not identical. Also, the fact that most children successfully acquire the same grammaticality judgements with regard to islandhood suggests that the principle is part of the innate language faculty—it’s hard to imagine the necessary sentences being uttered often enough to enable a child to learn by example alone. (See Baker 2010 for more on the assumption of universality.) But it’s hard to see how Subjacency, if it’s part of the innate language faculty, could have evolved by natural selection. As Lightfoot (1991), quoted by Progovac, says:

Subjacency has many virtues, but […] it could not have increased the chances of having fruitful sex.

However, if we start from the conception of syntax as an evolved system, a different approach to the whole problem naturally presents itself.

As we said above, one of the basic operations of syntax is movement. In the jargon of syntax this operation is often referred to as Move. It’s called an “operation” because the idea is that when a sentence is formed in the mind, it first consists of an unordered set of words, and then syntactic operations are applied to that set of words in order to give it structure, so that the words can eventually be arranged in a linear order and uttered in that order. The other important syntactic operation is Merge, which combines pairs of words into units called constituents, and also combines pairs of sub-constituents into super-constituents, thus organizing the words into a binary tree. Move applies afterwards, moving words from one node in the tree to another under certain conditions. If Subjacency exists, then it blocks the application of Move in certain conditions.

Now, it’s not too difficult to see why natural selection would enable the Move operation to evolve. Language with Move is more expressive than language without Move. But who’s to say the whole of Move appeared all at once? It seems more likely that it would have evolved gradually, being first applicable only in a particular environment or set of environments, and having its applicability gradually expanded over time by analogy. And it could well be the case that there are some environments that it never became applicable in. These would be exactly the wh-islands.

Let’s not get ahead of ourselves: so far, this does not constitute any sort of answer to our question. What we want to know is how to distinguish wh-islands from non-wh-island environments, and all we’ve done in the above paragraph is shown that we can reformulate the question as “Why didn’t Move get generalized to be possible out of the wh-island environments?” It could still end up being the case that the wh-island environments can be characterised by a condition such as Subjacency, in which case that condition would still be a useful thing to talk about. But framing the question this way makes it clear that we probably shouldn’t expect this to be the case. Impossibility of movement is the original, hence default state. Possibility of movement is the innovative state. And analogical generalization goes from like to like. The second construction on which Move was able to operate would have been a construction very similar to the first; the third would have been similar to the second; and so on, throughout the whole set of non-wh-islands. Hence it’s the non-wh-islands that should be expected to form a natural class, not the wh-islands.

So there’s one way in which the evolutionary perspective has been helpful: it’s allowed us to get a better of idea of the kind of answer we should be looking for. But what would be even better is if we could actually find an answer, using this approach.

Progovac doesn’t have a complete answer here, but she does have a fairly promising sketch of one. The key insight it relies on is the idea that some syntactic constructions are more archaic than others. She identifies four approximate stages of syntactic evolution, which are listed below.

1. The one-word or holophrastic stage, in which all utterances consist of a single word, with no internal structure whatsoever. Multiple words may be uttered in succession, but there is no higher-level structure, only a string of isolated words. The words convey a set of concepts to the listener and the listener has to rely solely on pragmatics to work out how the concepts compose to form a statement about the world. Nim Chimpsky, a chimpanzee who researchers tried to teach (signed) language to, never got past this stage: an example utterance of his was “Give orange me give eat orange me eat orange give me eat orange give me you”.

One-word utterances are still possible in modern languages: “Fire!” Children also often pass through a one-word stage when they are learning to speak, although it’s not clear how far this can be attributed to linguistic constraints, as opposed to physical or general cognitive ones.

2. The two-word or paratactic stage. In this stage utterances consist of at most two words, which are linked by an operation called Conjoin. The two words are of equal status within the resulting constituent; there are no heads or complements. Conjoined constituents cannot themselves be Conjoined, so there is no recursion. Within a string of utterance, the separate utterances (each corresponding to a single Conjoined constituent) are identifiable via prosodic cues such as pitch rise-fall patterns. The interpretation of each constituent is less dependent on pragmatics. Simple, two-word intransitive sentences could already be uttered at this stage, and there might have already been a rudimentary noun-verb distinction. However, in order to convey more complex relations between concepts, inference from pragmatics would still be necessary.

Paratactic constructions still exist in mature human language, but, apart from simple intransitive clauses, they are marginal. Some notable examples are agentive verb-noun compounds (pickpocket, scarecrow), orders (Everybody out!), and the construction which involves a non-case-marked noun followed by a verb and is uttered with an exaggerated rise in pitch on each word, used to convey incredulity at the idea that the statement could be true: Me, a liar?! Adjunction (roughly, the addition of “optional” phrases that add extra information such as adjectives and adverbs) is also a little parataxis-like. When an adjunct combines with a phrase, the resulting phrase is of exactly the same type, in syntactic terms, as the original phrase—one can substitute “black dog” into more or less every sentence that contains “dog”, for example, and preserve grammaticality. This is in contrast to the combination of heads with complements, which results in entirely new phrases with different syntactic distributions.

3. The three-word or coordinate stage. This is similar to the previous stage, except that function words emerge for the first time, in the role of “linkers”: they come in between two Conjoined words in order to mark the conjunction. This makes it easier to identify constituent boundaries, backing up the prosodic cues which are relied on in the paratactic stage. Different linkers may have come to be used in order to add different shades of meaning (cf. and and but in English), but otherwise there is little change in expressive power.

4. The categorical / hierarchical stage. At this stage, different categories of phrases become identifiable for the first time, on the basis of the content words and linkers they contain. For example, John eat might be identifiable as a verb phrase (John is eating) and John dog might be identifiable as a noun phrase (John’s dog). This facilitates the substitution of words for phrases, which allows hierarchical, recursive structure to emerge for the first time (John dog eat = John’s dog is eating), and at this point language reaches its full expressive power.

The identification of these stages is not meant to imply that they were strictly separated; they would have blended into each other to a considerable degree. At the start of each stage the new structural possibilities would have been made use of in relatively few constructions, and as the stage progressed the new kind of structure would have become more and more dominant, just as in the scenario for the evolution of Move described above. So there would always be some constructions which were more “integrated” into the current stage than others. And this applies in just the same way to the categorical / hierarchical stage, which is the stage human language is at today. Progovac thinks that many modern syntactic constructions are “fossils” which have not been fully integrated into the categorical / hierarchical stage. Now, the Move operation is a feature of the categorical / hierarchical stage, and its successful application probably requires the presence of certain structural characteristics which are often not present in these fossil constructions. Hence, islandhood. The wh-island environments should be precisely those environments that involve archaic structure, for a certain level of archaicness.

The prohibition on movement out of coordinate structures has an obvious explanation, using this approach. Coordinate structures are relics of the coordinate stage and have not been fully integrated into the categorical / hierarchical stage. (There is little reason, for example, to believe that a phrase like ham and cheese is structured as either ham [and cheese] or [ham and] cheese; it may be best analyzed as a phrase with three direct sub-phrases which each contain a single word.) The same goes for the prohibition on movement out of subordinate clauses attached to nouns. These subordinate clauses are adjuncts, and hence not integrated into the categorical / hierarchical stage to the same extent as subordinate clauses that are verb complements, for the reasons alluded to above.

Of course, there are other wh-island environments: indirect questions, sentential subjects, and DP specifiers. These environments do not involve particularly archaic structure; in fact, all of these environments involve recursive, hierarchical embedding of structures, which is characteristic of the categorical / hierarchical stage and impossible at lower stages. But perhaps by identifying finer degrees of integratedness, it would be possible to explain why these environments are wh-islands as well. For example, specifiers might be less integrated, in some sense, than complements, which would account for both the sentential subject and DP specifier island constraints. (According to the theory I was taught, verb subjects occupy VP specifiers in the underlying structure and move into IP specifiers on the surface.) But a more in-depth treatment of the subject is needed here than is given in Evolutionary Syntax. Further investigation into exactly how the Move operation might have developed might yield helpful insights here.

The evolutionary approach may also prove helpful in understanding variation in the set of environments that are wh-islands. As described above, the categorical / hierarchical stage probably did not arise suddenly but rather in a gradual manner, with constructions becoming more and more integrated over time at varying rates. This trend towards greater integration could well be continuing to this day; after all, a language that allows movement out of, say, indirect questions has slightly more expressive power than one that doesn’t. It would be interesting to see whether certain kinds of wh-islands are more likely to be non-wh-islands in a minority of individuals’ grammars than others, and whether this likelihood correlates with the extent to which the wh-island environment has a typical categorical / hierarchical phrase structure. To me, indirect questions seem like the most integrated wh-islands that we’ve examined in this post, and hence the most difficult to explain under Progovac’s approach—and they’re the ones for which we’ve seen that there is some variation between individual speakers.

So Progovac’s approach is definitely in need of a lot of further elaboration. But it does explain some of the wh-islands fairly well, and it seems like it might be the right approach to take in explaining the others. By the way, the ideas it makes use of—the different stages of syntactic evolution, and the existence of fossil constructions and varying degrees of integratedness—aren’t just used in Chapter 5, the one that deals with the phenomenon of islandhood, but throughout the whole book. They’re its central ideas. So if you found these ideas interesting, I suggest you check Evolutionary Syntax out.