The Duke of York gambit in diachronic linguistics


Pullum (1976) discusses a phenomenon he evocatively calls the “Duke of York gambit”—the postulation of a derivation of the form A → B → A, which takes the underlying structure A “up to the top of the hill” into a different form B and then takes it “down again” into A on the surface (usually in a more restricted environment, otherwise the postulation of this derivation would not be able to explain anything). Such derivations are called “Duke of York derivations”.

As an illustrative example, consider the case of word-final devoicing in Dutch. Like many other languages, Dutch distinguishes its voiceless and voiced stop phonemes only in non-word-final position. In word-final position, voiceless stops are found exclusively, so that, for example, goed, the cognate of English good, is pronounced [ɣut] in isolation. But morphologically related words like goede ‘good one’, pronounced [ɣudə], seem to indicate that the segment written d is in fact underlyingly /d/ and it becomes [t] by a phonological rule that word-final obstruents become voiceless. We therefore have a derivation /d/ → [t]. Now, in fast, connected speech, goed is not always pronounced [ɣut]. Before a word that begins with a voiced obstruent such as boek ‘book’, it may be pronounced with [d]: goed boek [ɣudbuk]. Some linguists like Brink (1974) have therefore proposed a second phonological rule that grants word-final obstruents the voicing of the obstruent beginning the next word (if there is such an obstruent) in fast, connected speech. This rule applies after the first phonological rule that devoices word-final obstruents, so that the pronunciation [d] of the d in goed boek is derived from underlying /d/ by two steps: /d/ → /t/ → [d]. This is a Duke of York derivation.

Many linguists, as Pullum documents, find Duke of York gambits like this objectionable. They question others’ analyses on the grounds that they postulate Duke of York derivations to take place, and they decide between analyses of their own by disfavouring those which involve Duke of York gambits. In this particular case, an objection is reasonable enough: why not simply propose that in fast, connected speech, the words in phrases run into one another and become unitary words? In that case, the rule devoicing word-final obstruents would not apply to the d in goed boek in the first place because it would be not be in word-final position; the word-final segment would be the k in boek.

Yet Pullum finds no principled reason to disfavour analyses involving Duke of York gambits just because they involve Duke of York gambits. Clearly some linguists find something unsavoury about such analyses: in the quotes in Pullum’s paper, we can find descriptions of them as “to be viewed with some suspicion” (Brame & Bordelois 1973: 115), “rather suspicious” (White 1973: 159), “theoretically quite illegitimate” (Hogg 1973: 10), “hardly an attractive solution” (Chomsky & Halle 1968: 270), “clearly farcical” (Smith 1973: 33), and “extremely implausible” (Johnson 1974: 98). (See Pullum’s paper for the full references.) But none of them articulate the problem explicitly. If an analysis can be replaced by a simpler one with equal or greater explanatory power, that’s one thing: that would be a problem by the well-established principle of Ockham’s Razor. But a Duke of York gambit does not necessarily make an analysis more complex than the alternative in any well-defined way. Even with the Dutch example above, the greater simplicity of the Duke of York gambit-less solution proposed can be questioned: is it really simpler to propose a process of allegro word unification (at the fairly deep underlying level at which the word-final devoicing rule must apply) which we might be able to do without otherwise?

Pullum mentions some other examples where a Duke of York gambit might even seem part of the obviously preferable analysis. In Nootka, according to Campbell (1973), there is a phonological rule of assimilation that turns /k/ into /kʷ/ immediately after /o/, and there is another phonological rule of word-final delabialization that turns /kʷ/ into /k/ in word-final position. And, in word-final position, the sequence /-ok/ appears and the sequence /-okʷ/ does not appear. If the word-final delabialization rule applies before the assimilation rule, then we would expect instead to find the sequence /-okʷ/ to the exclusion of /-ok/ in word-final position. The only possible analysis, if the rules are to be ordered, is to have assimilation before word-final delabialization: but this means that word-final /-ok/ undergoes the Duke of York derivation /-ok/ → /-okʷ/ → /-ok/. And the use of a model with ordered rules is not essential here, because a Duke of York derivation is obtained even in a very natural model with unordered rules: if we say that rules apply at any point that they can apply but they only apply at most once, then we again have that word-final /-ok/ is susceptible to the assimilation rule (but not the delabialization rule), so /-ok/ becomes /-okʷ/, but then /-okʷ/ is susceptible to the delabialization rule (but not the assimilation rule), so /-okʷ/ becomes /-ok/. One could propose that the assimilation rule is restricted in its application to the non-word-final environment. But this is peculiar: why should a progressive assimilation rule pay any heed to whether there is a word boundary after the assimilating segment? Any way of accounting for such a restriction could easily involve making complicated assumptions which would make the Duke of York gambit analysis preferable by Ockham’s Razor.


Now, Pullum discusses only synchronic derivations in his paper. But diachronic derivations can also of course be Duke of York derivations. It is interesting, then, to consider how we should evaluate diachronic analyses that postulate Duke of York derivations. Such analyses are favoured or disfavoured for different reasons than synchronic analyses, so, even if one accepts Pullum’s conclusion that synchronic Duke of York gambits are unobjectionable in of themselves, the situation could conceivably be different for diachronic Duke of York gambits.

My first intuition is that there is even less reason to object to Duke of York gambits in the diachronic context. After all, diachronic analyses deal with changes that we can actually see happening, over the course of years or decades or centuries, and observe the intermediate stages of. (Of course, this is only the case in practice for a very small subset of the diachronic change that we are interested in—until time travel is invented nobody can go and observe the real-time development of languages like Proto-Indo-European.) It is not inconceivable that a change might be “undone” on a short time-scale and it seems inevitable that some changes will be undone on longer time-scales. There is some very strong evidence for such long-term Duke of York derivations having happened in a diachronic sense. The history of English provides a nice example. In Old English, front vowels were “broken” in certain environments (e.g. before h): *æ became ea, *e become eo, and *i became io. We do not, of course, know with absolute certainty exactly how these segments were pronounced, unbroken or broken, but it is at least fairly certain that unbroken *æ, *e and *i were pronounced as [æ], [e] and [i] or vowels of very similar quality. The broken vowels remained largely unchanged throughout the Old English period, except that io was everywhere replaced by eo. But by the Middle English period they had been once again “unbroken” to a and e respectively—the only eventual change was to pre-Old English broken *i which eventually became Middle English e. There may or may not have been minor changes in the pronunciations of these letters in the meantime—[æ] to [a], [e] to [ɛ], [i] to [ɪ]—but these seem scarcely large enough for this sequence of changes to not count as a diachronic Duke of York derivation.

But there are indeed linguists who appear to object to the postulation of diachronic Duke of York derivations, just like the linguists Pullum mentions. Cercignani (1972) seems to rely on such an objection in his questioning of the hypothesis that Proto-Germanic *ē became *ā in stressed syllables in pre-Old English and pre-Old Frisian. The relevant facts here are as follows.

  1. The general reflexes of late Proto-Indo-European *ē in initial syllables in the Germanic languages are exemplified by the following example: Proto-Indo-European *dʰéh1tis ‘act of putting’ (cf. Greek θέσις; Sanskrit dádhāti and Greek τίθημι for the root *dʰeh1 ‘put’) ↣ Proto-Germanic *dēdiz ‘deed’ (with -d- [< Proto-Indo-European *-t-] levelled in from the Proto-Indo-European oblique stem *dʰh1téy-) > Gothic -deþs in missadeþs ‘misdeed’, Old Norse dáð, Old English (West Saxon) dǣd, Old English (non-West Saxon) and Old Frisian dēd, Old Saxon dād and Old High German tāt. One can see that Gothic, Old English (non-West Saxon) and Old Frisian reflect the vowel’s presumed original mid quality, Old Norse, Old Saxon and Old High German have shifted it to a low vowel, and Old English (West Saxon) is intermediate, having shifted it to a near-low front vowel. Length is preserved in every case (Gothic e is a long vowel, it’s just not marked with a macron diacritic because Gothic has no short /e/ phoneme). It is reasonable to reconstruct *ē for Proto-Germanic, reflecting the original Proto-Indo-European quality, and to assume that the shifts have taken place at a post-Proto-Germanic date.
  2. In Old English and Old Frisian, Proto-Germanic *ē is reflected as ō if it was immediately before an underlying nasal (including nasals before *h and *hʷ, which were allophonically elided in Proto-Germanic) in Proto-Germanic: Proto-Germanic *mēnō̄ ‘moon’ (cf. Old Saxon and Old High German māno; Gothic mena with -a [< Proto-Germanic *-a] levelled in from the stem *mēnan-; Old Norse máni with -i levelled in from nouns ending in -i < *-ija [with *-a levelled in as in Gothic] ← Proto-Germanic *-ijō̄) > Old English, Old Frisian mōna.
  3. In Old English, Proto-Germanic *ē is reflected as ā immediately before w: Proto-Germanic *sēgun ‘saw’ (3pl.) (cf. Old Norwegian and Old Swedish ságu, Old English [non-West Saxon] and Old Frisian sēgon; Gothic seƕun and Old High German sāhun with Gothic -ƕ- and Old High German -h- [< Proto-Germanic *-hʷ-] levelled in from the infinitive and present stems *sehʷ- and *sihʷ- and the past. sg. stem *sahʷ-; Old Saxon sāwun with -w- [< Proto-Germanic *-w-] levelled in from the past subj. stem *sēwī-) ↣ Old English (West Saxon) sāwon with -w- < Proto-Germanic *-w- levelled in as in Old Saxon.

The question is in which languages the shift from *ē to *ā reflected in Old Norse, Old Saxon, Old High German and (partially, at least) in Old English (West Saxon) took place. Cercignani argues that it took place only in the languages it is reflected in, with Old English and Old Frisian being partially or totally unaffected by this shift. Let us call this the restriction hypothesis. Other linguists propose that it took place in every Proto-Germanic language other than Gothic, including Old English and Old Frisian, and later shifts are responsible for the reflection of Proto-Germanic *ē as ǣ or ē in Old English and Old Frisian. Let us call this the extension hypothesis (because it postulates a more extensive area for the *ē > *ā shift to take place in than the restriction hypothesis). The derivation *ē > *ā > ē which must have taken place in Old English (non-West Saxon) and Old Frisian if the extension hypothesis is to be accepted is, of course, a Duke of York derivation, and it is clear that Cercignani regards this is a major strike against the extension hypothesis.

The restriction hypothesis certainly appears simpler and, therefore, preferable at first glance. However, there are various pieces of evidence that complicate matters—most obviously points 2 and 3 above. If Proto-Germanic *ē became *ā in pre-Old English and pre-Old Frisian before shifting back to a higher quality, then we can explain the reflection of Proto-Germanic *ē as ō when nasalized or immediately before a nasal as the result of a shift *ā > *ō in this environment (paralleled by the present-day shift /ɑ̃/ > [ɔ̃] in some French dialects). This is more believable than a direct shift *ē > *ō and arguably simpler than a two-step shift *ē > *ā > *ō occurring exclusively in this nasal environment. Likewise, one might argue that the postulation of a slight restriction on the environment of the *ā-fronting sound change in Old English, allowing for retention of *ā before *w, is simpler than the postulation of an entirely separate sound change shifting *ē to *ā before *w in Old English. Neither of these arguments is at all conclusive, but they might be sufficient to make the reader adjust their estimations of the two hypotheses’ probabilities a little in favour of the extension hypothesis. As far as I can tell, the thrust of Cercignani’s argument is that, even if the consideration of points 2 and 3 does make the restriction hypothesis more complicated than it seems at first glance, the postulation of Duke of York derivations is preposterous enough that the restriction hypothesis is still by far the favourable one. Naturally I, not thinking that Duke of York derivations are necessarily preposterous, disagree.

In any case there is some more conclusive evidence for the extension hypothesis not mentioned by Cercignani, but mentioned by Ringe (2014: 13). The Proto-Germanic distal demonstrative and interrogative locative adverbs ‘there’ and ‘where’ can be reconstructed as *þar and *hʷar on the basis of Gothic þar and ƕar and Old Norse þar and hvar. Further support for these reconstructions comes from the fact that they can be transparently derived from the Proto-Germanic distal demonstrative and interrogative stems *þa- and *hʷa- by the addition of a locative suffix *-r (also found on other adverbs such as *aljar ‘elsewhere’ [cf. Gothic aljar, Old English ellor] ← *alja- ‘other’ + *-r). But in the West Germanic languages, the reflexes are as if they contained Proto-Germanic *ē: Old English (West Saxon) þǣr and hwǣr, Old English (non-West Saxon) þēr and hwēr, Old Frisian thēr and hwēr, Old Saxon thār and hwār, Old High German dār and wār. The simplest way to explain this is to propose that there has been an irregular lengthening of these words to *þār and *hwār in Proto-West Germanic, and that the *-ā- in these words was raised in Old English and Old Frisian by the same changes that raised *ā < Proto-Germanic *ē. Proponents of the restriction hypothesis must propose an irregular raising as well as a lengthening in these words, which is perhaps less believable (one can imagine adverbs with the sense ‘here’ and ‘there’ being lengthened due to contrastive emphasis—Ringe alludes to “heavy deictic stress”, which may be the same thing, although he doesn’t explain the term) and, most importantly, one must propose that this irregular raising only happens in Old English and Old Frisian, with the identity of the reflexes of Proto-Germanic *a in these words with the reflexes of Proto-Germanic *ē in stressed syllables existing entirely by coincidence. It is true that Proto-Germanic short *a in stressed syllables became *æ in Old English and Old Frisian, so if we propose that the irregular lengthening occurred after this change as an areal innovation among the West Germanic languages, we can account for Old English (West Saxon) þǣr and hwǣr; but this does not account for Old English (non-West Saxon) þēr and hwēr and Old Frisian thēr and hwēr, which have to be accounted for by an irregular raising.

To me this additional evidence seems fairly decisive. In that case, with the extension hypothesis accepted, we have a nice example of a diachronic Duke of York derivation which we know must have run its full course in a fairly short time, because we can date the Proto-Northwest Germanic *ē > *ā shift and the Old English *ǣ > ē shift (fed by the *ā > *ǣ shift, whose date is irrelevant here because it must have occurred in between these two) with reasonable precision. Ringe (p. 12), citing Grønvik (1998), says that the *ē > *ā shift is “attested from the second half of the 2nd century AD”. This is presumably based on runic evidence. As for the *ǣ > ē shift, it was one of the very early Old English sound changes in the dialects it took place in, being attested already in apparent completion in the oldest Old English texts (which date to the 8th century AD). The fact that it is shared with Old Frisian also suggests an early date. We can therefore say that there were at most five or six centuries between the two shifts, and quite likely considerably less.


To summarize: though they may seem somehow untidy, Duke of York derivations, whether diachronic or synchronic, are not intrinsically implausible. The simplest hypothesis that accounts for the data should always be preferred, but this is not always the hypothesis that avoids the Duke of York gambit. On the diachronic side of things, Duke of York derivations can certainly take place over many centuries—which nobody would dispute—but they can also take place over periods of just a few centuries, as evidenced by the history of Proto-Germanic *ē in Old English and Old Frisian.


Brink, D., 1974. Characterizing the natural order of application of phonological rules. Lingua, 34(1), pp. 47-72.

Campbell, L., 1973. Extrinsic order lives. Bloomington, IN: Indiana University Linguistics Club Publications.

Cercignani, F., 1972. Indo-European ē in Germanic. Zeitschrift für vergleichende Sprachforschung, 86(1. H), pp. 104-110.

Grønvik, O., 1998. Untersuchungen zur älteren nordischen und germanischen Sprachgeschichte. Lang.

Pullum, G. K., 1976. The Duke of York gambit. Journal of Linguistics, 12(01), pp. 83-102.

Ringe, D. & Taylor, A., 2014. A Linguistic History of English: Volume II, The Development of Old English. OUP.


13 responses to “The Duke of York gambit in diachronic linguistics

  1. A good concept, thanks for introducing me to this.

    The synchronic version of this is reminding me of the process of rule inversion: a historical development *A > B can end up producing a morphophonological rule B → A, if the initial change has some exception environment, and later on state B gets reinterpreted as the default state. (A simple example are the palatalvelar rules posited already in ancient Sanskrit grammars.) A “Duke of York gambit” would seem to then represent an intermediate stage, where //A// can still be treated as the unmarked member, but the rule A → B has already become prominent enough to become low-ranked; this then requires creating ex nihilo a rule B → A to account for what are historically actually conditional retentions.

    As for the diachronic version: I have no doubt that such changes happen, but for some cases of supporting evidence like the Anglo-Frisian demonstratives, interdialectal loaning would seem to be another feasible explanation. After *ā > *ǣ, an innovative replacement *þar > *þār being introduced from Istvaeonic (or from other Germanic dialects further east) would have no chance but to be adopted as *þǣr. Within this approach, there may also be some complications for this issue from the existence of Germanic *ē₂ — perhaps Anglo-Frisian *ǣ could even be considered simply archaic, both with respect to Gothic ē and rest-Germanic *ā. (Loanword evidence in Finnic seems to be compatible with this, but that would be too big of a topic for me to go into right now.)

    I additionally have wondered before if some number of what we think of as “conditioned retentions” are actually historical Duke of York developments. For a simple example, the Early Modern English sound change *æ > ɑ / w_ (was, waffle etc.) fails to take place before velars (wacky, wag etc.). This could be however also modelled by a sequence of three changes: 1) *æ acquires an allophone that we could mark *æᵏ before velars (phonetically e.g. a slight diphthong [æɐ] might be feasible); 2) the default allophone is backed after *w; 3) most varieties (but perhaps not all, given widespread æ-breaking or raising in American English before voiced velars?) merge *ӕᵏ back into [æ].

  2. David Marjanović

    in of themselves

    < in ‘n’ of themselves < in and of themselves 🙂

    In this particular case, an objection is reasonable enough: why not simply propose that in fast, connected speech, the words in phrases run into one another and become unitary words? In that case, the rule devoicing word-final obstruents would not apply to the d in goed boek in the first place because it would be not be in word-final position; the word-final segment would be the k in boek.

    There are of course languages that lack word-final but have utterance-final devoicing. Trouble is, English is one, and it ends up devoicing a lot less than Dutch does. To avoid the Duke of York (who is he anyway?) we’d have to postulate ad hoc that Dutch utterances are a lot shorter than English ones…

    Doesn’t Dutch already have regressive voice assimilation? I spent a week in Holland (actual Holland 🙂 ) last month and thought I noticed it a lot – northern German has progressive devoicing instead and never takes a break from its syllable-final fortition.

    paralleled by the present-day shift /ɑ̃/ > [ɔ̃] in some French dialects

    Unrounded [ɑ̃] survives in Québec, but good luck finding it in France. What young people in Paris are doing these days is raising their rounded [ɒ̈̃] to [ɔ̃] and beyond, so that it becomes very similar to the [õ] that many dictionaries still transcribe as its 19th-century pronunciation [ɔ̃].

    perhaps Anglo-Frisian *ǣ could even be considered simply archaic, both with respect to Gothic ē and rest-Germanic *ā. (Loanword evidence in Finnic seems to be compatible with this, but that would be too big of a topic for me to go into right now.)

    That has been proposed, but it’s just the same thing in more precise notation: obviously, [æː] had to be an intermediate step on the way from [ɛː] to [aː], so it’s not surprising that Finnic has Pre-Northwest-Germanic loans with *ää.

    I think the simplest explanation is the Duke of York derivation:

    PIE [ɛː] > PGmc. [ɛː]
    1) > Gothic [eː]
    2) > Pre-NW-Gmc. [æː] > Proto-NW-Gmc. [aː] > Proto-W-Gmc. [aː] > Proto-Anglo-Frisian:
    a) [ɑː] before /w/
    b) [ɑ̃ː] > [ɒ̃ː] > [ɔ̃ː] > [õː] (reanalyzed as /oː/) before /n/
    c) [æː] elsewhere; retained as such in West Saxon, raised to [ɛː] ?> [eː] in the rest of Anglo-Frisian.

    By “Proto-” I mean the last common ancestor of the attested languages, the stage right at the first split.

    The developments 1) and 2) both happened on the way to Proto-Celtic, too: 1) in stressed syllables ( > PC [iː]), 2) in unstressed ones. The development 2b) has odd sporadic occurrences in High German, e.g. Mond (with [oː]!) “moon”. (The -d is one of those strange Early NHG “excrescent consonants” that apparently served to mark the end of the word, as words became emphasized more and more above syllables.)

    For a simple example, the Early Modern English sound change *æ > ɑ / w_ (was, waffle etc.) fails to take place before velars (wacky, wag etc.). This could be however also modelled by a sequence of three changes:

    I agree.

    • I was convinced it was just “in of themselves”, but apparently not 🙂

      I didn’t know about the Early NHG excrescent consonant phenomenon—interesting! I wouldn’t have thought it possible for a final /-t/ to be inserted out of nowhere due to emphasis. Are there any possible alternative explanations (rebracketing, or importation from related words, or something like that)?

  3. David Marjanović

    Test: are comments moderated, or is there an overzealous spam filter?

  4. David Marjanović

    Thought so.

    No words are related to “moon”. 🙂 Another example is Saft “sap, juice”. And for people without syllable-final fortition, like me, Mond really does end in /d/; nothing changes on the phonetic level in the plural Monde.

  5. David Marjanović

    Further examples: 1) jemand “some-/anybody” and niemand “nobody”, *je-/ne-mann- and then the excrescent d; 2) possibly -schaft “-ship”, but that’s Original Research as far as I know; 3) to some extent the Fugen-s which often goes between the components of compound nouns – it descends from a genitive ending, but goes on components that have never formed a genitive in -s, removes the syllable structure of the results even farther from the CV ideal, and apparently serves to mark the ends of phonological words. (Unlike in English, where e.g. -man and -land often get their vowel reduced, recognizable components always stay separate phonological words in German.)

    • David Marjanović

      Another is Axt “ax”. A day or two ago I encountered one with -cht and promptly forgot it again… 😦

  6. David Marjanović

    Yet further examples: PIE *ei > PGmc. *ī > mainstream German ei (still [iː] in Low German and High Alemannic, but independently ij in Dutch…). Also Gothic ei, which is a purely graphic trap.

  7. David Marjanović

    The simplest way to explain this is to propose that there has been an irregular lengthening of these words to *þār and *hwār in Proto-West Germanic

    Apparently, an even simpler way to explain “where”, “there” and “here” all at once is:

    1) stressed [ir] > [er], stressed [iz] > [ez] and stressed [ist] > [est] on the way to PGmc; this [e] remains distinct from /ɛ/
    2) monosyllabic words ending in /r/ are lengthened, giving us two more words with the extremely rare PGmc. /aː/*, and also giving us PGmc. [eː] as, apparently, a one-word allophone of /i/

    East Germanic:
    evidence for 1) erased on the way to Gothic, except in the one word where 2) happened as well

    Northwest Germanic:
    a) [eː] develops from several more sources, forming a new phoneme
    b) later, [e] develops two more sources (i-umlaut of /ɛ/ and of /a/) and either becomes a phoneme (OHG) or merges with /ɛ/

    Source: this paper on I tentatively confirm 1) in that Nest has [e] in my dialect, which has not redistributed the vowel qualities as Standard German has done, but largely keeps MHG /e/ as /e/ and has merged MHG /ɛ/ and /ɛː/ as, largely, /ɛ/ (phonemic vowel length is lost across the board).

    * Following Ringe (2006), /aː/ otherwise existed in a few verb forms because of contraction of earlier /aja/.

  8. David Marjanović

    Ooh, here’s a good Duke of York gambit: [Ɂ] > [h] > [Ɂ] in the Nāhuatl dialect of Ixhuatlancillo, in some environments.

  9. the Duke of York (who is he anyway?)

    The purely fictional hero of an English nursery rhyme:

    The noble Duke of York
    He had ten thousand men
    He marched them up to the top of the hill
    And marched them down again.


    However, I don’t think His Grace is at work in the vowel breaking/unbreaking example, because Standard ME is a West Midlands dialect, whereas Standard OE is a Wessex one, and not actually the ancestor of Standard OE. There was no breaking in Mercian OE (note that the OE in The Lord of the Rings is unbroken), and it’s reasonable to suppose that the mostly-undocumented ancestor of London ME was also unbroken.

  10. David Marjanović


Leave a Reply

Fill in your details below or click an icon to log in: Logo

You are commenting using your account. Log Out / Change )

Twitter picture

You are commenting using your Twitter account. Log Out / Change )

Facebook photo

You are commenting using your Facebook account. Log Out / Change )

Google+ photo

You are commenting using your Google+ account. Log Out / Change )

Connecting to %s