Pullum (1976) discusses a phenomenon he evocatively calls the “Duke of York gambit”—the postulation of a derivation of the form A → B → A, which takes the underlying structure A “up to the top of the hill” into a different form B and then takes it “down again” into A on the surface (usually in a more restricted environment, otherwise the postulation of this derivation would not be able to explain anything). Such derivations are called “Duke of York derivations”.
As an illustrative example, consider the case of word-final devoicing in Dutch. Like many other languages, Dutch distinguishes its voiceless and voiced stop phonemes only in non-word-final position. In word-final position, voiceless stops are found exclusively, so that, for example, goed, the cognate of English good, is pronounced [ɣut] in isolation. But morphologically related words like goede ‘good one’, pronounced [ɣudə], seem to indicate that the segment written d is in fact underlyingly /d/ and it becomes [t] by a phonological rule that word-final obstruents become voiceless. We therefore have a derivation /d/ → [t]. Now, in fast, connected speech, goed is not always pronounced [ɣut]. Before a word that begins with a voiced obstruent such as boek ‘book’, it may be pronounced with [d]: goed boek [ɣudbuk]. Some linguists like Brink (1974) have therefore proposed a second phonological rule that grants word-final obstruents the voicing of the obstruent beginning the next word (if there is such an obstruent) in fast, connected speech. This rule applies after the first phonological rule that devoices word-final obstruents, so that the pronunciation [d] of the d in goed boek is derived from underlying /d/ by two steps: /d/ → /t/ → [d]. This is a Duke of York derivation.
Many linguists, as Pullum documents, find Duke of York gambits like this objectionable. They question others’ analyses on the grounds that they postulate Duke of York derivations to take place, and they decide between analyses of their own by disfavouring those which involve Duke of York gambits. In this particular case, an objection is reasonable enough: why not simply propose that in fast, connected speech, the words in phrases run into one another and become unitary words? In that case, the rule devoicing word-final obstruents would not apply to the d in goed boek in the first place because it would be not be in word-final position; the word-final segment would be the k in boek.
Yet Pullum finds no principled reason to disfavour analyses involving Duke of York gambits just because they involve Duke of York gambits. Clearly some linguists find something unsavoury about such analyses: in the quotes in Pullum’s paper, we can find descriptions of them as “to be viewed with some suspicion” (Brame & Bordelois 1973: 115), “rather suspicious” (White 1973: 159), “theoretically quite illegitimate” (Hogg 1973: 10), “hardly an attractive solution” (Chomsky & Halle 1968: 270), “clearly farcical” (Smith 1973: 33), and “extremely implausible” (Johnson 1974: 98). (See Pullum’s paper for the full references.) But none of them articulate the problem explicitly. If an analysis can be replaced by a simpler one with equal or greater explanatory power, that’s one thing: that would be a problem by the well-established principle of Ockham’s Razor. But a Duke of York gambit does not necessarily make an analysis more complex than the alternative in any well-defined way. Even with the Dutch example above, the greater simplicity of the Duke of York gambit-less solution proposed can be questioned: is it really simpler to propose a process of allegro word unification (at the fairly deep underlying level at which the word-final devoicing rule must apply) which we might be able to do without otherwise?
Pullum mentions some other examples where a Duke of York gambit might even seem part of the obviously preferable analysis. In Nootka, according to Campbell (1973), there is a phonological rule of assimilation that turns /k/ into /kʷ/ immediately after /o/, and there is another phonological rule of word-final delabialization that turns /kʷ/ into /k/ in word-final position. And, in word-final position, the sequence /-ok/ appears and the sequence /-okʷ/ does not appear. If the word-final delabialization rule applies before the assimilation rule, then we would expect instead to find the sequence /-okʷ/ to the exclusion of /-ok/ in word-final position. The only possible analysis, if the rules are to be ordered, is to have assimilation before word-final delabialization: but this means that word-final /-ok/ undergoes the Duke of York derivation /-ok/ → /-okʷ/ → /-ok/. And the use of a model with ordered rules is not essential here, because a Duke of York derivation is obtained even in a very natural model with unordered rules: if we say that rules apply at any point that they can apply but they only apply at most once, then we again have that word-final /-ok/ is susceptible to the assimilation rule (but not the delabialization rule), so /-ok/ becomes /-okʷ/, but then /-okʷ/ is susceptible to the delabialization rule (but not the assimilation rule), so /-okʷ/ becomes /-ok/. One could propose that the assimilation rule is restricted in its application to the non-word-final environment. But this is peculiar: why should a progressive assimilation rule pay any heed to whether there is a word boundary after the assimilating segment? Any way of accounting for such a restriction could easily involve making complicated assumptions which would make the Duke of York gambit analysis preferable by Ockham’s Razor.
Now, Pullum discusses only synchronic derivations in his paper. But diachronic derivations can also of course be Duke of York derivations. It is interesting, then, to consider how we should evaluate diachronic analyses that postulate Duke of York derivations. Such analyses are favoured or disfavoured for different reasons than synchronic analyses, so, even if one accepts Pullum’s conclusion that synchronic Duke of York gambits are unobjectionable in of themselves, the situation could conceivably be different for diachronic Duke of York gambits.
My first intuition is that there is even less reason to object to Duke of York gambits in the diachronic context. After all, diachronic analyses deal with changes that we can actually see happening, over the course of years or decades or centuries, and observe the intermediate stages of. (Of course, this is only the case in practice for a very small subset of the diachronic change that we are interested in—until time travel is invented nobody can go and observe the real-time development of languages like Proto-Indo-European.) It is not inconceivable that a change might be “undone” on a short time-scale and it seems inevitable that some changes will be undone on longer time-scales. There is some very strong evidence for such long-term Duke of York derivations having happened in a diachronic sense. The history of English provides a nice example. In Old English, front vowels were “broken” in certain environments (e.g. before h): *æ became ea, *e become eo, and *i became io. We do not, of course, know with absolute certainty exactly how these segments were pronounced, unbroken or broken, but it is at least fairly certain that unbroken *æ, *e and *i were pronounced as [æ], [e] and [i] or vowels of very similar quality. The broken vowels remained largely unchanged throughout the Old English period, except that io was everywhere replaced by eo. But by the Middle English period they had been once again “unbroken” to a and e respectively—the only eventual change was to pre-Old English broken *i which eventually became Middle English e. There may or may not have been minor changes in the pronunciations of these letters in the meantime—[æ] to [a], [e] to [ɛ], [i] to [ɪ]—but these seem scarcely large enough for this sequence of changes to not count as a diachronic Duke of York derivation.
But there are indeed linguists who appear to object to the postulation of diachronic Duke of York derivations, just like the linguists Pullum mentions. Cercignani (1972) seems to rely on such an objection in his questioning of the hypothesis that Proto-Germanic *ē became *ā in stressed syllables in pre-Old English and pre-Old Frisian. The relevant facts here are as follows.
- The general reflexes of late Proto-Indo-European *ē in initial syllables in the Germanic languages are exemplified by the following example: Proto-Indo-European *dʰéh1tis ‘act of putting’ (cf. Greek θέσις; Sanskrit dádhāti and Greek τίθημι for the root *dʰeh1 ‘put’) ↣ Proto-Germanic *dēdiz ‘deed’ (with -d- [< Proto-Indo-European *-t-] levelled in from the Proto-Indo-European oblique stem *dʰh1téy-) > Gothic -deþs in missadeþs ‘misdeed’, Old Norse dáð, Old English (West Saxon) dǣd, Old English (non-West Saxon) and Old Frisian dēd, Old Saxon dād and Old High German tāt. One can see that Gothic, Old English (non-West Saxon) and Old Frisian reflect the vowel’s presumed original mid quality, Old Norse, Old Saxon and Old High German have shifted it to a low vowel, and Old English (West Saxon) is intermediate, having shifted it to a near-low front vowel. Length is preserved in every case (Gothic e is a long vowel, it’s just not marked with a macron diacritic because Gothic has no short /e/ phoneme). It is reasonable to reconstruct *ē for Proto-Germanic, reflecting the original Proto-Indo-European quality, and to assume that the shifts have taken place at a post-Proto-Germanic date.
- In Old English and Old Frisian, Proto-Germanic *ē is reflected as ō if it was immediately before an underlying nasal (including nasals before *h and *hʷ, which were allophonically elided in Proto-Germanic) in Proto-Germanic: Proto-Germanic *mēnō̄ ‘moon’ (cf. Old Saxon and Old High German māno; Gothic mena with -a [< Proto-Germanic *-a] levelled in from the stem *mēnan-; Old Norse máni with -i levelled in from nouns ending in -i < *-ija [with *-a levelled in as in Gothic] ← Proto-Germanic *-ijō̄) > Old English, Old Frisian mōna.
- In Old English, Proto-Germanic *ē is reflected as ā immediately before w: Proto-Germanic *sēgun ‘saw’ (3pl.) (cf. Old Norwegian and Old Swedish ságu, Old English [non-West Saxon] and Old Frisian sēgon; Gothic seƕun and Old High German sāhun with Gothic -ƕ- and Old High German -h- [< Proto-Germanic *-hʷ-] levelled in from the infinitive and present stems *sehʷ- and *sihʷ- and the past. sg. stem *sahʷ-; Old Saxon sāwun with -w- [< Proto-Germanic *-w-] levelled in from the past subj. stem *sēwī-) ↣ Old English (West Saxon) sāwon with -w- < Proto-Germanic *-w- levelled in as in Old Saxon.
The question is in which languages the shift from *ē to *ā reflected in Old Norse, Old Saxon, Old High German and (partially, at least) in Old English (West Saxon) took place. Cercignani argues that it took place only in the languages it is reflected in, with Old English and Old Frisian being partially or totally unaffected by this shift. Let us call this the restriction hypothesis. Other linguists propose that it took place in every Proto-Germanic language other than Gothic, including Old English and Old Frisian, and later shifts are responsible for the reflection of Proto-Germanic *ē as ǣ or ē in Old English and Old Frisian. Let us call this the extension hypothesis (because it postulates a more extensive area for the *ē > *ā shift to take place in than the restriction hypothesis). The derivation *ē > *ā > ē which must have taken place in Old English (non-West Saxon) and Old Frisian if the extension hypothesis is to be accepted is, of course, a Duke of York derivation, and it is clear that Cercignani regards this is a major strike against the extension hypothesis.
The restriction hypothesis certainly appears simpler and, therefore, preferable at first glance. However, there are various pieces of evidence that complicate matters—most obviously points 2 and 3 above. If Proto-Germanic *ē became *ā in pre-Old English and pre-Old Frisian before shifting back to a higher quality, then we can explain the reflection of Proto-Germanic *ē as ō when nasalized or immediately before a nasal as the result of a shift *ā > *ō in this environment (paralleled by the present-day shift /ɑ̃/ > [ɔ̃] in some French dialects). This is more believable than a direct shift *ē > *ō and arguably simpler than a two-step shift *ē > *ā > *ō occurring exclusively in this nasal environment. Likewise, one might argue that the postulation of a slight restriction on the environment of the *ā-fronting sound change in Old English, allowing for retention of *ā before *w, is simpler than the postulation of an entirely separate sound change shifting *ē to *ā before *w in Old English. Neither of these arguments is at all conclusive, but they might be sufficient to make the reader adjust their estimations of the two hypotheses’ probabilities a little in favour of the extension hypothesis. As far as I can tell, the thrust of Cercignani’s argument is that, even if the consideration of points 2 and 3 does make the restriction hypothesis more complicated than it seems at first glance, the postulation of Duke of York derivations is preposterous enough that the restriction hypothesis is still by far the favourable one. Naturally I, not thinking that Duke of York derivations are necessarily preposterous, disagree.
In any case there is some more conclusive evidence for the extension hypothesis not mentioned by Cercignani, but mentioned by Ringe (2014: 13). The Proto-Germanic distal demonstrative and interrogative locative adverbs ‘there’ and ‘where’ can be reconstructed as *þar and *hʷar on the basis of Gothic þar and ƕar and Old Norse þar and hvar. Further support for these reconstructions comes from the fact that they can be transparently derived from the Proto-Germanic distal demonstrative and interrogative stems *þa- and *hʷa- by the addition of a locative suffix *-r (also found on other adverbs such as *aljar ‘elsewhere’ [cf. Gothic aljar, Old English ellor] ← *alja- ‘other’ + *-r). But in the West Germanic languages, the reflexes are as if they contained Proto-Germanic *ē: Old English (West Saxon) þǣr and hwǣr, Old English (non-West Saxon) þēr and hwēr, Old Frisian thēr and hwēr, Old Saxon thār and hwār, Old High German dār and wār. The simplest way to explain this is to propose that there has been an irregular lengthening of these words to *þār and *hwār in Proto-West Germanic, and that the *-ā- in these words was raised in Old English and Old Frisian by the same changes that raised *ā < Proto-Germanic *ē. Proponents of the restriction hypothesis must propose an irregular raising as well as a lengthening in these words, which is perhaps less believable (one can imagine adverbs with the sense ‘here’ and ‘there’ being lengthened due to contrastive emphasis—Ringe alludes to “heavy deictic stress”, which may be the same thing, although he doesn’t explain the term) and, most importantly, one must propose that this irregular raising only happens in Old English and Old Frisian, with the identity of the reflexes of Proto-Germanic *a in these words with the reflexes of Proto-Germanic *ē in stressed syllables existing entirely by coincidence. It is true that Proto-Germanic short *a in stressed syllables became *æ in Old English and Old Frisian, so if we propose that the irregular lengthening occurred after this change as an areal innovation among the West Germanic languages, we can account for Old English (West Saxon) þǣr and hwǣr; but this does not account for Old English (non-West Saxon) þēr and hwēr and Old Frisian thēr and hwēr, which have to be accounted for by an irregular raising.
To me this additional evidence seems fairly decisive. In that case, with the extension hypothesis accepted, we have a nice example of a diachronic Duke of York derivation which we know must have run its full course in a fairly short time, because we can date the Proto-Northwest Germanic *ē > *ā shift and the Old English *ǣ > ē shift (fed by the *ā > *ǣ shift, whose date is irrelevant here because it must have occurred in between these two) with reasonable precision. Ringe (p. 12), citing Grønvik (1998), says that the *ē > *ā shift is “attested from the second half of the 2nd century AD”. This is presumably based on runic evidence. As for the *ǣ > ē shift, it was one of the very early Old English sound changes in the dialects it took place in, being attested already in apparent completion in the oldest Old English texts (which date to the 8th century AD). The fact that it is shared with Old Frisian also suggests an early date. We can therefore say that there were at most five or six centuries between the two shifts, and quite likely considerably less.
To summarize: though they may seem somehow untidy, Duke of York derivations, whether diachronic or synchronic, are not intrinsically implausible. The simplest hypothesis that accounts for the data should always be preferred, but this is not always the hypothesis that avoids the Duke of York gambit. On the diachronic side of things, Duke of York derivations can certainly take place over many centuries—which nobody would dispute—but they can also take place over periods of just a few centuries, as evidenced by the history of Proto-Germanic *ē in Old English and Old Frisian.
Brink, D., 1974. Characterizing the natural order of application of phonological rules. Lingua, 34(1), pp. 47-72.
Campbell, L., 1973. Extrinsic order lives. Bloomington, IN: Indiana University Linguistics Club Publications.
Cercignani, F., 1972. Indo-European ē in Germanic. Zeitschrift für vergleichende Sprachforschung, 86(1. H), pp. 104-110.
Grønvik, O., 1998. Untersuchungen zur älteren nordischen und germanischen Sprachgeschichte. Lang.
Pullum, G. K., 1976. The Duke of York gambit. Journal of Linguistics, 12(01), pp. 83-102.
Ringe, D. & Taylor, A., 2014. A Linguistic History of English: Volume II, The Development of Old English. OUP.