Some of the phonological history of English vowels, illustrated by failed rhymes in English folk songs


  • ModE = Modern English (18th century–present)
  • EModE = Early Modern English (16th–17th centuries)
  • ME = Middle English (12th–15th centuries)
  • OE = Old English (7th–11th centuries)
  • OF = Old French (9th–14th centuries)

All of this information is from the amazingly comprehensive book English Pronunciation, 1500–1700 (Volume II) by E. J. Dobson, published in 1968, which I will unfortunately have to return to the library soon.

The transcriptions of ModE pronunciations are not meant to reflect any particular accent in particular but to provide enough information to allow the pronunciation in any particular accent to be deduced given sufficient knowledge about the accent.

I use the acute accent to indicate primary stress and the grave accent to indicate secondary stress in phonetic transcriptions. I don’t like the standard IPA notation.

Oh, the holly bears a blossom
As white as the lily flower
And Mary bore sweet Jesus Christ
To be our sweet saviour
— “The Holly and the Ivy”, as sung by Shirley Collins and the Young Tradition)

In ModE flower is [fláwr], but saviour is [séjvjər]; the two words don’t rhyme. But they rhymed in EModE, because saviour was pronounced with secondary stress on its final syllable, as [séjvjə̀wr], while flower was pronounced [flə́wr].

The OF suffix -our (often spelt -or in English, as in emperor and conqueror) was pronounced /-ur/; I don’t know if it was phonetically short or long, and I don’t know whether it had any stress in OF, but it was certainly borrowed into ME as long [-ùːr] quite regularly, and regularly bore a secondary stress. In general borrowings into ME and EModE seem to have always been given a secondary stress somewhere, in a position chosen so as to minimize the number of adjacent unstressed syllables in the word. The [-ùːr] ending became [-ə̀wr] by the Great Vowel Shift in EModE, and then would have become [-àwr] in ModE, except that it (universally, as far as I know) lost its secondary stress.

English shows a consistent tendency for secondary stress to disappear over time. Native English words don’t generally have secondary stress, and you could see secondary stress as a sort of protection against the phonetic degradation brought about by English’s native vowel reduction processes, serving to prevent the word from getting too dissimilar from its foreign pronunciation too quickly. Eventually, however, the word (or really suffix, in this case, since saviour, emperor and conqueror all develop in the same way) gets fully nativized, which means loss of the secondary stress and concomitant vowel reduction. According to Dobson, words probably acquired their secondary stress-less variants more or less immediately after borrowing if they were used in ordinary speech at all, but educated speech betrays no loss of secondary stress until the 17th century (he’s speaking generally here, not just about the [-ə̀wr] suffix. Disyllabic words were quickest to lose their secondary stresses, trisyllabic words (such as saviour) a bit slower, and in words with more than three syllables secondary stress often survives to the present day (there are some dialect differences, too: the suffix -ary, as in necessary, is pronounced [-ɛ̀ri] in General American but [-əri] in RP, and often just [-ri] in more colloquial British English).

The pronunciation [-ə̀wr] is recorded as late as 1665 by Owen Price (The Vocal Organ). William Salesbury (1547–1567) spells the suffix as -wr in Welsh orthography, which could reflect a pronunciation [-ùːr] or [-ur]; the former would be the result of occasional failure of the Great Vowel Shift before final [r] as in pour, tour, while the latter would be the probable initial result of vowel reduction. John Hart (1551–1570) has [-urz] in governors. So the [-ə̀wr] pronunciation was in current use throughout the 17th century, although the reduced forms were already being used occasionally in Standard English during the 16th. Exactly when [-ə̀wr] became obsolete, I don’t know (because Dobson doesn’t cover the ModE period).

Bold General Wolfe to his men did say
Come lads and follow without delay
To yonder mountain that is so high
Don’t be down-hearted
For we’ll gain the victory
— “General Wolfe” as sung by the Copper Family

Our king went forth to Normandy
With grace and might of chivalry
The God for him wrought marvelously
Wherefore England may call and cry
— “Agincourt Carol” as sung by Maddy Prior and June Tabor

This is another case where loss of secondary stress is the culprit. The words victory, Normandy and chivalry are all borrowings of OF words ending in -ie /-i/. They would therefore have ended up having [-àj] in ModE, like cry, had it not been for the loss of the secondary stress. For the -y suffix this occurred quite early in everyday speech, already in late ME, but the secondarily stressed variants survived to be used in poetry and song for quite a while longer. Alexander Gil’s Logonomia Anglica (1619) explicitly remarks that pronouncing three-syllable, initially-stressed words ending in -y with [-ə̀j] is something that can be done in poetry but not in prose. Dobson says that apart from Gil’s, there are few mentions of this feature of poetic speech during the 17th century; we can perhaps take this an indication that it was becoming unusual to pronounce -y as [-ə̀j] even in poetry. I don’t know exactly how long the feature lasted. But General Wolfe is a folk song whose exact year of composition can be identified—1759, the date of General Wolfe’s death—so the feature seems to have been present well into the 18th century.

They’ve let him stand till midsummer day
Till he looked both pale and wan
And Barleycorn, he’s grown a beard
And so become a man
— “John Barleycorn” as sung by The Young Tradition

In ModE wan is pronounced [wɒ́n], with a different vowel from man [man]. But both of them used to have the same vowel as man; in wan the influence of the preceding [w] resulted in rounding to an o-vowel. The origins of this change are traced by Dobson to the East of England during the 15th century. There is evidence of the change from the Paston Letters (a collection of correspondence between members of the Norfolk gentry between 1422 and 1509) and the Cely Papers (a collection of correspondence between wealthy wool merchants owning estates in Essex between 1475 and 1488); the Cely Papers only exhibit the change in the word was, but the change is more extensive in the Paston Letters and in fact seems to have applied before the other labial consonants [b], [f] and [v] too for these letters’ writers.

There is no evidence of the change in Standard English until 1617, when Robert Robinson in The Art of Pronunciation notes that was, wast (as in thou wast) and what have [ɒ́] rather than [á]. The restriction of the change to unstressed function words initially, as in the Cely Papers suggests the change did indeed spread from the Eastern dialects. Later phoneticians during the 17th century record the [ɒ́] pronunciation in more and more words, but the change is not regular at this point; for example, Christopher Cooper (1687) has [ɒ́] in watch but not in wan. According to Dobson, relatively literary words such as wan and quality, not often used in everyday speech, did not reliably have [ɒ́] until the late 18th century.

Note that the change also applied after [wr] in wrath, and that words in which a velar consonant ([k], [g] or [ŋ]) followed the vowel were regular exceptions (cf. wax, wag, twang).

I’ll go down in some lonesome valley
Where no man on earth shall e’er me find
Where the pretty little small birds do change their voices
And every moment blows blusterous winds
— “The Banks of the Sweet Primroses” as sung by the Copper family

The expected ModE pronunciation of OE wind ‘wind’ would be [wájnd], resulting in homophony with find. Indeed, as far as I know, every other monosyllabic word with OE -ind has [-ájnd] in Modern English (mind, grind, bind, kind, hind, rind, …), resulting from an early ME sound change that lengthened final-syllable vowels before [nd] and various other clusters containing two voiced consonants at the same place of articulation (e.g. [-ld] as in wild).

It turns out that [wájnd] did use to be the pronunciation of wind for a long time. The OED entry for wind, written in the early 20th century, actually says that the word is still commonly taken to rhyme with [-ajnd] by “modern poets”; and Bob Copper and co. can be heard pronouncing winds as [wájndz] in their recording of “The Banks of the Sweet Primroses”. The [wínd] pronunciation reportedly became usual in Standard English only in the 17th century. It is hypothesized to be a result of backformation from the derivatives windy and windmill, in which lengthening never occurred because the [nd] cluster was not in word-final position. It is unlikely to be due to avoidance of homophony with the verb wind, because the words spent several centuries being homophonous without any issues arising.

Meeting is pleasure but parting is a grief
And an inconstant lover is worse than a thief
A thief can but rob me and take all I have
But an inconstant lover sends me to the grave
— “The Cuckoo”, as sung by Anne Briggs

As the spelling suggests, the word have used to rhyme with grave. The word was confusingly variable in form in ME, but one of its forms was [haːvə] (rhyming with grave) and another one was [havə]. The latter could have been derived from the former by vowel reduction when the word was unstressed, but this is not the only possible sources of it (e.g. another one would be analogy with the second-person singular form hast, where the a was in a closed open syllable and therefore would have been short); there does not seem to be any consistent conditioning by stress in the forms recorded by 16th- and 17th-century phoneticians, who use both forms quite often. There are some who have conditioning by stress, such as Gil, who explicitly describes [hǽːv] as the stressed form and [hav] as the unstressed form. I don’t know how long [hǽːv] (and its later forms, [hɛ́ːv], [héːv], [héjv]) remained a variant usable in Standard English, but according to the Traditional Ballad Index, “The Cuckoo” is attested no earlier than 1769.

Now the day being gone and the night coming on
Those two little babies sat under a stone
They sobbed and they sighed, they sat there and cried
Those two little babies, they laid down and died
— “Babes in the Wood” as sung by the Copper family

In EModE there was occasional shortening of stressed [ɔ́ː], so that it developed into ModE [ɒ́] rather than [ów] as normal. It is a rather irregular and mysterious process; examples of it which have survived into ModE include gone (< OE ġegān), cloth (< OE clāþ) and hot (< OE hāt). The 16th- and 17th-century phoneticians record many other words which once had variants with shortening that have not survived to the present-day, such as both, loaf, rode, broad and groat. Dobson mentions that Elisha Coles (1675–1679) “knew some variant, perhaps ŏ in stone“; the verse from “Babes in the Wood” above would be additional evidence that stone at some point by some people was pronounced as [stɒn], thus rhyming with on. As far as I know, there is no way it could have been the other way round, with on having [ɔ́ː]; the word on has always had a short vowel.

“So come riddle to me, dear mother,” he said
“Come riddle it all as one
Whether I should marry with Fair Eleanor
Or bring the brown girl home” (× 2)

“Well, the brown girl, she has riches and land
Fair Eleanor, she has none
And so I charge you do my bidding
And bring the brown girl home” (× 2)
— “Lord Thomas and Fair Eleanor” as sung by Peter Bellamy

In “Lord Thomas and Fair Eleanor”, the rhymes on the final consonant are often imperfect (although the consonants are always phonetically similar). These two verses, however, are the only ones where the vowels aren’t the same in the modern pronunciation—and there’s good reason to think they were the same once.

The words one and none are closely related. The OE word for ‘one’ was ān; the OE word for ‘none’ was nān; the OE word for ‘not’ was ne; the second is simply the result of adding the third as a prefix to the first: ‘not one’.

OE ā normally becomes ME [ɔ́ː] and then ModE [ów] in stressed syllables. If it had done that in one and none, it’d be a near-rhyme with home today, save for the difference in the final nasals’ places of articulation. Indeed, in only, which is a derivative of one with the -ly suffix added, we have [ów] in ModE. But the standard ModE pronunciations of one and none are [wʌ́n] and [nʌ́n] respectively. There are also variant forms [wɒ́n] and [nɒ́n] widespread across England. How did this happen? As usual, Dobson has answers.

The [nɒ́n] variant is the easiest one to explain, at least if we consider it in isolation from the others. It’s just the result of sporadic [ɔ́ː]-shortening before [n], as in gone (see above on the onstone rhyme). As for [nʌ́n]—well, ModE [ʌ] is the ordinary reflex of short ME [u], but there is a sporadic [úː]-shortening change in EModE besides the sporadic [ɔ́ː]-shortening one. This change is quite common and reflected in many ModE words such as blood, flood, good, book, cook, wool, although I don’t think there are any where it happens before n. So perhaps [nɔ́ːn] underwent a shift to [nóːn] somehow during the ME period, which would become [núːn] by the Great Vowel Shift. As it happens there is some evidence for such a shift in ME from occasional rhymes in ME texts, such as hoom ‘home’ with doom ‘doom’ and forsothe ‘forsooth’ with bothe ‘bothe’ in the Canterbury Tales. However, there is especially solid evidence for it in the environment after [w], in which environment most instances of ME [ɔ́ː] exhibit raising that has passed into Standard English (e.g. who < OE hwā, two < OE twā, ooze < OE wāse; woe is an exception in ModE, although it, too, is listed as a homophone of woo occasionally by Early Modern phoneticians). Note that although all these examples happen to have lost the [w], presumably by absorption into the following [úː] after the Great Vowel Shift occurred, there are words such as womb with EModE [úː] which have retained their [w], and phoneticians in the 16th and 17th centuries record pronunciations of who and two with retained [w]. So if ME [ɔ́ːn] ‘one’ somehow became [wɔ́ːn], and then raising to [wóːn] occurred due to the /w/, then this vowel would be likely to spread by analogy to its derivative [nɔ́ːn], allowing for the emergence of [wʌ́n] and [nʌ́n] in ModE. The ModE [wɒ́n] and [nɒ́n] pronunciations can be accounted for by assuming the continued existence of an un-raised [wɔ́ːn] variant in EModE alongside [wuːn].

As it happens there is a late ME tendency for [j] to be inserted before long mid front vowels and, a little less commonly, for [w] to be inserted before word-initial long mid back vowels. This glide insertion only happened in initial syllables, and usually only when the vowel was word-initial or the word began with [h]; but there are occasional examples before other consonants such as John Hart’s [mjɛ́ːn] for mean. The Hymn of the Virgin (uncertain date, 14th century), which is written in Welsh orthography and therefore more phonetically transparent than usual, evidences [j] in earth. John Hart records [j] in heal and here, besides mean, and [w] in whole (< OE hāl). 17th-century phoneticians record many instances of [j]- and [w]-insertion, giving spellings such as yer for ‘ere’, yerb for ‘herb’, wuts for ‘oats’ (this one also has shortening)—but they frequently condemn these pronunciations as “barbarous”. Christopher Cooper (1687) even mentions a pronunciation wun for ‘one’, although not without condemning it for its barbarousness. The general picture seems to be that glide insertion was widespread in dialects, and filtered into Standard English to some degree during the 16th century, but there was a strong reaction against it during the 17th century and it mostly disappeared—except, of course, in the word one, which according to Dobson the [wʌ́n] pronunciation becomes normal for around 1700. The [nʌ́n] pronunciation for ‘none’ is first recorded by William Turner in The Art of Spelling and Reading English (1710).

Finally, I should mention that sporadic [úː]-shortening is also recorded as applying to home, resulting in the pronunciation [hʌ́m]; and Turner has this pronunciation, as do many English traditional dialects. So it’s possible that the rhyme in “Lord Thomas and Fair Eleanor” is due to this change having applied to home, rather than preservation of the conservative [-ówn] forms of one and none.

The perfect pathway

Anybody who knows French or German will be familiar with the fact that the constructions in these languages described as “perfects” tend to be used in colloquial speech as simple pasts1 rather than true perfects. This can be illustrated by the fact that the English sentence (1) is ungrammatical, whereas the French and German sentences (2) and (3) are perfectly grammatical.

  1. *I have left yesterday.
  1. Je suis parti hier.
    I am leave-PTCP yesterday
    “I left yesterday.”
  1. Ich habe gestern verlassen.
    I have-1SG yesterday leave-PTCP
    “I left yesterday.”

The English perfect is a true perfect, referring to a present state which is the result of a past event. So, for example, the English sentence (4) is paraphrased by (5).

  1. I have left.
  1. I am in the state of not being present resulting from having left.

As it is specifically present states which are referred to by perfects, it makes no sense for a verb in the perfect to be modified by an adverb of past time like ‘yesterday’. That’s why (1) is ungrammatical. In order for ‘yesterday’ to modify the verb in (1), the verb would have to refer to a past state resulting from an event further in the past; the appropriate category for such a verb is not the perfect but rather the pluperfect or past perfect, which is formed in the same way as the perfect in English except that the auxiliary verb have takes the past tense. It’s perfectly fine for adverbs of past time to modify the main verbs of pluperfect constructions; c.f. (6).

  1. I had left yesterday.

If the French and German “perfects” were true perfects like the English perfect, (2) and (3) would have to be ungrammatical too, and as they are not in fact ungrammatical we can conclude that these “perfects” are not true perfects. (Of course one could also conclude this from asking native speakers about the meaning of these “perfects”, and one has to take this step to be able to conclude that they are in fact simple pasts; the above is just a neat way of demonstrating their non-true perfect nature via the medium of writing.)

French and German verbs do have simple past forms which have a distinctive inflection; for example, partis and verließ are the first-person singular inflected simple past forms of the verbs meaning ‘leave’ in sentences (2) and (3) respectively, corresponding to the first-person singular present forms pars and verlasse. But these inflected simple past forms are not used in colloquial speech; their function has been taken over by the “perfect”. If you take French or German lessons you are taught how to use the “perfect” before you are taught how to use the simple past, because the “perfect” is more commonly used; it’s the other way round if you take English lessons, because in English the simple past is not restricted to literary speech, and is more common than the perfect as it has a more basic meaning.

The French and German “perfects” were originally true perfects even in colloquial speech, just as in English. So how did this change in meaning from perfect to simple past occur? One way to understand it is as a simple case of generalization. The perfect is a kind of past; if one were to translate (4) into a language such as Turkish which does not have any sort of perfect construction, but does have a distinction between present and past tense, one would translate it as a simple past, as in (7).

  1. Ayrıldım.
    “I left / have left.”

The distinction in meaning between the perfect and the simple past is rather subtle, so it is not hard to imagine the two meanings being confused with each other frequently enough that the perfect came eventually to be used with the same meaning as the simple past. This could have been a gradual process. After all, it is often more or less a matter of arbitrary perspective whether one chooses to focus on the state of having done something, and accordingly use the perfect, or on the doing of the thing itself, and accordingly use the simple past. Here’s an example: if somebody tells you to look up the answer to a question which was raised in a discussion of yours with them, and you go away and look up the answer, and then you meet this person again, you might say either “I looked up the answer” or “I’ve looked up the answer”. At least to me, neither utterance seems any more expected in that situation than other. French and German speakers may have tended over time to more and more err on the side of focusing on the state, so that the perfect construction became more and more common, and this would encourage reanalysis of the meaning of the perfect as the same as that of the simple past.

But it might help to put this development in some further context. It’s not only in French and German that this development from perfect to simple past has occurred. In fact, it seems to be pretty common. Well, I don’t know about other families, but it is definitely common among the Indo-European (IE) languages. There is, in fact, evidence that the development occurred in the history of English, during the development of Proto-Germanic from Proto-Indo-European (PIE). (This means German has undergone the development twice!) I’ll talk a little bit about this pre-Proto-Germanic development, because it’s a pretty interesting one, and it ties in with some of the other cases of the development attested from IE languages.

PIE (or at least a late stage of it; we’ll talk more about that issue below) distinguished three different aspect categories, which are traditionally called the “present”, “aorist” and “perfect”. The names of these aspects do not have their usual meanings—if you know about the distinction between tense and aspect, you probably already noticed that “present” is normally the name of a tense, rather than an aspect. (Briefly, tense is an event or state’s relation in time to the speech act, aspect is the structure of the event on the timeline without any reference to the speech act; for example, aspect includes things like whether the event is completed or not. But this isn’t especially important to our discussion.) The better names for the “present” and “aorist” aspects are imperfective and perfective, respectively. The difference between them is the same as that between the French imperfect and the French simple past: the perfective (“aorist”) refers to events as completed wholes and the imperfective (“present”) refers to other events, such as those which are iterated, habitual or ongoing. Note that present events cannot be completed yet and therefore can only be referred to by imperfectives (“presents”). But past events can be referred to by either imperfectives or perfectives. So, although PIE did distinguish two tenses, present and past, in addition to the three aspects, the distinction was only made in the imperfective (“present”, although that name is getting especially confusing here) aspect because the perfective (“aorist”) aspect entailed past tense. The past tense of the imperfective aspect is called the imperfect rather than the past “present” (I guess even IEists would find that terminology too ridiculous).

So what was the meaning of the PIE “perfect”? Well, the PIE “perfect” is reflected as a true perfect in Classical Greek. The system of Classical Greek, with the imperfect, aorist and true perfect all distinguished from one another, was more or less the same as that of modern literary French. However, according to Ringe (2006: 25, 155), the “perfect” in the earlier Greek of Homer’s poems is better analyzed as a simple stative, referring to a present state without any implication of this state being the result of a past event. Now, I’m not sure exactly what the grounds for this analysis are. Ringe doesn’t elaborate on it very much and the further sources it refers to (Wackernagel 1904; Chantraine 1927) are in German and French, respectively, so I can’t read them very easily. The thing is, every state has a beginning, which can be seen as an event whose result is the state, and thus every simple stative can be seen as a perfect. English does distinguish simple statives from perfects (predicative adjectives are stative, as are certain verbs in the present tense, such as “know”). The difference seems to me to be something to do with how salient the event that begins the state—the state’s inception—is. Compare sentences (8) and (9), which have more or less the same meaning except that the state’s inception is more salient in (9) (although still not as salient as it is in (10)).

  1. He is dead.
  1. He has died.
  1. He died.

But I don’t know if there are any more concrete diagnostic tests that can distinguish a simple stative from a perfect. Homeric and Classical Greek are extinct languages, and it seems like it would be difficult to judge the salience of inceptions of states in sentences of these languages without having access to native speaker intutions.

It is perhaps the case that some states are crosslinguistically more likely than others to be referred to by simple statives, rather than perfects. Perhaps the change was just that the “perfect” came to be used more often to refer to states that crosslinguistically tend to be referred to by perfects. Ringe (2006: 155) says:

… a large majority of the perfects in Classical Attic are obvious innovations and have meanings like that of a Modern English perfect; that is, they denote a past action and its present result. We find ἀπεκτονέναι /apektonénai/ ‘to have killed’, πεπομφέναι /pepompʰénai/ ‘to have sent’, κεκλοφέναι /keklopʰénai/ ‘to have stolen’, ἐνηνοχέναι /enęːnokʰénai/ ‘to have brought’, δεδωκέναι /dedǫːkénai/ ‘to have given’, γεγραφέναι /gegrapʰénai/ ‘to have written’, ἠχέναι /ęːkʰénai/ ‘to have led’, and many dozens more. Most are clearly new creations, but a few appear to be inherited stems that have acquired the new ‘resultative’ meaning, such as λελοιπέναι /leloipʰénai/ ‘to have left behind’ and ‘to be missing’ (the old stative meaning).

These newer perfects could still be glossed as simple statives (‘to be a thief’ instead of ‘to have stolen’, etc.) but the states they refer to do seem to me to be ones which inherently tend to involve a salient reference to the inception of the state.

There is a pretty convincing indication that the “perfect” was a simple stative at some point in the history of Greek: some Greek verbs whose meanings are conveyed by lexically stative verbs or adjectives in English, such as εἰδέναι ‘to know’ and δεδιέναι ‘to be afraid of’, only appear in the perfect and pluperfect. These verbs are sometimes described as using the perfect in place of the present and the pluperfect in place of the imperfect, although at least in Homeric Greek their appearance in only the perfect and pluperfect is perfectly natural in respect of their meaning and does not need to be treated as a special case. These verbs continued to appear only in the perfect and pluperfect during the Classical period, so they do not tell us anything about when the Greek “perfect” became a true perfect.

Anyway, it is on the basis of the directly attested meaning of the “perfect” in Homeric Greek that the PIE “perfect” is reconstructed as a simple stative. Other IE languages do preserve relics of the simple stative meaning which add to the evidence for this reconstruction. There are in fact relics of the simple stative meaning in the Germanic languages which have survived, to this day, in English. These are the “preterite-present” or “modal” verbs: can, dare, may, must, need, ought, shall and will. Unlike other English verbs, these verbs do not take an -s ending in the third person singular (dare and need can take this ending, but only when their complements are to-infinitives rather than bare infinitives). Apart from will (which has a slightly more complicated history), the preterite-present verbs are precisely those whose presents are reflexes of PIE “perfects” rather than PIE “presents” (although some of them have unknown etymologies). It is likely that they were originally verbs that appeared only in the perfect, like Greek εἰδέναι ‘to know’.2

Most of the PIE “perfects”, however, ended up as the simple pasts of Proto-Germanic strong verbs. (That’s why the preterite-present verbs are called preterite-presents: “preterite” is just another word for “past”, and the presents of preterite-present verbs are inflected like the pasts of other verbs.) Presumably these “perfects” underwent the whole two-step development from simple stative to perfect to simple past. There was plenty of time for this to occur: remember that the Germanic languages are unattested before 100 AD, and the development of the true perfect in Greek had already occurred by 500 BC. Just as the analytical simple pasts of colloquial French and German, which are the reflexes of former perfects, have completely replaced the older inflected simple pasts, so the PIE “perfects” completely replaced the PIE “aorists” in Proto-Germanic. According to Ringe (2006: 157) there is absolutely no trace of the PIE “aorist” in any Germanic language. Proto-Germanic also lost the PIE imperfective-perfective opposition, and again the simple pasts reflecting the PIE “perfects” completely replaced the PIE imperfects—with a single exception. This was the verb *dōną ‘to do’, whose past stem *ded- is a reflex of the PIE present stem *dʰédeh1 ‘put’. Admittedly, the development of this verb as a whole is somewhat mysterious (it is not clear where its present stem comes from; proposals have been put forward, but Ringe 2006: 160 finds none of them convincing) but given its generic meaning and probable frequent use it is not surprising to find it developing in an exceptional way. One reason we can be quite sure it was used very frequently is that the *ded- stem is the same one which is though to be reflected in the past tense endings of Proto-Germanic weak verbs. There’s a pretty convincing correspondence between the Gothic weak past endings and the Old High German (OHG) past endings of tuon ‘to do’:

Past of Gothic waúrkjan ‘to make’ Past of OHG tuon ‘to do’
Singular First-person waúrhta ‘I made’ tëta ‘I did’
Second-person waúrhtēs ‘you (sg.) made’ tāti ‘you (sg.) did’
Third-person waúrhta ‘(s)he made’ tëta ‘(s)he did’
Plural First-person waúrhtēdum ‘we made’ tāti ‘we did’
Second-person waúrhtēduþ ‘you (pl.) made’ tātīs ‘you (pl.) did’
Third-person waúrhtēdun ‘they made’ tāti ‘they did’

Note that Proto-Germanic is reflected as ē in Gothic but ā in the other Germanic languages, so the alternation between -t- and -tēd- at the start of each ending in Gothic corresponds exactly, phonologically and morphologically, to the alternation between the stems tët- and tāt- in OHG.

The pasts of Germanic weak verbs must have originally been formed by an analytical construction with a similar syntax as the English, French and German perfect constructions, involving the auxiliary verb *dōną ‘to do’ in the past tense (probably in a sense of ‘to make’) and probably the past participle of the main verb. As pre-Proto-Germanic had SOV word order, the auxiliary verb could then be reinterpreted as an ending on the past participle, which would take us (with a little haplology) from (11) to (12).

  1. *Ek wēpną wurhtą dedǭ.
    I weapon made-NSG wrought-1SG
    “I wrought a weapon” (lit. “I made a weapon wrought”)
  1. *Ek wēpną wurht(ąd)edǭ
    I weapon wrought-1SG
    “I wrought a weapon”

(The past of waúrht- is glossed here by the archaic ‘wrought’ to distinguish it from ded- ‘make’, although ‘make’ is the ideal gloss for both verbs. I should probably have just used a verb other than waúrhtjan in the example to avoid this confusion, but oh well.)

Why couldn’t the pasts of weak verbs have been formed from PIE “perfects”, like those of strong verbs? The answer is that the weak verbs were those that did not have perfects in PIE to use as pasts. Many PIE verbs never appeared in one or more of the three aspects (“present”, “aorist” and “perfect”). I already mentioned the verbs like εἰδέναι < PIE *weyd- ‘to know’ which only appeared in the perfect in Greek, and probably in PIE as well. One very significant and curious restriction in this vein was that all PIE verbs which were derived from roots by the addition of a derivational suffix appeared only in the present aspect. There is no semantic reason why this restriction should have existed, and it is therefore one of the most convincing indications that PIE did not originally have morphological aspect marking on verbs. Instead, aspect was marked by the addition of derivational suffixes. There must have been a constraint on the addition of multiple derivational suffixes to a single root (perhaps because it would mess up the ablaut system, or perhaps just because it’s a crosslinguistically common constraint), and that would account for this curious restriction. Other indications that aspect was originally marked by derivational suffixes in PIE are the fact that the “present”, “aorist” and “perfect” stems of each PIE verb do not have much of a consistent formal relation to one another (there are some consistencies, e.g. all verbs which have a perfect stem form it by reduplication of the initial syllable, although *weyd- ‘know’, which has no present or aorist stem, is not reduplicated; but the general rule is one of inconsistency); there is no single present or aorist suffix, for example, and one pretty much has to learn each stem of each verb off by heart. Also, I’ve think I’ve read, although I can’t remember where I read it, that aspect is still marked (wholly, or largely) by derivational sufixes only in Hittite.

The class of derived verbs naturally expanded over time, while the class of basic verbs became smaller. The inability of derived verbs to have perfect stems is therefore perhaps the main reason why it was necessary to use an alternative strategy for forming the pasts of some verbs in Proto-Germanic, and thus to create a new class of weak verbs separate from the strong verbs.

So that’s the history of the PIE “perfect” in Germanic (with some tangential, but hopefully interesting elaboration). A similar development occurred in Latin. A few PIE “perfects” were preserved in Latin as statives, just like the Germanic preterite-presents (meminisse ‘to remember’, ōdisse ‘to hate’, nōvisse ‘to recognize, to know (someone)’); the others became simple pasts. But I don’t know much about the details of the developments in Latin.


perfect-pathwayWe’ve seen evidence from Indo-European languages that there’s a kind of developmental pathway going on: statives develop into perfects, and perfects develop into simple pasts. In order for the first step to occur there has to be some kind of stative category, and it looks like this might be a relatively uncommon feature: most of the languages I’ve seen have a class of lexically stative verbs or tend to use entirely different syntax for events and states (e.g. verbs for events, adjectives for states). (English does a bit of both.) The existence of the stative category in PIE might be associated with the whole aspectual system’s recent genesis via morphologization of derivational suffixes. Of course the second part of the pathway can occur on its own, as it did in French and German after perfects were innovated via an analytical construction. It is also possible for simple pasts to be innovated straight away via analytical constructions, as we saw with the Germanic weak past inflection.

It would be interesting to hear if there are any other examples of developments occurring along this pathway, or, even more interestingly, examples where statives, perfects or simple pasts have developed or have been developed in completely different ways, from non-Indo-European languages (or Indo-European languages that weren’t mentioned here).


  1. ^ I’m using the phrase “simple past” here to refer to the past tense without the additional meaning of the true perfect (that of a present state resulting from the past event). In French the simple past can be distinguished from the imperfect as well as the perfect: the simple past refers to events as completed wholes (and is therefore said to have perfective aspect), while the imperfect refers either to iterated or habitual events, or to part of an event without the entailment that the event was completed (and is therefore said to have imperfective aspect). The perfect also refers to events as completed wholes, but it also refers to the state resulting from the completion of such events, more or less at the same time (arguably the state is the more primary reference). In colloquial French, the perfect is used in place of the simple past, so that no distinction is made between the simple past and perfect (and the merged category takes the name of the simple past), but the distinction from the imperfect is preserved. Thus the “simple past” in colloquial French is a little different from the “simple past” in colloquial German; German does not distinguish the imperfect from the simple past in either its literary or colloquial varieties. The name “aorist” can be used to refer to a simple past category like the one in literary French, i.e., a simple past which is distinct from both the perfect and the imperfect.
  2. ^ Of course, εἰδέναι appears in the pluperfect as well as the perfect, but the Greek pluperfect was an innovation formation, not inherited from PIE, and there is no reason to think Proto-Germanic ever had a pluperfect. The Proto-Germanic perfect might well have referred to a state of indeterminate tense resulting from a past event, in which case it verbs in the perfect probably could be modified with adverbs of past time like ‘yesterday’. It is a curious thing that the present and past tenses were not distinguished in the PIE “perfect”; there is no particular reason why they should not have been (simple stative meaning is perfectly compatible with both tenses, c.f. English “know” and “knew”) and it is therefore perhaps an indication that tense distinction was a recent innovation in PIE, which had not yet had time to spread to aspects other than the imperfective (“present”). The nature of the endings distinguishing the present and past tense is also suggestive of this; for example the first-person, second-person and third-person singular endings are *-mi, *-si and *-ti respectively in the present and *-m, *-s and *-t respectively in the past, so the present endings can be derived from the past endings by the addition of an *-i element. This *-i element has been hypothesised to be have originally been a particle indicating present tense; it’s called the hic et nunc (‘here and now’) particle. I don’t know how the other endings are accounted for though.


Ringe, D., 2006. From Proto-Indo-European to Proto-Germanic: A Linguistic History of English: Volume I: A Linguistic History of English (Vol. 1). Oxford University Press.

A brief history of English kinship terminology

In modern standard English, the following basic kinship terms exist:

father, mother, uncle, aunt, cousin, brother, sister, nephew, niece, husband, wife, son, daughter.

Phrases consisting of multiple words and terms which are regularly derived from more basic words via prefixes like grand- or great- or suffixes like -in-law are not included in this list. cousin is included here on the basis of its sense of ‘first cousin, i.e. uncle or aunt’s child’, not its sense of ‘relative who is not a direct ancestor or descendant’. Gender-neutral terms like parent, sibling, spouse and child are not included because, when the gender of the referent is known, it is always preferable to use a gender-specific term in English, so these terms are not as basic as the gender-specific terms.

The English kinship terminology system is a perfect example of an Eskimo kinship terminology system. Eskimo kinship terminology is the kind of terminology expected in a bilateral society, where no distinction is made between patrilineal and matrilineal ancestry and where the emphasis is on the nuclear family.

In Old English, the system was different. Here are the basic kinship terms of Old English (from the Bosworth-Toller Anglo-Saxon Dictionary):

fæder ‘father’, fædera ‘paternal uncle’, faþu ‘paternal aunt’, mōdor ‘mother’, ēam ‘uncle, esp. maternal’, mōdriġe ‘aunt, esp. maternal’, brōþor ‘brother, sweostor ‘sister’, nefa ‘nephew, grandson’, nift ‘niece, granddaughter’, swēor ‘father-in-law’, sweġer ‘mother-in-law’, tācor ‘husband’s brother’, sunu ‘son’, snoru ‘daughter-in-law’, dōhtor ‘daughter’, āþum ‘son-in-law, sister’s husband’.

Note that the precise meanings of the Old English kinship terms are difficult to identify, because the historical evidence is often incomplete, and also there was probably variation over time and space. So there might be some more obscure words, and additional senses to the words listed above, that have not been listed here. The above list, therefore, should be taken as a close but not exact approximation of the Old English kinship terminology system. With this caveat in mind, the following differences from modern standard English can be observed.

  • A distinction is made between paternal and maternal uncles and aunts. There were specific terms for paternal uncles and aunts, fædera and faþu respectively. The other two terms, ēam and mōdriġe, appear to have not referred exclusively to maternal uncles and aunts respectively but they were chiefly used in this sense.
  • The terms nefa and nift, chiefly meaning ‘nephew’ and ‘niece’ respectively, could also be used in the sense of ‘grandson’ or ‘granddaughter’, respectively. Note that unlike the terms for uncles and aunts, maternal and paternal nieces and nephews were not distinguished, although it was possible to use more specific derived terms like brōþordōhtor ‘brother’s daughter’.
  • It’s hard to find information on the Old English terminology for cousins; it seems that it isn’t well-attested, and people disagree about what distinctions where drawn. So I haven’t included any of it here. But according to Bosworth-Toller swēor ‘father-in-law’ could be used to refer to male cousins of some kind and mōdriġe ‘maternal aunt’ could be used to refer to female cousins of some kind. The use of swēor to mean ‘cousin’ is especially interesting because it may indicate that the Anglo-Saxons practised some kind of cousin marriage.
  • Like many languages, Old English lacked basic terms for ‘husband’ and ‘wife’; the words for ‘man’ and ‘woman’, wer or ceorl and wīf or cwēn respectively, were used instead).
  • Old English had basic terms for ‘father-in-law’ and ‘mother-in-law’: swēor and sweġer respectively. It also had basic terms for ‘son-in-law’ and ‘daughter-in-law’: āþum and snoru. However, āþum had an additional meaning of ‘sister’s husband’, and in this sense it translates modern standard English brother-in-law. But brother-in-law can also mean ‘husband’s brother’, and Old English had an entirely distinct word for this sense: tācor. As for ‘sister-in-law’, Old English does not appear to have had any basic terms for this, whether in the sense of ‘wife’s sister’ or ‘brother’s wife’.

The Old English kinship system does not fit neatly into any of Morgan’s classifications. It resembles the Eskimo kinship terminology of modern standard English in that paternal and maternal nephews and nieces are not distinguished; however, it does make a distinction between paternal and maternal uncles and aunts which is more typical of a Sudanese kinship terminology system. The Old English system might be seen as a system in a state of transition between a Sudanese system and an Eskimo system. The nonexistence of a basic term for ‘wife’s sister’ and the existence of a basic term for ‘husband’s brother’ might be taken as an indication that Old English society was patrilocal.

The Proto-Germanic kinship terminology system is of course even more difficult to know about, because the language is not attested in writing. However, based on the evidence of the older Germanic languages (Gothic, Old Norse, Old English, Old Frisian, Old Saxon, Old Dutch and Old High German), we can reconstruct an approximation of the system. The following list is based on information in Lehmann (2005-2007), A Grammar of Proto-Germanic and Ringe (2006), From Proto-Indo-European to Proto-Germanic.

*fadēr ‘father’ (c.f. Goth. fadar, ON faðir, OHG fater), *mōdēr ‘mother’ (c.f. Goth. mōdar, ON móðir, OHG muoter), *nefō̄ ‘nephew, grandson’ (c.f. ON nefe, OHG nevo), *niftiz ‘niece, granddaughter’ (c.f. ON nipt, OHG nift), *brōþēr ‘brother’ (c.f. Goth. brōþar, ON bróðir, OHG bruoder), *swestēr ‘sister’ (c.f. Goth. swistar, ON systir, OHG swester), *swehuraz ‘father-in-law’ (c.f. Old Swedish svēr, OHG swehur), *swegrū ‘mother-in-law’ (c.f. Goth. swaíhra, ON sværa, OHG swigar), *taikuraz ‘husband’s brother’ (c.f. OHG zeihhur), *sunuz ‘son’ (c.f. Goth. sunus, ON sunr, OHG sunu), *snuzō ‘daughter-in-law’ (c.f. OHG snura), *duhtēr ‘daughter’ (c.f. Goth. daúhtar, ON dóttir, OHG tohter), *aiþumaz ‘son-in-law, brother-in-law’ (c.f. OHG eidum).

Note that in Gothic, there were two other words for ‘father’ and ‘mother’ besides fadar and mōdar: atta and áiþei. The first of these has a PIE ancestor, *átta (c.f. Greek átta, Latin atta, both respectful terms of address for elderly men, and Hittite attas ‘father’). Ringe (2006) reconstructs *attō̄ for Proto-Germanic. However, áiþei is of unknown origin. The resemblance to *aiþaz ‘oath’ (which has a cognate in Old Irish ōeth, but no other Indo-European cognates, so it is probably a loanword from an unknown language that entered both Celtic and Germanic) is suggestive, but it could also be entirely unrelated. áiþei may also be related to *aiþumaz, which is also of unknown origin; it has no known cognates in any non-Germanic Indo-European languages, or indeed in any non-West Germanic language.

There were probably terms for uncles, aunts and cousins in Proto-Germanic as well, but they are difficult to reconstruct. On the basis of OE ēam and OHG ōheim we can reconstruct Proto-West Germanic *auhaimaz ‘maternal uncle’. This appears to be a contraction of a compound *awahaimaz formed from *awaz ‘uncle, grandfather’ (< Proto-Indo-European *h₂éwh₂os) + *haimaz ‘home’. But this is a strange compound, because compounds in the modern Germanic languages and in Proto-Indo-European are head-final (for example, elephant shrew refers to shrews that are like elephants, not elephants that are like shrews). An *awahaimaz is a kind of uncle, so this compound appears to be head-initial. I have no idea why this is the case. The choice of this compound to denote the maternal uncle is also interesting. If *awahaimaz is interpreted as ‘uncle who lives in the same home’ that suggests that the Proto-West Germanic speakers actually had a matrilocal society. In a patrilocal society, wives move into their husband’s homes after marriage, leaving their brothers behind, so people tend to live in extended families with their paternal uncles rather than their maternal uncles. This might seem strange, because it is pretty clear that later Germanic society and earlier Proto-Indo-European society was patrilocal. But there is, in fact, a theory that societies in the process of state formation tend to pass through a temporary matrilocal stage. For more on this, see my post on Tumblr about matrilocal societies.

There are other indications that Proto-Germanic preserved a reflex of *h₂éwh₂os (maybe *awaz?). Old Norse had the words afi ‘grandmother’ and amma ‘grandmother’; amma is probably a nursery word, but Lehmann says afi is a reflexes of *h₂éwh₂os (although I don’t know why the word has -f- rather than -v-). Apparently a dative singular form awōn ‘grandmother’ is attested from Gothic, too, which would correspond to nominative singular *awō. This might be the descendant of a feminine derivative, *awō (< PIE *h₂éwh₂ah₂, if it goes back as far as that), of *awaz in Proto-Germanic.

What about the other Old English words for uncles and aunts? Well, all of them lack cognates outside of West Germanic. fædera and mōdriġe are clearly derivatives of the words for ‘father’ and ‘mother’ respectively; they were probably originally adjectives meaning ‘paternal, i.e. of a father’ and ‘maternal, i.e. of a mother’ respectively. faþu also seems to be some kind of derivative of the word for ‘father’, although I don’t know what process would turn *fadēr into *faþō. Note the apparent Verner’s Law alternation!

Old English had a word mǣġ ‘relative’, which is not a kinship term has defined here. Its cognate in Old High German, māg, also means ‘relative’. However, in Old Norse mágr was a general term meaning ‘male relative by marriage, i.e. son-in-law, brother-in-law, father-in-law’, and in Gothic mēgs meant ‘son-in-law’ specifically. This word has no cognates in other Indo-European languages, and it is possible that was a kinship term with the ON or Goth. meaning in Proto-Germanic; then again ‘relative’ might just as well be the original meaning, especially if *aiþumaz is Proto-Germanic.

As for Proto-Indo-European, there is even more uncertainty than with Proto-Germanic, but the following kinship terms can be reconstructed.

*ph₂tḗr ‘father’ (c.f. Tocharian B pācer, Sanskrit pitā́, Old Armenian hayr, Greek patḗr, Latin pater, Old Irish athair), *máh₂tēr ‘mother’ (c.f. Tocharian B mācer, Sanskrit mātā́, Old Armenian mayr, Greek mḗtēr, Lithuanian mótė, Old Church Slavonic mati, Latin māter, Old Irish máthair), *h₂éwh₂os ‘grandfather’ (c.f. Hittite ḫūḫḫas, Old Armenian haw, Latin avus), *bráh₂tēr ‘brother’ (c.f. Sanskrit bhrātā́, Old Armenian ełbayr, Greek phrátēr, Lithuanian brólis, Old Church Slavonic bratrŭ, Latin frāter, Old Irish bráthair), *swésōr ‘sister’ (c.f. Tocharian B ṣer, Sanskrit śvasā́, Lithuanian sesuõ, Old Church Slavonic sestra, Latin soror, Old Irish siur), *swéḱuros ‘father-in-law’ (c.f. Sanskrit śvaśura, Greek hekurós, Albanian vjehërr, Old Church Slavonic svekrŭ ‘husband’s father’, Latin socer), *sweḱrúh₂ ‘mother-in-law’ (c.f. Sanskrit śvaśrū́s, Greek hekurā́, Old Church Slavonic svekry, Latin socrus), *dayhₐwḗr ‘husband’s brother’ (c.f. Sanskrit devā́, , Old Armenian taygr, Greek daḗr, Old Church Slavonic děverĭ, Latin lēvir), *yénh₂tēr ‘husband’s brother’s wife’ (c.f. Sanskrit yā́tṛ, Greek enátēr, Lithuanian jéntė, Old Church Slavonic jętry), *ǵh₂lōws ‘husband’s sister’ (c.f. Greek gálōs ‘sister-in-law’, Old Church Slavonic zŭlŭva ‘husband’s sister’, Latin glōs ‘husband’s sister’), *suHnús / *suHyús ‘son’ (c.f. Tocharian B soy, Sanskrit sūnú, Greek huiús, Lithuanian sūnùs, Old Church Slavonic synŭ), *népōts ‘grandson’ (c.f. Sanskrit nápāt, Greek anepsiós ‘cousin’, Albanian nip ‘grandson, nephew’, Old Church Slavonic netijĭ ‘nephew’, Latin nepōs ‘grandson, nephew’, Old Irish nïa ‘sororal nephew’), *snusós ‘daughter-in-law’ (c.f. Sanskrit snuṣā́, Old Armenian nu, Greek nuós, Latin nurus), *dʰugh₂tḗr ‘daughter’ (c.f. Tocharian B tkācer, Sanskrit duhitā́, Old Armenian dustr, Greek thugátēr, Lithuanian duktė̃, Old Church Slavonic dŭšti)

There were probably feminine counterparts to *h₂éwh₂os and *népōts in Proto-Indo-European, but they were formed as derivatives of the masculine terms. There are numerous indications that the society of the Proto-Indo-European speakers was patrilocal: swéḱuros seems to have referred to a husband’s father only, not a wife’s father, there is a basic term for ‘husband’s brother’s wife’ but not ‘husband’s sister’s wife’, and it is uncertain whether there are reconstructable basic PIE terms for ‘wife’s brother’ or ‘wife’s sister’.

It looks to me like the evidence of kinship terminology suggests that English-speakers and their linguistic ancestors have been patrilocal for most of their history. That said, as mentioned above, Proto-Germanic *awahaimaz suggests that there might have been a short matrilocal period around the Proto-Germanic period. This is far from conclusive evidence on its own, but there are also clues that the Germanic peoples might have been to some degree matrilocal (or avunculocal) from Tacitus’s Germania, and if the Harris-Divale theory of matrilocality being related to external warfare during state formation is correct, this would be a prediction of that theory.

Words for men and women in Indo-European languages

There were quite a few words meaning ‘man’ in Old English (OE). However, mann, the ancestor of the modern English word man, wasn’t one of them. In the Bosworth-Toller Anglo-Saxon Dictionary the definition of mann is given as ‘human being of either sex’. It only started to be used to refer to male human beings in particular in late OE, from c. 1000 AD. The old sense survives in modern English, but it is no longer the primary one and it has become less common over time. The use of gender-neutral man is still fairly common in compounds like mankind, manmade and manslaughter. In fact, the word woman itself is descended from a compound in which man was used in the gender-neutral sense. One of the two main words for ‘woman’ in OE (along with cwēn, the ancestor of modern English queen) was wīf, the ancestor of modern English wife. The word was used in the sense of ‘wife’ already in OE, but its primary sense was ‘woman’ in OE, and this sense has survived in the compounds midwife and fishwife. Perhaps due to the increasing dominance of the sense of ‘wife’, the compound wīfmann (‘woman-person’) started to be used more often for ‘woman’ until the ‘woman’ sense of wife became extinct.

OE mann is a descendant of the reconstructed Proto-Germanic (PGmc) word *mann- (of uncertain ending). This appears man in Old Frisian, Old Saxon, Old Dutch and Old High German, maðr in Old Norse and manna in Gothic. As with the OE word, these words originally meant ‘human being’ but later shifted to meaning ‘man’ specifically; the ‘human being’ sense survives as a secondary one in Icelandic and Faroese, but on the continent it has been completely replaced by derived words such as German Mensch. (Mensch is a descendant of Old High German mennisko. From mann an adjective was formed by adding the umlaut-inducing suffix -isk (cognate to English -ish), then this adjectivisation was undone again by adding a nominal ending -o, which would have made the word completely redundant if the meaning of the original noun man had not been changed.) PGmc *mann-, in turn, is probably the descendant of the Proto-Indo-European (PIE) word *mánus, which is also the ancestor of Proto-Slavic *mǫ̑žь ‘man, husband’ (> Russian muž ‘husband’) and Sanskrit mánuḥ ‘human being’. Different explanations have been proposed for the double *-nn- in the PGmc word; Ringe (2006)’s is that the PIE word had an oblique stem *mánw-, PIE *-nw- regularly became *-nn- in PGmc, and the form of the oblique stem was generalised. In the Hindu religion, Manu is the name of the progenitors of humanity, and in Tacitus’s Germania he mentions that ‘[the Germanic peoples] celebrate the god Tuisto, sprung from the earth, and his son Mannus, as the fathers and founders of their race’, which seems to me to strongly suggest that *mann- and mánuḥ share a common ancestor.

As for OE wīf, it is a descendant of PGmc *wībą, which appears as wīf in Old Frisian, Old Saxon and Old Dutch, wīb in Old High German and víf in Old Norse. In the continental Germanic languages the word has been replaced as the word for ‘woman, wife’ by descendants of PGmc *frawjǭ ‘lady’, such as Dutch vrouwe and German Frau. In Dutch and German wijf and Weib remain as words but have acquired a pejorative connotation because of the contrast with vrouwe and Frau; using the original word would imply that the woman is of low birth. The same kind of dynamic is responsible for the phenomenon in English where in public addresses (e.g. on bathroom doors) the words ladies and gentlemen and are used in place of women and men. In Icelandic (and Faroese? I don’t have a good source for Faroese) the word survives, but is old-fashioned and restricted to poetic use; the usual word for ‘woman’ is kona. This word is a cognate of English queen; it is a descendant of PGmc *kwēniz via Old Norse kván. In Gothic, *kwēniz appears as qēns ‘wife’, but there seems to be no trace of this word in the continental West Germanic languages, and kván has died out in the continental North Germanic languages as well. In English, of course, the meaning of the word was specialised to mean a royal wife in particular, although the word can also be used to refer to a gay man and this might be a survival of the old sense of ‘woman’. PGmc *kwēniz is, in turn, a descendant of PIE *gʷḗn ‘woman’. This word is very widely attested in the Indo-European languages: it appears as Proto-Slavic *žena (> Russian žená), Old Irish , Ancient Greek gynḗ, Armenian kin, Sanskrit jániḥ ‘wife’ and Tocharian B śana (although no cognate survives in Latin). Ancient Greek gynḗ in particular appears in a few Greek-derived English words such as gynaecology, polygyny and misogyny. What about *wībą? It’s uncertain whether this word is a descendant of a PIE word (it might have been borrowed from some long-lost language in PGmc specifically; it might even be specific to Northwest Germanic since it does not appear in Gothic). A link has been proposed between it and Proto-Tocharian *kwäipe ‘feel shame’ (> Tocharian A kip, kwīp) via a change of meaning along the lines of ‘woman’ > ‘female genitalia’ > ‘shame’, but I think this change is too far-fetched. Although the fact that *wībą was neuter, rather than feminine, is suggestive.

So what was the Old English word for ‘man’? The main one was wer. It started to die out in English in the late 13th century, but it survives in the compound werewolf (‘man-wolf’). The Proto-Germanic form of the word was *weraz, and it appears in Old Frisian, Old Saxon and Old High German as wer, Old Norse as verr and Gothic as waír, with the meaning ‘man’ in each case. However, the word has died out in all of the modern Germanic languages, except in Icelandic (and Faroese?) were it survives, not as the usual word for ‘man’, but as the poetic word ver. The word is also widely attested in Indo-European as a whole; its Proto-Indo-European form was *wiHrós, which appears as výras in Lithuanian, fear in Irish, gŵr ‘husband’ in Welsh, vir in Latin and vīrá in Sanskrit. A few English words, such as virile and virtue, are derived from the Latin form of the word.

The word vir didn’t survive in the Romance languages, either; it has been replaced by descendants of Latin homō ‘human being’. It’s interesting how this change parallels exactly the change in the Germanic languages, where *mann-, another word meaning ‘human being’, replaced *weraz as the word for ‘man’. The word homō can be seen in derived English words like human and hominid which are of Latin origin. However, Old English also had a direct cognate of homō: guma. In Old English, this word referred to male humans, specifically, so it was a synonym of wer; however, it was more of a poetic word, whereas wer was the everyday word for ‘man’. Both words are descendants of a derivative *dʰǵʰm̥mō of the PIE *dʰéǵʰōm ‘earth’ (in Germanic and Latin, the initial *dʰ was regularly lost, and *ǵʰ regularly became h in Latin) which meant ‘something from the earth’. The word guma has survived into modern English only via the Old English compound brȳdguma (‘bride-man’). This compound of course became modern English bridegroom (often shortened to groom), and its meaning has not changed. However, the insertion of the -r- in groom is an irregular development. What seems to have happened is that the word groom came into Middle English (from an unknown source) c. 1200 with the meaning ‘youth’. This was then confused with the -goom element in bridegoom and so the modern form of the word arose. As with wer, similar developments have occured in all Germanic languages. The r-insertion is unique to English, but all of the other Germanic languages have lost their cognates of guma but retain it in a compound cognate to English bridegroom (e.g. German Bräutigam).

As well as *wiHrós, there is another widespread Indo-European word for ‘man’, which had the PIE form *h₂nḗr. This appears as njeri ‘human being’ in Albanian, Nerō (a personal name) in Latin, anḗr in Ancient Greek, and nára (this one also has a secondary sense of ‘human being’) in Sanskrit, and it also appears in the derivatives neart in Irish and Welsh nerth, both meaning ‘strength’. The Greek word anḗr had the oblique stem andr-, and this appears in many English words such as androgyny, polyandry, android and androgen, as well as in the personal name Andrew. It is tempting to link the Greek word for ‘human being’, ánthrōpos, to *h₂nḗr as well, but the presence of -th- rather than -d- in the word is unexplainable if this is the case. The real etymology of ánthrōpos is unknown. Given that the sense of ‘human being’ is attested in Sanskrit and Albanian for *h₂nḗr, it is possible that this was the original sense in PIE, too. Either way, it would have had a synonym in either *wiHrós or *mánus. This shift has the advantage of not requiring a shift from the more specific sense of ‘man’ to the more general sense of ‘human being’; shifts in meaning more often increase specificity rather than generality.

Clearly the senses of ‘man’ and ‘human being’ are quite prone to confusion. I don’t know of any cases where a word has shifted directly in meaning from ‘human being’ to ‘woman’, or the other way around. I’d be interested to hear of examples if anybody has any. The similar shift ‘young human being’ to ‘woman’ seems like it could definitely be possible,, though. The English word girl (which is of unknown origin, first appearing c. 1300) originally meant ‘child’; it was gender-neutral. Over time, it has come to refer specifically to female children. Since the 1500s it has been used to refer to young women as well, and since the 1800s it has sometimes been used to refer to all women, even elderly ones, although this usage has never become standard. So this word which originally meant ‘child’ may in the future have shifted its meaning to ‘woman’. A shift from ‘human being’ to ‘woman’ might be possible via this route, but it would require an initial shift of ‘human being’ to ‘child’. I don’t know whether such a shift is possible; I was going to say it was unlikely, but semantic shifts can happen in all sorts of weird ways, so I don’t really have any idea.

(note: a lot of this post is based on information gathered from Wiktionary and the Online Etymology Dictionary which are not entirely reliable sources. I tried to look up every word cited here in a dictionary specific to the language the word belonged to, to make sure I didn’t end up citing words with the wrong meaning, or citing words that didn’t actually exist. However, it’s hard to find freely available online English-language dictionaries for some of the more obscure languages like Faroese, so I wasn’t able to do this for every word; and given that this post ended up involving a lot of words from a lot of languages it’s quite possible that some errors in detail are present. The PGmc and PIE words cited have been checked via Ringe (2006), From Proto-Indo-European to Proto-Germanic.)

English words for mammal species, ordered by age

This is a list of English words referring to kinds of mammals, ordered by age. By ‘age’, I mean the earliest time at which the word was used in its current sense; for example, the word ‘deer’ is of Proto-Germanic vintage but it was originally used to refer to animals in general (like the modern German cognate Tier); the word was already used to mean ‘deer’ specifically in Old English, but the wider sense only became the more usual one by the 15th century, so I have listed the word as being only 500 years old.

I have not included words referring to animals of specific sexes or ages, except for the words ‘cow’, ‘bull’, ‘steer’ and ‘ox’. I have also not included words referring to animals that I wouldn’t expect most people living in England to have heard of, unless they are of especially old vintage (like ‘onager’).

Proto-Indo-European period (4000 BC – 2500 BC): beaver, mouse, swine, hound, wolf

Proto-Germanic period (2500 BC – 100 AD): ape*, horse, cow, bull†, steer†, ox, elk, whale, cat, fox, bear, weasel, seal

(note: ‘cat’ was borrowed from Latin at the end of this period, ‘ape’ is probably late as well although its origin is unknown)

Proto-West Germanic period (100 AD – 450 AD): hare, boar, sheep

Early Old English period (450 AD – 900 AD): shrew, ass, camel, tiger

Late Old English period (900 AD – 1100): rat, pig, dog

12th century (1100 – 1200): lion

13th century (1200 – 1300): dromedary, ounce, panther, leopard

14th century (1300 – 1400): squirrel, mole, bat, onager, rhinoceros, goat, dolphin, porpoise, lynx, hyena, polecat, elephant

15th century (1400 – 1500): monkey, baboon, porcupine, dormouse, hedgehog, hog, deer, reindeer, antelope, genet, marten

16th century (1500 – 1600): chinchilla, marmot, giraffe, buffalo, chamois, hippopotamus, civet, badger, armadillo, manatee

17th century (1600 – 1700): orangutan, guinea pig, woodchuck, lemming, muskrat, hamster, zebra, Bactrian, llama, peccary, moose, bison, gazelle, ibex, narwhal, jaguar, mongoose, jackal, skunk, wolverine, mink, raccoon, walrus, sealion, sloth, opossum, possum

18th century (1700 – 1800): chimpanzee, gibbon, lemur, rabbit, chipmunk, groundhog, donkey, tapir, alpaca, yak, gnu, beluga (whale), pangolin, ocelot, cougar, puma, cheetah, dingo, coyote, anteater, mammoth, wombat, kangaroo, platypus

19th century (1800 – 1900): gorilla, vole‡, gerbil, wildebeest, orca, meerkat, (red) panda, aardvark, dugong, bandicoot, koala, wallaby

20th century (1900 – 2000): (giant) panda

* The word ‘ape’ originally referred to both monkeys and apes (well, it first referred to monkeys, and then to apes; the Proto-Germanic speakers would not have been familiar with any ape species), and it is still used in this sense colloquially, so I have dated its origin accordingly; I couldn’t find any information on how early the word was used in the more specific sense.

† The words ‘bull’ and ‘steer’ were synonymous in Proto-Germanic, like the modern German cognates Bulle and Stier; I have dated their origin to Proto-Germanic, as if they were still synonyms, even though, strictly speaking, bulls are uncastrated and steers are castrated. I couldn’t find any information on how recently the specialisation of the meanings of these two words was (it was post-Old English, at least), so it was easier to do it this way.

‡ The word ‘vole’ is a shortening of an older compound ‘volemouse’, in which the ‘vole-‘ element had no independent meaning; I couldn’t find any information on it, but since the Dutch word for ‘vole’ is woelmuis and the German word for ‘vole’ is Wühlmaus, it seems likely that the compound goes back to the Proto-West Germanic period.