Etymologies of some English kinship terms

Abbreviations: PIE (Proto-Indo-European), PGmc (Proto-Germanic), OE (Old English), ME (Middle English), NE (New English, i.e. modern English).

father
From OE fæder. The sequence /-dər/ regularly became /-ðər/ after stressed vowels in early ME, which is why we have father rather than fader. The development of the stressed vowel, however, is irregular. The expected development would be into /aː/ by ME open syllable lengthening and then into /ej/ (as in face) by the Great Vowel Shift. However, in this word only the first change seems to have occured, so that the stressed vowel in father is the same as the one in palm and spa, rather than the one in face. In British English, the words rather (< OE hraþor) and lather (< OE lēaþor) have similarly resisted the Great Vowel Shift, although in American English the vowel in these words has been shortened, as it has in all dialects in gather (< OE gadorian) (slather has /a/, but is of unknown origin); that is, these words seem to have resisted the ME open syllable lengthening in the first place. Of course, we can’t rule out the alternatively possibility that they were lengthened and then subsequently shortened, perhaps as part of the same round of shortenings that resulted in short vowels in words like bread and blood. I don’t actually know of any words in -ather where both of the expected changes have taken place, so that the word is pronounced with /-ejðər/.

As for OE fæder, this is a regular development from PGmc *fadēr. And fadēr, in turn, is a regular development from PIE *ph2tḗr.

mother
From OE mōdor. See above on the change of -d- to -th-. The development of the stressed vowel is again irregular. The expected development would be for it to remain as /oː/ in ME and then to develop into /uː/ by the Great Vowel Shift. In this word, however, we have /ʌ/, the usual outcome of ME short /u/. The same outcome exists in brother (< OE brōþor), other (< OE ōþer) and smother (< OE smorian, with -th- inserted perhaps due to influence from the agentive form of the verb, *smorþor ‘suffocator’). There are a couple of words (blood and flood) where ME /oː/ become /uː/ by the Great Vowel Shift, but was subsequently shortened and changed into /ʌ/. However, this change was irregular; for example, it didn’t occur in food. Perhaps the same shortening occured in mother, brother, other and smother after the Great Vowel Shift. I can’t think of any words in -other which aren’t pronounced with /-ʌðər/, so perhaps the shortening was regular in this environment.

As for OE mōdor, this is from PGmc *mōdēr. The changes of unstressed vowels from PGmc to OE are very complicated and I don’t understand them very well. But I don’t know why *mōdēr became mōdor rather than mōder. Perhaps the preceding heavy syllable had something to do with it? PGmc *brōþēr and *duhtēr, with a heavy initial syllable, became brōþor and dōhtor respectively, while PGmc *swēster, also with a heavy initial syllable, had variants in both -er and -or, but PGmc *fadēr, with a light initial syllable, became fæder. The vowel -o- was regularly inserted in OE before postconsonantal r at the end of a word (c.f. OE wundor ‘wonder’ < PGmc *wundrą), so if the -e- in *mōder was dropped due to the preceding syllable being heavy, that would explain it.

PGmc *mōdēr, in turn, is from PIE *máh2tēr. The expected development of this word would be *mōþēr, but the accent appears to have been shifted to the suffix at some point, perhaps due to analogy with *ph2tḗr ‘father’ and *dʰugh2tḗr ‘daughter’, so that -d- occurs by Verner’s Law. Sanskrit mātā́ ‘mother’ shows accentuation on the suffix as well; it is on the basis of Ancient Greek mḗtēr ‘mother’ that accent on the initial syllable is reconstructed.

brother
From OE brōþor. See above on the pronunciation of OE ō as /ʌ/ in NE. OE brōþor is from PGmc *brōþēr. See above on the change of unstressed ē into o. And PGmc *brōþēr is a regular development from PIE *bʰráh2tēr.
sister
From OE sweostor. Variants of the OE word in -er and in swi-, swy- or swu- are also attested. The quality of the final unstressed vowel would not make any difference to the NE reflex, because all OE unstressed vowels merged into /ə/ in ME. However, variants in sweo- would have ended up as NE swester (rhymes with fester), and variants in swu- would have ended up as NE suster (rhymes with muster). So the modern form of the word must originate from a variant in swi- or swy-.

The PGmc form of the word was *swestēr, and the regular development of this in OE would have been *swester, or sweostor if the -e- was dropped due to the preceding syllable, as conjectured above. The other variants probably resulted from a combination of influence of the -w- on the following vowel (c.f. OE wudu ‘wood’ from PGmc *widuz) and influence of the Old Norse form of the word, systir.

PGmc *swestēr is from PIE *swésōr. The expected development of PIE *swésōr would be *swesōr (which would become OE *sweosor and NE *sweaser /swiːzər/), but presumably the word was influenced in PGmc by *bʰráh2tēr ‘brother’, *ph2tḗr ‘father’, *máh2tēr ‘mother’ and *dʰugh2tḗr ‘daughter’, which all ended in *-tḗr or *-tēr in PIE.

son
A regular development from OE sunu. ME open syllable lengthening did sometimes affect short /i/ and /u/ (c.f. week < OE wicu) but not usually. OE sunu, in turn, is a regular development from PGmc *sunuz. PGmc *sunuz is from PIE *suHnús. The expected development of PIE *suHnús would be *sūnus (which would become OE *sūns and NE *sunse /sʌns/). Apparently the accent was retracted to the initial syllable, so that the *-s became *-z by Verner’s Law, and the long vowel was shortened. The shortening of the long vowel also occurs in Italic and Celtic (those branches give no evidence with regards to the accent). Ringe (2006) attributes the change to “morphological resegmentations or reanalyses which yielded roots without a final laryngeal (or its reflex)”, which isn’t very enlightening. The reconstruction with the long vowel and final-syllable accent is based on Sanskrit sūnú.
daughter
A regular development from OE dōhtor, via ME doughter /dowxtər/; although ME /ow/ normally became NE /ow/ (as in boat), it regularly became /ɔː/ (as in thought) before /x/. OE dōhtor is from PGmc duhtēr. For some reason, the vowel was lowered to o and lengthened. The lowering apparently also occured in every other Germanic language, and the lengthening happened Old Norse as well (c.f. Gothic dauhtar /dɔxtar/, Old Norse dóttir, Old High German tohtar). But I have no idea why either change occured. As for the development of the unstressed vowel from ē to o, see above. PGmc duhtēr is from PIE *dʰugh2tḗr. The expected development of PIE *dʰugh2tḗr in PIE would be either *dukþēr or *dukuþēr, depending on whether you believe that interconsonantal laryngeals in non-initial syllables developed into zero or *u in PGmc (opinions differ). The form we actually see is the result of interference from the oblique stem of the word, *dʰuktr̥-ˊ, which became PGmc *duhtur-.
uncle
From Old French oncle ‘uncle’. Old English had two words for ‘uncle': fædera ‘father’s brother’ and ēam ‘mother’s brother’. The regular developments of these words into NE would have been fathere (rhymes with gather, due to trisyllabic laxing) and eam (rhymes with beam), respectively.
aunt
From Old French ante ‘aunt’. In southern England, Australia (presumably South Africa and New Zealand as well?), New England and Virginia the word is pronounced with /aː/, while in other areas of the world it is pronounced with /a/. I don’t know why this is the case. I also don’t know why it is spelled with au-.

Old English had two words for ‘aunt': faþu ‘father’s sister’ and modrige ‘mother’s sister’. The regular developments of these words into NE would have been fathe (rhymes with lathe) and mothery (pronounced in the same way as the derived word meaning ‘like a mother’), respectively.

nephew
From Old French neveu ‘nephew, grandson’. The word was originally spelt nevew, and pronounced accordingly. The origin of the spelling with -ph- is kind of a mystery, but perhaps it was due to influence from Latin nepōs ‘nephew, grandson’. A spelling pronunciation with /f/ subsequently emerged and became predominant in American English. The pronunciation with /f/ is now usual in British English as well, although some old-fashioned speakers still pronounce it with /v/.

The native word, neve ‘nephew, grandson’ (rhymes with Eve), survived into ME but is now obsolete. This word is a regular development of OE nefa. OE nefa is a regular development of PGmc *nefô. PGmc *nefô is from PIE *népōts. The expected development of PIE *népōts would be *nefōþs (probably, although I’m not aware of any final *-ts clusters which survived into PGmc), but the noun came to be declined in the same way as *gumô ‘man’, and the ending in the nom. sg. was changed accordingly.

niece
From Old French nece ‘niece, granddaughter’. The native word, nift ‘nephew, granddaughter’ (rhymes with lift), survived into ME but is now obsolete. This word is a regular development of OE nift. OE nift is a regular development of PGmc *niftiz. There does not appear to be a securely reconstructable feminine counterpart of PIE *népōts, although the PGmc form would go back to a PIE form *néptis, and the same form would give Latin neptis.

The origins of the songs on Joan Baez’s first album

Most of this information is taken from the Traditional Ballad Index (TBI). In particular the dates of earliest recordings, and the lists of regions where each song has been recorded, are taken from the TBI and may not be as early or as complete as they could be.

Silver Dagger
A traditional American folk ballad recorded from Appalachia, the Rocky Mountains, the Midwest, Southeast and South-Central United States, and the Canadian Maritimes, with the earliest date of recording being 1866. There is another traditional folk ballad with similar lyrics called “Drowsy Sleeper“ which has been recorded from Appalachia, the Mid-Atlantic, Midwest, Southeast and South-Central United States, New England, the Canadian Maritimes, Newfoundland and Scotland, with the earliest date of recording being 1830. Hence the ballad may ultimately have a Scottish origin.
East Virginia
A traditional American folk ballad recorded from Appalachia and the Southeast and Southwest United States), with the earliest date of recording being 1917.
Fare Thee Well (10,000 Miles)
A traditional English, Scottish and American folk ballad. There is a confusing variety of songs on this same theme; the closest one listed in the Traditional Ballad Index seems to be “Fare Thee Well, My Own True Love”, which has been recorded from Appalachia, the Midwest, Southeast, South-Central and Southwest United States, Southwest England and Aberdeenshire, with the earliest date of recording being 1867. However, the Index identifies this song by the inclusion of the line “Who will shoe your pretty little foot?”, which is actually not included in Joan Baez’s version. The song must be older, because the last stanza of Robert Burns’ “A Red, Red Rose” (1794) is clearly derived from the lyrics of this song. According to Lesley Nelson it was included in the Book of Roxburghe Ballads and dated to 1710 (the book was published in 1847, but the ballads were collected much earlier).
House of the Rising Sun
A traditional American folk ballad recorded from the South-Central United States, with the earliest date of recording being 1933.
All My Trials
A traditional American folk ballad recorded from the Southeast United States, with the earliest date of recording being 1961 according to the TBI. This is the date of the Pete Seeger recording, but Joan Baez had already released this song in 1960. It seems to have been picked up by the folk revival without having been recorded in any collections made earlier. A song called “The Tallest Tree in Paradise” recorded in 1954 has some similar lyrics and some completely different lyrics. The TBI mentions that a verse including the lines “If life were merchandise that money could buy / The rich would live and the poor would die” was found in a gravestone in Tysoe (Warwickshire) in 1798.
Wildwood Flower
A traditional American folk ballad recorded from Applachia and the Southeast and South-Central United States, with the earliest date of recording being 1928. This is the date of the Carter Family recording. The origin of this song has been traced to a song called “I’ll Twine ’Mid the Ringlets” published by the composer Joseph Philbrick Webster in 1860 with lyrics by Maud Irving. Maud Irving seems to have been a pseudonym used by a spiritualist poet called J. William Van Navee. Over time, as the song was passed down through the oral tradition, the nonsensical lines heard in the Carter Family version (“I’ll twine with my mingles”) must have evolved through mishearing—the song is thus a good illustration of the effect of the folk process.
Donna Donna
One of the two non-traditional songs on Joan Baez’s first album. It was written for the Aaron Zeitlin Yiddish-language play Esterke (1940-1941). The music was composed by Sholom Secunda.
John Riley
A traditional Scottish and American folk ballad recorded from Appalachia, the Mid-Atlantic, Midwest and Southeast United States and Aberdeenshire, with the earliest date of recording being 1845. But the theme of a lover who is unrecognised by his love after a long journey away at sea is an old one—it goes right back to the Odyssey.
Rake and Rambling Boy
A traditional English, Scottish, Irish and American folk ballad recorded from Appalachia, the Southeast, South-Central and Southwest United States, Ontario, Southwest and Southeast England, as well as East Anglia, Scotland and Ireland. The TBI gives “before 1830” as the earliest date of recording.
Little Moses
A traditional American folk ballad recorded from Apalachia and the South-Central United States, with the earliest date of recording being 1905. Of course, the story is much older, having come from the Bible.
Mary Hamilton
A traditional Scottish and American folk ballad recorded from the Scottish Lowlands, Appalachia, the Midwest, Southeast, South-Central and Southwest United States, New England and the Canadian Maritimes, with the earliest date of recording being 1790. The “four Marys” mentioned in the last stanza may be the historical “four Marys” who were ladies-in-waiting to Mary, Queen of Scots. However, none of the four Marys had the surname Hamilton, and there are alternative theories as to the historical events the song is connected to. It is possible that multiple events have contributed to the song, and much of the story could be completely made up.
Henry Martin
A traditional English, Welsh, Scottish and American folk ballad recorded across England and Wales and in Aberdeenshire, Appalachia, the Midwest, Northeast, Southeast, South-Central and Southwest United States, the Canadian Maritimes and Newfoundland. The TBI gives “before 1825” as the earliest date of recording.
El Preso Numero Nueve
The second of the two non-traditional songs on Joan Baez’s first album. It was written and composed by the Mexican singer-songwriter Roberto Cantoral and recorded by him with Antonio Cantoral as part of an act called the Hermanos Cantoral (that is, Cantoral Brothers, in Spanish). The Hermanos Cantoral were active from 1950 to 1954; I don’t know exactly when the song was written or first recorded.

The phonetic motivation for Grimm’s Law

…is not as clear as I had thought.

According to the standard reconstruction of Proto-Indo-European (PIE), the language had three series of stops. One of the series is thought to have consisted of voiceless unaspirated stops: *p, *t, *ḱ, *k, and *kʷ. Another is thought to have consisted of voiced unaspirated stops: *b, *d, *ǵ, *g and *gʷ. And the other is thought to have consisted of voiced aspirated stops: *bʰ, *dʰ, *ǵʰ, *gʰ and *gʷʰ. These series were preserved in this form in Sanskrit, although Sanskrit also innovated a fourth series of voiceless aspirated stops out of clusters consisting of voiceless stops followed by laryngeals. In Proto-Germanic, however, the situation is different. The PIE voiceless unaspirated stops have become voiceless fricatives; c.f. Proto-Germanic *þū (> English thou) and Sanskrit tvám ‘you (singular)’. The PIE voiced unaspirated stops have become voiceless unaspirated stops; c.f. Proto-Germanic *twō (> English two) and Sanskrit dvā́ ‘two’. And the PIE voiced aspirated stops have become voiced unaspirated stops; c.f. Proto-Germanic *meduz (> English mead) and Sanskrit mádhu ‘honey’.

I had always assumed that the change went something like this. First, the voiceless unaspirated stops fricativised, retaining their lack of voice and aspiration and becoming voiceless fricatives. Changes of stops into fricatives are common and unremarkable; phonologists disagree on whether this is due to a natural tendency towards lenition (weakening) or due to assimilation to neighbouring phonemes which are more sonorous, but there is no dispute that such a change can be phonetically motivated. The change can be written formally in terms of distinctive features as follows.

[-continuant, -voice] > [+continuant]

Second, the voiced unaspirated stops devoiced, retaining their lack of frication and aspiration and becoming voiceless unaspirated stops. This change would be unusual if it occured on its own. However, the previous change had left the language with no voiceless unaspirated stops, only voiced unaspirated stops and voiced aspirated stops. The [±voice] feature which had been used to distinguish the three original series was now redundant. For obstruents the unmarked value of this feature is [-voice] (voicelessness); that is, obstruents tend to be voiceless unless something forces them to be voiced. Therefore, it was natural for the voiced unaspirated stops to be devoiced. The change can be written formally in terms of distinctive features as follows.

[-continuant, +voice, -spread glottis] > [-voice]

Third, the voiced aspirated stops deaspirated, retaining their voicing and lack of frication and becoming voiced aspirated stops. This change was increased in likelihood due to the fact that the previous change left the language with two series of stops, one of which was voiceless unaspirated and one of which was voiced aspirated; the two features [±voice] and [±spread glottis] were therefore redundant against each other. [-spread glottis] is the unmarked value of the [±spread glottis] feature on stops, so it was natural to resolve this by deaspirating the voiced aspirated stops (although devoicing the voiced aspirated stops would have worked just as well). The change can be written formally in terms of distinctive features as follows.

[-continuant, +spread glottis] > [-spread glottis]

But there are two questions I have about this account.

  1. Why did the second change involve the voiced unaspirated stops devoicing, rather than the voiced aspirated ones? The redundancy of the [±voice] feature could have been resolved either way. In fact, why didn’t both kinds of stop devoice? Since [-voice] is unmarked for obstruents there is nothing stopping this from happening.
  2. Why did the third change involve the voiced aspirated stops deaspirating rather than devoicing? Since the [±voice] and [±spread glottis] features were redundant against each other devoicing would have worked just as well as a means of resolving the redundancy.

Now, sound change is not a deterministic process, so perhaps the answers to these questions are just that out of all of the different ways the redundancies in question could be resolved, these were the ways that were chosen, more or less at random. I am satisfied with this as the answer to question 2. In fact, with respect to question 2 it seems like deaspiration would be a more likely occurence than devoicing because it is much more common for languages to distinguish stops using the [±voice] feature than it is for them to distinguish stops using the [±spread glottis] feature; contrasts of voice are therefore probably favoured over contrasts of aspiration (although this is only a tendency, and there are plenty of languages like Mandarin Chinese where [±spread glottis] is distinctive but [±voice] is not).

But I am less satisfied with this as an answer to question 1. As I mentioned above, the redundancy of the [±voice] feature could have been solved in three different ways:

  1. devoicing of the voiced unaspirated stops, resulting in a contrast between voiceless unaspirated stops and voiced aspirated stops.
  2. devoicing of the voiced aspirated stops, resulting in a contrast between voiced unaspirated stops and voiceless aspirated stops.
  3. devoicing of both kinds of stop, resulting in a contrast between voiceless unaspirated stops and voiceless aspirated stops.

There are languages with a contrast between voiced unaspirated stops and voiceless aspirated stops, as would result from option 2. English is such a language. There are also languages with a contrast between voiceless unaspirated stops and voiceless aspirated stops, as would result from option 3. Mandarin Chinese is such a language. But I know of no language which has a contrast between voiceless unaspirated stops and voiced aspirated stops, as would result from option 1. Yet option 1 seems to have been the option that was taken. This is odd.

I think there are phonetic reasons why we would expect options 2 or 3 to be favoured over option 1. If you examine the articulatory mechanisms which are used to produce voiced aspirated stops, you can see them as half-voiced stops, closer to voiceless stops than voiced unaspirated stops (but still voiced). If you think about voiced aspirated stops in this way, option 1 is weird, because it involves change of the voiced unaspirated (i.e. fully voiced) stops directly into voiceless unaspirated stops without passing through the intermediate stage where they would be voiced aspirated (i.e. half-voiced) and end up merging with the voiced aspirated stops. If the characterisation of voiced aspirated stops as half-voiced already makes sense to you, you can skip the next few paragraphs, because I’m now going to try and explain why this is an accurate characterisation.

The first thing that I want to explain is what voiced aspirated stops are. In terms of distinctive features, they are parallel to voiceless aspirated stops. Voiced aspirated stops are [+voice] and [+spread glottis], voiceless aspirated stops are [-voice] and [+spread glottis]. But the meaning of [+spread glottis] is different in the two cases. As a feature of voiceless stops, [+spread glottis] corresponds to increased duration of the period during which the vocal folds are prevented from vibrating (normally by keeping the vocal folds apart from each other, hence the name of the feature, although reducing the airflow is also an option). The between the release of a stop and the beginning of vocal fold vibration in order to voice the following voiced phoneme is called the voice onset time (VOT). For voiceless unaspirated stops, the VOT is close to 0, while for voiceless aspirated stops the VOT is larger, so that there is an audible period after the stop has been released where air flows through the glottis but the vocal folds do not vibrate. This results in a sound being produced during this period which is in fact exactly [h], the voiceless glottal continuant (although speakers of languages which have aspirated stops don’t usually perceive the [h], instead perceiving it as part of the preceding stop).

During the production of voiced stops, the vocal folds are already vibrating (that’s what it means for a stop to be voiced). So it is impossible for voiced stops to be aspirated if aspiration is defined as having a positive VOT1. Instead, [+spread glottis] as a feature of voiced stops corresponds to the vocal folds being held further apart than is normal for voiced stops, roughly speaking. The vocal folds are still close enough that they vibrate during the production of voiced aspirated stops, so such stops are not completely voiceless, but they are closer to voiceless than voiced unaspirated stops. The kind of voice that accompanies voiced aspirated stops is called breathy voice, as opposed to the modal voice that accompanies voiced unaspirated stops. It might help to look at the following diagram, which illustrates the relationship between the degree of closure of the glottis and different kinds of voicing. The diagram is adapted from Gordon & Ladefoged (2001).

Voiceless sounds have the least glottal closure. The glottal stop has the most glottal closure (complete closure). Modally-voiced sounds have a degree of glottal closure midway between these two extremes. Breathy-voiced sounds have a degree of glottal closure between that of voiceless sounds and voiced sounds. Creaky-voiced sounds have a degree of glottal closure between that of voiced sounds and the glottal stop.

(I should note that talking about the degree of closure of the glottis as if this was a scalar variable is an oversimplification. When the vocal folds vibrate, what happens is that the glottis alternates between a state where it is more or less fully open (as when a voiceless sound is being produced) and a state where it is more or less fully closed (as when a glottal stop is being produced). Closure occurs due to tension from the laryngeal muscles and opening occurs due to pressure from the flow of air through the trachea; closure results in buildup of air below the glottis, resulting in increased pressure, while opening allows air to flow through a greater area, resulting in decreased pressure, and this is why the alternation occurs. For a given rate of flow of air, there is a maximal tension above which opening cannot occur and a minimal tension below which closure cannot occur, and in between these two extremes there is an optimal tension which results in maximal vibration; this tension is approached during the production of modally-voiced sounds. If the tension is below the optimal tension but above the minimal tension, the result is a breathy-voiced sound. If the tension is above the optimal tension but below the maximal tension, the result is a creaky-voiced sound. Alternatively, creaky-voiced sounds can be produced by having the glottis completely closed at one end, with modal voice at the other end, and breathy-voiced sounds can be produced by having the glottis open so that the vocal folds do not vibrate at one end, with modal voice at the other end. But regardless of how these sounds are produced, they sound the same, so the distinction is not important. Either way, it is still accurate to say that breathy-voiced sounds are in a position between voiceless sounds and modally-voiced sounds.)

It would be helpful to see how voiced aspirated stops behave with respect to sound change in attested languages. Unfortunately, voiced aspirated stops are rare. which limits the number of available examples. As far as I know, voiced aspirated stops are mainly found in the Indo-Aryan languages of South Asia and the Nguni languages of South Africa. In the Indo-Aryan languages the voiced aspirated stops have been inherited from PIE, or at least Vedic Sanskrit (depending on what you believe about the nature of the PIE stops), and most of them seem to have preserved them unchanged. Sinhala and Kashmiri have no voiced aspirated stops, but I don’t know and can’t find any information on what happened to them in these languages. So it seems that the voiced aspirated stops have been stable in these languages. That suggests the rarity of voiced aspirated stops is probably more due to the infrequency of sound changes that would make them phonemic rather than inherent instability. However, the mutual influence of these languages upon each other within the South Asian linguistic area might have helped preserve the voiced aspirated stops; the fact that the two most peripheral Indo-Aryan languages do not have them is perhaps suggestive that this has been the case. What about the Nguni languages? These are a tight-knit group, probably having a common origin within the last millennium, and their closest relatives such as Tswana have no voiced aspirated stops. So their voiced aspirated stops are of more recent vintage. Interestingly, Traill, Khumalo & Fridjhon (1987) have found that the Zulu voiced aspirated stops are actually voiceless, with the breathy voice occuring after the release on the following vowel. This seems like it could be the first step on a change of voiced aspirated stops into voiceless aspirated stops. But I don’t think any of this evidence is of much use in making the case that Grimm’s Law is weird. My case primarily rests on the idea that voiced aspirated stops are intermediate between voiceless and modally-voiced stops on the basis of how they are produced.

If the changes as described above are odd, maybe we should consider the possibility that the changes described by Grimm’s Law were of a different nature.

Perhaps a minor amendment can solve the problem. It is universally agreed that the Proto-Germanic voiced stops had voiced fricative allophones. It is not totally clear which environments the stops occured in and which environments the fricatives occured in, but they were all definitely stops after nasals and when geminate and fricatives after vowels and diphthongs. There are three different ways this situation might have come to be.

  1. The PIE voiced aspirated stops might have turned into voiced unaspirated stops first and then acquired fricative allophones in certain environments.
  2. The PIE voiced aspirated stops might have turned into voiced unaspirated fricatives first and then acquired stop allophones in certain environments.
  3. The PIE voiced aspirated stops might have turned into voiced unaspirated fricatives in certain environments and voiced unaspirated stops in others.

If we suppose that number 2 is the accurate description of what happened, then it is possible that the fricativisation of the PIE voiced aspirated stops occured before the devoicing of the PIE voiced unaspirated stops. This devoicing would then be perfectly natural because the PIE voiced unaspirated stops would be the only stops remaining in the language, so the marked [+voice] feature would be dropped from them. The voiced aspirated stops would probably have become voiced aspirated fricatives (i.e. breathy-voiced fricatives) initially and then these fricatives would have become modally-voiced since there would be no need for them to contrast with modally-voiced fricatives. Is it plausible that the voiceless unaspirated and voiced aspirated stops would have devoiced, but not the voiced unaspirated stops? What do these two kinds of stop have in common that the third stop lacks? If we think of voiced aspirated stops as half-voiced stops, we can describe the change as affecting all of the stops which were not fully voiced. The change is especially plausible, however, if we suppose that the PIE voiceless unaspirated stops had become aspirated before the changes described by Grimm’s Law took place. In that case, the change would affect the aspirated stops and not affect the unaspirated stops. Fricativisation of aspirated stops but not unaspirated stops is a very well-attested sound change; it happened in Greek, for example. The sequence of changes would be as follows:

[-continuant, -voice] > [+spread glottis]

[-continuant, +spread glottis] > [+continuant]

[-continuant, +voice] > [-voice]

[+continuant, +voice] > [-continuant]

(The last change would have occurred only in some environments; there are also conditioned exceptions to some of the other changes.)

Is there any other reason to think the PIE voiceless unaspirated stops might have become aspirated in Proto-Germanic before fricativising? Well, the reflexes of the Proto-Germanic voiceless stops are aspirated in the North Germanic languages and English, and have become affricates in some positions in German which suggests that they were originally aspirated; the lack of aspiration in Dutch can probably be attributed to French influence. That suggests the Proto-Germanic voiceless stops were already aspirated. Of course, these voiceless stops are the reflexes of the PIE voiced unaspirated stops, not the PIE voiceless unaspirated stops. But perhaps the rule that aspirated voiceless stops was persistent in Proto-Germanic, so that it applied to both the PIE voiceless unaspirated stops before they fricativised and the PIE voiced unaspirated stops after they were devoiced. The rule seems to have persisted into German, because German went through its own kind of replay of Grimm’s Law in which the Proto-Germanic voiceless stops became affricates or fricatives and the Proto-Germanic voiced stops were devoiced. This second consonant shift was never fully completed in most German dialects; in Standard German, for example, Proto-Germanic *b and *g were not devoiced in word-initial position. However, *d was devoiced (c.f. English daughter, German Tochter) and modern Standard German /t/ is aspirated, so, for example, Tochter is pronounced [ˈtʰɔxtɐ].

I think this is a satisfactory solution to the problem. The idea that the PIE voiced aspirated stops became fricatives first is not a new one, in fact it is probably the favoured scenario, but I have never seen it justified in this way, and Ringe (2006) suggests that the voiced aspirates changed into both stops and fricatives depending on the environment (number 3 above), which is incompatible with the scenario I have proposed here.

Finally, I think I should mention that all of this reasoning has been done on the assumption that PIE had voiceless unaspirated, voiced unaspirated and voiced aspirated stops. If you subscribe to an alternative hypothesis about the nature of the PIE stops, such as the glottalic theory, Grimm’s Law might have to be explained in a completely different way. But despite it not being as easy as it might appear at first glance, it does seem that the standard hypothesis is capable of explaining Grimm’s Law.

Whether it can explain Verner’s Law is another matter. I have always thought it a little odd that the voiceless fricatives were voiced after unaccented syllables but not after accented syllables. It is not obvious how accent and voice can affect each other. But I’ll discuss this, perhaps, in another post.

References

Gordon, M., & Ladefoged, P. (2001). Phonation types: a cross-linguistic overview. Journal of Phonetics, 29(4), 383-406.

Ringe, D. (2006). From Proto-Indo-European to Proto-Germanic: A Linguistic History of English: Volume I. Oxford University Press.

Traill, A., Khumalo, J. S., & Fridjhon, P. (1987). Depressing facts about Zulu. African Studies, 46(2), 255-274.

The origin of war (summary of Cannibals and Kings by Marvin Harris, chapter 4)

Previously: Chapter 2.

I meant to write about Chapter 3 next, which is about the origin of agriculture. Basically, Harris thinks that the origin of agriculture is ultimately a result of the climate change that occured at the end of the Ice Age; but the exact causal sequence is complicated, and it varies depending on which region you’re talking about. I was struggling to write a post about it, so instead I’m going to write about Chapter 4, which is about the origin of warfare. I might write something on Chapter 3 later.

Why do people wage war against each other? In order to start answering this question we first have to understand that warfare is a phenomenon that has varied significantly in its nature across time and space. Different kinds of wars may happen for different reasons. We also have to understand that there is a difference between proximal and distal causes. The First World War happened, in one sense, because of the assassination of Archduke Franz Ferdinand, and yet most people who ask why the First World War started are not satisfied with this answer, because on its own it does not explain why the archduke was assassinated nor why his assassination caused a war. In general, when one asks “why does X happen?” and one receives the answer “because Y” there remain two questions about the cause of X—“why Y?” and “why does Y cause X?”. Y is the more proximal cause here, while the cause of Y and the cause of the fact that Y causes Y are the more distal causes. There is a temptation with questions of causation to think in terms of trying to find a first cause, but there is a sense in which first causes do not exist; everything is caused by something else. Yet it is possible to get to a stage where the remaining distal causes are already understood, or can be taken for granted; at that point you can say that the question has been answered. A lot of literature has been written by historians to try to get us to that point with the question of what caused the First World War. However, I’m not going to discuss this literature very much here, because even the causes identified in this literature are too proximal; they are the causes of one specific war, while the question at hand is the cause of war in general. And the First World War is a very atypical example of a war. In fact, all wars between states are atypical, in a certain sense, so the causes of these wars are not going to be the main focus of our discussion. States have only existed for the last few thousand years. But behaviorally modern humans have existed for at least 30,000 years. For most of that time, all human societies had a foraging mode of substance and were organised into bands1. It is only in the last 10,000 years that societies started to emerge that had a farming mode of subsistence and were organised into tribes, chiefdoms or states, and for much of this time these societies were a minority. So out of all the societies that have ever existed, most have been organised into foraging bands. Therefore, let’s start by examining the causes of warfare between foraging bands specifically.

At this point we have to note that there is some dispute over whether warfare is characteristic of forager societies. The dispute has a long history; Hobbes famously argued that humanity’s “state of nature” was a “war of all against all”, while Rousseau argued for the opposite. Nobody denies that there are forager societies today which wage war. But there are also some forager societies which have never been observed waging war, such as the Andaman Islanders, the Yahgan of Patagonia and the Semai of Malaysia. These are a minority of forager societies today, but some anthropologists believe that during the Palaeolithic, all forager societies were like this, and the modern forager societies which wage war acquired the practice due to contact with farmers. Marvin Harris is not one of those anthropologists. He thinks warfare is a very old phenomenon. The dispute is to some extent politically charged, because those with anti-war inclinations would like to believe that the prospensity towards warfare is not innate to the human species. Of course, even if humanity did have some innate prospensity towards warfare that wouldn’t necessarily mean it couldn’t be suppressed by culture. In any case, the question of how old warfare is as a phenomenon is a factual one, to be settled by facts.

Unfortunately, there is no strong evidence for any stance when it comes to Palaeolithic warfare. We certainly don’t find obvious giveaways, like the walls, towers and moats of Jericho, or the Talheim Death Pit; none of the cave paintings depict warfare. Plenty of skeletons have been found showing signs of trauma, but this is not unambiguous evidence; the trauma could have been caused by hunting accidents, or one-off incidents that were not part of an organised campaign of violence, or it could have been inflicted after death (the Manus people of New Guinea, for example, sever their dead relatives’ heads to use as keepsakes, while the Fore people, also of New Guinea, used to smash holes in their dead relatives’ skulls in order to eat their brains).

Harris admits that warfare was probably less frequent and less deadly in the Palaeolithic. Forager societies are mobile, so there is less need for conflict over territory (although it might still arise when there is no new land to escape to that is sufficiently fertile). There is also less sense of shared identity and hence less potential for xenophobia. Bands are small units, with less than 100 individuals each, so members of different bands have to intermarry to avoid inbreeding, and people often move between bands, perhaps accompanied by relatives within the band, to meet up with relatives outside of the band. Recorded instances of wars between forager bands are perhaps best described not as wars between bands, but as the sum of one-on-one fights between members of different bands who decide to resolve their individual disputes. In order to illustrate this, let’s have a look at one example of such a war that was recorded by C. W. Hart and Arnold Pilling in the late 1920s. This war took place between two bands of Australian aboriginals who lived on the Tiwi Islands in northern Australia, the Tiklauila-Rangwila and Mandiiumbula bands. The original account is not available to me, but here’s Harris’s retelling of it:

The Tiklauila-Rangwila men were the instigators. They painted themselves white, formed a war party and advised the Mandiiumbula of their intentions. A time was set for a meeting. When the two groups had gathered, both sides exchanged a few insults and agreed to meet formally in an open space where there was plenty of room. As night fell … individuals from the two groups exchanged visits, since the war parties included relatives on both sides and no one regarded every member of the other group as an enemy. At dawn the two groups lined up on opposite sides of the clearing. Hostilities began with some old men shouting out their grievances at one another. Two or three individuals were singled out for special attention.

Hence when spears began to be thrown, they were thrown by individuals for reasons based on individual disputes.

Since the old men did most of the spear throwing, marksmanship tended to be highly inaccurate.

Not infrequently the person hit was some innocent non-combatant or one of the screaming old women who weaved through the fighting men, yelling obscenities at everybody, and whose reflexes for dodging spears were not as fast as those of the men … As soon as somebody was wounded, even a seemingly irrelevant old crone, fighting stopped immediately until the implications of this new incident could be assessed by both sides.

Harris thinks that a typical war between forager bands in the Palaeolithic would have been much like this war among the Tiwi people. It would have arisen as a result of individual disputes when people who shared disputes against members of the same band decided to team up with each other. That would have been the proximal cause of these wars. Yet the question remains: why did these disputes need to be resolved by violence?

Don’t get me wrong; I’m not expecting that foragers wouldn’t have resorted to violence when it was advantageous for them. (Although there are forager societies which appear to have had a philosophy of complete non-violence, such as the Moriori of the Chatham Islands; unfortunately such societies tend to be selected against for obvious reasons—look up what happened to the Moriori.) But war between forager bands is immensely costly, perhaps more so for the societies involved than war between states. Remember, forager bands are small. And war disproportionately affects the stronger members of the band, the adult males, who make the biggest contribution to the band’s continued survival. Note also the mention in the above account of the Tiwi war of the fact that the Tiklauila-Rangwila and Mandiiumbula war parties included relatives on both sides. These people were trying to kill their own relatives, and other people who they regularly interacted with. It wasn’t like a war between states were the two parties can dehumanize each other.

Why couldn’t the Tiklauila-Rangwila men have resolved their grievances by engaging in some kind of ritualised mock combat? Of course, we don’t know how serious these grievances were, and killing has a finality which no other method of conflict resolution can rival—but it would be rash to assume that war was the only thing which would work. In fact, this was apparently how the Moriori resolved their conflicts.

Why might this kind of warfare happen? Let’s try and think of some possible reasons.

  1. The need to foster solidarity. By working together to fight an external enemy, groups that wage war increase their internal cohesion and hence their capability for survival.
  2. The need for entertainment. Groups might start wars just because their members enjoy it. (OK, this might sound a little ridiculous to people brought up in the modern Western cultural tradition which tends to hammer home the “war is hell” message, but in other cultures war is glorified. If you haven’t been in a war, how do you know what it’s really like? And if you have been in a war and not enjoyed it, might others have had a different experience?)
  3. The need to satisfy an innate “urge to kill”. Humans, or perhaps just human males specifically, have an instinct which compels them to kill others, which has evolved in the usual way via natural selection; for whatever reason, humans with an inclination towards violence have proved more reproductively successful. Of course it is possible to suppress and moderate this innate drive, just as is possible with other innate drives (people go on hunger strikes, and die from them, for example), but it remains present and manifests itself in the absence of mitigation.
  4. The desire to control more resources.

The last explanation here, that war occurs when political units compete over resources, might strike you as the most sensible one here, and it probably is the main explanation for wars between states. But does it work as an explanation of the Tiwi war? If wars between forager bands are waged for this reason, we would expect wars to often result in one band gaining resources at the expense of the other band. The ethnographic evidence, however, suggests otherwise. Victorious warriors certainly gain social status, and sometimes they gain women which they have kidnapped from the enemy, but often they return only with trophies, such as the severed heads of the men they have slain. After all, these societies do not have the capability to store food or other valuables in large quantities. Territorial expansion is out of the question; bands are mobile, so they lack territory in the first place, and besides they lack the organisational capability to subjugate a population and extract tribute from it. And as we have seen from Chapter 2, forager societies control their population growth, so they would have no need for more Lebensraum most of the time. Of course forager societies would not have always been completely successful at maintaining a constant population, so perhaps wars would occur for reasons of Lebensraum sometimes, but it would not be the main cause of war.

So, perhaps one of the other three explanations given above is more appropriate? Well, let’s have a look at them one-by-one.

Harris’s main problem with the solidarity explanation is that it needs to be accompanied with an explanation of why, out of all the ways a group could increase its sense of togetherness, warfare would be favoured over other methods which do not suffer from the rather significant costs associated with warfare. Is it so difficult to foster intragroup solidarity? In modern Western societies, sport seems to be able to do this to a considerable degree. Maybe it doesn’t work as well as warfare would, but it’s not clear—and warfare doesn’t just need to be better at fostering intragroup solidarity than sport, it needs to be so much better that it is worth the cost. Nobody has been able to show that this is the case, and without that the solidarity explanation has no explanatory power.

As for entertainment: well, I tried to convince you above that it’s not totally obvious that war isn’t fun, but after a careful look at the evidence it seems fair to conclude that war really isn’t fun. Of course societies tend to bombard their members with messages telling them that war is a blast (and not just literally), or, failing that, that killing people somehow makes you a better, more noble person—dulce et decorum est pro patria mori, right? But the very fact that the message has to be hammered home so much kind of gives away that this attitude towards war is far from natural. Humans don’t need these societal messages to get them to do things they enjoy doing; in fact there are some things that humans enjoy very much and do very frequently even when the society they live in is constantly telling them that it makes them a bad person. Everybody knows about the conscientious objectors during the two World Wars. What you might not know is that their equivalents existed even in many pre-state societies which waged war. The Crow Indians, for example, allowed adult males to avoid fighting as long as they dressed themselves in women’s clothing and worked as servants to the warriors. Even the notoriously warlike Yanomamo of the Amazon rainforest have to prepare themselves for battles by taking drugs and performing special rituals. So this explanation, too, is not convincing.

The “killer instinct” explanation suffers from the same problem. If humans have an innate urge to kill people, why is it so hard to get them to do it? In any case, why would such an innate urge be maintained by natural selection in the first place? The cause which the instinct explanation names is too proximal.

So if none of these explanations are satisfactory, what is Harris’s explanation?

I mentioned above how warfare could occur due to the need of bands for Lebensraum after population growth, but this would be a rare occurence. But a related reason why war would be beneficial would be the fact that war helps decrease the average population density in a region. Obviously, the prospect of war is good motivation for bands to try and stay away from each other. In fact, the threat of attack might encourage bands to stay away entirely from certain regions where they would be especially vulnerable. In this way “no man’s lands” are created, and these might be crucial to ensuring that foragers do not deplete the supply of the animals and plants that they consume.

But Harris doesn’t think these are the only benefits of war. He also claims that war helped foragers to control their population growth. On the face of it, this claim might seem strange because the direct effect of war is mainly to reduce the number of adult males in the population. But as long as the amount of adult women remains more or less constant, the rate of population growth will remain the same, especially in openly polygynous societies. (Not all forager societies are polygynous, but the more warlike societies tend to be more polygynous, according to Harris.)

Harris’s answer is that warfare is part of a kind of package of practices which mutually reinforce each other and serve to control population growth, so that forager populations remain within the carrying capacity of their environment. These three practices are female infanticide, warfare and male supremacy (i.e., patriarchy).

Female infanticide, of course, directly limits the rate of population growth. That was the topic of Chapter 2. But female infanticide is costly for the mother. Not necessarily in moral terms, because cultures vary greatly in how they view the morality of infanticide. But pregnancy isn’t easy, and it is not hard to see why mothers would be reluctant to kill their newborn babies after going through all that effort. And it’s not just mothers who suffer the costs, because the whole band has to work together to supply the extra food needed to feed the developing baby (or the newborn baby, since infanticide can occur by neglect, not just by direct killing).

Harris proposes (as we’ll see when we get to the next chapter) that foragers’ reliance on the strongest members of their societies, the adult males, for military purposes is the cause of the development of male supremacist institutions and practices. One of these male supremacist practises is the favouring of male babies over female ones, and this encourages female infanticide. Female infanticide would be favoured to some extent already because of its effect on population growth, but male supremacy makes it possible for it to be favoured to a greater extent, resulting in a greater effect on population growth.

Even if you don’t find this causal link illuminating (it is elaborated upon in the next chapter), Harris points to evidence that it holds up. William Divale studied a number of band and tribe societies for which census records were available covering the time when the society was pacified by the occupying state. He found that the ratio of boys less than 14 years old to girls in the same age bracket was significantly higher (128:100, on average) before pacification than after it (the average figure for societies less than 25 years post-pacification was 113:100, and after 25 years it dropped to 106:100, more or less the global average, which was 105:100 when Harris wrote the book but is 107:100 today). The figures are restricted to this age bracket for the obvious reason that when these boys grew up, many of them were killed in battle, so that the sex ratio among the adults in the unpacified societies was actually more or less exactly equal, and it actually became more skewed towards boys after pacification.

So, in summary: according to Marvin Harris, the practice of warfare facilitates the survival of forager societies because it encourages the dispersal of bands and the creation of no man’s lands where the animals and plants that they feed on can take refuge, and also because it facilitates the development of male supremacist institutions and practices; in particular, it encourages female infanticide, which limits population growth and thereby prevents the resources in the environment from being depleted.

Footnotes

  1. ^ Anthropologists classify societies via their extent of organisation into four types, going from least to most complex: bands, tribes, chiefdoms and states.

A controversial maths question

The following question which appeared on an Edexcel GCSE maths paper (for readers outside of the UK, this is a test that would be taken at the end of high school, at the age of 15 or 16) has gone viral, with many students complaining about its difficulty.

There are n sweets in a bag. 6 of the sweets are orange. The rest of the sweets are yellow.

Hannah takes at random a sweet from the bag. She eats the sweet.

Hannah then takes at random another sweet from the bag. She eats the sweet.

The probability that Hannah eats two orange sweets is 1/3.

Show that n2n − 90 = 0.

I sympathise with the students complaining about this question’s difficulty. I don’t think it is an easy question for GCSE students. I found it easy to answer, but I’m studying for a Maths degree.

However, there are different kinds of difficulty. Sometimes a question is difficult because it involves the application of knowledge which is complex and/or less prominently featured in the syllabus, which makes acquiring this knowledge and keeping it in your head more difficult, and also makes it harder to realise when this knowledge needs to be applied. This was not one of those questions. The knowledge required to complete this question was very basic stuff which I expect most of the students complaining about it already knew. As far as I can tell, the following mathematical knowledge is required to answer this question:

  • If m of some n objects have a certain property then the probability that one of these n objects, picked at random, will have that certain property is m/n.
  • The combined probability of two events with probabilities x and y is xy (as long as the events are independent of each other, although you could get away with not understanding that part for this question).
  • Basic algebraic manipulation techniques, so that you can see that the equation (6/n)(5/(n − 1)) = 1/3
  • can be rearranged into n2n − 90 = 0. All this takes is knowing how to multiply the two fractions together on the left-hand side (new numerator is the product of the old numerators, new denominator is the product of the old denominators), then taking the reciprocals of both sides and bringing the 30 on the left-hand side over to the right-hand side (there are different ways you could describe this process).

Certainly for some of the students, the knowledge was what was at issue here, but I think the main challenge in this question was the successful application of this knowledge. Most of the students knew the three things listed above, but they failed to see that this knowledge could be used to answer the question.

Unfortunately for the students, failure to apply knowledge successfully is in many ways a much more serious failure than failure to possess the appropriate knowledge. If you fail due to lack of knowledge, there is an obvious step you can take to prevent further failure in the same situation: acquire the knowledge that you lack, by having somebody or something teach you it. Crucially, when you successfully learn something you not only end up knowing it, but you also know that you know it.

If, on the other hand, your problem is that you failed to apply your knowledge successfully, then it is much less clear what the next step you should take is. And, also, you never know whether you are capable of applying your knowledge in every situation where it might be useful, because there are usually a whole lot of different situations where it might be useful and it is impossible for you to be familiar with them all. This is why cramming for tests is not a good idea. Cramming may be the most efficient way to obtain the knowledge you need for a test (I don’t know whether this is actually true, but I don’t think it’s impossible) but it certainly won’t help you with applying your knowledge.

There is one relatively straightforward way to become better at applying your knowledge. If you remember how you applied a certain piece of knowledge in a previous situation, then, if the exact same situation occurs again, you won’t have any trouble, because you will just apply the knowledge in the same way again. In a similar way, the more similar the new situation is to the old one which you remember how to deal with, the more likely you are to be able to successfully apply your knowledge.

But, in a way, this is a trick. I am about to get a bit esoteric here, but bear with me. Let’s say your mind has two “departments” which work together to solve a problem. One of them is the Department of Knowledge-Recall, or the DKR for short, and the other is the Department of Knowledge-Application, or the DKA for short. I think that when you carry out the strategy above, what is happening in your mind is that the DKA is pre-emptively passing on the tough part of the work to the DKR for them to deal with. If you remember how you used a certain piece of knowledge (let’s call it X) to deal with a previous situation, then that memory, in itself, has become knowledge (let’s call it Y). The DKR has worked so that Y is easily recalled. When the situation comes up again, the DKA’s task is really easy. It’s looking at the most prominent pieces of knowledge that the DKR has made you recall, and it notices that Y has a kind of direct link to this situation. If the DKR hadn’t worked to make Y a piece of easily-recalled knowledge for you, the DKA would have to do more work, sifting through the knowledge that the DKR has made you recall, possibly asking the DKR for more, inspecting them more closely for any connection to the situation at hand.

There’s a good chance the above paragraph made no sense. But basically, I’m trying to say that familiarising yourself with the situations where you need to apply your knowledge works because it is a process of acquiring specialised knowledge about where you need to apply your knowledge—it is not truly applying your knowledge. But probably the more important point to make is that this process is inefficient. It would be much simpler if you could simply apply your knowledge to unfamiliar situations straight away. And it’s often impossible to familiarise yourself with every conceivable situation, because the range of conceivable situations is so vast.

Students taking tests try to carry out the familiarisation process by doing past papers. This is often effective because the range of questions that can be on a paper is often quite limited, so it really is possible to familiarise yourself with every situation where you might need to apply your knowledge. But this isn’t a good thing!

As I argued (somewhat incoherently) above, the skill of being able to apply your knowledge is only separate from the skill of being able to recall your knowledge if it includes the sub-skill of being able to apply your knowledge to unfamiliar situations. That particular sub-skill is one which is hard to improve. Arguably, this sub-skill is what we mean by “intelligence” (the word “intelligence” is used to refer to a lot of different things, but this might be the most prototypical thing referred to as “intelligence”). It’s certainly possible to get better at this sub-skill, but I don’t know if it is possible to get better through conscious effort.

But intelligence (by which I mean, the ability to apply your knowledge to unfamiliar situations) is often the main thing the tests that students take are trying to assess. After all, there are few vocations, if any, that a person cannot do better at by being more intelligent. You don’t always need intelligence to get to an acceptable level, sure, but intelligence always helps. The purpose of the GCSEs is not just to find out which students are more or less intelligent—they are also supposed to increase the amount of people who have useful knowledge—but it is one their main purposes.

That’s why I don’t think this question was unfair, as some students have been saying. Yes, it was quite different from anything that was on the past papers. But that was the whole point, to test people’s ability to apply their knowledge in unfamiliar situations. It is natural to be disappointed if you couldn’t answer the question and to complain about your disappointment, but saying that it was unfair is a step too far. It did what it was meant to do.

I think there is possibly a genuine problem here, though. If questions like this which strongly tested intelligence (as defined above) were usual on the GCSE papers, you wouldn’t expect this one to become such a topic of interest. Perhaps the GCSEs have suffered in the past from a lack of questions like this, which has affected students’ expectations of what these tests should be like. I should be clear that I have no idea whether this is true. I took my GCSEs in 2011, but I don’t remember what the questions were like in this respect.

PS: I’ve seen some people saying “this is easy, the answer is 10”. These people are making fools of themselves, because the answer is not 10, in fact that doesn’t even make any sense. The question is asking for a proof. (“Show that…”) It seems these people have just seen the quadratic equation at the end and assumed that the question was “Solve this equation” without actually reading it. Perhaps this is evidence that the question really isn’t easy. Or maybe these people just aren’t thinking about it very much.

A summary of Cannibals and Kings by Marvin Harris, chapter 2

Cannibals and Kings is an anthropology book by Marvin Harris aimed at a popular audience. Since I’m trying to read this for education rather than entertainment, I’m not reviewing it in the usual way, but instead I’m trying to understand what it is arguing for. Hopefully, this will be part of a series of posts where I summarise each chapter; I’m starting with chapter 2 because the first one is a general overview. I wrote a brief introduction to Marvin Harris and Cannibals and Kings in this post on Tumblr.

Somebody not familiar with the topic might be inclined to think of agriculture as an invention, like the steam engine or the light bulb. The question of the origin of agriculture would not seem well-posed to such a person. They would say that until the first farmer societies appeared 12,000 years ago in the Middle East, the idea of agriculture had simply never occured to anybody, and for this reason every society was a forager society. But once the first person had the idea (who happened to be in the Middle East, 12,000 years ago), people saw that the farmer lifestyle would be better for them, and therefore they adopted it. Neighbouring societies came into contact with the first farmer societies, came to the same realisation, and became agricultural themselves. Other societies came up with the idea independently too (but later than those Middle-Eastern pioneers), and passed the idea on to their own neighbours. In this way most of the societies in the world became agricultural, except for a few in places like Siberia where the environment made agriculture impractical, and in places like Australia where the societies were too isolated to be sufficiently exposed to the idea and happened not to come up with it themselves.

There are a lot of problems with this explanation. One of them is the assumption that the idea of agriculture was sufficiently unlikely to occur to people in forager societies that the first farmer societies only appeared 12,000 years ago. Humans are thought to have reached behavioral modernity around 40,000 years ago, so that’s 28,000 years where the idea of agriculture, if it occured to anybody, occured only to people who were unable to get it across to others, or to people in those places like Siberia where agriculture was impractical. Was the idea of agriculture really so inaccessible? Anthropologists have found that people in modern forager societies often have extensive knowledge of the natural world in which they live. Presumably, prehistoric foragers were the same. In particular, the mechanics of plant growth would probably not have been a mystery to them. And if they knew how plants grew, it doesn’t seem like a massive leap for them to have the idea of planting seeds and encouraging their growth in order to eat the plant once it was fully grown.

But the most fundamental problem with this explanation is the idea that the farmer lifestyle would be attractive to foragers. This is far from obvious. In fact, it appears that, at least in its primitive stages, farmer societies were in many respects less conducive to general well-being than forager societies. There are lots of points that could be made here, so let’s just focus on one metric by which forager societies have an advantage over farmer societies (at least the primitive ones): the amount of leisure time available, as opposed to time spent obtaining food. The modern San foragers of the Kalahari desert spend about three hours per adult per day hunting and gathering and have a diet rich in animal and plant protein. Modern Javanese peasants (as of 1977), on the other hand, spend about six hours a day working their farms and get much less protein for their efforts. Even modern workers still spend about four and a half hours a day earning the wages they need to obtain their food (assuming a 40-hour week), although they have access to an enormous range of different kinds of food, so the comparison is less straightforward. And, let’s not forget, the Bushmen live on the edge of the Kalahari desert, not one of the most hospitable environments in the world. Most prehistoric foragers would have lived in environments where food was easier to access.

That’s the empirical side of the argument. It is also possible to explain why it makes sense that the early farmer societies would have had a worse standard of living than forager societies. The reason is that there is a crucial difference in the nature of the means of production in forager and farmer societies. Foragers depend on the amount of resources present in the surrounding environment. It is impossible for them to increase the amount of food produced per unit area, because the more they hunt and gather, the more scarce the animals and plants that they sought become. On the other hand, farmers can increase the amount of food produced per unit area by planting more crops per unit area; there are limits on the amount of crops that can be grown per unit area too, but the limits are sufficiently high that the maximum amount of food that can be produced per unit area in a farmer society is much higher, and the early farmers would not have needed to reach it. Harris refers to increase in the amount of food produced per unit area as “intensification of production” (it’s a concept that re-occurs and is important throughout the book). The difference between forager and farmer societies can therefore be briefly summarised as this: foragers cannot intensify production, but farmers can.

Since foragers cannot intensify production, forager societies are motivated to maintain a constant population density. If the amount of foragers within a given area increases, the foragers in the area have to eat less on average. However, if the amount of farmers within a given area increases, the farmers in the area don’t have to eat less on average, because they have the option of increasing the amount of food produced in the area and thereby cancelling out the effect of the increased population density. As a result, farmers are not motivated to maintain a constant population density, and, in general, the population density in a farmer society tends to increase over time. But the intensification of production that farmers must carry out in order to accomodate the increase in population density necessitates an increase in workload.

One last question remains. How do forager societies maintain their constant population density? In the second chapter of the book, which is called “Murders in Eden”, Harris talks about some of the methods that they probably used. As the name of the chapter suggests, some of these might turn you off from the somewhat idyllic picture of forager life that has been painted so far.

Foragers had no access to effective methods of contraception. They had access to methods of abortion, and some of them were probably quite effective, but they tended to also be very effective at killing or seriously injuring the mother, so abortion was not an attractive option. But there was another way of getting rid of unwanted children which carried zero risk of harm to the mother: infanticide. Infanticide is, of course, morally objectionable to most people in my society, and probably yours as well, if you’re reading this. But other societies, both historical and modern, it is not; in these societies infants are considered to be non-persons, just as many people in my society consider foetuses to be non-persons. I think Harris would agree that a society’s nature generally determines its moral system, not the other way round. The necessity of infanticide in forager societies would cause these societies to define infants as non-persons. There would, therefore, be no reason for foragers to refrain from carrying out infanticide for moral reasons. The only reason infanticide would be disfavoured to some degree would be so that the effort of pregnancy, and, especially, the extra food that the mother would need to eat, would not be wasted. But the benefits of keeping the population constant would have outweighed these costs.

Another, less morally questionable method that was used was late weaning. After birth, women ovulate only after their body has built up enough fat reserves to allow the next baby to get enough food, at least according to Harris (I don’t know if this is an accepted fact or not). If they breastfeed, they expend a significant amount of calories feeding the baby, so that their fat reserves build up more slowly, and ovulation is delayed. (The same mechanics are behind the fact that menarche occurs earlier among more well-nourished populations.) In farmer societies, people usually consume enough carbohydrates that it is impossible to delay ovulation for more than a year or two, even if the baby is not weaned for the whole of this time. But foragers’ diets are much less carbohydrate-rich, so they can delay ovulation for much longer. And as long as ovulation has not occurred, impregnation is impossible. Studies of Bushmen women have found that, by putting off weaning, they often avoid getting pregnant for four years or more. That means that, within in the approximately 26-year-long span in which they are fertile (between the ages of 16 and 42, or close to those ages), there is only time for five or six pregnancies. Accounting for the effects of miscarriage and infant mortality, this might result in only three or four children who survive puberty on average, given no infanticide. In order to maintain constant population this number needs to be cut down to two, and this would have to be done via infanticide, but the average woman would only need to kill one or two babies during her lifetime. It still probably doesn’t sound like an ideal way to live to anybody reading this, but it’s not quite as bad as you might have thought. Again, though, it is important to understand that while this would be a moral cost for us, it would not have been a moral cost for foragers in which infanticide was an accepted practice.

Now that we have established that the naïve description of the origin of agriculture given above is incorrect, the next question to answer is this. In general, farming was not an attractive option for foragers. What were the particular circumstances in the Fertile Crescent 12,000 years ago, and in the other areas where agriculture originated independently, that made farming an attractive option in these particular times and places? This is the subject of the next chapter, “The Origin of Agriculture”, which I’ll write about in the next post.

English words for mammal species, ordered by age

This is a list of English words referring to kinds of mammals, ordered by age. By ‘age’, I mean the earliest time at which the word was used in its current sense; for example, the word ‘deer’ is of Proto-Germanic vintage but it was originally used to refer to animals in general (like the modern German cognate Tier); the word was already used to mean ‘deer’ specifically in Old English, but the wider sense only became the more usual one by the 15th century, so I have listed the word as being only 500 years old.

I have not included words referring to animals of specific sexes or ages, except for the words ‘cow’, ‘bull’, ‘steer’ and ‘ox’. I have also not included words referring to animals that I wouldn’t expect most people living in England to have heard of, unless they are of especially old vintage (like ‘onager’).

Proto-Indo-European period (4000 BC – 2500 BC): beaver, mouse, swine, hound, wolf

Proto-Germanic period (2500 BC – 100 AD): ape*, horse, cow, bull†, steer†, ox, elk, whale, cat, fox, bear, weasel, seal

(note: ‘cat’ was borrowed from Latin at the end of this period, ‘ape’ is probably late as well although its origin is unknown)

Proto-West Germanic period (100 AD – 450 AD): hare, boar, sheep

Early Old English period (450 AD – 900 AD): shrew, ass, camel, tiger

Late Old English period (900 AD – 1100): rat, pig, dog

12th century (1100 – 1200): lion

13th century (1200 – 1300): dromedary, ounce, panther, leopard

14th century (1300 – 1400): squirrel, mole, bat, onager, rhinoceros, goat, dolphin, porpoise, lynx, hyena, polecat, elephant

15th century (1400 – 1500): monkey, baboon, porcupine, dormouse, hedgehog, hog, deer, reindeer, antelope, genet, marten

16th century (1500 – 1600): chinchilla, marmot, giraffe, buffalo, chamois, hippopotamus, civet, badger, armadillo, manatee

17th century (1600 – 1700): orangutan, guinea pig, woodchuck, lemming, muskrat, hamster, zebra, Bactrian, llama, peccary, moose, bison, gazelle, ibex, narwhal, jaguar, mongoose, jackal, skunk, wolverine, mink, raccoon, walrus, sealion, sloth, opossum, possum

18th century (1700 – 1800): chimpanzee, gibbon, lemur, rabbit, chipmunk, groundhog, donkey, tapir, alpaca, yak, gnu, beluga (whale), pangolin, ocelot, cougar, puma, cheetah, dingo, coyote, anteater, mammoth, wombat, kangaroo, platypus

19th century (1800 – 1900): gorilla, vole‡, gerbil, wildebeest, orca, meerkat, (red) panda, aardvark, dugong, bandicoot, koala, wallaby

20th century (1900 – 2000): (giant) panda

* The word ‘ape’ originally referred to both monkeys and apes (well, it first referred to monkeys, and then to apes; the Proto-Germanic speakers would not have been familiar with any ape species), and it is still used in this sense colloquially, so I have dated its origin accordingly; I couldn’t find any information on how early the word was used in the more specific sense.

† The words ‘bull’ and ‘steer’ were synonymous in Proto-Germanic, like the modern German cognates Bulle and Stier; I have dated their origin to Proto-Germanic, as if they were still synonyms, even though, strictly speaking, bulls are uncastrated and steers are castrated. I couldn’t find any information on how recently the specialisation of the meanings of these two words was (it was post-Old English, at least), so it was easier to do it this way.

‡ The word ‘vole’ is a shortening of an older compound ‘volemouse’, in which the ‘vole-‘ element had no independent meaning; I couldn’t find any information on it, but since the Dutch word for ‘vole’ is woelmuis and the German word for ‘vole’ is Wühlmaus, it seems likely that the compound goes back to the Proto-West Germanic period.

A review of “The Buried Giant”, by Kazuo Ishiguro

I hadn’t read any of Kazuo Ishiguro’s books, or even heard of the man, before reading “The Buried Giant”. He’s one of the many literary authors who get a lot of attention from critics but not so much from the general public. But “The Buried Giant” may surprise readers used to the conventions of literary fiction, because in terms of its setting and plot it is much more like a fantasy novel. The story of “The Buried Giant” takes place in sub-Roman England, not long after the time of King Arthur. The knight Gawain is one of the central characters. In the story’s universe, magic exists and ogres and pixies are real creatures. One of the other central characters is even on a quest to slay a dragon. When it’s described in these terms it almost seems like Ishiguro was intentionally trying to come up with the most stereotypical sword-and-sorcery fantasy plot. Naturally, some critics have been a bit put off by this. But if you like this kind of thing, it might attract you to the book. It’s probably what made me decide to read the book in the first place.1

There is more to the plot than the especially stereotypical fantasy elements I listed above. In fact, the story’s main plotline is perhaps more the sort of thing that will interest fans of literary function. It follows an elderly couple, Axl and Beatrice, who are on a journey to meet their son who lives in another village. Since this is 1,500 years ago we’re talking about here, this is not quite as routine an occurence as it may sound, and it is made all the more interesting by the fact that they don’t actually remember anything about their son. They have even forgotten his name. In fact, they, along with, it appears, everyone else in the village, are suffering from a collective amnesia. They struggle even to remember events of only a few weeks ago. Memories do come back now and then, but only in fragments, and at unpredictable times. The two don’t remember how they met each other, for example. Indeed, although they still feel a strong bond of love between them, that love now lacks a past, and exists only in the present; how, then, can they be sure that it will last, or if it has any real existence at all? And, worryingly, the occasional fragments of memories of their time spent together that come back do not always seem like happy ones.

The realisation of the existence of this collective amnesia, referred to as “the mist” by the couple, is what compels Axl and Beatrice to set off on their journey, in case their memory of their son disappears entirely. But their journey also has a dual purpose, because they are also interested to know what the cause of “the mist” is, and whether it can be stopped—although they are somewhat apprehensive about whether their newly-recollected memories might put their relationship in danger.

So there’s a kind of personal, perhaps relatable story at the heart of “The Buried Giant”. There’s also another story which intertwines with this one, which is of a very different nature: this one is about people as collectives, rather than individuals; about the destinies of nations and the making of history. The more typical epic fantasy stuff, in other words. Not long after they begin their journey, Axl and Beatrice end up joining up with two Saxons, whose names are Wistan and Edwin. Axl and Beatrice are Celtic-speaking Britons, which are the dominant people in the region in which the events of the book take place; Wistan and Edwin speak Axl and Beatrice’s language with them, not the other way around. Of course, all that is set to change. I won’t go into any more detail than this, because I don’t want to spoil the twists in this story (there are a few of them).

I do think the first, personal story is the more interesting one. This is clear to me, because there are points in the book where the two stories separate, and at these points, when I was reading about what was happening to Wistan and Edwin, I was impatient to read on and find out what was going to happen next to Axl and Beatrice. Perhaps Ishiguro was able to make the first story more interesting since it’s the kind of story he’s more used to writing. I don’t mean to give the impression, however, that the second story was a complete bore; far from it. In fact, the second story definitely had a more satisfying resolution than the first one. The fact that there is a kind of mystery around Wistan and Edwin, which eventually gets resolved, provides the story with a natural endpoint. But the first story couldn’t be resolved so nicely. Of course, the natural way to end the story would be for us to find out more about Axl and Beatrice’s past and to get some answer as to whether their relationship will survive. But, even though “the mist” does disappear in the end (I hope that’s not too much of a spoiler—I mean, it’s predictable) we still don’t get told very much about their past. And with regards to whether they will stay together, the author does the thing where he ends the book just before it seems like there’s about to be some closure on the question, leaving you on a sort of cliffhanger, albeit not one which will ever be resolved (I mean, I don’t think literary authors generally do sequels). Clearly this was intentional, and the ending is supposed to be unsatisfying, but doesn’t stop me feeling a little bit annoyed with it. I don’t think the book would have suffered for having a more conventional, perhaps more uplifting ending.

One thing critics have complained about is the dialogue. For example, James Woods thinks this is self-evidently bad dialogue:

“Your news overwhelms us, Sir Gawain. … But first tell us of this beast you speak of. What is its nature and does it threaten us even as we stand here?”

Personally, I don’t really have a problem with this. I do agree that the dialogue sounds rather odd and un-modern in places, although it is never difficult to understand. But that’s all part of the fantasy aesthetic. I imagine there are lots of fantasy books where the dialogue is a lot more weird than in this book.

That said, the overall impression you get from critics’ reviews is that this is a decent book, but not close to a great one. And I pretty much agree with this overall impression. “The Buried Giant” is enjoyable, perfectly readable, but not outstanding at anything in particular. Its strong points are its plot and the endearing characters of Axl and Beatrice, and those are probably the aspects of the book that will stick with you. You won’t remember the book for, say, its prose quality, or its capability for intellectual stimulation, quite as much. I think it was worth my time to read this book, and I’ll recommend it to anyone who’s looking for something to read. But I won’t tell anybody they have to make room for this one if there are also lots of other books to read. I do think I might read another of Ishiguro’s books at some point.


  1. ^ I don’t think I can really call myself a fantasy fan, though, since I have only read one fantasy author. Admittedly, that author is Terry Pratchett, so I have read a lot of fantasy books, but they’re not very central examples of fantasy books.

Formal semantic analysis of natural language quantifiers

Natural language quantifiers are an interesting subset of words in that it is possible to define them formally using set theory, by taking them to be binary relations between sets. For example, here are the formal definitions of some English quantifiers.

  • “every” is the binary relation \forall between sets such that for every pair of sets A and B, \forall(A, B) if and only if A \subseteq B. For example, the sentence “every man is in the room” is true if and only if the set of all men is a subset of the set of everything in the room.
  • “some” (in the sense of “at least one”) is the binary relation \exists between sets such that for every pair of sets A and B, \exists(A, B) if and only if A \cap B is non-empty. For example, the sentence “some (at least one) man is in the room” is true if and only if the set of all men and the set of everything in the room have at least one member in common.
  • More generally, each natural number n (as an English word, in the sense of “at least n”) is the binary relation \exists n between sets such that for every pair of sets A and B, \exists n(A, B) if and only if |A \cap B| \ge n. (We are assuming here that the number is not interpreted to be exhaustive, so that the statement “two men are in the room” would still be seen as true in the case where three men are in the room.) For example, the sentence “two men are in the room” is true if and only if the set of all men and the set of everything in the room have at least two members in common.
  • “most” (in the sense of “more often than not”) is the binary relation M between sets such that for every pair of sets A and B, M(A, B) if and only if |A \cap B| \ge |A \setminus B|.

Admittedly, some natural language quantifiers, like “few” and “many”, cannot be satisfactorily defined in this way. But quite a lot of them can be, and I’m going to just focus on those that can be in the rest of the post. From now on you can take the term “natural language quantifiers” to refer specifically to those natural language quantifiers that can be given a formal definition as a binary relation between sets.

Now, once we have taken this approach to natural language quantifiers, an interesting question arises: which binary relations between sets correspond to natural language quantifiers? Clearly, no individual language could have natural language quantifiers corresponding to every single binary relation between sets, because there are infinitely many such binary relations, and only finitely many words in a given language. In fact, we can be quite sure that the vast majority of binary relations between sets will never correspond to natural language quantifiers in any language, because most of them are simply too obscure. Consider, for example, the binary relation R between sets such that for every pair of sets A and B, R(A, B) if and only if A is the set of all men and B is the set of all women. If this corresponded to an English quantifier, which might be pronounced, say, “blort”, then the sentence “blort men are women” would be true, and every other sentence of the form “blort Xs are Ys” would be false. I don’t know about you, but I can’t think of any circumstances under which such a word would be of any use in communication whatsoever.

Another problem with our supposed quantifier “blort” is that it can’t reasonably be called a quantifier, because its definition has absolutely nothing to do with quantities! You probably know what I mean here, but it’s worth trying to spell out exactly what it is (after all, the whole point of formal analysis of any subject is that trying to spell out exactly what you mean often leads to interesting new insights). It seems that the problem is to do with the objects and properties that are referred to in the definition of “blort”. Our definition of “blort” refers to the identities of the two arguments A and B—it includes the phrases “if A is the set of all men” and “if B is the set of all women”. But the definition of a quantifier should refer to quantities only, not identities. From the point of view of set theory, “quantity” is just another word for “cardinality”, which means the number of members a given set contains. So perhaps we should say that the definition of a natural language quantifier can only refer to the cardinalities of the arguments A and B. This is still not a proper formal definition, because we have not been specific about what it actually means for a definition to “refer” to the cardinalities of the arguments only. If we take the statement very literally, we could take it to mean that the definition of a natural language quantifier should be a string consisting only of the substrings “|A|” and “|B|” (with A and B replaced by whichever symbols you want to use to refer to the two arguments), interpreted in first-order logic. But that’s ridiculous, and not just because such a string would evaluate to a natural number rather than a truth value. In order to find out what the proper constraints on the string should be, let’s have a look again at the definitions we gave above.

  • For “every”, we have that \forall(A, B) if and only if A \subseteq B, or, equivalently, |A \setminus B| = 0.
  • For each natural number n, we have that \exists(A, B) if and only if |A \cap B| \ge n.
  • For “most”, we have that M(A, B) if and only if |A \cap B| \ge |A \setminus B|.

In order for these to count as quantifiers, our definition must allow us to compare the cardinalities as well as refer to them. We also need to refer to the cardinalities of combinations of the two arguments of A and B, such as A \cap B and A \setminus B, as well as |A| and |B|. And, although none of the definitions above involve the logical connectives \wedge (AND) and \vee (OR), we will need them for more complex quantifiers that are formed as phrases, such as “most but not all”.

The question of exactly which combinations of sets we need to refer to is quite an interesting one. Given our two arguments A and B, we can see all the possible combinations by drawing a Venn diagram:

A Venn diagram with two circles. The overlapping region is labelled A ∩ B, the region falling solely within the left circle is labelled A \ B, and the region falling solely within the right circle is labelled B \ A.

There are four disjoint regions in this Venn diagram, corresponding to the sets A \cap B, A \setminus B, B \setminus A and (not labelled, but we mustn’t forget it) U \setminus (A \cup B) (where U is the universal set). We also might need to refer to regions that are composed of two or more of these disjoint regions, but such regions can be referred to by using \cup to refer to the union of the disjoint regions.

But do we need to be able to refer to each of these disjoint regions? Note that in the definitions above, we only needed to refer to |A \setminus B| and |A \cap B|, not to |B \setminus A| and |U \setminus (A \cup B)|. In fact, it is thought that these are the only two disjoint regions that definitions of natural language quantifiers ever need to refer to. Quantifiers which can be defined without reference to |U \setminus (A \cup B)| are called extensional quantifiers, and quantifiers which can be defined without reference to |B \setminus A| are called conservative quantifiers. So now, if all this seems like pointless formalism to you, you might be relieved to see that we can make an actual falsifiable hypothesis:

Hypothesis 1. All natural language quantifiers are conservative and extensional.

To give you a better sense of exactly what it means for a quantifier to be conservative or extensional, let’s give some examples of quantifiers which are not conservative, and not extensional.

  • Let NE be the binary relation between sets such that for every pair of sets A and B, NE(A, B) if and only if |U \setminus (A \cup B)| = \emptyset. For example, if we suppose NE corresponds to an English quantifier “scrong”, the sentence “scrong men are in the room” is true if and only if everything which is not a man is in the room (it’s therefore identical in meaning to “every non-man is in the room”). “scrong” is conservative, but not extensional.
  • Let NC be the binary relation between sets such that for every pair of sets A and B, NC(A, B) if and only if |B \setminus A| = \emptyset. For example, if we suppose NC corresponds to an English quantifier “gewer”, the sentence “gewer men are in the room” is true if and only if there is nothing in the room which is not a man. “gewer” is extensional, but not conservative.

Wait a minute, though! I don’t know if you noticed, but “gewer” as defined above has exactly the same meaning as a real English word: “only”. The sentence “only men are in the room” means exactly the same thing as “gewer men are in the room”. (It’s true that we can say “only men are in the room” might just mean that there are no women in the room, not that there is nothing in the room that is not a man—there could be furniture, a table, etc. But “only” still has the same meaning there—it’s just that the universal set is taken to be the set of all people, not the set of all objects. In semantics, the universal set is understood to be a set containing every entity relevant to the current discourse context, not the set that contains absolutely everything.)

Does that falsify Hypothesis 1? Well… I said “only” was a word, but I didn’t say it was a quantifier. In fact, the people who propose Hypothesis 1 would analyse “only” as an adverb, rather than a quantifier. I guess this makes sense considering “only” has the “-ly” suffix. But that’s not proper evidence. Some people have argued that “only” cannot be a determiner (and hence cannot be a quantifier) based on syntactic evidence: “only” does not pattern like other determiners. The example I was given at university was the following sentence:

The girls only danced a tango.

Here, “only” occurs in front of the VP, rather than the NP, hence it must be a determiner.

I’m sure my lecturer could have given better evidence, but he was just pressed for time. But the obvious problem with this argument is that there is a well-known group of determiners which can appear in front of the VP, rather than the NP: the “floating” quan tifiers, such as “all”:

The girls all danced a tango.

Anyway, I remain not totally convinced that “only” is not a quantifier, and, taking a very brief look at the literature, it seems like a far-from-uncontroversial topic, with, for example, de Mey (1991) arguing that “only” is a determiner, after all (although I don’t really understand its argument, having not read the paper very carefully). Payne (2010) mentions that “only” should be seen as a kind of adverb-quantifier hybrid, which I guess is probably the best way to think about it, although it is kind of inconvenient if you’re trying to analyse these words in a formal semantic approach.

I wonder if there are any words in natural languages which have ever been analysed as non-extensional quantifiers. Google Scholar doesn’t turn up anything on the subject.

In any case, perhaps the following weakened statement of Hypothesis 1 is more likely to be true.

Hypothesis 1. In every natural language, the words that can be analysed as non-conservative or non-extensional quantifiers will exhibit atypical behaviour compared to the conservative and extensional quantifiers, so that it may be better to analyse them as adverbs.

Thoughts on “The Master and Margarita” by Mikhail Bulgakov

The Master and Margarita is a book by Mikhail Bulgakov, written from 1928 to 1940, but published only in 1967 due to Soviet censorship, long after the author’s death. It’s considered one of the best works of Soviet literature if not the best. I finished reading it last week, and, although I haven’t read any other works of Soviet literature, I can believe that it’s one of the best.

I’m more familiar with Russian novels of the 19th century: the works of Tolstoy, Dosteyevsky, etc. So it was interesting to see how a 20th-century Russian novel compared, although of course I don’t know how representative The Master and Margarita (let’s call it TMM, because I’m going to have to write it a lot) is of 20th-century Russian literature in general. There are some clear differences between TMM and Tolstoy and Dostoyevsky’s novels. For example, while Tolstoy and Dostoyevsky stuck to realism in their writing (even though the devil made an appearance as a character in The Brothers Karamazov, he could be explained as a hallucination), TMM is not very realistic at all. It’s arguably a fantasy novel. I mean, the premise of the story is that the devil, together with a band of demon sidekicks, has come to Moscow to cause trouble. One of the characters, who you might see on the cover if you buy the book, is a talking cat called Behemoth. At several points he threatens people wielding a Browning handgun. There are some memorably absurd lines involving, like the following:

“I challenge you to a duel!” screamed the cat, sailing over their heads on the swinging chandelier.

If this strikes you as a pretty amusing line, that’s probably intentional. Because another way in which TMM is quite different from your average Tolstoy or Dostoyevsky book is that it’s a comedy. A Russian kind of comedy, mind you; it can be categorised best as farce, although there are elements of black comedy too, to a lesser extent. Basically, if you find it funny when bizarre, horrible things happen to people, but they kinda deserve it, then you’ll probably enjoy the humour in TMM. Oh, and although a lot of it went over my head, since I wasn’t ever a Soviet citizen, there’s political satire in here too. That’s why Stalin wouldn’t allow it to be published, after all. There is at least one point where a more serious political message is conveyed, one more horrifying than amusing: I’m thinking of the chapter called “Nikanor Ivanovich’s Dream”. I’ll confess that I completely missed the point of this chapter when I first read it; I only realised its significance when I was reading about the book online and came across people saying that it was an ‘obvious’ allusion to the secret police’s interrogation methods. But, even if you take the chapter just as what it is on the surface, a description of a strange dream, it’s an unsettling read. In fact, I believe that this is one of the chapters that was still heavily censored for a time even after the book was published.

It was never my impression, though, that the political aspect of TMM was central to it. I’ve seen a lot of descriptions of the book along the lines of ‘satire of Soviet life’, and I guess you can interpret it that way, but that isn’t how I interpreted it. It seemed to me that the book was mostly about something else; it was more than just a comedy, and more than just a political satire or exposé along the lines of Solzhenitsyn’s The Gulag Archipelago. That said, I can’t really tell you what this extra something is. Like most good works of art, I don’t think it has or is supposed to have a straightforward message. So I won’t say anything more about what the deeper meaning of TMM is; that’s up to you to reckon for yourself if you read the book. But I can give you a brief overview of the plot (which I guess can be thought of as the meaning at the surface level).

The book has two parts. In Part 1 the reader is treated to a number of descriptions of the disastrous encounters various residents of Moscow have with Satan (going by the alias of “Professor Woland”) and his gang of demons. This is fairly entertaining, and it includes “Nikanor Ivanovich’s Dream” and no doubt other politically meaningful excerpts. But if this was all there was to the book, I’d be disappointed. Part 2 is the meat of the book, in my opinion. That’s where we are introduced to the titular character, Margarita. There are two titular characters, actually, since there’s the Master as well, who is Margarita’s boyfriend. But it’s Margarita who the narrative follows. We are briefly introduced to the Master in part 1, when he is living in an insane asylum (where a lot of the people who have had the misfortune to meet Professor Woland have ended up). He was a writer (he refuses to tell his name), who wrote a magnum opus about Pontius Pilate, the Roman governor of Judea in Jesus’s time, but the censors refused to publish it, he got into political trouble, and he ended up being sent to the madhouse (but not before he decided to burn his book’s entire manuscript). By the way, a writer, who wrote a great book that was censored—does that sound familiar? Yes, it’s not too hard to see that the Master is, at least on one level, a stand-in for Bulgakov himself. You might say that the book he was writing represents TMM itself, in which case I guess you could say that Bulgakov wrote a book about himself writing it1. The manuscript-burning episode even has a parallel in Bulgakov’s life: apparently, he set an early manuscript of TMM alight as well.

I forgot to mention, by the way, that there are actually some chapters in Part 1 which do not progress the main story, but instead give us a chapter of the Master’s lost manuscript. So there are actually two stories in parallel: the main story set in Moscow in the 1930s, and the Master’s story about Pontius Pilate, set in Jerusalem in 1 AD. The Master’s story is, as far as I know, a pretty faithful retelling of the same one readers of the gospels will be familiar with; although I couldn’t really say since I haven’t read the gospels myself. But it’s an interesting side-narrative. In my experience books that try this dual narrative technique often suffer from one of the narratives being less interesting than the other. I’m thinking of Ursula Le Guin’s The Dispossessed here; that’s a great book, but I was always more interested in what was happening on Urras than Annares. But TMM is not one of the books that suffers from this; I often found the Biblical sub-narrative more interesting than the main one. This sub-narrative continues into Part 2, as well.

Anyway, the Master had a girlfriend, Margarita, and he didn’t get a chance to tell her that he was being taken away from his home, nor did he opt to disappoint her with the news once he was free to tell her, so when Part 2 begins she has no idea where he is and fears that he has died. Then she encounters a mysterious man in a park, who is actually Azazello, one of Professor Woland’s goons. He’s been entrusted with the task of getting Margarita to carry out a certain errand, which Woland anticipates she will do in return for being reunited with her love, the Master. Azazello isn’t a very good salesman—his pitch isn’t very clear, and I don’t think calling someone a “stupid bitch” ever helps—but when he recite a passage from the Master’s lost book, Margarita realises what he has to offer, and agrees to do whatever he wants.

What follows is the most wonderful section of the book, in my opinion. Azazello gives Margarita a special cream which, when she rubs it on her body, restores her youthful beauty and grants her the powers of a witch. Then she hops on a broomstick (while still completely naked—witches don’t need clothes) and goes flying out of Moscow and far to the east (although not before stopping at the apartment of one of the critics who negatively reviewed the Master’s book and gleefully smashing everything in it). It’s the kind of section that you want to read as slowly as possible, savouring every new sentence since the prose is just such a pleasure to read. In the end she lands somewhere deep in Siberia, where there is no trace of human habitation, and bathes in a river. When she steps onto the bank she is greeted by a band of pipe-playing frogs, dancing water-sprites, curtsying fellow witches and a goat-legged man brings her champagne. Then after a short stay, she is returned to Moscow in a flying car driven by a crow.

In Moscow, Margarita is taken to Woland’s apartment and meets his demon crew. There, she learns what the errand is that she has agreed to doing. It turns out not to be anything especially terrible: all she has to do is host a ball on Walpurgisnacht for the denizens of Hell: murderers and poisoners, free-thinkers and adultresses, pimps and brothel-keepers, and the composer Johann Strauss. Although she finds it somewhat exhausting, she carries it out without trouble and Woland is satisfied. Again we are treated to some wonderful writing as Bulgakov tells us about the fantastic things at the ball, including a swimming pool filled with brandy and a troupe of accordion-playing polar bears. After this point, I probably shouldn’t go into too much detail about what happens, in case you don’t want the ending spoiled. But I will tell you that there is a sort of happy ending. Indeed, one of the interesting things about the book is that Satan is not portrayed as the ruthless trickster you might expect to him; he sticks to his word and gives Margarita what she wants. Even the various people who suffer as a result of his crew’s actions tend to have done something to deserve it first. And as for the heavenly counterparts of these demons, they are nowhere to be seen; well, Jesus appears in the Master’s story but he doesn’t really have much presence in the main narrative.

So, overall, what’s my opinion on the book? Well, it’s an entertaining, enjoyable story, for sure. Russians and Russia enthusiasts, especially, will find a lot that will appeal to them in the book, and perhaps will gain insight from reading it. The book can be appreciated in many different ways: at one time it’s comic, another time tragic, and other times intellectually stimulating. I’m not surprised that TMM is often considered the greatest Russian novel of the 20th century.

But would anyone consider it the greatest Russian novel ever? I don’t think so. You see, while I think TMM is a good book, perhaps even great, I do also feel that it is in a league below Tolstoy and Dostoyevsky’s best works. It may compare to Dostoyevsky’s lesser novels—Devils and The Idiot—but I remember War and Peace, Anna Karenina, Crime and Punishment, and The Brothers Karamazov having a much greater impact on me. Those four books got into my head; they stayed with me; and I feel like I learned something from reading them. But TMM was just a story to me. It was good to read; it entertained me for a while; but it didn’t give me anything substantive to keep with me. Now that it’s ended, I can forget about it. I may be exaggerating a little here—I mean, writing this review is probably going to prevent me from forgetting it for too long. And this is just my own personal reaction to TMM; you may react differently, because you value different things in your reading; and who knows, maybe I myself will have a different reaction to it if I re-read at at some later point in my life, when the things that I value have changed. I think one of the things I missed in TMM was the 19th-century practice of being fully open about what was going on in characters’ heads. They hadn’t invented that rule about showing, rather than telling, back then. In War and Peace, Tolstoy regularly outright declares what his characters are thinking and feeling (although he also backs it up as with the actions that the characters take—I think a failure to do this properly is the real sin being warned against in “show, don’t tell”, rather than telling per se). By doing this Tolstoy is able to express certain nuances and subtleties of thought and feeling which are hard to express in any other way, and it makes the work a lot richer. TMM is not at all like this; things happen to the protagonists and we are left to ourselves to deduce what they are thinking or feeling. I don’t think we are really supposed to know the characters’ feelings and thoughts very well in TMM. It’s not a character-driven story; it’s driven by the spectacle of the story itself. So I guess I would make a little amendment to my statement that TMM is in a league below the great 19th-century novels. It’s not inferior in a simple sense, but it’s a story of a different nature; a kind of story that, perhaps, is less capable of having a great impact on its readers than a character-driven story.

But making these comparisons is kind of like complaining about getting a £4,000,000 salary rather than £5,000,000. It would be better, indeed, quite significantly better to have a £5,000,000 salary, but the £4,000,000 salary is nothing to complain about. In the same way, I would have liked to read a 20th-century rival to War and Peace, but even so, TMM is better than 99% of the books that are out there. If you haven’t read it, I would definitely recommend it. It’s worth reading.

  1. ^ As a student of linguistics, I feel compelled to point out that something interesting is going on with this sentence, grammatically, to do with the limits of the power of reflexive pronouns. To put it in terms that my mathematically-inclined readers will understand, I mean that Bulgakov wrote a book B about the process of Bulgakov writing B. I couldn’t think of any better way to write it.