Monthly Archives: July 2015

Some facts about gender

One of the most interesting phenomena found in languages is gender. In its linguistic sense, gender refers to the phenomenon where nouns are divided into a number of different classes which can be distinguished due to the fact that words associated with nouns, like pronouns, determiners, adjectives and verbs, often appear in different forms depending on the gender of the nouns they are associated with (this is called gender agreement). For example, in German the word for ‘the’ is der when it is attached to masculine nouns like Mann ‘man’, die when it is attached to feminine nouns like Frau ‘women’ and das when it is attached to neuter nouns like Kind ‘child’. The main reason gender is such an interesting phenomenon is probably that is not at all obvious why it exists. Language is generally thought of as a means of communication, but it is hard to see how gender systems aid communication. Even if there might be some benefits, any explanation has to account for the high prevalence of gender systems in languages worldwide: in the WALS‘s sample 112 out of 257 languages, about 44%, make some kind of distinction between two or more genders.

Something people aren’t always aware of is that English has gender as well. In fact, it has three genders, like German: masculine, feminine and neuter. Admittedly, there are two differences between the gender system of English and the gender systems of languages like French and German which make gender a less prominent phenomenon in English.

Firstly, in English, gender agreement only occurs with pronouns. The words he, she and it are used to refer to males, females and non-gendered things, respectively, and using the wrong pronoun for a given referent is considered grammatically incorrect.1 On the other hand, in French and German gender agreement also occurs with determiners and adjectives, and in written French, and in both spoken and written Russian gender agreement also occurs with verbs. Languages like English where gender agreement only occurs with pronouns are said to have pronominal gender systems. But promoninal gender systems are gender systems nonetheless. Remember above I said that only 44% of the languages in the WALS’s sample distinguish two or more genders: well, English and other languages are counted among that 44%, so a majority—56%—of the languages in the sample show no gender agreement, not even in pronouns. In fact, pronominal gender systems are quite rare, and it is likely that most of them are the result of not-quite-complete loss of an original, more extensive gender system. This is certainly the case for English. I think it’s quite possible that in the future, English will lose the last vestiges of its gender system as people switch to using they to refer to people without regard to gender in all circumstances, as they already tend to do when the gender of a person is unknown.

Secondly, English gender assignment corresponds almost exactly to the meaning of the referent. Perhaps the only exception is that ships are often referred to as she, but even this is optional. On the other hand, in French and German there are many things which do not have a gender but are classified as masculine or feminine. It’s not the case that speakers of these languages define gender in a different way from English speakers: French people do not actually think that curtains are male and tables are female, even though they say un rideau, not *une rideau and une table, not *un table. And in German, Gardine ‘curtain’ is feminine and Tisch ‘table’ is masculine, so it seems unlikely that the choice of genders for these objects is based on any inherent association of them with masculinity or femininity given that two neighbouring peoples sharing similar cultures have made the assignments in two completely different ways. The French and German masculine and feminine genders just contain a lot of other things besides males and females. In fact, in German, there is an example which goes the other way. Mädchen ‘girl’ should be feminine given the meaning2, but it is actually neuter: Germans say das Mädchen, rather than *die Mädchen.3

There is, however, an important difference between the assignment of Mädchen to the neuter gender and the assignment of Tisch and Gardine to the masculine and feminine genders respectively. The reason Mädchen is neuter is that it it is formed from the word Magd ‘maiden’ by adding the dimunitive suffix -chen, and there is a rule in German that says that every word formed by adding the suffix -chen is neuter. Many German suffixes are associated with a particular gender; for example, nouns in -heit, -keit and -schaft are always feminine, and nouns in -lein (which is another dimunitive suffix) are always neuter. The associations of these suffixes with particular genders constitute a rule which overrides the rule that every word that refers to males is masculine and every word that refers to females is feminine. So, in German gender assignment is determined by (at least) two rules, which are applied in sequence.

  1. If the noun is formed with the suffixes -heit, -keit or -schaft, assign it to the feminine gender, and if the noun is formed with the suffixes -chen or -lein, assign it to the neuter gender. (Note: there are other suffixes which should be mentioned in this rule, as well, but I’m not intending to precisely describe German gender assignment here, just to show you the general outline of how the system works.)
  2. If the noun refers to males, assign it to the masculine gender. If the noun refers to females, assign it to the feminine gender.

Note that the first rule is formal in nature (it refers to how the words are formed) while the second rule is semantic in nature (it refers to what the words mean). The existence of formal rules is responsible for another way in which English gender assignment is different from French and German assignment. In English, gender is a property of referents, not of the nouns themselves. But in French and German, gender is more a property of the nouns themselves. In these languages, it is possible for the same thing to be referred to by two different nouns of different genders. For example, in French, the word vélo is masculine and the word bicyclette is feminine; this is because bicyclette ends in the feminine dimunitive suffix -ette.

These rules are not sufficient to assign every German noun to a gender. Of course, this is not meant to be a complete list. But there is an interesting question here: can a list of rules based on formal and semantic factors account for the gender assignment of every noun in German? Or are there some words whose gender assignment is simply arbitrary?

If you have any knowledge of German, you might find it hard to believe that gender assignment is not mostly arbitrary. People who know French would probably also expect gender assignment in that language to also be mostly arbitrary. However, quite a few studies have been carried out which have shown that gender in French is mostly predictable via phonological rules: that is, rules that take into account the sound of the word. For example Tucker, Lambert & Rigault (1977) found that 94% of French nouns ending in the sound /ʒ/ (such as ménage ‘housekeeping’) are masculine. There are exceptions: orange ‘orange’ is feminine. But by using rules like this, Tucker, Lambert & Rigault were able to correctly determine the genders of 85% of the nouns in the Petit Larousse, a famous French dictionary. Since they did not take into account semantic (i.e. relating the meanings of words) and morphological (i.e. relating to the composition of words from prefixes, suffixes, etc.) factors, and the phonological rules they found could probably be made more accurate, it is quite likely that the vast majority of French nouns are predictably gendered. Given this surprising result, it is possible that a lot more of German assignment is predictable than you might think. Köpcke & Zubin (1984) were able to find a large amount of regularity using similar techniques, although the rules appear to be more complex than the rules in French. Of course, the more complex the assignment rules are, the less useful they are for prediction because it is difficult for learners to remember them all and apply them quickly. There is no bright line between gender being hard to predict and gender being completely unpredictable, since if you just have one rule for every word in the language saying “this word is masculine / feminine / neuter”, then that is still a set of rules. The conclusion I would draw from results like those of Tucker, Lambert & Rigault is that French and German gender is more predictable than you might think, even though it is often not fully predictable in practice.

Few gender systems have been studied as much as the French and German gender systems have, so it is possible that we might find languages that have significantly more unpredictable gender assignment rules. But it would be surprising, since predictable assignment rules are a lot more convenient for learners. I think it’s more likely that in all languages that distinguish different genders, gender assignment is to a large degree predictable.

Another interesting example of a language with apparently unpredictable gender assignment is Ojibwa. In Ojibwa there are two genders which are called the animate and inanimate genders. These genders have nothing to do with sex (the word gender comes from the French word genre, which just means type; it can refer to any kind of distinction between nouns that is reflected in agreement). Nouns that denote people, animals, trees or supernatural beings are always animate. Most other nouns are inanimate. But there is a fairly large group of nouns that seem like they should be in the inanimate gender, but are actually animate. These include ekoːn ‘snow’, enank ‘star’, esseːmaː ‘tobacco’, mentaːmin ‘maize’, meskomin ‘raspberry’ and ekkikk ‘kettle’ (Bloomfield 1957). Now, some of these might be explainable as resulting from differences in which things are considered to possess a gender. For example, it is very common, cross-linguistically and cross-culturally, for celestial bodies such as stars to be identified with supernatural beings, who have a gender. And given that trees are considered animate in Ojibwa, it’s possible that other plants like tobacco and maize might be considered animate as well. But other examples like ekoːn ‘snow’ being animate are harder to explain. There is no generally-accepted explanation for the composition of the Ojibwa animate gender, but Black-Rogers (1982) has an interesting one. According to Black-Rogers, the Ojibwa lack a clear distinction between natural and supernatural abilities. They believe that even fairly mundane activities like beadwork are only possible because of powers that have been granted to humans via supernatural means. Inanimate objects, in particular, may be sources of power. Different speakers may disagree as to which objects have power, and power may be considered to come from different sources at different times. Black-Rogers proposed that when an object is considered to be a source of power speakers start assigning it to the animate gender. She was able to explain many of the problematic animate nouns by this means. For those she wasn’t able to explain, she suggests that objects assigned to the animate gender tend to stay there, so there may be animate nouns which refer to objects that were formerly considered to be sources of power, but no longer are today. So Ojibwa gender assignment is to some extent arbitrary from a synchronic perspective, but in diachronic perspective it can be completely explained by semantic factors. Black-Rogers’s explanation may or may not be true, but I brought up the example to show you that highly unpredictable gender assignments can also be influenced mainly by semantic factors, rather than by formal factors as in the case of French and German.

Another interesting question about gender is whether there are any languages where gender assignment is determined entirely by formal factors, so that semantic factors are irrelevant. Now, it’s true that in many languages formal rules can almost entirely predict gender assignment. In Hausa, for example, there is a very simple rule: nouns ending in -aa are feminine, and all other nouns are masculine. There are some exceptions to the rule, but they are few in number. However, semantic factors are not irrelevant. If they were, then we would not expect nouns referring to males to be masculine and nouns referring to females to be feminine, because there is no reason why nouns referring to females should end in -aa but other nouns should not. In fact, the vast majority of nouns referring to females end in -aa and are feminine. Historically, the correlation between the feminine gender and the -aa suffix did not exist, or was less strong. What happened was that a suffix -nyàa was used to form nouns denoting females, and this resulted in -aa being associated with nouns denoting females, so that all such nouns ended up having the suffix -aa added to them.

Hausa gender, then, is not determined only by formal factors, and in fact it seems that there are no languages where gender is determined only by formal factors. In general, gender distinctions seem to always be fundamentally based on semantic rules of the form “all nouns with meanings of type A are assigned to gender X” (so the gender X contains all nouns of type A, but not necessarily only nouns of type A). These rules give each gender an initial set of nouns called its “semantic core”. Then semantic associations and formal rules are sufficient to assign the vast majority of remaining nouns to one of the genders, and they may shift nouns within the semantic core of one gender to a different gender as well.

So, I’ve talked a bit about about gender assignment and the interaction between formal and semantic factors here, but there are lots more interesting things to talk about with respect to gender in languages, such as: what kind of distinctions tend to be drawn? Male vs. female, animate vs. inanimate are very common—any others? What can borrowings tell us about gender assignment? What can we say about nouns which appear to have characteristics of multiple genders (like German Mädchen)? How do gender systems develop and change over time? How are gender systems acquired by language learners? If you’re interested, I recommend Gender (1991) by Greville Corbett. This post is based on the first few chapters of that book.


  1. ^ There is a known phenomenon where English speakers sometimes refer to things that would normally be referred to by it by he or she instead; for example a teenage boy told a surfer, referring to a wave: “Catch her at her height!” (Corbett 1991). But this occurs only in particular circumstances; it is clearly the usual pronoun used to refer to these things.
  2. ^ It is common, cross-linguistically and cross-culturally, for children to be non-gendered, and indeed Kind ‘child’ is neuter; however, Junge ‘boy’, and its older synonym Knabe are both masculine, so we would expect Mädchen to be feminine in parallel.
  3. ^ However, in colloquial German pronouns often agree with Mädchen as if it was feminine: Kennst du das Mädchen? Nein, ich kenne sie nicht, not … kenne es nicht.


Black-Rogers, Mary B. 1982. Algonquian Gender Revisited: Animate Nouns and Ojibwa ‘Power’ – an Impasse? Papers in Linguistics 15. 59-76.

Bloomfield, Leonard. 1957. Eastern Ojibwa: Grammatical Sketch, Texts and Word List. Ann Arbor: University of Michigan Press.

Corbett, G. G. 1991. Gender. Cambridge: Cambridge University Press.

Köpcke, K. M. & Zubin, D. 1984. Sechs Prinzipien für die Genuszuweisung im Deutschen: Ein Beitrag zur natürlichen Klassifikation. Linguistische Berichte 93: 26-50.

Tucker, G. R., Lambert, W. E., & Rigault A. A. 1977. The French speaker’s skill with grammatical gender: An example of rule-governed behavior. The Hague: Mouton de Gruyter.


Etymologies of some English kinship terms

Abbreviations: PIE (Proto-Indo-European), PGmc (Proto-Germanic), OE (Old English), ME (Middle English), NE (New English, i.e. modern English).

From OE fæder. The sequence /-dər/ regularly became /-ðər/ after stressed vowels in early ME, which is why we have father rather than fader. The development of the stressed vowel, however, is irregular. The expected development would be into /aː/ by ME open syllable lengthening and then into /ej/ (as in face) by the Great Vowel Shift. However, in this word only the first change seems to have occured, so that the stressed vowel in father is the same as the one in palm and spa, rather than the one in face. In British English, the words rather (< OE hraþor) and lather (< OE lēaþor) have similarly resisted the Great Vowel Shift, although in American English the vowel in these words has been shortened, as it has in all dialects in gather (< OE gadorian) (slather has /a/, but is of unknown origin); that is, these words seem to have resisted the ME open syllable lengthening in the first place. Of course, we can’t rule out the alternatively possibility that they were lengthened and then subsequently shortened, perhaps as part of the same round of shortenings that resulted in short vowels in words like bread and blood. I don’t actually know of any words in -ather where both of the expected changes have taken place, so that the word is pronounced with /-ejðər/.

As for OE fæder, this is a regular development from PGmc *fadēr. And fadēr, in turn, is a regular development from PIE *ph2tḗr.

From OE mōdor. See above on the change of -d- to -th-. The development of the stressed vowel is again irregular. The expected development would be for it to remain as /oː/ in ME and then to develop into /uː/ by the Great Vowel Shift. In this word, however, we have /ʌ/, the usual outcome of ME short /u/. The same outcome exists in brother (< OE brōþor), other (< OE ōþer) and smother (< OE smorian, with -th- inserted perhaps due to influence from the agentive form of the verb, *smorþor ‘suffocator’). There are a couple of words (blood and flood) where ME /oː/ become /uː/ by the Great Vowel Shift, but was subsequently shortened and changed into /ʌ/. However, this change was irregular; for example, it didn’t occur in food. Perhaps the same shortening occured in mother, brother, other and smother after the Great Vowel Shift. I can’t think of any words in -other which aren’t pronounced with /-ʌðər/, so perhaps the shortening was regular in this environment.

As for OE mōdor, this is from PGmc *mōdēr. The changes of unstressed vowels from PGmc to OE are very complicated and I don’t understand them very well. But I don’t know why *mōdēr became mōdor rather than mōder. Perhaps the preceding heavy syllable had something to do with it? PGmc *brōþēr and *duhtēr, with a heavy initial syllable, became brōþor and dōhtor respectively, while PGmc *swēster, also with a heavy initial syllable, had variants in both -er and -or, but PGmc *fadēr, with a light initial syllable, became fæder. The vowel -o- was regularly inserted in OE before postconsonantal r at the end of a word (c.f. OE wundor ‘wonder’ < PGmc *wundrą), so if the -e- in *mōder was dropped due to the preceding syllable being heavy, that would explain it.

PGmc *mōdēr, in turn, is from PIE *máh2tēr. The expected development of this word would be *mōþēr, but the accent appears to have been shifted to the suffix at some point, perhaps due to analogy with *ph2tḗr ‘father’ and *dʰugh2tḗr ‘daughter’, so that -d- occurs by Verner’s Law. Sanskrit mātā́ ‘mother’ shows accentuation on the suffix as well; it is on the basis of Ancient Greek mḗtēr ‘mother’ that accent on the initial syllable is reconstructed.

From OE brōþor. See above on the pronunciation of OE ō as /ʌ/ in NE. OE brōþor is from PGmc *brōþēr. See above on the change of unstressed ē into o. And PGmc *brōþēr is a regular development from PIE *bʰráh2tēr.
From OE sweostor. Variants of the OE word in -er and in swi-, swy- or swu- are also attested. The quality of the final unstressed vowel would not make any difference to the NE reflex, because all OE unstressed vowels merged into /ə/ in ME. However, variants in sweo- would have ended up as NE swester (rhymes with fester), and variants in swu- would have ended up as NE suster (rhymes with muster). So the modern form of the word must originate from a variant in swi- or swy-.

The PGmc form of the word was *swestēr, and the regular development of this in OE would have been *swester, or sweostor if the -e- was dropped due to the preceding syllable, as conjectured above. The other variants probably resulted from a combination of influence of the -w- on the following vowel (c.f. OE wudu ‘wood’ from PGmc *widuz) and influence of the Old Norse form of the word, systir.

PGmc *swestēr is from PIE *swésōr. The expected development of PIE *swésōr would be *swesōr (which would become OE *sweosor and NE *sweaser /swiːzər/), but presumably the word was influenced in PGmc by *bʰráh2tēr ‘brother’, *ph2tḗr ‘father’, *máh2tēr ‘mother’ and *dʰugh2tḗr ‘daughter’, which all ended in *-tḗr or *-tēr in PIE.

A regular development from OE sunu. ME open syllable lengthening did sometimes affect short /i/ and /u/ (c.f. week < OE wicu) but not usually. OE sunu, in turn, is a regular development from PGmc *sunuz. PGmc *sunuz is from PIE *suHnús. The expected development of PIE *suHnús would be *sūnus (which would become OE *sūns and NE *sunse /sʌns/). Apparently the accent was retracted to the initial syllable, so that the *-s became *-z by Verner’s Law, and the long vowel was shortened. The shortening of the long vowel also occurs in Italic and Celtic (those branches give no evidence with regards to the accent). Ringe (2006) attributes the change to “morphological resegmentations or reanalyses which yielded roots without a final laryngeal (or its reflex)”, which isn’t very enlightening. The reconstruction with the long vowel and final-syllable accent is based on Sanskrit sūnú.
A regular development from OE dōhtor, via ME doughter /dowxtər/; although ME /ow/ normally became NE /ow/ (as in boat), it regularly became /ɔː/ (as in thought) before /x/. OE dōhtor is from PGmc duhtēr. For some reason, the vowel was lowered to o and lengthened. The lowering apparently also occured in every other Germanic language, and the lengthening happened Old Norse as well (c.f. Gothic dauhtar /dɔxtar/, Old Norse dóttir, Old High German tohtar). But I have no idea why either change occured. As for the development of the unstressed vowel from ē to o, see above. PGmc duhtēr is from PIE *dʰugh2tḗr. The expected development of PIE *dʰugh2tḗr in PIE would be either *dukþēr or *dukuþēr, depending on whether you believe that interconsonantal laryngeals in non-initial syllables developed into zero or *u in PGmc (opinions differ). The form we actually see is the result of interference from the oblique stem of the word, *dʰuktr̥-ˊ, which became PGmc *duhtur-.
From Old French oncle ‘uncle’. Old English had two words for ‘uncle’: fædera ‘father’s brother’ and ēam ‘mother’s brother’. The regular developments of these words into NE would have been fathere (rhymes with gather, due to trisyllabic laxing) and eam (rhymes with beam), respectively.
From Old French ante ‘aunt’. In southern England, Australia (presumably South Africa and New Zealand as well?), New England and Virginia the word is pronounced with /aː/, while in other areas of the world it is pronounced with /a/. I don’t know why this is the case. I also don’t know why it is spelled with au-.

Old English had two words for ‘aunt’: faþu ‘father’s sister’ and modrige ‘mother’s sister’. The regular developments of these words into NE would have been fathe (rhymes with lathe) and mothery (pronounced in the same way as the derived word meaning ‘like a mother’), respectively.

From Old French neveu ‘nephew, grandson’. The word was originally spelt nevew, and pronounced accordingly. The origin of the spelling with -ph- is kind of a mystery, but perhaps it was due to influence from Latin nepōs ‘nephew, grandson’. A spelling pronunciation with /f/ subsequently emerged and became predominant in American English. The pronunciation with /f/ is now usual in British English as well, although some old-fashioned speakers still pronounce it with /v/.

The native word, neve ‘nephew, grandson’ (rhymes with Eve), survived into ME but is now obsolete. This word is a regular development of OE nefa. OE nefa is a regular development of PGmc *nefô. PGmc *nefô is from PIE *népōts. The expected development of PIE *népōts would be *nefōþs (probably, although I’m not aware of any final *-ts clusters which survived into PGmc), but the noun came to be declined in the same way as *gumô ‘man’, and the ending in the nom. sg. was changed accordingly.

From Old French nece ‘niece, granddaughter’. The native word, nift ‘nephew, granddaughter’ (rhymes with lift), survived into ME but is now obsolete. This word is a regular development of OE nift. OE nift is a regular development of PGmc *niftiz. There does not appear to be a securely reconstructable feminine counterpart of PIE *népōts, although the PGmc form would go back to a PIE form *néptis, and the same form would give Latin neptis.

The origins of the songs on Joan Baez’s first album

Most of this information is taken from the Traditional Ballad Index (TBI). In particular the dates of earliest recordings, and the lists of regions where each song has been recorded, are taken from the TBI and may not be as early or as complete as they could be.

Silver Dagger
A traditional American folk ballad recorded from Appalachia, the Rocky Mountains, the Midwest, Southeast and South-Central United States, and the Canadian Maritimes, with the earliest date of recording being 1866. There is another traditional folk ballad with similar lyrics called “Drowsy Sleeper“ which has been recorded from Appalachia, the Mid-Atlantic, Midwest, Southeast and South-Central United States, New England, the Canadian Maritimes, Newfoundland and Scotland, with the earliest date of recording being 1830. Hence the ballad may ultimately have a Scottish origin.
East Virginia
A traditional American folk ballad recorded from Appalachia and the Southeast and Southwest United States), with the earliest date of recording being 1917.
Fare Thee Well (10,000 Miles)
A traditional English, Scottish and American folk ballad. There is a confusing variety of songs on this same theme; the closest one listed in the Traditional Ballad Index seems to be “Fare Thee Well, My Own True Love”, which has been recorded from Appalachia, the Midwest, Southeast, South-Central and Southwest United States, Southwest England and Aberdeenshire, with the earliest date of recording being 1867. However, the Index identifies this song by the inclusion of the line “Who will shoe your pretty little foot?”, which is actually not included in Joan Baez’s version. The song must be older, because the last stanza of Robert Burns’ “A Red, Red Rose” (1794) is clearly derived from the lyrics of this song. According to Lesley Nelson it was included in the Book of Roxburghe Ballads and dated to 1710 (the book was published in 1847, but the ballads were collected much earlier).
House of the Rising Sun
A traditional American folk ballad recorded from the South-Central United States, with the earliest date of recording being 1933.
All My Trials
A traditional American folk ballad recorded from the Southeast United States, with the earliest date of recording being 1961 according to the TBI. This is the date of the Pete Seeger recording, but Joan Baez had already released this song in 1960. It seems to have been picked up by the folk revival without having been recorded in any collections made earlier. A song called “The Tallest Tree in Paradise” recorded in 1954 has some similar lyrics and some completely different lyrics. The TBI mentions that a verse including the lines “If life were merchandise that money could buy / The rich would live and the poor would die” was found in a gravestone in Tysoe (Warwickshire) in 1798.
Wildwood Flower
A traditional American folk ballad recorded from Applachia and the Southeast and South-Central United States, with the earliest date of recording being 1928. This is the date of the Carter Family recording. The origin of this song has been traced to a song called “I’ll Twine ’Mid the Ringlets” published by the composer Joseph Philbrick Webster in 1860 with lyrics by Maud Irving. Maud Irving seems to have been a pseudonym used by a spiritualist poet called J. William Van Navee. Over time, as the song was passed down through the oral tradition, the nonsensical lines heard in the Carter Family version (“I’ll twine with my mingles”) must have evolved through mishearing—the song is thus a good illustration of the effect of the folk process.
Donna Donna
One of the two non-traditional songs on Joan Baez’s first album. It was written for the Aaron Zeitlin Yiddish-language play Esterke (1940-1941). The music was composed by Sholom Secunda.
John Riley
A traditional Scottish and American folk ballad recorded from Appalachia, the Mid-Atlantic, Midwest and Southeast United States and Aberdeenshire, with the earliest date of recording being 1845. But the theme of a lover who is unrecognised by his love after a long journey away at sea is an old one—it goes right back to the Odyssey.
Rake and Rambling Boy
A traditional English, Scottish, Irish and American folk ballad recorded from Appalachia, the Southeast, South-Central and Southwest United States, Ontario, Southwest and Southeast England, as well as East Anglia, Scotland and Ireland. The TBI gives “before 1830” as the earliest date of recording.
Little Moses
A traditional American folk ballad recorded from Apalachia and the South-Central United States, with the earliest date of recording being 1905. Of course, the story is much older, having come from the Bible.
Mary Hamilton
A traditional Scottish and American folk ballad recorded from the Scottish Lowlands, Appalachia, the Midwest, Southeast, South-Central and Southwest United States, New England and the Canadian Maritimes, with the earliest date of recording being 1790. The “four Marys” mentioned in the last stanza may be the historical “four Marys” who were ladies-in-waiting to Mary, Queen of Scots. However, none of the four Marys had the surname Hamilton, and there are alternative theories as to the historical events the song is connected to. It is possible that multiple events have contributed to the song, and much of the story could be completely made up.
Henry Martin
A traditional English, Welsh, Scottish and American folk ballad recorded across England and Wales and in Aberdeenshire, Appalachia, the Midwest, Northeast, Southeast, South-Central and Southwest United States, the Canadian Maritimes and Newfoundland. The TBI gives “before 1825” as the earliest date of recording.
El Preso Numero Nueve
The second of the two non-traditional songs on Joan Baez’s first album. It was written and composed by the Mexican singer-songwriter Roberto Cantoral and recorded by him with Antonio Cantoral as part of an act called the Hermanos Cantoral (that is, Cantoral Brothers, in Spanish). The Hermanos Cantoral were active from 1950 to 1954; I don’t know exactly when the song was written or first recorded.

The phonetic motivation for Grimm’s Law

…is not as clear as I had thought.

According to the standard reconstruction of Proto-Indo-European (PIE), the language had three series of stops. One of the series is thought to have consisted of voiceless unaspirated stops: *p, *t, *ḱ, *k, and *kʷ. Another is thought to have consisted of voiced unaspirated stops: *b, *d, *ǵ, *g and *gʷ. And the other is thought to have consisted of voiced aspirated stops: *bʰ, *dʰ, *ǵʰ, *gʰ and *gʷʰ. These series were preserved in this form in Sanskrit, although Sanskrit also innovated a fourth series of voiceless aspirated stops out of clusters consisting of voiceless stops followed by laryngeals. In Proto-Germanic, however, the situation is different. The PIE voiceless unaspirated stops have become voiceless fricatives; c.f. Proto-Germanic *þū (> English thou) and Sanskrit tvám ‘you (singular)’. The PIE voiced unaspirated stops have become voiceless unaspirated stops; c.f. Proto-Germanic *twō (> English two) and Sanskrit dvā́ ‘two’. And the PIE voiced aspirated stops have become voiced unaspirated stops; c.f. Proto-Germanic *meduz (> English mead) and Sanskrit mádhu ‘honey’.

I had always assumed that the change went something like this. First, the voiceless unaspirated stops fricativised, retaining their lack of voice and aspiration and becoming voiceless fricatives. Changes of stops into fricatives are common and unremarkable; phonologists disagree on whether this is due to a natural tendency towards lenition (weakening) or due to assimilation to neighbouring phonemes which are more sonorous, but there is no dispute that such a change can be phonetically motivated. The change can be written formally in terms of distinctive features as follows.

[-continuant, -voice] > [+continuant]

Second, the voiced unaspirated stops devoiced, retaining their lack of frication and aspiration and becoming voiceless unaspirated stops. This change would be unusual if it occured on its own. However, the previous change had left the language with no voiceless unaspirated stops, only voiced unaspirated stops and voiced aspirated stops. The [±voice] feature which had been used to distinguish the three original series was now redundant. For obstruents the unmarked value of this feature is [-voice] (voicelessness); that is, obstruents tend to be voiceless unless something forces them to be voiced. Therefore, it was natural for the voiced unaspirated stops to be devoiced. The change can be written formally in terms of distinctive features as follows.

[-continuant, +voice, -spread glottis] > [-voice]

Third, the voiced aspirated stops deaspirated, retaining their voicing and lack of frication and becoming voiced aspirated stops. This change was increased in likelihood due to the fact that the previous change left the language with two series of stops, one of which was voiceless unaspirated and one of which was voiced aspirated; the two features [±voice] and [±spread glottis] were therefore redundant against each other. [-spread glottis] is the unmarked value of the [±spread glottis] feature on stops, so it was natural to resolve this by deaspirating the voiced aspirated stops (although devoicing the voiced aspirated stops would have worked just as well). The change can be written formally in terms of distinctive features as follows.

[-continuant, +spread glottis] > [-spread glottis]

But there are two questions I have about this account.

  1. Why did the second change involve the voiced unaspirated stops devoicing, rather than the voiced aspirated ones? The redundancy of the [±voice] feature could have been resolved either way. In fact, why didn’t both kinds of stop devoice? Since [-voice] is unmarked for obstruents there is nothing stopping this from happening.
  2. Why did the third change involve the voiced aspirated stops deaspirating rather than devoicing? Since the [±voice] and [±spread glottis] features were redundant against each other devoicing would have worked just as well as a means of resolving the redundancy.

Now, sound change is not a deterministic process, so perhaps the answers to these questions are just that out of all of the different ways the redundancies in question could be resolved, these were the ways that were chosen, more or less at random. I am satisfied with this as the answer to question 2. In fact, with respect to question 2 it seems like deaspiration would be a more likely occurence than devoicing because it is much more common for languages to distinguish stops using the [±voice] feature than it is for them to distinguish stops using the [±spread glottis] feature; contrasts of voice are therefore probably favoured over contrasts of aspiration (although this is only a tendency, and there are plenty of languages like Mandarin Chinese where [±spread glottis] is distinctive but [±voice] is not).

But I am less satisfied with this as an answer to question 1. As I mentioned above, the redundancy of the [±voice] feature could have been solved in three different ways:

  1. devoicing of the voiced unaspirated stops, resulting in a contrast between voiceless unaspirated stops and voiced aspirated stops.
  2. devoicing of the voiced aspirated stops, resulting in a contrast between voiced unaspirated stops and voiceless aspirated stops.
  3. devoicing of both kinds of stop, resulting in a contrast between voiceless unaspirated stops and voiceless aspirated stops.

There are languages with a contrast between voiced unaspirated stops and voiceless aspirated stops, as would result from option 2. English is such a language. There are also languages with a contrast between voiceless unaspirated stops and voiceless aspirated stops, as would result from option 3. Mandarin Chinese is such a language. But I know of no language which has a contrast between voiceless unaspirated stops and voiced aspirated stops, as would result from option 1. Yet option 1 seems to have been the option that was taken. This is odd.

I think there are phonetic reasons why we would expect options 2 or 3 to be favoured over option 1. If you examine the articulatory mechanisms which are used to produce voiced aspirated stops, you can see them as half-voiced stops, closer to voiceless stops than voiced unaspirated stops (but still voiced). If you think about voiced aspirated stops in this way, option 1 is weird, because it involves change of the voiced unaspirated (i.e. fully voiced) stops directly into voiceless unaspirated stops without passing through the intermediate stage where they would be voiced aspirated (i.e. half-voiced) and end up merging with the voiced aspirated stops. If the characterisation of voiced aspirated stops as half-voiced already makes sense to you, you can skip the next few paragraphs, because I’m now going to try and explain why this is an accurate characterisation.

The first thing that I want to explain is what voiced aspirated stops are. In terms of distinctive features, they are parallel to voiceless aspirated stops. Voiced aspirated stops are [+voice] and [+spread glottis], voiceless aspirated stops are [-voice] and [+spread glottis]. But the meaning of [+spread glottis] is different in the two cases. As a feature of voiceless stops, [+spread glottis] corresponds to increased duration of the period during which the vocal folds are prevented from vibrating (normally by keeping the vocal folds apart from each other, hence the name of the feature, although reducing the airflow is also an option). The between the release of a stop and the beginning of vocal fold vibration in order to voice the following voiced phoneme is called the voice onset time (VOT). For voiceless unaspirated stops, the VOT is close to 0, while for voiceless aspirated stops the VOT is larger, so that there is an audible period after the stop has been released where air flows through the glottis but the vocal folds do not vibrate. This results in a sound being produced during this period which is in fact exactly [h], the voiceless glottal continuant (although speakers of languages which have aspirated stops don’t usually perceive the [h], instead perceiving it as part of the preceding stop).

During the production of voiced stops, the vocal folds are already vibrating (that’s what it means for a stop to be voiced). So it is impossible for voiced stops to be aspirated if aspiration is defined as having a positive VOT1. Instead, [+spread glottis] as a feature of voiced stops corresponds to the vocal folds being held further apart than is normal for voiced stops, roughly speaking. The vocal folds are still close enough that they vibrate during the production of voiced aspirated stops, so such stops are not completely voiceless, but they are closer to voiceless than voiced unaspirated stops. The kind of voice that accompanies voiced aspirated stops is called breathy voice, as opposed to the modal voice that accompanies voiced unaspirated stops. It might help to look at the following diagram, which illustrates the relationship between the degree of closure of the glottis and different kinds of voicing. The diagram is adapted from Gordon & Ladefoged (2001).

Voiceless sounds have the least glottal closure. The glottal stop has the most glottal closure (complete closure). Modally-voiced sounds have a degree of glottal closure midway between these two extremes. Breathy-voiced sounds have a degree of glottal closure between that of voiceless sounds and voiced sounds. Creaky-voiced sounds have a degree of glottal closure between that of voiced sounds and the glottal stop.

(I should note that talking about the degree of closure of the glottis as if this was a scalar variable is an oversimplification. When the vocal folds vibrate, what happens is that the glottis alternates between a state where it is more or less fully open (as when a voiceless sound is being produced) and a state where it is more or less fully closed (as when a glottal stop is being produced). Closure occurs due to tension from the laryngeal muscles and opening occurs due to pressure from the flow of air through the trachea; closure results in buildup of air below the glottis, resulting in increased pressure, while opening allows air to flow through a greater area, resulting in decreased pressure, and this is why the alternation occurs. For a given rate of flow of air, there is a maximal tension above which opening cannot occur and a minimal tension below which closure cannot occur, and in between these two extremes there is an optimal tension which results in maximal vibration; this tension is approached during the production of modally-voiced sounds. If the tension is below the optimal tension but above the minimal tension, the result is a breathy-voiced sound. If the tension is above the optimal tension but below the maximal tension, the result is a creaky-voiced sound. Alternatively, creaky-voiced sounds can be produced by having the glottis completely closed at one end, with modal voice at the other end, and breathy-voiced sounds can be produced by having the glottis open so that the vocal folds do not vibrate at one end, with modal voice at the other end. But regardless of how these sounds are produced, they sound the same, so the distinction is not important. Either way, it is still accurate to say that breathy-voiced sounds are in a position between voiceless sounds and modally-voiced sounds.)

It would be helpful to see how voiced aspirated stops behave with respect to sound change in attested languages. Unfortunately, voiced aspirated stops are rare. which limits the number of available examples. As far as I know, voiced aspirated stops are mainly found in the Indo-Aryan languages of South Asia and the Nguni languages of South Africa. In the Indo-Aryan languages the voiced aspirated stops have been inherited from PIE, or at least Vedic Sanskrit (depending on what you believe about the nature of the PIE stops), and most of them seem to have preserved them unchanged. Sinhala and Kashmiri have no voiced aspirated stops, but I don’t know and can’t find any information on what happened to them in these languages. So it seems that the voiced aspirated stops have been stable in these languages. That suggests the rarity of voiced aspirated stops is probably more due to the infrequency of sound changes that would make them phonemic rather than inherent instability. However, the mutual influence of these languages upon each other within the South Asian linguistic area might have helped preserve the voiced aspirated stops; the fact that the two most peripheral Indo-Aryan languages do not have them is perhaps suggestive that this has been the case. What about the Nguni languages? These are a tight-knit group, probably having a common origin within the last millennium, and their closest relatives such as Tswana have no voiced aspirated stops. So their voiced aspirated stops are of more recent vintage. Interestingly, Traill, Khumalo & Fridjhon (1987) have found that the Zulu voiced aspirated stops are actually voiceless, with the breathy voice occuring after the release on the following vowel. This seems like it could be the first step on a change of voiced aspirated stops into voiceless aspirated stops. But I don’t think any of this evidence is of much use in making the case that Grimm’s Law is weird. My case primarily rests on the idea that voiced aspirated stops are intermediate between voiceless and modally-voiced stops on the basis of how they are produced.

If the changes as described above are odd, maybe we should consider the possibility that the changes described by Grimm’s Law were of a different nature.

Perhaps a minor amendment can solve the problem. It is universally agreed that the Proto-Germanic voiced stops had voiced fricative allophones. It is not totally clear which environments the stops occured in and which environments the fricatives occured in, but they were all definitely stops after nasals and when geminate and fricatives after vowels and diphthongs. There are three different ways this situation might have come to be.

  1. The PIE voiced aspirated stops might have turned into voiced unaspirated stops first and then acquired fricative allophones in certain environments.
  2. The PIE voiced aspirated stops might have turned into voiced unaspirated fricatives first and then acquired stop allophones in certain environments.
  3. The PIE voiced aspirated stops might have turned into voiced unaspirated fricatives in certain environments and voiced unaspirated stops in others.

If we suppose that number 2 is the accurate description of what happened, then it is possible that the fricativisation of the PIE voiced aspirated stops occured before the devoicing of the PIE voiced unaspirated stops. This devoicing would then be perfectly natural because the PIE voiced unaspirated stops would be the only stops remaining in the language, so the marked [+voice] feature would be dropped from them. The voiced aspirated stops would probably have become voiced aspirated fricatives (i.e. breathy-voiced fricatives) initially and then these fricatives would have become modally-voiced since there would be no need for them to contrast with modally-voiced fricatives. Is it plausible that the voiceless unaspirated and voiced aspirated stops would have devoiced, but not the voiced unaspirated stops? What do these two kinds of stop have in common that the third stop lacks? If we think of voiced aspirated stops as half-voiced stops, we can describe the change as affecting all of the stops which were not fully voiced. The change is especially plausible, however, if we suppose that the PIE voiceless unaspirated stops had become aspirated before the changes described by Grimm’s Law took place. In that case, the change would affect the aspirated stops and not affect the unaspirated stops. Fricativisation of aspirated stops but not unaspirated stops is a very well-attested sound change; it happened in Greek, for example. The sequence of changes would be as follows:

[-continuant, -voice] > [+spread glottis]

[-continuant, +spread glottis] > [+continuant]

[-continuant, +voice] > [-voice]

[+continuant, +voice] > [-continuant]

(The last change would have occurred only in some environments; there are also conditioned exceptions to some of the other changes.)

Is there any other reason to think the PIE voiceless unaspirated stops might have become aspirated in Proto-Germanic before fricativising? Well, the reflexes of the Proto-Germanic voiceless stops are aspirated in the North Germanic languages and English, and have become affricates in some positions in German which suggests that they were originally aspirated; the lack of aspiration in Dutch can probably be attributed to French influence. That suggests the Proto-Germanic voiceless stops were already aspirated. Of course, these voiceless stops are the reflexes of the PIE voiced unaspirated stops, not the PIE voiceless unaspirated stops. But perhaps the rule that aspirated voiceless stops was persistent in Proto-Germanic, so that it applied to both the PIE voiceless unaspirated stops before they fricativised and the PIE voiced unaspirated stops after they were devoiced. The rule seems to have persisted into German, because German went through its own kind of replay of Grimm’s Law in which the Proto-Germanic voiceless stops became affricates or fricatives and the Proto-Germanic voiced stops were devoiced. This second consonant shift was never fully completed in most German dialects; in Standard German, for example, Proto-Germanic *b and *g were not devoiced in word-initial position. However, *d was devoiced (c.f. English daughter, German Tochter) and modern Standard German /t/ is aspirated, so, for example, Tochter is pronounced [ˈtʰɔxtɐ].

I think this is a satisfactory solution to the problem. The idea that the PIE voiced aspirated stops became fricatives first is not a new one, in fact it is probably the favoured scenario, but I have never seen it justified in this way, and Ringe (2006) suggests that the voiced aspirates changed into both stops and fricatives depending on the environment (number 3 above), which is incompatible with the scenario I have proposed here.

Finally, I think I should mention that all of this reasoning has been done on the assumption that PIE had voiceless unaspirated, voiced unaspirated and voiced aspirated stops. If you subscribe to an alternative hypothesis about the nature of the PIE stops, such as the glottalic theory, Grimm’s Law might have to be explained in a completely different way. But despite it not being as easy as it might appear at first glance, it does seem that the standard hypothesis is capable of explaining Grimm’s Law.

Whether it can explain Verner’s Law is another matter. I have always thought it a little odd that the voiceless fricatives were voiced after unaccented syllables but not after accented syllables. It is not obvious how accent and voice can affect each other. But I’ll discuss this, perhaps, in another post.


Gordon, M., & Ladefoged, P. (2001). Phonation types: a cross-linguistic overview. Journal of Phonetics, 29(4), 383-406.

Ringe, D. (2006). From Proto-Indo-European to Proto-Germanic: A Linguistic History of English: Volume I. Oxford University Press.

Traill, A., Khumalo, J. S., & Fridjhon, P. (1987). Depressing facts about Zulu. African Studies, 46(2), 255-274.

The origin of war (summary of Cannibals and Kings by Marvin Harris, chapter 4)

Previously: Chapter 2.

I meant to write about Chapter 3 next, which is about the origin of agriculture. Basically, Harris thinks that the origin of agriculture is ultimately a result of the climate change that occured at the end of the Ice Age; but the exact causal sequence is complicated, and it varies depending on which region you’re talking about. I was struggling to write a post about it, so instead I’m going to write about Chapter 4, which is about the origin of warfare. I might write something on Chapter 3 later.

Why do people wage war against each other? In order to start answering this question we first have to understand that warfare is a phenomenon that has varied significantly in its nature across time and space. Different kinds of wars may happen for different reasons. We also have to understand that there is a difference between proximal and distal causes. The First World War happened, in one sense, because of the assassination of Archduke Franz Ferdinand, and yet most people who ask why the First World War started are not satisfied with this answer, because on its own it does not explain why the archduke was assassinated nor why his assassination caused a war. In general, when one asks “why does X happen?” and one receives the answer “because Y” there remain two questions about the cause of X—“why Y?” and “why does Y cause X?”. Y is the more proximal cause here, while the cause of Y and the cause of the fact that Y causes Y are the more distal causes. There is a temptation with questions of causation to think in terms of trying to find a first cause, but there is a sense in which first causes do not exist; everything is caused by something else. Yet it is possible to get to a stage where the remaining distal causes are already understood, or can be taken for granted; at that point you can say that the question has been answered. A lot of literature has been written by historians to try to get us to that point with the question of what caused the First World War. However, I’m not going to discuss this literature very much here, because even the causes identified in this literature are too proximal; they are the causes of one specific war, while the question at hand is the cause of war in general. And the First World War is a very atypical example of a war. In fact, all wars between states are atypical, in a certain sense, so the causes of these wars are not going to be the main focus of our discussion. States have only existed for the last few thousand years. But behaviorally modern humans have existed for at least 30,000 years. For most of that time, all human societies had a foraging mode of substance and were organised into bands1. It is only in the last 10,000 years that societies started to emerge that had a farming mode of subsistence and were organised into tribes, chiefdoms or states, and for much of this time these societies were a minority. So out of all the societies that have ever existed, most have been organised into foraging bands. Therefore, let’s start by examining the causes of warfare between foraging bands specifically.

At this point we have to note that there is some dispute over whether warfare is characteristic of forager societies. The dispute has a long history; Hobbes famously argued that humanity’s “state of nature” was a “war of all against all”, while Rousseau argued for the opposite. Nobody denies that there are forager societies today which wage war. But there are also some forager societies which have never been observed waging war, such as the Andaman Islanders, the Yahgan of Patagonia and the Semai of Malaysia. These are a minority of forager societies today, but some anthropologists believe that during the Palaeolithic, all forager societies were like this, and the modern forager societies which wage war acquired the practice due to contact with farmers. Marvin Harris is not one of those anthropologists. He thinks warfare is a very old phenomenon. The dispute is to some extent politically charged, because those with anti-war inclinations would like to believe that the prospensity towards warfare is not innate to the human species. Of course, even if humanity did have some innate prospensity towards warfare that wouldn’t necessarily mean it couldn’t be suppressed by culture. In any case, the question of how old warfare is as a phenomenon is a factual one, to be settled by facts.

Unfortunately, there is no strong evidence for any stance when it comes to Palaeolithic warfare. We certainly don’t find obvious giveaways, like the walls, towers and moats of Jericho, or the Talheim Death Pit; none of the cave paintings depict warfare. Plenty of skeletons have been found showing signs of trauma, but this is not unambiguous evidence; the trauma could have been caused by hunting accidents, or one-off incidents that were not part of an organised campaign of violence, or it could have been inflicted after death (the Manus people of New Guinea, for example, sever their dead relatives’ heads to use as keepsakes, while the Fore people, also of New Guinea, used to smash holes in their dead relatives’ skulls in order to eat their brains).

Harris admits that warfare was probably less frequent and less deadly in the Palaeolithic. Forager societies are mobile, so there is less need for conflict over territory (although it might still arise when there is no new land to escape to that is sufficiently fertile). There is also less sense of shared identity and hence less potential for xenophobia. Bands are small units, with less than 100 individuals each, so members of different bands have to intermarry to avoid inbreeding, and people often move between bands, perhaps accompanied by relatives within the band, to meet up with relatives outside of the band. Recorded instances of wars between forager bands are perhaps best described not as wars between bands, but as the sum of one-on-one fights between members of different bands who decide to resolve their individual disputes. In order to illustrate this, let’s have a look at one example of such a war that was recorded by C. W. Hart and Arnold Pilling in the late 1920s. This war took place between two bands of Australian aboriginals who lived on the Tiwi Islands in northern Australia, the Tiklauila-Rangwila and Mandiiumbula bands. The original account is not available to me, but here’s Harris’s retelling of it:

The Tiklauila-Rangwila men were the instigators. They painted themselves white, formed a war party and advised the Mandiiumbula of their intentions. A time was set for a meeting. When the two groups had gathered, both sides exchanged a few insults and agreed to meet formally in an open space where there was plenty of room. As night fell … individuals from the two groups exchanged visits, since the war parties included relatives on both sides and no one regarded every member of the other group as an enemy. At dawn the two groups lined up on opposite sides of the clearing. Hostilities began with some old men shouting out their grievances at one another. Two or three individuals were singled out for special attention.

Hence when spears began to be thrown, they were thrown by individuals for reasons based on individual disputes.

Since the old men did most of the spear throwing, marksmanship tended to be highly inaccurate.

Not infrequently the person hit was some innocent non-combatant or one of the screaming old women who weaved through the fighting men, yelling obscenities at everybody, and whose reflexes for dodging spears were not as fast as those of the men … As soon as somebody was wounded, even a seemingly irrelevant old crone, fighting stopped immediately until the implications of this new incident could be assessed by both sides.

Harris thinks that a typical war between forager bands in the Palaeolithic would have been much like this war among the Tiwi people. It would have arisen as a result of individual disputes when people who shared disputes against members of the same band decided to team up with each other. That would have been the proximal cause of these wars. Yet the question remains: why did these disputes need to be resolved by violence?

Don’t get me wrong; I’m not expecting that foragers wouldn’t have resorted to violence when it was advantageous for them. (Although there are forager societies which appear to have had a philosophy of complete non-violence, such as the Moriori of the Chatham Islands; unfortunately such societies tend to be selected against for obvious reasons—look up what happened to the Moriori.) But war between forager bands is immensely costly, perhaps more so for the societies involved than war between states. Remember, forager bands are small. And war disproportionately affects the stronger members of the band, the adult males, who make the biggest contribution to the band’s continued survival. Note also the mention in the above account of the Tiwi war of the fact that the Tiklauila-Rangwila and Mandiiumbula war parties included relatives on both sides. These people were trying to kill their own relatives, and other people who they regularly interacted with. It wasn’t like a war between states were the two parties can dehumanize each other.

Why couldn’t the Tiklauila-Rangwila men have resolved their grievances by engaging in some kind of ritualised mock combat? Of course, we don’t know how serious these grievances were, and killing has a finality which no other method of conflict resolution can rival—but it would be rash to assume that war was the only thing which would work. In fact, this was apparently how the Moriori resolved their conflicts.

Why might this kind of warfare happen? Let’s try and think of some possible reasons.

  1. The need to foster solidarity. By working together to fight an external enemy, groups that wage war increase their internal cohesion and hence their capability for survival.
  2. The need for entertainment. Groups might start wars just because their members enjoy it. (OK, this might sound a little ridiculous to people brought up in the modern Western cultural tradition which tends to hammer home the “war is hell” message, but in other cultures war is glorified. If you haven’t been in a war, how do you know what it’s really like? And if you have been in a war and not enjoyed it, might others have had a different experience?)
  3. The need to satisfy an innate “urge to kill”. Humans, or perhaps just human males specifically, have an instinct which compels them to kill others, which has evolved in the usual way via natural selection; for whatever reason, humans with an inclination towards violence have proved more reproductively successful. Of course it is possible to suppress and moderate this innate drive, just as is possible with other innate drives (people go on hunger strikes, and die from them, for example), but it remains present and manifests itself in the absence of mitigation.
  4. The desire to control more resources.

The last explanation here, that war occurs when political units compete over resources, might strike you as the most sensible one here, and it probably is the main explanation for wars between states. But does it work as an explanation of the Tiwi war? If wars between forager bands are waged for this reason, we would expect wars to often result in one band gaining resources at the expense of the other band. The ethnographic evidence, however, suggests otherwise. Victorious warriors certainly gain social status, and sometimes they gain women which they have kidnapped from the enemy, but often they return only with trophies, such as the severed heads of the men they have slain. After all, these societies do not have the capability to store food or other valuables in large quantities. Territorial expansion is out of the question; bands are mobile, so they lack territory in the first place, and besides they lack the organisational capability to subjugate a population and extract tribute from it. And as we have seen from Chapter 2, forager societies control their population growth, so they would have no need for more Lebensraum most of the time. Of course forager societies would not have always been completely successful at maintaining a constant population, so perhaps wars would occur for reasons of Lebensraum sometimes, but it would not be the main cause of war.

So, perhaps one of the other three explanations given above is more appropriate? Well, let’s have a look at them one-by-one.

Harris’s main problem with the solidarity explanation is that it needs to be accompanied with an explanation of why, out of all the ways a group could increase its sense of togetherness, warfare would be favoured over other methods which do not suffer from the rather significant costs associated with warfare. Is it so difficult to foster intragroup solidarity? In modern Western societies, sport seems to be able to do this to a considerable degree. Maybe it doesn’t work as well as warfare would, but it’s not clear—and warfare doesn’t just need to be better at fostering intragroup solidarity than sport, it needs to be so much better that it is worth the cost. Nobody has been able to show that this is the case, and without that the solidarity explanation has no explanatory power.

As for entertainment: well, I tried to convince you above that it’s not totally obvious that war isn’t fun, but after a careful look at the evidence it seems fair to conclude that war really isn’t fun. Of course societies tend to bombard their members with messages telling them that war is a blast (and not just literally), or, failing that, that killing people somehow makes you a better, more noble person—dulce et decorum est pro patria mori, right? But the very fact that the message has to be hammered home so much kind of gives away that this attitude towards war is far from natural. Humans don’t need these societal messages to get them to do things they enjoy doing; in fact there are some things that humans enjoy very much and do very frequently even when the society they live in is constantly telling them that it makes them a bad person. Everybody knows about the conscientious objectors during the two World Wars. What you might not know is that their equivalents existed even in many pre-state societies which waged war. The Crow Indians, for example, allowed adult males to avoid fighting as long as they dressed themselves in women’s clothing and worked as servants to the warriors. Even the notoriously warlike Yanomamo of the Amazon rainforest have to prepare themselves for battles by taking drugs and performing special rituals. So this explanation, too, is not convincing.

The “killer instinct” explanation suffers from the same problem. If humans have an innate urge to kill people, why is it so hard to get them to do it? In any case, why would such an innate urge be maintained by natural selection in the first place? The cause which the instinct explanation names is too proximal.

So if none of these explanations are satisfactory, what is Harris’s explanation?

I mentioned above how warfare could occur due to the need of bands for Lebensraum after population growth, but this would be a rare occurence. But a related reason why war would be beneficial would be the fact that war helps decrease the average population density in a region. Obviously, the prospect of war is good motivation for bands to try and stay away from each other. In fact, the threat of attack might encourage bands to stay away entirely from certain regions where they would be especially vulnerable. In this way “no man’s lands” are created, and these might be crucial to ensuring that foragers do not deplete the supply of the animals and plants that they consume.

But Harris doesn’t think these are the only benefits of war. He also claims that war helped foragers to control their population growth. On the face of it, this claim might seem strange because the direct effect of war is mainly to reduce the number of adult males in the population. But as long as the amount of adult women remains more or less constant, the rate of population growth will remain the same, especially in openly polygynous societies. (Not all forager societies are polygynous, but the more warlike societies tend to be more polygynous, according to Harris.)

Harris’s answer is that warfare is part of a kind of package of practices which mutually reinforce each other and serve to control population growth, so that forager populations remain within the carrying capacity of their environment. These three practices are female infanticide, warfare and male supremacy (i.e., patriarchy).

Female infanticide, of course, directly limits the rate of population growth. That was the topic of Chapter 2. But female infanticide is costly for the mother. Not necessarily in moral terms, because cultures vary greatly in how they view the morality of infanticide. But pregnancy isn’t easy, and it is not hard to see why mothers would be reluctant to kill their newborn babies after going through all that effort. And it’s not just mothers who suffer the costs, because the whole band has to work together to supply the extra food needed to feed the developing baby (or the newborn baby, since infanticide can occur by neglect, not just by direct killing).

Harris proposes (as we’ll see when we get to the next chapter) that foragers’ reliance on the strongest members of their societies, the adult males, for military purposes is the cause of the development of male supremacist institutions and practices. One of these male supremacist practises is the favouring of male babies over female ones, and this encourages female infanticide. Female infanticide would be favoured to some extent already because of its effect on population growth, but male supremacy makes it possible for it to be favoured to a greater extent, resulting in a greater effect on population growth.

Even if you don’t find this causal link illuminating (it is elaborated upon in the next chapter), Harris points to evidence that it holds up. William Divale studied a number of band and tribe societies for which census records were available covering the time when the society was pacified by the occupying state. He found that the ratio of boys less than 14 years old to girls in the same age bracket was significantly higher (128:100, on average) before pacification than after it (the average figure for societies less than 25 years post-pacification was 113:100, and after 25 years it dropped to 106:100, more or less the global average, which was 105:100 when Harris wrote the book but is 107:100 today). The figures are restricted to this age bracket for the obvious reason that when these boys grew up, many of them were killed in battle, so that the sex ratio among the adults in the unpacified societies was actually more or less exactly equal, and it actually became more skewed towards boys after pacification.

So, in summary: according to Marvin Harris, the practice of warfare facilitates the survival of forager societies because it encourages the dispersal of bands and the creation of no man’s lands where the animals and plants that they feed on can take refuge, and also because it facilitates the development of male supremacist institutions and practices; in particular, it encourages female infanticide, which limits population growth and thereby prevents the resources in the environment from being depleted.


  1. ^ Anthropologists classify societies via their extent of organisation into four types, going from least to most complex: bands, tribes, chiefdoms and states.