That’s OK, but this’s not OK?

Here’s something peculiar I noticed the other day about the English language.

The word is (the third-person singular present indicative form of the verb be) can be ‘contracted’ with a preceding noun phrase, so that it is reduced to an enclitic form -‘s. This can happen after pretty much any noun phrase, no matter how syntactically complex:

(1) he’s here

/(h)iːz ˈhiːə/[1]

(2) everyone’s here

/ˈevriːwɒnz ˈhiːə/

(3) ten years ago’s a long time

/ˈtɛn ˈjiːəz əˈgəwz ə ˈlɒng ˈtajm/

However, one place where this contraction can’t happen is immediately after the proximal demonstrative this. This is strange, because it can certainly happen after the distal demonstrative that, and one wouldn’t expect these two very similar words to behave so differently:

(4) that’s funny
/ˈðats ˈfʊniː/

(5) *this’s funny

There is a complication here which I’ve kind of skirted over, though. Sure, this’s funny is unacceptable in writing. But what would it sound like, if it was said in speech? Well, the -’s enclitic form of is can actually be realized on the surface in a couple of different ways, depending on the phonological environment. You might already have noticed that it’s /-s/ in example (4), but /-z/ in examples (1)-(3). This allomorphy (variation in phonological form) is reminiscent of the allomorphy in the plural suffix: cats is /ˈkats/, dogs is /ˈdɒgz/, horses is /ˈhɔːsɪz/. In fact the distribution of the /-s/ and /-z/ realizations of -‘s is exactly the same as for the plural suffix: /-s/ appears after voiceless non-sibilant consonants and /-z/ appears after vowels and voiced non-sibilant consonants. The remaining environment, the environment after sibilants, is the environment in which the plural suffix appears as /-ɪz/. And this environment turns out to be exactly the same environment in which -’s is unacceptable in writing. Here are a couple more examples:

(6) *a good guess’s worth something (compare: the correct answer’s worth something)

(7) *The Clash’s my favourite band (compare: Pearl Jam’s my favourite band)

Now, if -‘s obeys the same rules as the plural suffix then we’d expect it to be realized as /-ɪz/ in this environment. However… this is exactly the same sequence of segments that the independent word is is realized as when it is unstressed. One might therefore suspect that in sentences like (8) below, the morpheme graphically represented as the independent word is is actually the enclitic -‘s, it just happens to be realized the same as the independent word is and therefore not distinguished from it in writing. (Or, perhaps it would be more elegant to say that the contrast between enclitic and independent word is neutralized in this environment.)

(8) The Clash is my favourite band

Well, this is (*this’s) a very neat explanation, and if you do a Google search for “this’s” that’s pretty much the explanation you’ll find given to the various other confused people who have gone to websites like English Stack Exchange to ask why this’s isn’t a word. Unfortunately, I think it can’t be right.

The problem is, there are some accents of English, including mine, which have /-əz/ rather than /-ɪz/ in the allomorph of the plural suffix that occurs after sibilants, while at the same time pronouncing unstressed is as /ɪz/ rather than /əz/. (There are minimal pairs, such as peace is upon us /ˈpiːsɪz əˈpɒn ʊz/ and pieces upon us /ˈpiːsəz əˈpɒn ʊz/.) If the enclitic form of is does occur in (8) then we’d expect it to be realized as /əz/ in these accents, just like the plural suffix would be in the same environment. This is not what happens, at least in my own accent: (8) can only have /ɪz/. Indeed, it can be distinguished from the minimally contrastive NP (9):

(9) The Clash as my favourite band

In fact this problem exists in more standard accents of English as well, because is is not the only word ending in /-z/ which can end a contraction. The third-person singular present indicative of the verb have, has, can also be contracted to -‘s, and it exhibits the expected allomorphy between voiceless and voiced realizations:

(10) it’s been a while /ɪts ˈbiːn ə ˈwajəl/

(11) somebody I used to know’s disappeared /ˈsʊmbɒdiː aj ˈjuːst tə ˈnəwz dɪsəˈpijəd/

But like is it does not contract, at least in writing, after sibilants, although it may drop the initial /h-/ whenever it’s unstressed:

(12) this has gone on long enough /ˈðɪs (h)əz gɒn ɒn lɒng əˈnʊf/

I am not a native speaker of RP, so, correct me if I’m wrong. But I would be very surprised if any native speaker of RP would ever pronounce has as /ɪz/ in sentences like (12).

What’s going on? I actually do think the answer given above—that this’s isn’t written because it sounds exactly the same as this is—is more or less correct, but it needs elaboration. Such an answer can only be accepted if we in turn accept that the plural -s, the reduced -‘s form of is and the reduced -‘s form of has do not all exhibit the same allomorph in the environment after sibilants. The reduced form of is has the allomorph /-ɪz/ in all accents, except in those such as Australian English in which unstressed /ɪ/ merges with schwa. The reduced form of has has the allomorph /-əz/ in all accents. The plural suffix has the allomorph /-ɪz/ in some accents, but /-əz/ in others, including some in which /ɪ/ is not merged completely with schwa and in particular is not merged with schwa in the unstressed pronunciation of is.

Introductory textbooks on phonology written in the English language are very fond of talking about the allomorphy of the English plural suffix. In pretty much every treatment I’ve seen, it’s assumed that /-z/ is the underlying form, and /-s/ and /-əz/ are derived by phonological rules of voicing assimilation and epenthesis respectively, with the voicing assimilation crucially coming after the epenthesis (otherwise we’d have an additional allomorph /-əs/ after voiceless sibilants, while /-əz/ would only appear after voiced sibilants). This is the best analysis when the example is taken in isolation, because positing an epenthesis rule allows the phonological rules to be assumed to be productive across the entire lexicon of English. If such a fully productive deletion rule were posited, then it would be impossible to account for the pronunciation of a word like Paulas (‘multiple people named Paula’) with /-əz/ on the surface, whose underlying form would be exactly the same, phonologically, as Pauls (‘multiple people named Paul’). (This example only works if your plural suffix post-sibilant allomorph is /-əz/ rather than /-ɪz/, but a similar example could probably be exhibited in the other case.) One could appeal to the differing placement of the morpheme boundary but this is unappealing.

However, the assumption that a single epenthesis rule operating between sibilants is productive across the entire English lexicon has to be given up, because ‘s < is and ‘s < has have different allomorphs after sibilants! Either they are accounted for by two different lexically-conditioned epenthesis rules (which is a very unappealing model) or the allomorphs with the vowels are actually the underlying ones, and the allomorphs without the vowels are produced by a not phonologically-conditioned but at least (sort of) morphologically-conditioned deletion rule that elides fully reduced unstressed vowels (/ə/, /ɪ/) before word-final obstruents. This rule only applies in inflectional suffixes (e.g. lettuce and orchid are immune), and even there it does not apply unconditionally because the superlative suffix -est is immune to it. But this doesn’t bother me too much. One can argue that the superlative is kind of a marginal inflectional category, when you put it in the company of the plural, the possessive and the past tense.

A nice thing about the synchronic rule I’m proposing here is that it’s more or less exactly the same as the diachronic rule that produced the whole situation in the first place. The Old English nom./acc. pl., gen. sg., and past endings were, respectively, -as, -es, -aþ and -ede. In Middle English final schwa was elided unconditionally in absolute word-final position, while in word-final unstressed syllables where it was followed by a single obstruent it was gradually eliminated by a process of lexical diffusion from inflectional suffix to inflectional suffix, although “a full coverage of the process in ME is still outstanding” (Minkova 2013: 231). Even the superlative suffix was reduced to /-st/ by many speakers for a time, but eventually the schwa-ful form of this suffix prevailed.

I don’t see this as a coincidence. My inclination, when it comes to phonology, is to see the historical phonology as essential for understanding the present-day phonology. Synchronic phonological alternations are for the most part caused by sound changes, and trying to understand them without reference to these old sound changes is… well, you may be able to make some progress but it seems like it’d be much easier to make progress more quickly by trying to understand the things that cause them—sound changes—at the same time. This is a pretty tentative paragraph, and I’m aware I’d need a lot more elaboration to make a convincing case for this stance. But this is where my inclination is headed.

[1] The transcription system is the one which I prefer to use for my own accent of English.


The perfect pathway

Anybody who knows French or German will be familiar with the fact that the constructions in these languages described as “perfects” tend to be used in colloquial speech as simple pasts1 rather than true perfects. This can be illustrated by the fact that the English sentence (1) is ungrammatical, whereas the French and German sentences (2) and (3) are perfectly grammatical.

  1. *I have left yesterday.
  1. Je suis parti hier.
    I am leave-PTCP yesterday
    “I left yesterday.”
  1. Ich habe gestern verlassen.
    I have-1SG yesterday leave-PTCP
    “I left yesterday.”

The English perfect is a true perfect, referring to a present state which is the result of a past event. So, for example, the English sentence (4) is paraphrased by (5).

  1. I have left.
  1. I am in the state of not being present resulting from having left.

As it is specifically present states which are referred to by perfects, it makes no sense for a verb in the perfect to be modified by an adverb of past time like ‘yesterday’. That’s why (1) is ungrammatical. In order for ‘yesterday’ to modify the verb in (1), the verb would have to refer to a past state resulting from an event further in the past; the appropriate category for such a verb is not the perfect but rather the pluperfect or past perfect, which is formed in the same way as the perfect in English except that the auxiliary verb have takes the past tense. It’s perfectly fine for adverbs of past time to modify the main verbs of pluperfect constructions; c.f. (6).

  1. I had left yesterday.

If the French and German “perfects” were true perfects like the English perfect, (2) and (3) would have to be ungrammatical too, and as they are not in fact ungrammatical we can conclude that these “perfects” are not true perfects. (Of course one could also conclude this from asking native speakers about the meaning of these “perfects”, and one has to take this step to be able to conclude that they are in fact simple pasts; the above is just a neat way of demonstrating their non-true perfect nature via the medium of writing.)

French and German verbs do have simple past forms which have a distinctive inflection; for example, partis and verließ are the first-person singular inflected simple past forms of the verbs meaning ‘leave’ in sentences (2) and (3) respectively, corresponding to the first-person singular present forms pars and verlasse. But these inflected simple past forms are not used in colloquial speech; their function has been taken over by the “perfect”. If you take French or German lessons you are taught how to use the “perfect” before you are taught how to use the simple past, because the “perfect” is more commonly used; it’s the other way round if you take English lessons, because in English the simple past is not restricted to literary speech, and is more common than the perfect as it has a more basic meaning.

The French and German “perfects” were originally true perfects even in colloquial speech, just as in English. So how did this change in meaning from perfect to simple past occur? One way to understand it is as a simple case of generalization. The perfect is a kind of past; if one were to translate (4) into a language such as Turkish which does not have any sort of perfect construction, but does have a distinction between present and past tense, one would translate it as a simple past, as in (7).

  1. Ayrıldım.
    “I left / have left.”

The distinction in meaning between the perfect and the simple past is rather subtle, so it is not hard to imagine the two meanings being confused with each other frequently enough that the perfect came eventually to be used with the same meaning as the simple past. This could have been a gradual process. After all, it is often more or less a matter of arbitrary perspective whether one chooses to focus on the state of having done something, and accordingly use the perfect, or on the doing of the thing itself, and accordingly use the simple past. Here’s an example: if somebody tells you to look up the answer to a question which was raised in a discussion of yours with them, and you go away and look up the answer, and then you meet this person again, you might say either “I looked up the answer” or “I’ve looked up the answer”. At least to me, neither utterance seems any more expected in that situation than other. French and German speakers may have tended over time to more and more err on the side of focusing on the state, so that the perfect construction became more and more common, and this would encourage reanalysis of the meaning of the perfect as the same as that of the simple past.

But it might help to put this development in some further context. It’s not only in French and German that this development from perfect to simple past has occurred. In fact, it seems to be pretty common. Well, I don’t know about other families, but it is definitely common among the Indo-European (IE) languages. There is, in fact, evidence that the development occurred in the history of English, during the development of Proto-Germanic from Proto-Indo-European (PIE). (This means German has undergone the development twice!) I’ll talk a little bit about this pre-Proto-Germanic development, because it’s a pretty interesting one, and it ties in with some of the other cases of the development attested from IE languages.

PIE (or at least a late stage of it; we’ll talk more about that issue below) distinguished three different aspect categories, which are traditionally called the “present”, “aorist” and “perfect”. The names of these aspects do not have their usual meanings—if you know about the distinction between tense and aspect, you probably already noticed that “present” is normally the name of a tense, rather than an aspect. (Briefly, tense is an event or state’s relation in time to the speech act, aspect is the structure of the event on the timeline without any reference to the speech act; for example, aspect includes things like whether the event is completed or not. But this isn’t especially important to our discussion.) The better names for the “present” and “aorist” aspects are imperfective and perfective, respectively. The difference between them is the same as that between the French imperfect and the French simple past: the perfective (“aorist”) refers to events as completed wholes and the imperfective (“present”) refers to other events, such as those which are iterated, habitual or ongoing. Note that present events cannot be completed yet and therefore can only be referred to by imperfectives (“presents”). But past events can be referred to by either imperfectives or perfectives. So, although PIE did distinguish two tenses, present and past, in addition to the three aspects, the distinction was only made in the imperfective (“present”, although that name is getting especially confusing here) aspect because the perfective (“aorist”) aspect entailed past tense. The past tense of the imperfective aspect is called the imperfect rather than the past “present” (I guess even IEists would find that terminology too ridiculous).

So what was the meaning of the PIE “perfect”? Well, the PIE “perfect” is reflected as a true perfect in Classical Greek. The system of Classical Greek, with the imperfect, aorist and true perfect all distinguished from one another, was more or less the same as that of modern literary French. However, according to Ringe (2006: 25, 155), the “perfect” in the earlier Greek of Homer’s poems is better analyzed as a simple stative, referring to a present state without any implication of this state being the result of a past event. Now, I’m not sure exactly what the grounds for this analysis are. Ringe doesn’t elaborate on it very much and the further sources it refers to (Wackernagel 1904; Chantraine 1927) are in German and French, respectively, so I can’t read them very easily. The thing is, every state has a beginning, which can be seen as an event whose result is the state, and thus every simple stative can be seen as a perfect. English does distinguish simple statives from perfects (predicative adjectives are stative, as are certain verbs in the present tense, such as “know”). The difference seems to me to be something to do with how salient the event that begins the state—the state’s inception—is. Compare sentences (8) and (9), which have more or less the same meaning except that the state’s inception is more salient in (9) (although still not as salient as it is in (10)).

  1. He is dead.
  1. He has died.
  1. He died.

But I don’t know if there are any more concrete diagnostic tests that can distinguish a simple stative from a perfect. Homeric and Classical Greek are extinct languages, and it seems like it would be difficult to judge the salience of inceptions of states in sentences of these languages without having access to native speaker intutions.

It is perhaps the case that some states are crosslinguistically more likely than others to be referred to by simple statives, rather than perfects. Perhaps the change was just that the “perfect” came to be used more often to refer to states that crosslinguistically tend to be referred to by perfects. Ringe (2006: 155) says:

… a large majority of the perfects in Classical Attic are obvious innovations and have meanings like that of a Modern English perfect; that is, they denote a past action and its present result. We find ἀπεκτονέναι /apektonénai/ ‘to have killed’, πεπομφέναι /pepompʰénai/ ‘to have sent’, κεκλοφέναι /keklopʰénai/ ‘to have stolen’, ἐνηνοχέναι /enęːnokʰénai/ ‘to have brought’, δεδωκέναι /dedǫːkénai/ ‘to have given’, γεγραφέναι /gegrapʰénai/ ‘to have written’, ἠχέναι /ęːkʰénai/ ‘to have led’, and many dozens more. Most are clearly new creations, but a few appear to be inherited stems that have acquired the new ‘resultative’ meaning, such as λελοιπέναι /leloipʰénai/ ‘to have left behind’ and ‘to be missing’ (the old stative meaning).

These newer perfects could still be glossed as simple statives (‘to be a thief’ instead of ‘to have stolen’, etc.) but the states they refer to do seem to me to be ones which inherently tend to involve a salient reference to the inception of the state.

There is a pretty convincing indication that the “perfect” was a simple stative at some point in the history of Greek: some Greek verbs whose meanings are conveyed by lexically stative verbs or adjectives in English, such as εἰδέναι ‘to know’ and δεδιέναι ‘to be afraid of’, only appear in the perfect and pluperfect. These verbs are sometimes described as using the perfect in place of the present and the pluperfect in place of the imperfect, although at least in Homeric Greek their appearance in only the perfect and pluperfect is perfectly natural in respect of their meaning and does not need to be treated as a special case. These verbs continued to appear only in the perfect and pluperfect during the Classical period, so they do not tell us anything about when the Greek “perfect” became a true perfect.

Anyway, it is on the basis of the directly attested meaning of the “perfect” in Homeric Greek that the PIE “perfect” is reconstructed as a simple stative. Other IE languages do preserve relics of the simple stative meaning which add to the evidence for this reconstruction. There are in fact relics of the simple stative meaning in the Germanic languages which have survived, to this day, in English. These are the “preterite-present” or “modal” verbs: can, dare, may, must, need, ought, shall and will. Unlike other English verbs, these verbs do not take an -s ending in the third person singular (dare and need can take this ending, but only when their complements are to-infinitives rather than bare infinitives). Apart from will (which has a slightly more complicated history), the preterite-present verbs are precisely those whose presents are reflexes of PIE “perfects” rather than PIE “presents” (although some of them have unknown etymologies). It is likely that they were originally verbs that appeared only in the perfect, like Greek εἰδέναι ‘to know’.2

Most of the PIE “perfects”, however, ended up as the simple pasts of Proto-Germanic strong verbs. (That’s why the preterite-present verbs are called preterite-presents: “preterite” is just another word for “past”, and the presents of preterite-present verbs are inflected like the pasts of other verbs.) Presumably these “perfects” underwent the whole two-step development from simple stative to perfect to simple past. There was plenty of time for this to occur: remember that the Germanic languages are unattested before 100 AD, and the development of the true perfect in Greek had already occurred by 500 BC. Just as the analytical simple pasts of colloquial French and German, which are the reflexes of former perfects, have completely replaced the older inflected simple pasts, so the PIE “perfects” completely replaced the PIE “aorists” in Proto-Germanic. According to Ringe (2006: 157) there is absolutely no trace of the PIE “aorist” in any Germanic language. Proto-Germanic also lost the PIE imperfective-perfective opposition, and again the simple pasts reflecting the PIE “perfects” completely replaced the PIE imperfects—with a single exception. This was the verb *dōną ‘to do’, whose past stem *ded- is a reflex of the PIE present stem *dʰédeh1 ‘put’. Admittedly, the development of this verb as a whole is somewhat mysterious (it is not clear where its present stem comes from; proposals have been put forward, but Ringe 2006: 160 finds none of them convincing) but given its generic meaning and probable frequent use it is not surprising to find it developing in an exceptional way. One reason we can be quite sure it was used very frequently is that the *ded- stem is the same one which is though to be reflected in the past tense endings of Proto-Germanic weak verbs. There’s a pretty convincing correspondence between the Gothic weak past endings and the Old High German (OHG) past endings of tuon ‘to do’:

Past of Gothic waúrkjan ‘to make’ Past of OHG tuon ‘to do’
Singular First-person waúrhta ‘I made’ tëta ‘I did’
Second-person waúrhtēs ‘you (sg.) made’ tāti ‘you (sg.) did’
Third-person waúrhta ‘(s)he made’ tëta ‘(s)he did’
Plural First-person waúrhtēdum ‘we made’ tāti ‘we did’
Second-person waúrhtēduþ ‘you (pl.) made’ tātīs ‘you (pl.) did’
Third-person waúrhtēdun ‘they made’ tāti ‘they did’

Note that Proto-Germanic is reflected as ē in Gothic but ā in the other Germanic languages, so the alternation between -t- and -tēd- at the start of each ending in Gothic corresponds exactly, phonologically and morphologically, to the alternation between the stems tët- and tāt- in OHG.

The pasts of Germanic weak verbs must have originally been formed by an analytical construction with a similar syntax as the English, French and German perfect constructions, involving the auxiliary verb *dōną ‘to do’ in the past tense (probably in a sense of ‘to make’) and probably the past participle of the main verb. As pre-Proto-Germanic had SOV word order, the auxiliary verb could then be reinterpreted as an ending on the past participle, which would take us (with a little haplology) from (11) to (12).

  1. *Ek wēpną wurhtą dedǭ.
    I weapon made-NSG wrought-1SG
    “I wrought a weapon” (lit. “I made a weapon wrought”)
  1. *Ek wēpną wurht(ąd)edǭ
    I weapon wrought-1SG
    “I wrought a weapon”

(The past of waúrht- is glossed here by the archaic ‘wrought’ to distinguish it from ded- ‘make’, although ‘make’ is the ideal gloss for both verbs. I should probably have just used a verb other than waúrhtjan in the example to avoid this confusion, but oh well.)

Why couldn’t the pasts of weak verbs have been formed from PIE “perfects”, like those of strong verbs? The answer is that the weak verbs were those that did not have perfects in PIE to use as pasts. Many PIE verbs never appeared in one or more of the three aspects (“present”, “aorist” and “perfect”). I already mentioned the verbs like εἰδέναι < PIE *weyd- ‘to know’ which only appeared in the perfect in Greek, and probably in PIE as well. One very significant and curious restriction in this vein was that all PIE verbs which were derived from roots by the addition of a derivational suffix appeared only in the present aspect. There is no semantic reason why this restriction should have existed, and it is therefore one of the most convincing indications that PIE did not originally have morphological aspect marking on verbs. Instead, aspect was marked by the addition of derivational suffixes. There must have been a constraint on the addition of multiple derivational suffixes to a single root (perhaps because it would mess up the ablaut system, or perhaps just because it’s a crosslinguistically common constraint), and that would account for this curious restriction. Other indications that aspect was originally marked by derivational suffixes in PIE are the fact that the “present”, “aorist” and “perfect” stems of each PIE verb do not have much of a consistent formal relation to one another (there are some consistencies, e.g. all verbs which have a perfect stem form it by reduplication of the initial syllable, although *weyd- ‘know’, which has no present or aorist stem, is not reduplicated; but the general rule is one of inconsistency); there is no single present or aorist suffix, for example, and one pretty much has to learn each stem of each verb off by heart. Also, I’ve think I’ve read, although I can’t remember where I read it, that aspect is still marked (wholly, or largely) by derivational sufixes only in Hittite.

The class of derived verbs naturally expanded over time, while the class of basic verbs became smaller. The inability of derived verbs to have perfect stems is therefore perhaps the main reason why it was necessary to use an alternative strategy for forming the pasts of some verbs in Proto-Germanic, and thus to create a new class of weak verbs separate from the strong verbs.

So that’s the history of the PIE “perfect” in Germanic (with some tangential, but hopefully interesting elaboration). A similar development occurred in Latin. A few PIE “perfects” were preserved in Latin as statives, just like the Germanic preterite-presents (meminisse ‘to remember’, ōdisse ‘to hate’, nōvisse ‘to recognize, to know (someone)’); the others became simple pasts. But I don’t know much about the details of the developments in Latin.


perfect-pathwayWe’ve seen evidence from Indo-European languages that there’s a kind of developmental pathway going on: statives develop into perfects, and perfects develop into simple pasts. In order for the first step to occur there has to be some kind of stative category, and it looks like this might be a relatively uncommon feature: most of the languages I’ve seen have a class of lexically stative verbs or tend to use entirely different syntax for events and states (e.g. verbs for events, adjectives for states). (English does a bit of both.) The existence of the stative category in PIE might be associated with the whole aspectual system’s recent genesis via morphologization of derivational suffixes. Of course the second part of the pathway can occur on its own, as it did in French and German after perfects were innovated via an analytical construction. It is also possible for simple pasts to be innovated straight away via analytical constructions, as we saw with the Germanic weak past inflection.

It would be interesting to hear if there are any other examples of developments occurring along this pathway, or, even more interestingly, examples where statives, perfects or simple pasts have developed or have been developed in completely different ways, from non-Indo-European languages (or Indo-European languages that weren’t mentioned here).


  1. ^ I’m using the phrase “simple past” here to refer to the past tense without the additional meaning of the true perfect (that of a present state resulting from the past event). In French the simple past can be distinguished from the imperfect as well as the perfect: the simple past refers to events as completed wholes (and is therefore said to have perfective aspect), while the imperfect refers either to iterated or habitual events, or to part of an event without the entailment that the event was completed (and is therefore said to have imperfective aspect). The perfect also refers to events as completed wholes, but it also refers to the state resulting from the completion of such events, more or less at the same time (arguably the state is the more primary reference). In colloquial French, the perfect is used in place of the simple past, so that no distinction is made between the simple past and perfect (and the merged category takes the name of the simple past), but the distinction from the imperfect is preserved. Thus the “simple past” in colloquial French is a little different from the “simple past” in colloquial German; German does not distinguish the imperfect from the simple past in either its literary or colloquial varieties. The name “aorist” can be used to refer to a simple past category like the one in literary French, i.e., a simple past which is distinct from both the perfect and the imperfect.
  2. ^ Of course, εἰδέναι appears in the pluperfect as well as the perfect, but the Greek pluperfect was an innovation formation, not inherited from PIE, and there is no reason to think Proto-Germanic ever had a pluperfect. The Proto-Germanic perfect might well have referred to a state of indeterminate tense resulting from a past event, in which case it verbs in the perfect probably could be modified with adverbs of past time like ‘yesterday’. It is a curious thing that the present and past tenses were not distinguished in the PIE “perfect”; there is no particular reason why they should not have been (simple stative meaning is perfectly compatible with both tenses, c.f. English “know” and “knew”) and it is therefore perhaps an indication that tense distinction was a recent innovation in PIE, which had not yet had time to spread to aspects other than the imperfective (“present”). The nature of the endings distinguishing the present and past tense is also suggestive of this; for example the first-person, second-person and third-person singular endings are *-mi, *-si and *-ti respectively in the present and *-m, *-s and *-t respectively in the past, so the present endings can be derived from the past endings by the addition of an *-i element. This *-i element has been hypothesised to be have originally been a particle indicating present tense; it’s called the hic et nunc (‘here and now’) particle. I don’t know how the other endings are accounted for though.


Some facts about gender

One of the most interesting phenomena found in languages is gender. In its linguistic sense, gender refers to the phenomenon where nouns are divided into a number of different classes which can be distinguished due to the fact that words associated with nouns, like pronouns, determiners, adjectives and verbs, often appear in different forms depending on the gender of the nouns they are associated with (this is called gender agreement). For example, in German the word for ‘the’ is der when it is attached to masculine nouns like Mann ‘man’, die when it is attached to feminine nouns like Frau ‘women’ and das when it is attached to neuter nouns like Kind ‘child’. The main reason gender is such an interesting phenomenon is probably that is not at all obvious why it exists. Language is generally thought of as a means of communication, but it is hard to see how gender systems aid communication. Even if there might be some benefits, any explanation has to account for the high prevalence of gender systems in languages worldwide: in the WALS‘s sample 112 out of 257 languages, about 44%, make some kind of distinction between two or more genders.

Something people aren’t always aware of is that English has gender as well. In fact, it has three genders, like German: masculine, feminine and neuter. Admittedly, there are two differences between the gender system of English and the gender systems of languages like French and German which make gender a less prominent phenomenon in English.

Firstly, in English, gender agreement only occurs with pronouns. The words he, she and it are used to refer to males, females and non-gendered things, respectively, and using the wrong pronoun for a given referent is considered grammatically incorrect.1 On the other hand, in French and German gender agreement also occurs with determiners and adjectives, and in written French, and in both spoken and written Russian gender agreement also occurs with verbs. Languages like English where gender agreement only occurs with pronouns are said to have pronominal gender systems. But promoninal gender systems are gender systems nonetheless. Remember above I said that only 44% of the languages in the WALS’s sample distinguish two or more genders: well, English and other languages are counted among that 44%, so a majority—56%—of the languages in the sample show no gender agreement, not even in pronouns. In fact, pronominal gender systems are quite rare, and it is likely that most of them are the result of not-quite-complete loss of an original, more extensive gender system. This is certainly the case for English. I think it’s quite possible that in the future, English will lose the last vestiges of its gender system as people switch to using they to refer to people without regard to gender in all circumstances, as they already tend to do when the gender of a person is unknown.

Secondly, English gender assignment corresponds almost exactly to the meaning of the referent. Perhaps the only exception is that ships are often referred to as she, but even this is optional. On the other hand, in French and German there are many things which do not have a gender but are classified as masculine or feminine. It’s not the case that speakers of these languages define gender in a different way from English speakers: French people do not actually think that curtains are male and tables are female, even though they say un rideau, not *une rideau and une table, not *un table. And in German, Gardine ‘curtain’ is feminine and Tisch ‘table’ is masculine, so it seems unlikely that the choice of genders for these objects is based on any inherent association of them with masculinity or femininity given that two neighbouring peoples sharing similar cultures have made the assignments in two completely different ways. The French and German masculine and feminine genders just contain a lot of other things besides males and females. In fact, in German, there is an example which goes the other way. Mädchen ‘girl’ should be feminine given the meaning2, but it is actually neuter: Germans say das Mädchen, rather than *die Mädchen.3

There is, however, an important difference between the assignment of Mädchen to the neuter gender and the assignment of Tisch and Gardine to the masculine and feminine genders respectively. The reason Mädchen is neuter is that it it is formed from the word Magd ‘maiden’ by adding the dimunitive suffix -chen, and there is a rule in German that says that every word formed by adding the suffix -chen is neuter. Many German suffixes are associated with a particular gender; for example, nouns in -heit, -keit and -schaft are always feminine, and nouns in -lein (which is another dimunitive suffix) are always neuter. The associations of these suffixes with particular genders constitute a rule which overrides the rule that every word that refers to males is masculine and every word that refers to females is feminine. So, in German gender assignment is determined by (at least) two rules, which are applied in sequence.

  1. If the noun is formed with the suffixes -heit, -keit or -schaft, assign it to the feminine gender, and if the noun is formed with the suffixes -chen or -lein, assign it to the neuter gender. (Note: there are other suffixes which should be mentioned in this rule, as well, but I’m not intending to precisely describe German gender assignment here, just to show you the general outline of how the system works.)
  2. If the noun refers to males, assign it to the masculine gender. If the noun refers to females, assign it to the feminine gender.

Note that the first rule is formal in nature (it refers to how the words are formed) while the second rule is semantic in nature (it refers to what the words mean). The existence of formal rules is responsible for another way in which English gender assignment is different from French and German assignment. In English, gender is a property of referents, not of the nouns themselves. But in French and German, gender is more a property of the nouns themselves. In these languages, it is possible for the same thing to be referred to by two different nouns of different genders. For example, in French, the word vélo is masculine and the word bicyclette is feminine; this is because bicyclette ends in the feminine dimunitive suffix -ette.

These rules are not sufficient to assign every German noun to a gender. Of course, this is not meant to be a complete list. But there is an interesting question here: can a list of rules based on formal and semantic factors account for the gender assignment of every noun in German? Or are there some words whose gender assignment is simply arbitrary?

If you have any knowledge of German, you might find it hard to believe that gender assignment is not mostly arbitrary. People who know French would probably also expect gender assignment in that language to also be mostly arbitrary. However, quite a few studies have been carried out which have shown that gender in French is mostly predictable via phonological rules: that is, rules that take into account the sound of the word. For example Tucker, Lambert & Rigault (1977) found that 94% of French nouns ending in the sound /ʒ/ (such as ménage ‘housekeeping’) are masculine. There are exceptions: orange ‘orange’ is feminine. But by using rules like this, Tucker, Lambert & Rigault were able to correctly determine the genders of 85% of the nouns in the Petit Larousse, a famous French dictionary. Since they did not take into account semantic (i.e. relating the meanings of words) and morphological (i.e. relating to the composition of words from prefixes, suffixes, etc.) factors, and the phonological rules they found could probably be made more accurate, it is quite likely that the vast majority of French nouns are predictably gendered. Given this surprising result, it is possible that a lot more of German assignment is predictable than you might think. Köpcke & Zubin (1984) were able to find a large amount of regularity using similar techniques, although the rules appear to be more complex than the rules in French. Of course, the more complex the assignment rules are, the less useful they are for prediction because it is difficult for learners to remember them all and apply them quickly. There is no bright line between gender being hard to predict and gender being completely unpredictable, since if you just have one rule for every word in the language saying “this word is masculine / feminine / neuter”, then that is still a set of rules. The conclusion I would draw from results like those of Tucker, Lambert & Rigault is that French and German gender is more predictable than you might think, even though it is often not fully predictable in practice.

Few gender systems have been studied as much as the French and German gender systems have, so it is possible that we might find languages that have significantly more unpredictable gender assignment rules. But it would be surprising, since predictable assignment rules are a lot more convenient for learners. I think it’s more likely that in all languages that distinguish different genders, gender assignment is to a large degree predictable.

Another interesting example of a language with apparently unpredictable gender assignment is Ojibwa. In Ojibwa there are two genders which are called the animate and inanimate genders. These genders have nothing to do with sex (the word gender comes from the French word genre, which just means type; it can refer to any kind of distinction between nouns that is reflected in agreement). Nouns that denote people, animals, trees or supernatural beings are always animate. Most other nouns are inanimate. But there is a fairly large group of nouns that seem like they should be in the inanimate gender, but are actually animate. These include ekoːn ‘snow’, enank ‘star’, esseːmaː ‘tobacco’, mentaːmin ‘maize’, meskomin ‘raspberry’ and ekkikk ‘kettle’ (Bloomfield 1957). Now, some of these might be explainable as resulting from differences in which things are considered to possess a gender. For example, it is very common, cross-linguistically and cross-culturally, for celestial bodies such as stars to be identified with supernatural beings, who have a gender. And given that trees are considered animate in Ojibwa, it’s possible that other plants like tobacco and maize might be considered animate as well. But other examples like ekoːn ‘snow’ being animate are harder to explain. There is no generally-accepted explanation for the composition of the Ojibwa animate gender, but Black-Rogers (1982) has an interesting one. According to Black-Rogers, the Ojibwa lack a clear distinction between natural and supernatural abilities. They believe that even fairly mundane activities like beadwork are only possible because of powers that have been granted to humans via supernatural means. Inanimate objects, in particular, may be sources of power. Different speakers may disagree as to which objects have power, and power may be considered to come from different sources at different times. Black-Rogers proposed that when an object is considered to be a source of power speakers start assigning it to the animate gender. She was able to explain many of the problematic animate nouns by this means. For those she wasn’t able to explain, she suggests that objects assigned to the animate gender tend to stay there, so there may be animate nouns which refer to objects that were formerly considered to be sources of power, but no longer are today. So Ojibwa gender assignment is to some extent arbitrary from a synchronic perspective, but in diachronic perspective it can be completely explained by semantic factors. Black-Rogers’s explanation may or may not be true, but I brought up the example to show you that highly unpredictable gender assignments can also be influenced mainly by semantic factors, rather than by formal factors as in the case of French and German.

Another interesting question about gender is whether there are any languages where gender assignment is determined entirely by formal factors, so that semantic factors are irrelevant. Now, it’s true that in many languages formal rules can almost entirely predict gender assignment. In Hausa, for example, there is a very simple rule: nouns ending in -aa are feminine, and all other nouns are masculine. There are some exceptions to the rule, but they are few in number. However, semantic factors are not irrelevant. If they were, then we would not expect nouns referring to males to be masculine and nouns referring to females to be feminine, because there is no reason why nouns referring to females should end in -aa but other nouns should not. In fact, the vast majority of nouns referring to females end in -aa and are feminine. Historically, the correlation between the feminine gender and the -aa suffix did not exist, or was less strong. What happened was that a suffix -nyàa was used to form nouns denoting females, and this resulted in -aa being associated with nouns denoting females, so that all such nouns ended up having the suffix -aa added to them.

Hausa gender, then, is not determined only by formal factors, and in fact it seems that there are no languages where gender is determined only by formal factors. In general, gender distinctions seem to always be fundamentally based on semantic rules of the form “all nouns with meanings of type A are assigned to gender X” (so the gender X contains all nouns of type A, but not necessarily only nouns of type A). These rules give each gender an initial set of nouns called its “semantic core”. Then semantic associations and formal rules are sufficient to assign the vast majority of remaining nouns to one of the genders, and they may shift nouns within the semantic core of one gender to a different gender as well.

So, I’ve talked a bit about about gender assignment and the interaction between formal and semantic factors here, but there are lots more interesting things to talk about with respect to gender in languages, such as: what kind of distinctions tend to be drawn? Male vs. female, animate vs. inanimate are very common—any others? What can borrowings tell us about gender assignment? What can we say about nouns which appear to have characteristics of multiple genders (like German Mädchen)? How do gender systems develop and change over time? How are gender systems acquired by language learners? If you’re interested, I recommend Gender (1991) by Greville Corbett. This post is based on the first few chapters of that book.


  1. ^ There is a known phenomenon where English speakers sometimes refer to things that would normally be referred to by it by he or she instead; for example a teenage boy told a surfer, referring to a wave: “Catch her at her height!” (Corbett 1991). But this occurs only in particular circumstances; it is clearly the usual pronoun used to refer to these things.
  2. ^ It is common, cross-linguistically and cross-culturally, for children to be non-gendered, and indeed Kind ‘child’ is neuter; however, Junge ‘boy’, and its older synonym Knabe are both masculine, so we would expect Mädchen to be feminine in parallel.
  3. ^ However, in colloquial German pronouns often agree with Mädchen as if it was feminine: Kennst du das Mädchen? Nein, ich kenne sie nicht, not … kenne es nicht.


/a/ in foreign language pronunciation teaching

Often, in pronunciation guides for English speakers learning another language, they’ll instruct you to pronounce that language’s ‘a’ sound like the ‘a in father’. Now, the problem with pronunciation instructions like this is that in English especially, people have different dialects and will pronounce vowels differently. There are lots of examples I could give to show how this complicates things, but I’ll just focus on the case of ‘father’.

Sounds conventionally transcribed with ‘a’ in the Latin alphabet are usually open vowels, but may differ in their front or backness. The IPA provides three symbols for ‘a’ sounds: æ, a and ɑ (plus ɐ which is not completely open, so I’m not considering it in this discussion).

æ is prototypically the symbol for a front vowel, with some slight closing. The symbol reflects this, in that it is the closest a sound to an e. ɑ is the back counterpart, although it doesn’t imply any closing. It gets a bit more confusing with a, which doesn’t really have any specific prototype. It can be used to refer to any open vowel in the middle.

Within the range of sounds covered by a, there are two useful points. One is like æ, but without the slight closing. One is the most common vowel sound in any language: a central, open vowel. To disambiguate, the centralisation diacritic is sometimes used for the second sound, giving . But this level of precision is rarely used, and when you see that a language has a phoneme pronounced [a] you can’t be certain whether this is a front open vowel or a central open vowel.

Now, in English, we have the full range of different ‘a’ sounds. Quite universally in American and Australian English, and traditionally in British English, an ‘a’ as in ‘cat’ is pronounced [æ]. And this is the symbol used to transcribe the phoneme when talking about all English dialects.

Most English dialects have another ‘a’-like sound: the one found in ‘father’. This is always, as far as I know, further back than the ‘a’ in ‘cat’. In most North American English, this is merged with the ‘o’ sound of ‘cot’. Both are transcribed with [ɑ].

So, from a North American point of view, the instruction to pronounce ‘a’ in, say, Japanese (which has the usual central [ä]) as the ‘a’ in ‘father’ is pretty sensible, if not perfect–[ɑ] is closer to the centre than [æ] is. Especially since in many American dialects, these two phonemes are undergoing a shift where [æ] gets even closer and fronter to approach [e], and [ɑ] moves to the front and becomes pronounced more like [ä]–for these speakers ‘father’ is the perfect example.

But for non-North American dialects, it’s not perfect. For example, I, being from England, pronounce ‘father’ as [fɑːðə]–with a long [ɑ]. If someone told me to pronounce the ‘a’ in Japanese like I do in ‘father’, I might end up pronouncing ‘katakana’ as [kɑːtɑːkɑːnɑː]. Which would sound comically wrong, and take much longer to say than the actual pronunciation of [kätäkänä] (I assume so anyway, I don’t know the details of Japanese phonology).

For people like me who have an [ɑ] that’s always long, which includes most speakers from England, Australia and New Zealand, [æ] would probably be a closer approximation to the [ä] sound. Although at some point, we’d just have to accept that this is a sound with no proper equivalent in most varieties of English.

Note ‘most’. Because it actually gets worse–in Northern England, Wales, Ireland and Scotland*, the ‘a’ in cat is pronounced as [a], with varying degrees of backness–for all speakers it has none of the slight closing of [æ], and for some it’s a fully central [ä]. Plus, a full [æ] pronunciation in England is now either old-fashioned or vernacular–even in southern England, many people use the northern [a] sound. So for these speakers, there’s actually a really good approximation of [ä] in their native accent, but you’re telling them to use an different phoneme that’s much less like it!

* Although for many speakers in Scotland, the ‘a’ in ‘cat’ and ‘father’ are not distinguished, so telling them to use ‘father’ works well enough but you could just as well tell them to use the vowel in ‘cat’.

I guess the lesson to learn is: don’t rely an approximations for foreign phonemes on terms of your native language. Listen to the language’s speakers, and use the sound they use.