Tag Archives: semantics

Truth-uncertainty and meaning-uncertainty

Epistemic status: just a half-baked idea, which ought to be developed into something more complete, but since I’m probably not going to do that anytime soon I figured I’d publish it now just to get it out there.

Consider a statement such as (1) below.

(1) Cats are animals.

I’m used to interpreting statements such as (1) using a certain method which I’m going to call the “truth-functional method”. Its key characteristic is, as suggested by the name, that statements are supposed to be interpreted as truth functions, so that a hypothetical being which knew everything (had perfect information) would be able to assign a truth value—true or false—to every statement. There are two problems which prevent truth values being assigned straightforwardly to statements in practice.

The first is that nobody has perfect information. There is always some uncertainty of the sort which I’m going to call “truth-uncertainty”. Therefore, it’s often (or maybe even always) impossible to determine a statement’s truth value exactly. All one can do is have a “degree of belief” in the statement, though this degree of belief may be meaningfully said to be “close to truth” or “close to falsth1” or equally far from both. People disagree about how exactly degrees of belief should be thought about, but there’s a very influential school of thought (the Bayesian school of thought) which holds that degrees of belief are best thought about as probabilities, obeying the laws of probability theory. So, for a given statement and a given amount of available information, the goal for somebody practising the truth-functional method is to assign a degree of belief to the statement. At least inside the Bayesian school, there has been a lot of thought about how this process should work, so that truth-uncertainty is the relatively well-understood sort of uncertainty.

But there’s a second problem, which is that often (maybe even always) it’s unclear exactly what the statement means. To be more exact (the preceding sentence was an exemplification of itself), when you hear a statement, it’s often unclear exactly which truth function the statement is supposed to be interpreted as; and depended on which truth function it’s interpreted as, the degree of belief you assign to it will be different. This is the problem of meaning-uncertainty, and it seems to be rather less well-understood. Indeed, it’s probably not conventional to think about it as an uncertainty problem at all in the same way as truth-uncertainty. In the aforementioned scenario where you hear the statement carrying the meaning-uncertainty being made by somebody else, the typical reponse is to ask the statement-maker to clarify exactly what they mean (to operationalize, to use the technical term). There is of course an implicit assumption here that the statement-maker will always have a unique truth-function in their mind when they make their statement; meaning-uncertainty is a problem that exists only on the receiving end, due to imperfect linguistic encoding. If the statement-maker doesn’t have a unique truth function in mind, and they don’t care to invent one, then their statement is taken as content-free, and not engaged with.

I wonder if this is the right approach. My experience is that meaning-uncertainty exists not only on the recieving end, but also very much on the sending end too; I very often find myself saying things but not knowing quite what I would mean by them, but nevertheless feeling that they ought to be said, that making these statements does somehow contribute to the truth-seeking process. Now I could just be motivatedly deluded about the value of my utterances, but let’s run with the thought. One thing that makes me particularly inclined towards this stance is that sometimes I find myself resisting operationalizing my statements, like there’s something crucial being lost when I operationalize and restrict myself to just one truth function. If you draw the analogy with truth-uncertainty, operationalization is like just saying whether a statement is true or false, rather than giving the degree of belief. Now one of the great virtues of the Bayesian school of thought (although it would be shared by any similarly well-developed school of thought on what degrees of belief are exactly) is arguably that, by making it more clear exactly what degrees of belief are, it seems to make people a lot more comfortable with thinking about degrees of belief rather than just true vs. false, and thus dealing with truth-uncertainty. Perhaps, then, what’s needed is some sort of well-developed concept of “meaning distributions”, analogous to degrees of belief, that will allow everybody to get comfortable dealing with meaning-uncertainty. Or perhaps this analogy is a bad one; that’s a possibility.

Aside 1. Just as truth-uncertainty almost always exists to some degree, I’m fairly sure meaning-uncertainty almost always exists to some degree; operationalization is never entirely completely done. There’s a lot of meaning-uncertainty in statement (1), for example, and it doesn’t seem to completely go away no matter how much you operationalize.

Aside 2. The concept of meaning-uncertainty doesn’t seem to be as necessarily tied up with the truth-functional model to me as that of truth-uncertainty; one can imagine statements being modelled as some other sort of thing, but you’d still have to deal with exactly which example of the other sort of thing any given statement was, so there’d still be meaning-uncertainty of a sort. For example, even if you don’t see ought-statements as truth-functional, as opposed to is-statements, you can still talk about the meaning-uncertainty of an ought-statement, if not its truth-uncertainty.

Aside 3. Another way of dealing with meaning-uncertainty might be to go around the problem, and interpret statements using something other than the truth-functional method.


^ I’m inventing this word by analogy with “truth” because I get fed up with always having to decide whether to use “falsehood” or “falsity”.

Animacy and the meanings of ‘in front of’ and ‘behind’

The English prepositions ‘in front of’ and ‘behind’ behave differently in an interesting way depending on whether they have animate or inanimate objects.

To illustrate, suppose there are two people—let’s call them John and Mary—who are standing colinear with a ball. Three parts of the line can be distinguished: the segment between John’s and Mary’s positions (let’s call it the middle segment), the ray with John at its endpoint (let’s call it John’s ray), and the ray with Mary at its endpoint (let’s call it Mary’s ray). Note that John may be in front of or behind his ray, or at the side of it, depending on which way he faces; likewise with Mary, although, let’s assume that Mary is either in front of or behind her ray. What determines whether John describes the position of the ball, relative to Mary, as “in front of Mary” or “behind Mary”? First, note that it doesn’t matter which way John is facing. The relevant parameters are the way Mary is facing, and whether the ball is on the middle segment or Mary’s ray. So there are four different situations to consider:

  1. The ball is on the middle segment, and Mary is facing the middle segment. In this case, John can say, “Mary, the ball is in front of you.” But if he said, “Mary, the ball is behind you,” that statement would be false.
  2. The ball is on the middle segment, and Mary is facing her ray. In this case, John can say, “Mary, the ball is behind you.” But if he said, “Mary, the ball is in front of you,” that statement would be false.
  3. The ball is on Mary’s ray, and Mary is facing her ray. In this case, John can say, “Mary, the ball is in front of you.” But if he said, “Mary, the ball is behind you,” that statement would be false.
  4. The ball is on Mary’s ray, and Mary is facing the middle segment. In this case, John can say, “Mary, the ball is behind you.” But if he said, “Mary, the ball is in front of you,” that statement would be false.

So, the relevant variable is whether the ball’s position, and the position towards which Mary is facing, match up: if Mary faces the part of the line the ball is on, it’s in front of her, and if Mary faces away from the part of the line the ball is on, it’s behind her.

This all probably seems very obvious and trivial. But consider what happens if we replace Mary with a lamppost. A lamppost doesn’t have a face; it doesn’t even have clearly distinct front and back sides. So one of the parameters here—the way Mary is facing—has disappeared. But one has also been added—because now the way that John is facing is relevant. So there are still four situations:

  1. The ball is on the middle segment, and John is facing the middle segment. In this case, John can say, “The ball is in front of the lamppost.”
  2. The ball is on the middle segment, and John is facing his ray. In this case, I don’t think it really makes sense for John say either, “The ball is in front of the lamppost,” or, “The ball is behind the lamppost,” unless he is implicitly taking the perspective of some other person who is facing the middle segment. The most he can say is, “The ball is between me and the lamppost.”
  3. The ball is on Mary’s (or rather, the lamppost’s) ray, and John is facing the middle segment. In this case, John can say, “The ball is behind the lamppost.”
  4. The ball is on Mary’s (or rather, the lamppost’s) ray, and John is facing his ray. In this case, I don’t think it really makes sense for John say either, “The ball is in front of the lamppost,” or, “The ball is behind the lamppost,” unless he is implicitly taking the perspective of some other person who is facing the middle segment. The most he can say is, “The ball is behind me, and past the lamppost.”

A preliminary hypothesis: it seems that the prepositions ‘in front of’ and ‘behind’ can only be understood with reference to the perspective of a (preferably) animate being who has a face and a back, located on opposite sides of their body. If the object is animate, then this being is the object. The preposition ‘in front of’ means ‘on the ray extending from [the object]’s face’. The preposition ‘behind’ means ‘on the ray extending from [the object]’s back’. But if the object is inanimate, then … well, it seems to me that there are two analyses you could make:

  • The definitions just become completely different. The prepositions ‘in front of’ and ‘behind’ now presuppose that the object is on the ray extending from the speaker’s face. If the subject (the referent of the noun to which the prepositional phrase is attached, e.g. the ball above) is between the speaker and the object, it’s in front of the object. Otherwise (given the presupposition), it’s behind the object.
  • If the speaker is facing the object, the speaker imagines that the object has a face and a back and is looking back at the speaker. Then the regular definitions apply, so ‘in front of’ means ‘on the ray extending from [the object]’s face, i.e. on the ray extending from [the speaker]’s back or on the middle segment’, and ‘behind’ means ‘on the ray extending from [the object]’s back, i.e. on the ray extending from [the speaker]’s face but not on the middle segment’. On the other hand, if the speaker isn’t facing the object, then (for some reason) they fail to imagine the object as having a face and a back.

The first analysis feels more intuitively correct to me, when I think about what ‘in front of’ and ‘behind’ mean with inanimate objects. But the second analysis makes the same predictions, does not require the postulation of separate definitions in the animate-object and inanimate-object cases and goes some way towards explaining the presupposition that the object is on the ray extending from the speaker’s face (though it does not explain it completely, because it is still puzzling to me why the speaker imagines in particular that the object is facing the speaker, and why no such imagination takes place when the speaker does not face the object). Perhaps it should be preferred, then, although I definitely don’t intuitively feel like phrases like ‘in front of the lamppost’ are metaphors involving an imagination of the lamppost as having a face and a back.

Now, I’ve been talking above like all animate objects have a face and a back and all inanimate objects don’t, but this isn’t quite the case. Although the prototypical members of the categories certainly correlate in this respect, there are inanimate objects like cars, which can be imagined as having a face and a back, and certainly at least have distinct front and back sides. (It’s harder to think of examples of animates that don’t have a front and a back. Jellyfish, perhaps—but if a jellyfish is swimming towards you, you’d probably implicitly imagine its front as being the side closer to you. Given that animates are by definition capable of movement, perhaps animates necessarily have fronts and backs in this sense.)

With respect to these inanimate objects, I think they can be regarded both as animates/faced-and-backed beings or inanimates/unfaced-and-unbacked beings, with free variation as to whether they are so regarded. I can imagine John saying, “The ball is in front of the car,” if John is facing the boot of the car and the ball is in between him and the boot. But I can also imagine him saying, “The ball is behind the car.” He’d really have to say something more specific to make it clear where the ball is. This is much like how non-human animates are sometimes referred to as “he” or “she” and sometimes referred to as “it”.

The reason I started thinking about all this was that I read a passage in Claude Hagège’s 2010 book, Adpositions. Hagège gives the following three example sentences in Hausa:

(1) ƙwallo ya‐na gaba-n Audu
ball 3SG.PRS.S‐be in.front.of-3SG.O Audu
‘the ball is in front of Audu’

(2) ƙwallo ya‐na bayan‐n Audu
ball 3SG.PRS.S‐be behind-3SG.O Audu
‘the ball is behind Audu’

(3) ƙwallo ya‐na baya-n telefo
ball 3SG.PRS.S‐be behind-3SG.O telephone
‘the ball is in front of the telephone’ (lit. ‘the ball is behind the telephone’)

He then writes (I’ve adjusted the numbers of the examples; emphasis original):

If the ball is in front of someone whom ego is facing, as well as if the ball is behind someone and ego is also behind this person and the ball, Hausa and English both use an Adp [adposition] with the same meaning, respectively “in front of” in (1), and “behind” in (2). On the contrary, if the ball is in front of a telephone whose form is such that one can attribute this set a posterior face, which faces ego, and an anterior face, oriented in the opposite direction, the ball being between ego and the telephone, then English no longer uses the intrinsic axis from front to back, and ignores the fact that the telephone has an anterior and a posterior face: it treats it as a human individual, in front of which the ball is, whatever the face presented to the ball by the telephone, hence (3). As opposed to that, Hausa keeps to the intrinsic axis, in conformity to the more or less animist conception, found in many African cultures and mythologies, which views objects as spatial entities possessing their own structure. We thus have, here, a case of animism in grammar.

I don’t entirely agree with Hagège’s description here. I think a telephone is part of the ambiguous category of inanimate objects that have clearly distinct fronts and backs, and which can therefore be treated either way with respect to ‘in front of’ and ‘behind’. It might be true that Hausa speakers show a much greater (or a universal) inclination to treat inanimate objects like this in the manner of animates, but I’m not convinced from the wording here that Hagège has taken into account the fact that there might be variation on this point within both languages. And even if there is a difference, I would caution against assuming it has any correlation with religious differences (though it’s certainly a possibility which should be investigated!)

But it’s an interesting potential cross-linguistic difference in adpositional semantics. And regardless, I’m glad to have read the passage because it’s made me aware of this interesting complexity in the meanings of ‘in front of’ and ‘behind’, which I had never noticed before.

Words for men and women in Indo-European languages

There were quite a few words meaning ‘man’ in Old English (OE). However, mann, the ancestor of the modern English word man, wasn’t one of them. In the Bosworth-Toller Anglo-Saxon Dictionary the definition of mann is given as ‘human being of either sex’. It only started to be used to refer to male human beings in particular in late OE, from c. 1000 AD. The old sense survives in modern English, but it is no longer the primary one and it has become less common over time. The use of gender-neutral man is still fairly common in compounds like mankind, manmade and manslaughter. In fact, the word woman itself is descended from a compound in which man was used in the gender-neutral sense. One of the two main words for ‘woman’ in OE (along with cwēn, the ancestor of modern English queen) was wīf, the ancestor of modern English wife. The word was used in the sense of ‘wife’ already in OE, but its primary sense was ‘woman’ in OE, and this sense has survived in the compounds midwife and fishwife. Perhaps due to the increasing dominance of the sense of ‘wife’, the compound wīfmann (‘woman-person’) started to be used more often for ‘woman’ until the ‘woman’ sense of wife became extinct.

OE mann is a descendant of the reconstructed Proto-Germanic (PGmc) word *mann- (of uncertain ending). This appears man in Old Frisian, Old Saxon, Old Dutch and Old High German, maðr in Old Norse and manna in Gothic. As with the OE word, these words originally meant ‘human being’ but later shifted to meaning ‘man’ specifically; the ‘human being’ sense survives as a secondary one in Icelandic and Faroese, but on the continent it has been completely replaced by derived words such as German Mensch. (Mensch is a descendant of Old High German mennisko. From mann an adjective was formed by adding the umlaut-inducing suffix -isk (cognate to English -ish), then this adjectivisation was undone again by adding a nominal ending -o, which would have made the word completely redundant if the meaning of the original noun man had not been changed.) PGmc *mann-, in turn, is probably the descendant of the Proto-Indo-European (PIE) word *mánus, which is also the ancestor of Proto-Slavic *mǫ̑žь ‘man, husband’ (> Russian muž ‘husband’) and Sanskrit mánuḥ ‘human being’. Different explanations have been proposed for the double *-nn- in the PGmc word; Ringe (2006)’s is that the PIE word had an oblique stem *mánw-, PIE *-nw- regularly became *-nn- in PGmc, and the form of the oblique stem was generalised. In the Hindu religion, Manu is the name of the progenitors of humanity, and in Tacitus’s Germania he mentions that ‘[the Germanic peoples] celebrate the god Tuisto, sprung from the earth, and his son Mannus, as the fathers and founders of their race’, which seems to me to strongly suggest that *mann- and mánuḥ share a common ancestor.

As for OE wīf, it is a descendant of PGmc *wībą, which appears as wīf in Old Frisian, Old Saxon and Old Dutch, wīb in Old High German and víf in Old Norse. In the continental Germanic languages the word has been replaced as the word for ‘woman, wife’ by descendants of PGmc *frawjǭ ‘lady’, such as Dutch vrouwe and German Frau. In Dutch and German wijf and Weib remain as words but have acquired a pejorative connotation because of the contrast with vrouwe and Frau; using the original word would imply that the woman is of low birth. The same kind of dynamic is responsible for the phenomenon in English where in public addresses (e.g. on bathroom doors) the words ladies and gentlemen and are used in place of women and men. In Icelandic (and Faroese? I don’t have a good source for Faroese) the word survives, but is old-fashioned and restricted to poetic use; the usual word for ‘woman’ is kona. This word is a cognate of English queen; it is a descendant of PGmc *kwēniz via Old Norse kván. In Gothic, *kwēniz appears as qēns ‘wife’, but there seems to be no trace of this word in the continental West Germanic languages, and kván has died out in the continental North Germanic languages as well. In English, of course, the meaning of the word was specialised to mean a royal wife in particular, although the word can also be used to refer to a gay man and this might be a survival of the old sense of ‘woman’. PGmc *kwēniz is, in turn, a descendant of PIE *gʷḗn ‘woman’. This word is very widely attested in the Indo-European languages: it appears as Proto-Slavic *žena (> Russian žená), Old Irish , Ancient Greek gynḗ, Armenian kin, Sanskrit jániḥ ‘wife’ and Tocharian B śana (although no cognate survives in Latin). Ancient Greek gynḗ in particular appears in a few Greek-derived English words such as gynaecology, polygyny and misogyny. What about *wībą? It’s uncertain whether this word is a descendant of a PIE word (it might have been borrowed from some long-lost language in PGmc specifically; it might even be specific to Northwest Germanic since it does not appear in Gothic). A link has been proposed between it and Proto-Tocharian *kwäipe ‘feel shame’ (> Tocharian A kip, kwīp) via a change of meaning along the lines of ‘woman’ > ‘female genitalia’ > ‘shame’, but I think this change is too far-fetched. Although the fact that *wībą was neuter, rather than feminine, is suggestive.

So what was the Old English word for ‘man’? The main one was wer. It started to die out in English in the late 13th century, but it survives in the compound werewolf (‘man-wolf’). The Proto-Germanic form of the word was *weraz, and it appears in Old Frisian, Old Saxon and Old High German as wer, Old Norse as verr and Gothic as waír, with the meaning ‘man’ in each case. However, the word has died out in all of the modern Germanic languages, except in Icelandic (and Faroese?) were it survives, not as the usual word for ‘man’, but as the poetic word ver. The word is also widely attested in Indo-European as a whole; its Proto-Indo-European form was *wiHrós, which appears as výras in Lithuanian, fear in Irish, gŵr ‘husband’ in Welsh, vir in Latin and vīrá in Sanskrit. A few English words, such as virile and virtue, are derived from the Latin form of the word.

The word vir didn’t survive in the Romance languages, either; it has been replaced by descendants of Latin homō ‘human being’. It’s interesting how this change parallels exactly the change in the Germanic languages, where *mann-, another word meaning ‘human being’, replaced *weraz as the word for ‘man’. The word homō can be seen in derived English words like human and hominid which are of Latin origin. However, Old English also had a direct cognate of homō: guma. In Old English, this word referred to male humans, specifically, so it was a synonym of wer; however, it was more of a poetic word, whereas wer was the everyday word for ‘man’. Both words are descendants of a derivative *dʰǵʰm̥mō of the PIE *dʰéǵʰōm ‘earth’ (in Germanic and Latin, the initial *dʰ was regularly lost, and *ǵʰ regularly became h in Latin) which meant ‘something from the earth’. The word guma has survived into modern English only via the Old English compound brȳdguma (‘bride-man’). This compound of course became modern English bridegroom (often shortened to groom), and its meaning has not changed. However, the insertion of the -r- in groom is an irregular development. What seems to have happened is that the word groom came into Middle English (from an unknown source) c. 1200 with the meaning ‘youth’. This was then confused with the -goom element in bridegoom and so the modern form of the word arose. As with wer, similar developments have occured in all Germanic languages. The r-insertion is unique to English, but all of the other Germanic languages have lost their cognates of guma but retain it in a compound cognate to English bridegroom (e.g. German Bräutigam).

As well as *wiHrós, there is another widespread Indo-European word for ‘man’, which had the PIE form *h₂nḗr. This appears as njeri ‘human being’ in Albanian, Nerō (a personal name) in Latin, anḗr in Ancient Greek, and nára (this one also has a secondary sense of ‘human being’) in Sanskrit, and it also appears in the derivatives neart in Irish and Welsh nerth, both meaning ‘strength’. The Greek word anḗr had the oblique stem andr-, and this appears in many English words such as androgyny, polyandry, android and androgen, as well as in the personal name Andrew. It is tempting to link the Greek word for ‘human being’, ánthrōpos, to *h₂nḗr as well, but the presence of -th- rather than -d- in the word is unexplainable if this is the case. The real etymology of ánthrōpos is unknown. Given that the sense of ‘human being’ is attested in Sanskrit and Albanian for *h₂nḗr, it is possible that this was the original sense in PIE, too. Either way, it would have had a synonym in either *wiHrós or *mánus. This shift has the advantage of not requiring a shift from the more specific sense of ‘man’ to the more general sense of ‘human being’; shifts in meaning more often increase specificity rather than generality.

Clearly the senses of ‘man’ and ‘human being’ are quite prone to confusion. I don’t know of any cases where a word has shifted directly in meaning from ‘human being’ to ‘woman’, or the other way around. I’d be interested to hear of examples if anybody has any. The similar shift ‘young human being’ to ‘woman’ seems like it could definitely be possible,, though. The English word girl (which is of unknown origin, first appearing c. 1300) originally meant ‘child’; it was gender-neutral. Over time, it has come to refer specifically to female children. Since the 1500s it has been used to refer to young women as well, and since the 1800s it has sometimes been used to refer to all women, even elderly ones, although this usage has never become standard. So this word which originally meant ‘child’ may in the future have shifted its meaning to ‘woman’. A shift from ‘human being’ to ‘woman’ might be possible via this route, but it would require an initial shift of ‘human being’ to ‘child’. I don’t know whether such a shift is possible; I was going to say it was unlikely, but semantic shifts can happen in all sorts of weird ways, so I don’t really have any idea.

(note: a lot of this post is based on information gathered from Wiktionary and the Online Etymology Dictionary which are not entirely reliable sources. I tried to look up every word cited here in a dictionary specific to the language the word belonged to, to make sure I didn’t end up citing words with the wrong meaning, or citing words that didn’t actually exist. However, it’s hard to find freely available online English-language dictionaries for some of the more obscure languages like Faroese, so I wasn’t able to do this for every word; and given that this post ended up involving a lot of words from a lot of languages it’s quite possible that some errors in detail are present. The PGmc and PIE words cited have been checked via Ringe (2006), From Proto-Indo-European to Proto-Germanic.)

Formal semantic analysis of natural language quantifiers

Natural language quantifiers are an interesting subset of words in that it is possible to define them formally using set theory, by taking them to be binary relations between sets. For example, here are the formal definitions of some English quantifiers.

  • “every” is the binary relation \forall between sets such that for every pair of sets A and B, \forall(A, B) if and only if A \subseteq B. For example, the sentence “every man is in the room” is true if and only if the set of all men is a subset of the set of everything in the room.
  • “some” (in the sense of “at least one”) is the binary relation \exists between sets such that for every pair of sets A and B, \exists(A, B) if and only if A \cap B is non-empty. For example, the sentence “some (at least one) man is in the room” is true if and only if the set of all men and the set of everything in the room have at least one member in common.
  • More generally, each natural number n (as an English word, in the sense of “at least n”) is the binary relation \exists n between sets such that for every pair of sets A and B, \exists n(A, B) if and only if |A \cap B| \ge n. (We are assuming here that the number is not interpreted to be exhaustive, so that the statement “two men are in the room” would still be seen as true in the case where three men are in the room.) For example, the sentence “two men are in the room” is true if and only if the set of all men and the set of everything in the room have at least two members in common.
  • “most” (in the sense of “more often than not”) is the binary relation M between sets such that for every pair of sets A and B, M(A, B) if and only if |A \cap B| \ge |A \setminus B|.

Admittedly, some natural language quantifiers, like “few” and “many”, cannot be satisfactorily defined in this way. But quite a lot of them can be, and I’m going to just focus on those that can be in the rest of the post. From now on you can take the term “natural language quantifiers” to refer specifically to those natural language quantifiers that can be given a formal definition as a binary relation between sets.

Now, once we have taken this approach to natural language quantifiers, an interesting question arises: which binary relations between sets correspond to natural language quantifiers? Clearly, no individual language could have natural language quantifiers corresponding to every single binary relation between sets, because there are infinitely many such binary relations, and only finitely many words in a given language. In fact, we can be quite sure that the vast majority of binary relations between sets will never correspond to natural language quantifiers in any language, because most of them are simply too obscure. Consider, for example, the binary relation R between sets such that for every pair of sets A and B, R(A, B) if and only if A is the set of all men and B is the set of all women. If this corresponded to an English quantifier, which might be pronounced, say, “blort”, then the sentence “blort men are women” would be true, and every other sentence of the form “blort Xs are Ys” would be false. I don’t know about you, but I can’t think of any circumstances under which such a word would be of any use in communication whatsoever.

Another problem with our supposed quantifier “blort” is that it can’t reasonably be called a quantifier, because its definition has absolutely nothing to do with quantities! You probably know what I mean here, but it’s worth trying to spell out exactly what it is (after all, the whole point of formal analysis of any subject is that trying to spell out exactly what you mean often leads to interesting new insights). It seems that the problem is to do with the objects and properties that are referred to in the definition of “blort”. Our definition of “blort” refers to the identities of the two arguments A and B—it includes the phrases “if A is the set of all men” and “if B is the set of all women”. But the definition of a quantifier should refer to quantities only, not identities. From the point of view of set theory, “quantity” is just another word for “cardinality”, which means the number of members a given set contains. So perhaps we should say that the definition of a natural language quantifier can only refer to the cardinalities of the arguments A and B. This is still not a proper formal definition, because we have not been specific about what it actually means for a definition to “refer” to the cardinalities of the arguments only. If we take the statement very literally, we could take it to mean that the definition of a natural language quantifier should be a string consisting only of the substrings “|A|” and “|B|” (with A and B replaced by whichever symbols you want to use to refer to the two arguments), interpreted in first-order logic. But that’s ridiculous, and not just because such a string would evaluate to a natural number rather than a truth value. In order to find out what the proper constraints on the string should be, let’s have a look again at the definitions we gave above.

  • For “every”, we have that \forall(A, B) if and only if A \subseteq B, or, equivalently, |A \setminus B| = 0.
  • For each natural number n, we have that \exists(A, B) if and only if |A \cap B| \ge n.
  • For “most”, we have that M(A, B) if and only if |A \cap B| \ge |A \setminus B|.

In order for these to count as quantifiers, our definition must allow us to compare the cardinalities as well as refer to them. We also need to refer to the cardinalities of combinations of the two arguments of A and B, such as A \cap B and A \setminus B, as well as |A| and |B|. And, although none of the definitions above involve the logical connectives \wedge (AND) and \vee (OR), we will need them for more complex quantifiers that are formed as phrases, such as “most but not all”.

The question of exactly which combinations of sets we need to refer to is quite an interesting one. Given our two arguments A and B, we can see all the possible combinations by drawing a Venn diagram:

A Venn diagram with two circles. The overlapping region is labelled A ∩ B, the region falling solely within the left circle is labelled A \ B, and the region falling solely within the right circle is labelled B \ A.

There are four disjoint regions in this Venn diagram, corresponding to the sets A \cap B, A \setminus B, B \setminus A and (not labelled, but we mustn’t forget it) U \setminus (A \cup B) (where U is the universal set). We also might need to refer to regions that are composed of two or more of these disjoint regions, but such regions can be referred to by using \cup to refer to the union of the disjoint regions.

But do we need to be able to refer to each of these disjoint regions? Note that in the definitions above, we only needed to refer to |A \setminus B| and |A \cap B|, not to |B \setminus A| and |U \setminus (A \cup B)|. In fact, it is thought that these are the only two disjoint regions that definitions of natural language quantifiers ever need to refer to. Quantifiers which can be defined without reference to |U \setminus (A \cup B)| are called extensional quantifiers, and quantifiers which can be defined without reference to |B \setminus A| are called conservative quantifiers. So now, if all this seems like pointless formalism to you, you might be relieved to see that we can make an actual falsifiable hypothesis:

Hypothesis 1. All natural language quantifiers are conservative and extensional.

To give you a better sense of exactly what it means for a quantifier to be conservative or extensional, let’s give some examples of quantifiers which are not conservative, and not extensional.

  • Let NE be the binary relation between sets such that for every pair of sets A and B, NE(A, B) if and only if |U \setminus (A \cup B)| = \emptyset. For example, if we suppose NE corresponds to an English quantifier “scrong”, the sentence “scrong men are in the room” is true if and only if everything which is not a man is in the room (it’s therefore identical in meaning to “every non-man is in the room”). “scrong” is conservative, but not extensional.
  • Let NC be the binary relation between sets such that for every pair of sets A and B, NC(A, B) if and only if |B \setminus A| = \emptyset. For example, if we suppose NC corresponds to an English quantifier “gewer”, the sentence “gewer men are in the room” is true if and only if there is nothing in the room which is not a man. “gewer” is extensional, but not conservative.

Wait a minute, though! I don’t know if you noticed, but “gewer” as defined above has exactly the same meaning as a real English word: “only”. The sentence “only men are in the room” means exactly the same thing as “gewer men are in the room”. (It’s true that we can say “only men are in the room” might just mean that there are no women in the room, not that there is nothing in the room that is not a man—there could be furniture, a table, etc. But “only” still has the same meaning there—it’s just that the universal set is taken to be the set of all people, not the set of all objects. In semantics, the universal set is understood to be a set containing every entity relevant to the current discourse context, not the set that contains absolutely everything.)

Does that falsify Hypothesis 1? Well… I said “only” was a word, but I didn’t say it was a quantifier. In fact, the people who propose Hypothesis 1 would analyse “only” as an adverb, rather than a quantifier. I guess this makes sense considering “only” has the “-ly” suffix. But that’s not proper evidence. Some people have argued that “only” cannot be a determiner (and hence cannot be a quantifier) based on syntactic evidence: “only” does not pattern like other determiners. The example I was given at university was the following sentence:

The girls only danced a tango.

Here, “only” occurs in front of the VP, rather than the NP, hence it must be a determiner.

I’m sure my lecturer could have given better evidence, but he was just pressed for time. But the obvious problem with this argument is that there is a well-known group of determiners which can appear in front of the VP, rather than the NP: the “floating” quan tifiers, such as “all”:

The girls all danced a tango.

Anyway, I remain not totally convinced that “only” is not a quantifier, and, taking a very brief look at the literature, it seems like a far-from-uncontroversial topic, with, for example, de Mey (1991) arguing that “only” is a determiner, after all (although I don’t really understand its argument, having not read the paper very carefully). Payne (2010) mentions that “only” should be seen as a kind of adverb-quantifier hybrid, which I guess is probably the best way to think about it, although it is kind of inconvenient if you’re trying to analyse these words in a formal semantic approach.

I wonder if there are any words in natural languages which have ever been analysed as non-extensional quantifiers. Google Scholar doesn’t turn up anything on the subject.

In any case, perhaps the following weakened statement of Hypothesis 1 is more likely to be true.

Hypothesis 1. In every natural language, the words that can be analysed as non-conservative or non-extensional quantifiers will exhibit atypical behaviour compared to the conservative and extensional quantifiers, so that it may be better to analyse them as adverbs.