06 November 2014

Second-Language Reciprocity

Here is a fascinating infographic presentation posted at Lovely Little Lexemes (hat tip to Mrs. B!). A curious (and probably unique) relationship can be observed between the United Kingdom and Poland: the most common second language in the UK is Polish, and the most common second language in Poland is English.

Click here to see an enlarged version.

07 October 2014

Two Is Company, Four Is a Party

Neuter nouns with the suffix *-wr̥/*-w(e)n- are relatively rare in most branches of Indo-European. The only group where they can be found in great numbers is Anatolian. In Hittite, the suffix productively  formed verbal nouns (names of actions), but there are also examples of nouns that had  become independent lexical units, no longer bound to a particular verb paradigm. They had usually acquired a concrete meaning (referring to a thing or substance rather than an abstraction). One of such nouns is Hitt. pahhur/pahhuen‘fire’, evidently an ancient word, preserved in many branches of the family and showing evidence of archaic vowel alternations and mobile stress: nom/acc.sg. *páh₂wr̥, gen.sg. *ph₂wéns, etc. It may be etymologically connected with the verb *pah₂- ‘guard, protect’, but it’s doubtful if even the speakers of Hittite were still aware of any such connection: the semantic distance between the verb and its derivative was already too great.

Outside Anatolian, the suffix does not play any major role. The nouns that contain it are scattered remnants of a Proto-Indo-European pattern of word-formation. Their attestation is very uneven. They are quite well represented in Sanskrit and Greek, but only isolated examples are found elsewhere (the ‘fire’ word, which became part of Indo-European basic vocabulary sufficiently early, is exceptionally well attested). Here are a few typical *-wr̥/*-w(e)n- nouns evidently connected with known verb roots:

  1. *h₂árh₃-wr̥, gen. *h₂r̥h₃-wén-s  ‘arable land’ (root *h₂arh₃- ‘till, plough’);
  2. *snéh₁-wr̥, gen. *sn̥h₁-wén-s ‘string, sinew’ (root *(s)neh₁- ‘spin, twist’);
  3. *séǵʰ-wr̥, gen. *sǵʰ-wén-s ‘steadfastness’ (root *seǵʰ- ‘conquer, take possession of; hold, own’);
  4. *h₁éd-wr̥, gen. *h₁d-wén-s ‘food’ (root *h₁ed- ‘eat’).

Their reflexes in the historically documented languages rarely display the whole range of vowel, consonant and stress variations, most of which were levelled out analogically in prehistoric times. Still, these alternations are reconstructible thanks to the fact that different fragments of the pattern have been preserved in different languages. They can be reassembled into a complete picture like the pieces of a jigsaw puzzle or the disarticulated skeleton of a fossil animal.

Got wheels?
A four-wheeled toy from the Cucuteni-Trypillian culture;
the early fourth millennium BC.
Neuters of this kind formed collectives by inserting a lengthened *ō into the suffix. The collective of a count noun denotes simply a set of objects (a collective plural), while the collective of a mass noun like ‘fire’ denotes a particular quantity or sample of the thing in question (‘a fire, a burning mass’). This became one of the derivational mechanisms by which Indo-European mass nouns could be transformed into count nouns. The accent was commonly shifted to the suffix in the process, causing the reduction of the root vowel: *páh₂wōr (collective) > *ph₂wṓr > *pwṓr (a countable neuter with its own case forms such as gen.sg. *p(h₂)un-és). Still later, the distiction between the original mass noun and its collective could be blurred and abandoned, the younger form ousting the older and serving in both functions (‘fire’ or ‘a fire’). The archaic Proto-Indo-European form *páh₂wr̥ is unambiguously preserved only in Anatolian, while the remaining Indo-European languages show reflexes of *pwṓr or its further modified descendants.

Now we can view the reconstruction *kʷét-wr̥ in this light. Supposing it was derived from our hypothetical verb root *kʷet- ‘group into pairs’, the original meaning of *kʷétwr̥ (as a nomen actionis) would be something like ‘pairing’, and its collective *kʷétwōr would mean ‘a particular result of pairing, a complete set organised into pairs’. In the Proto-Indo-European world, there were many “natural” sets of things conceptualised as consisting of two pairs: human hands and feet; fore and rear legs of animals; the wheels of a wagon; the four directions, whether cardinal (east and west, north and south) or relative (forward and backwards, left and right); paired organs of perception (two eyes and two ears). This could have provided sufficient motivation for treating ‘4’ as the prototypical case of an “even collective”. An interesting parallel can be seen in the “fraternal” numeral systems widespread in Amazonia. In the languages that employ them, the numeral ‘4’ is derived from an expression meaning ‘each has a brother/companion/spouse’. At a more primitive stage, preserved in the Dâw language, there are only three “exact” lexical numerals, ‘1’, ‘2’, and ‘3’. The values from 4 to 10 are described as ‘even’ (‘has a brother’) or ‘odd’ (‘has no brother’). The precise value can’t be expressed linguistically, but the words ‘even’ and ‘odd’ can be supplemented by clarifying hand gestures:
Dâw speakers indicate ‘four’ by holding the fingers of one hand separated into two blocks; for ‘five’, they add the thumb; for ‘six’, they place the second thumb against the first to make a third pair; and so on until for ‘ten’ all fingers are grouped into five pairs, the thumbs together.
[Epps 2006: 265]
Once established as a concrete numeral (rather than part of an even-odd tally system), *kʷétwōr (or *kʷətwṓr) was interpreted as an ordinary neuter plural, and – like the numerals ‘1’, ‘2’, and ‘3’ – formally an adjective, inflected not only for case but also for gender. This resulted in the analogical creation of the animate plural in *-wor-es (and the periphrastic feminine ‘four females’, soon univerbated and phonetically mutilated in the process). Note that if the adjective had been formed directly from the verbal noun *kʷétwr̥/*kʷ(ə)twén-, its animate plural would probably have ended up as *kʷet-won-es. In addition to the Greek and Vedic words for ‘fat’, already discussed, compare Greek peîrar (gen. -atos) ‘boundary’ < *pér-wr̥/*pr̥-w(e)n- versus the Homeric adjective a-peírōn (animate) ‘boundless, endless’ < *n̥-per-wōn.

All this suggests that the word *kʷétwr̥ (coll. *kʷétwōr) was transparently derived from a verb root and adopted as a cardinal numeral at a rather late date, perhaps in “Core Indo-European” (the non-Anatolian part of the family) rather than in Proto-Indo-European proper. It is a well-known fact that Anatolian has a different word for ‘4’, *meju- (Hittite meu-/meyau-, Luwian māwa-). Since the jury is still out on whether Hittite kutruwa(n)- ‘witness’ has anything to do with the numeral ‘4’*), we should seriously consider the possibility that the familiar reconstruction *kʷetwores is not Proto-Indo-European at all but represents a “dialectal” innovation which replaced its older synonym in the common ancestor of Tocharian and the extant branches of the family.

If this were a journal article rather than a blog post, I would now be obliged to account for every puzzling irregularity in the branch-specific reflexes of *kʷetwores and its variants. I will spare my visitors such excruciating details, but if anyone is really interested in discussing them, welcome to the Comments section.

And now back to other matters – next time.

*) A witness in court could be denoted as ‘the fourth man’ (beside the two contracting parties and the judge).


Epps, Patience. 2006. “Growing a numeral system: The historical development of numerals in an Amazonian language family”. Diachronica 23(2): 259-288. [a preprint version is available here]

02 October 2014

Only Connect: The Strange Triangle

The Latin adjective triquetrus ‘triangular’ (neuter -um, feminine -a) is baffling. It’s obviously a compound, and it obviously contains the compositional form of the numeral ‘three’, *tri-. What else it contains is anything but obvious. Unfortunately, it’s the only specimen of its kind. The mysterious element -quetrus does not occur in any other Latin compound. It looks as if it could have something to do with quattuor ‘four’. When ‘four’ occurs as the first part of a compound, it has the shape quadru/i-. This form must somehow go back to *kʷətwr̥-, its metathetic variant *kʷətru-, or a hybrid combination of both, but the voicing of the *t is odd, not to say perverse, because its exact opposite, *dr > tr, was a regular change in the prehistory of Latin. The word ‘four’ is evidently such a fickle fellow that it just can’t resist breaking some established rules. For greater inconsistency, the adverbial numeral quater ‘four times’, which in other IE languages (and presumably in Latin as well) derives from *kʷ(e)twr̥-s ~ *kʷ(e)tru-s, shows no voicing. We see a voiced stop again, though, in the denominal verb quadrō ‘to square; put in order, arrange’ and a few related words such as quadra ‘square piece or slice, plinth, dining table, etc.’ and quadrātus ‘square (n. and adj.)’.

Some connections are impossible.
The second part of triquetrus doesn’t simply reflect *kʷetru- (or *kʷatru- < *kʷətru-), because the word is a second-declension o-stem, which means that its pre-form ended in *-tro- rather than *-tru-. The form *kʷetro- (or possibly *kʷatro-, since pre-Latin *a would have merged with *e in this position) does not otherwise occur as a variant of ‘4’ in Latin, but since we are dealing with a capricious word-family, it’s hard to rule out a connection. If it does mean ‘four’, however, why’s that? A triangle has three sides, it has three angles, but has it got three “fours”? It would not be strange if a word for the right angle had something to do with squares or rectangles, and therefore indirectly with the numeral ‘4’, but a triangle can have at most one right angle, certainly not as many as three (the Penrose tribar, shown on the right, would be an exception if it could exist in ordinary Euclidean space).

Can external cognates help? It’s tempting to compare triquetrus with Old English þrifeoþor (sometimes glossed as ‘triangular’ in reference books such as Bosworth and Toller’s Anglo-Saxon Dictionary). It has been suggested earlier by one of the commenters on this blog [Douglas G. Kilday] that the Old English word is a loan from (unattested) Gaulish *petros ‘corner’ (< *kʷetros), which became Germanic *feþra- after the operation of Grimm’s Law. This tantalising suggestion, however, can’t be correct. The word þrifeoþor appears in Old English glossaries (Corpus, Erfurt, and Épinal) three times (spelt ðrifeoðor, trifoedur, ðrifedor), and is translated into Latin as triquadrum. One might think that triquadrum is a distortion of triquetrum caused by “folk etymology” (the mistaken identification of the second part as the compositional form of ‘4’), but in fact it’s no such thing. Old English authors took the adjective triquadrus from Orosius, a Christian priest and scholar from the Roman province of Gallaecia (today’s Galicia, Spain). Orosius, active in the first decades of the 5th century, was the author of several enormously influential works, including  Historiae Adversus Paganos, with a chapter on the geography of the world. Here is the relevant passage (Book 1, Chapter 2; emphasis added):
Maiores nostri orbem totius terrae, oceani limbo circumsaeptum, triquadrum statuere eiusque tres partes Asiam Europam et Africam uocauerunt, quamuis aliqui duas hoc est Asiam ac deinde Africam in Europam accipiendam putarint.
[Our elders made a threefold division of the world, which is surrounded on its periphery by the Ocean. Its three parts they named Asia, Europe, and Africa. Some authorities, however, have considered them to be two, that is, Asia, and Africa and Europe, grouping the last two as one continent.]
The epithet triquadrus refers to “the circle of all the earth” (orbis totius terrae = the world). Orosius certainly doesn’t mean that the Earth is a triangular circle, or that it has three corners. He means that the landmass of the world (as he knew it) is tripartite, divided by most ancient geographers into three continents (in this context, quadra means ‘part, division, area’, not literally a square). Anglo-Saxon translators coined a calque, mechanically replacing Latin quadr- with feoþor- < *kʷetwr̥-, the compositional form of Old English fēower ‘four’. Þrifeoþor was never intended to mean ‘triangular’. Its second member is the same feoþor- (= Late West Saxon fiþer-, fyþer-) that we find as the first element in numerous Old English compounds, e.g. fiþerfēte ‘four-footed’ (= Latin quadrupēs).

External support for *kʷetro- thus evaporates, but triquetrus still has to be explained somehow. I would suggest that its second element is a derivative of *kʷet- ‘join pairwise’ with the instrumental suffix *-tro-. When the suffix was added to a root ending in a dental stop, the last segment of the root was dropped already in Proto-Indo-European (this process is known as “the metron rule”). Thus we get *métrom (Greek métron ‘measure’) from *méd-trom (*med- ‘allot, mete out’), and *h₁étrom (Vedic átra- ‘nourishment’) from *h₁éd-trom (*h₁éd- ‘eat’). The noun *kʷétrom < *kʷet-trom would be ‘something that holds a pair of things together’, hence ‘joint, connection’ or the like. There were several Proto-Indo-European roots with similar meanings, and accordingly several nearly synonymous nouns for things like woodworking joints; joint itself comes (via French) from Latin iunctus ‘connected’ (the root here is *jeug-, as in yoke). Tri-quetrus (< *tri-kʷetro-) is built exactly like tri-angulus (a noun is used as the second member of a compound adjective without altering its stem class), and its etymological meaning is ‘having three connections (between pairs of sides)’.

The next post, in which I shall return to the numeral ‘four’ itself, will be the last in this series.

[back to the table of contents]

29 September 2014

Forgotten Derivatives and Their Sexual Implications

What kind of noun is čët? What is its relationship to our hypothetical verb root? One cannot avoid asking such questions when proposing an etymology. A word is more than a root; it has a derivational history. If you add an affix to a word, you may alter its lexical category and its meaning of the base. We already know a good deal about morphological processes in the Indo-European languages, which means that we can tell plausible relationships between possibly related words from unlikely ones.

Let R be a root morpheme. In Proto-Indo-European (and in many of the languages descended from it), a root consists of a consonantal skeleton with a slot where a vowel can be inserted. For example, the verb root *{w_rǵ} ‘make, work’ is normally quoted in the form *werǵ-, called its e-grade, symbolised as R(e). Here, the slot is occupied by the vowel *e. The same root also forms an o-grade, R(o), realised as *worǵ-, and a zero grade, R(z), in which the vowel slot remains empty. In that case, the liquid *r, sandwiched between two other consonants, has to play the role of a syllable nucleus, and the root becomes phonetically *wr̥ǵ- (in the traditional Indo-Europeanist notation, a tiny subscipt ring marks a syllabic consonant).

One of the largest and most productive classes of PIE nominals (nouns/adjectives) were the so-called thematic nouns (also known as o-stems). Their stem ended in the vowel *-o-, to which inflectional endings were attached. In the simplest case, the vowel was added directly to the root; in more complex cases it was part of a suffix (such as *-to-, *-no-, *-tero-, *-tlo-, etc.). Somewhat surprisingly, “simple thematic”  nouns of the shape R(e)-o- were pretty rare in the protolanguage. The neuter action noun *wérǵ-o-m ‘work, activity’ is well supported by the agreement between Germanic *werka- (Old English weorc, German Werk) and Greek érgon; we also have Iranian (Avestan) varəza-, with the same stem (and meaning) but with masculine inflections. Very few such nouns, however, are truly old. More typically, the suffix *-o- was added to R(o), as in *wóiḱ-o- ‘house, dwelling’ (root *weiḱ- ‘enter, occupy’) and sometimes to R(z), as in *jug-ó- ‘yoke’ (root *jeug-, already mentioned in earlier posts).

Marc Greenber (2001) doesn’t define the morphological status of his reconstruction *kʷet- (‘two’ > ‘pair, partner’). In some places in the article he treats it as if it were a root noun (with no suffixes), but the simplest form we actually find in Slavic is represented by Russ. čët (cf. dialectal Polish cot), which appears to reflect a thematic masculine noun *kʷet-o-s ‘even number’. How could it have originated? If *kʷet- was once a verb root (with the approximate meaning of ‘arrange in pairs, pair up’), *kʷet-o- makes sense as a kind of action noun that has acquired a resultative interpretation: by pairing objects together, you end up with an even number of them. (By the way, the verb root is not entirely conjectural: we can see it in Russian četáť ‘form pairs’.) The problem  with *kʷet-o- is that it represents a rare type of stem, at least in terms of PIE morphology. Is it legitimate to posit it just like that?

On the other hand, *kʷet-o- needn’t go all the way back to PIE. The deverbal formation R(e)-o- has enjoyed increased productivity in Slavic. We even have doublets like R(o)-o- and R(e)-o-, where the o-grade variant is more conservative (and has more external cognates), while the e-grade seems to be a younger innovation (with a more restricted distribution).  Thus, the root *tekʷ- ‘run, flow’ has produced Slavic *tekъ (as if from *tekʷ-o-s) ‘waterflow, leak, source’, which coexists with *tokъ (< *tokʷ-o-s) ‘stream, current, flux; (figuratively) course, sequence of events’. The former is an innovation directly connected with the Slavic verb *tekti ‘leak, flow’ (3sg. *tečetь > *tékʷ-e-ti), whereas the latter is a relict form which has drifted away from its etymological base, also semantically. Therefore, if *četъ is a relatively recent derivative of a Proto-Slavic verb, it wouldn’t be surprising if it had an o-grade cousin (possibly with a more “evolved” meaning).

As a matter of fact, Greenberg mentions *kotъ ‘offspring (of animals), litter’ and *kotiti (sę) ‘have young’ as possible members of the same word-family. A connection with the homophonous noun *kotъ ‘domestic cat’ (a European Wanderwort which spread with the introduction of cats) is folk-etymological: the verb may be used of cats, but also of mice, sheep, goats, roe deer, and a variety of other animals. It is used even in those Slavic languages that have a different word for ‘cat’ (e.g. Serbo-Croatian mačka). The verb *kotiti could be an “iterative/causative” built to the root *kʷet-. The structure of such secondary verbs is R(o)-éje/o- (the final vowel of the stem alternates depending on which conjugational ending is added). For example, the Slavic verb *gъnati (3sg. *ženetь) ‘drive on, drive away, rush’ has a corresponding o-grade iterative, *goniti (3sg. *gonitь) ‘chase, run after’. These forms ultimately reflect PIE *gʷʰén-/*gʷʰn- ‘slay, kill with blows’ (a root verb, somewhat  restructured in Slavic) and its PIE iterative *gʷʰon-éje/o-. The verb *tekti (< *tékʷ-e/o-), mentioned above, forms a pair with the causative *točiti ‘cause to flow, (cause to) roll’  (< *tokʷ-éje/o-). Note also such English pairs as lie vs. lay, or sit vs. set, where the first member is a primary verb and the second is its causative (e.g. ‘lay’ = ‘cause to lie’).

The consequences of forming a pair.
[source; © gerald reiner]
The stem *kʷot-éje/o-, originally with middle-voice inflections (whose function was taken over by the reflexive/reciprocal pronoun * in Slavic), would mean ‘form a couple (together)’, hence ‘mate, have sex’, and eventually ‘reproduce, have young’. If so, *kotъ ‘litter’ is not a senior synonym of *četъ (with a hard-to-explain change of meaning), but more likely a separate verbal noun back-formed from *kotiti sę (the consequence of mating), on the analogy of formally similar denominal verbs: *agniti sę ‘yean’, *teliti sę ‘calve’, *žerbiti sę ‘foal’.

The feminine *četa can hardly be a collective (at any rate in the meaning ‘pair’). Not only because it refers to just two things, but also because collectives in *-ah₂ to o-stem masculines are an archaic formation in Indo-European (as opposed to neuter collectives, co-opted as ordinary plurals of neuter nouns and adjectives), and *četъ is unlikely to be sufficiently ancient. But Indo-European *-(a)h₂ was not only a collective suffix and a marker of femininity; it was also employed to coin (formally feminine) abstracts, including action nouns. Quite a few deverbal masculines in Slavic (and more generally in Balto-Slavic) have feminine synonyms like *čarъ ~ *čara ‘sorcery, enchantment’ or *-tokъ ~ *-toka ‘flow, course’, *-sěkъ ~  *-sěka ‘cutting’ (in compounds). Note the familiar morphological formations represented by Greek tómos ‘slice’ (result of cutting) versus tomḗ ‘cut’ (an instance of cutting) – a nice parallel to *četъ (resultative) vs. *četa (an individual instance of pairing).

In the first post of this series I suggested that the stem *kʷet-w(o)r- was originally a deverbal neuter of a familiar type. Before I develop this idea, let me briefly suggest one other possible trace of the root *kʷet-: the second member of the Latin compound triquetrus ‘triangular’. The next post will be about it.

[back to the table of contents]

26 September 2014

Twos and Troops: Sifting the Evidence

Jakobson’s remark about a possible connection between Russian čët and četýre is discussed in Blažek (1999: 212-213) and especially in Greenberg (2001). Both authors mention earlier, more sketchy treatments of the problem, and they both add more Slavic material to the Russian words originally listed by Jakobson (which were čët, čëtka ‘even number’, četá ‘pair, union’, and čeť ‘quarter’). Blažek also notes an interesting potential cognate in Ossetian, an Indo-European language spoken in the north-central Caucasus (Ossetian is the only living descendant of the Northeast Iranian languages once spoken by the Scytho-Sarmatian inhabitants of the Eurasian steppe belt). The word in question is cæd ‘pair of oxen yoked together’, as if from Proto-Iranian *čatā (the Digor dialect of Ossetian has preserved a more conservative disyllabic form of the word, cædæ).

Blažek does not follow up Jakobson’s suggestion (presumably because he favours a different etymology of ‘four’, proposed by Schmid 1989; see pp. 213, 215, 331 in Blažek’s book). Greenberg, however, regards it as convincing and develops it further. Like Blažek, he considers the predominantly South Slavic *četa ‘troop, military unit’ (hence Serbo-Croatian Četnici ‘Chetniks’) to be part of the word-family of čët, and tries to explain the accentual difference between the end-stressed word četá (< *četa̍) in Russian and the root-stressed South Slavic forms – Bulgarian čéta, Serbian/Croatian čȅta, Slovene čẹ́ta (< *čèta) – in order to defend their common origin.

According to Greenberg, the word ‘four’ is derived from the root *kʷet- meaning ‘two’ extended with a multiplicative suffix, so that *kʷet-wor- means ‘(two) groups of two, twice two’. Greenberg also speculates that Proto-Indo-European *kʷotero- ‘which (of two)?’ (Greek póteros, English whether) contains the same root. This is hardly a good idea, since there is no compelling reason to question the straightforward standard analysis of *kʷo-tero- as the interrogative pronoun *kʷo- plus *-tero-, the IE suffix of binary contrast. The semantic gap between ‘two’ and ‘military unit’ is bridged by Greenberg as follows: Slavic *četa originated as the collective (in *-ah₂) of a word meaning ‘two, pair’, and ‘multitude of pairs’ evolved into ‘troop, group, band (of soldiers)’.

Arranged in pairs
There are serious problems with this derivation. First, (East/West) Slavic *četъ means ‘even number’, not ‘two’ or ‘pair’, while, on the contrary, the supposedly collective četá can mean ‘pair’ in Russian (beside some related meanings: ne četá, accompanied by a dative, means ‘not on a par with, superior to…’). What appears to be its exact cognate in Ossetian means ‘pair of oxen’, not, say, ‘herd of cattle’. Furthermore, while it’s true that the semantics of Russian četá covers not only ‘pair’ but also ‘troop’ (the latter attested already in Old Russian), we are probably dealing with a lexical merger between a native East Slavic word and a borrowing from Church Slavic (Czech četa ‘platoon’ is likewise a South Slavic loan, as are, ultimately, a number of similar “wandering words” in various neighbouring languages – Romanian, Hungarian, Albanian, and even Turkish). The non-attestation of intermediate meanings like ‘double column (of soldiers)’ makes it hard to justify the derivation of ‘troop’ from ‘pair’. Since the semantic difference is combined with a formal difference (conflicting accentuation), the etymology simply falls apart. It seems reasonable to conclude that the contrast between *četa̍ and *čèta is old and distinguishes two words of different origin (notwithstanding their merger in Russian). [See this comment, however.]

Jakobson’s final hypothetical relative of ‘four’, čeť ‘fourth part (of land), quarter’ (Old Russian četь ~ četъka), is in all likelihood a popular truncation of četverť (~ četvertka) < Proto-Slavic *četvьrtь ‘quarter’ < *kʷetwr̥-ti-, a noun corresponding to the widespread ordinal *kʷetwr̥-to- ‘fourth’. It is of course related to ‘four’, but in a rather trivial manner.

Etymological dictionaries often attempt to connect četa (in either sense) with the Slavic verb *čьtǫ (inf. *čisti) ‘count, reckon, read’, derived from PIE *kʷeit- ‘notice, recognise’. This verb has produced numerous derivatives in Slavic (e.g. *čislo ‘number’); some of them may be accidentally similar to members of the čët group both in form and in meaning, e.g. Old Czech čet ‘count, quantity’ (Modern Czech počet, with a prefix). Note, however, the gen.sg. čtu ~ čta. The disappearing root vowel reflects Proto-Slavic *ь (a reduced vowel continuing earlier short *i in the weak form of the root, *kʷit-). Despite their deceptive similarity, Russian četá (or čët) and Czech čet have different etymologies.

If we remove all the false or dubious cognates, we are left with just the initial material: *četъ ‘even number’, *četьnъ ‘even (of numbers)’ and *četa ‘pair’ ­– a word-family securely attested in East and West Slavic. We can safely add the Ossetian word (isolated in Iranian, as far as I know, but a perfect match for *četa, semantically and formally). There’s no evidence that the original meaning of the morpheme *čet- was ‘two’; nevertheless, it seems to have had something to do with arranging things in couples. Typologically, the Slavic “odd/even” terminology is parallel to what we have seen in Greek and Sanskrit, even if different lexical roots are involved. If so, one could expect *čet- to be semantically close to the familiar Indo-European roots *h₂ar- ‘fit together’ and *jeug- ‘yoke, connect’. I shall therefore tentatively assume that *čet- continues a verb root like *kʷet-, with the approximate meaning of ‘combine into pairs’. Let’s see if we can work from here ­– next time.


Václav Blažek. 1999. Numerals: Comparative–etymological analyses of numeral systems and their implications. Brno: Masarykova Univerzita v Brně.

Marc L. Greenberg. 2001. “Is Slavic četa an Indo-European archaism?”. International Journal of Slavic Linguistics and Poetics 43: 35-39.

21 September 2014

‘Four’: A Map

I didn’t plan it this way, but since the discussion of the etymology of ‘four’ has unfolded into a small saga in several acts, I have to organise it for convenience. Here is a map of the route:

  1. [Word of the Month: Proto-Indo-European ‘Four’]
  2. [Even and Odd]
  3. [The Name of the Game: Jakobson Reads Vasmer]
  4. [Twos and Troops: Sifting the Evidence]
  5. [Forgotten Derivatives and their Sexual Implications]
  6. [Only Connect: The Strange Triangle]
  7. [Two Is Company, Four Is a Party] NEW!
The End

The Name of the Game: Jakobson Reads Vasmer

With the vast and reliable etymological material put into circulation by Vasmer, a number of new questions naturally arises. I should like to dwell on some particulars.
Roman Jakobson (1955) *) 

The Slavs played at “even and odd” too. In Polish the game used to be called cetno licho (or cetno i licho). The noun licho is still used as a mild euphemism for ‘devil’. Czego chcesz, do licha? means “What the heck do you want?” Polish also has the adjective lichy ‘poor, inferior, in bad shape’. Historically, licho is a neuter form of lichy, substantivised centuries ago, when the adjective had a wider range of meaning, including  ‘mean, evil’; licho was therefore ‘something wicked’. The phrase cetno i licho lingers on on the fringes of literary Polish (people are at best vaguely aware that it refers to some old game of chance), but cetno no longer occurs on its own, and has no obvious relatives  in the modern Polish lexicon.

The man who read Vasmer's dictionary
A few hundred years ago (most examples come from 16th-century texts) cetno and licho could mean, respectively, ‘even number’ and ‘odd number’. Though often contrasted with each other, they were not yet harnessed together into a fixed phrase. Cetnem (instr.sg.) or w cetnie (loc.sg.) meant ‘(occurring) in even numbers’; likewise lichem and w lichu ‘in odd numbers’. This usage has been completely forgotten.

Licho and lichy go back to Proto-Slavic *lixъ ‘strange, irregular, rogue’. In the modern Slavic languages it usually has pejorative conotations (‘bad, lacking, defective, lonely’, etc.); it can also mean ‘excessive, superfluous’. The meaning of Russian lixój, however, ranges – somewhat schizophrenically – from ‘bad, sinister, hard’ to ‘daring, valiant’ (the common ancestor was ‘extraordinary’, whether in a positive or a negative sense)’. Like semantically similar words in other languages (Greek perittós, English odd), *lixъ developed the arithmetical meaning of ‘odd’, which survives here and there in the Slavic branch. For example, in Czech liché číslo means ‘odd number’. As for its origin, *lixъ < *leikʷ-so-, from the widespread Proto-Indo-European root *leikʷ- ‘leave, abandon’.

So much for licho. Where does cetno come from? The Russian term for “even and odd” is čët i néčet. Čët means ‘even number’ (= čëtnoe čisló); néčet is its antonym. The adjective čëtnyj ‘even’ (of a number) is closely related to Polish cetno. Russian č normally corresponds to Polish cz, but some regional varieties of Polish have merged the affricate cz /tʂ/ with c /ts/ for centuries, and the standard language has borrowed a number of dialectal pronunciations of this kind.

On the combined evidence of Polish and East Slavic forms we can reconstruct Proto-Slavic *četъ (n.) and *četьnъ (adj.). Russian also has the noun četá ‘pair, couple’, which is formally and semantically close to them. There are several other Slavic words that might or might not be related to *četъ, but it’s wiser at this stage to exclude more difficult material so as to avoid the risk of contaminating a reliable set of cognates with spurious ones.

Back in the 1950s, as successive volumes of Max Vasmer’s monumental Russisches etymologisches Wörterbuch were published in Heidelberg, the great linguist Roman Jakobson (then at Harvard University) read the entire dictionary (I mean, actually read it like a novel, page by page), jotting down comments on entries that attracted his attention. Those marginalia were published as a journal article (see the reference below) and reprinted in Jakobson’s Selected Writings (Volume II: Word and Language). With regard to čët and its relatives, Jakobson remarked that they “seem to be archaic relics of the same word family as četýre” (the Russian reflex of the Indo-European numeral ‘four’). Having devoted one sentence to the matter, he moved on to the next entry that had caught his eye, čex ‘Czech’. The idea that čët and četýre are somehow related has been picked up by several other authors, but hitherto published attempts to analyse *kʷetwor- in this light have the usual flaws of “root etymologies”: too little attention to morphological details, and too much imaginative semantics. Nevertheless, I think Jakobson’s idea is worth salvaging, so I’ll review those previous attempts and try to see if I can do any better.

*) Roman Jakobson. 1955. “While reading Vasmer’s dictionary”. Word 11: 611-617.

[link to a digitalised Russian translation of Vasmer's dictionary]

[to be continued]

[back to the table of contents]