19 July 2013

Consider a Hammer

Figure1: Co-opting a natural object
According to Wikipedia, “A hammer is a tool meant to deliver an impact to an object. The most common uses for hammers are to drive nails, fit parts, forge metal and break apart objects. Hammers are often designed for a specific purpose, and vary in their shape and structure.” Hammers have been shaped by the functions they typically perform. A heavy metal head fixed on a light handle stores kinetic energy before the blow is delivered. The length, cross-section and shape of the handle are ergonomically adapted to human handgrip and typical working conditions. There are functionally motivated differences between, say, a light claw hammer used for driving and removing nails and a heavy-duty sledgehammer used for tearing down walls.

Figure 2: Putting a handle on it
The ancestors of all hammers were natural cobbles used as hammerstones by Palaeolithic humans (as well as earlier hominins). They carried out some of the same functions as modern hammers, albeit less efficiently. There was no handle (its function was played by the user’s arm), and hammerstones used for different purposes had the same general shape, differing mostly in size and weight. Small gradual improvements and  occasional major inventions (a wooden handle, the use of bronze or steel instead of stone) transformed the primitive tool visible in Figure 1 into a more sophisticated version (Figure 2), and finally into a fully streamlined  modern hammer (Figure 3).

Figure 3: Shaped by its functions
Of course a hammer can be used for many other purposes beside pounding nails into things or splitting hard objects. It can serve as a makeshift paperweight, a percussion instrument (as in Penderecki’s De Natura Sonoris No. 2), an improvised weapon, and even as a ritual or ceremonial object – for example, the emblem of a smithing god. Such accidental functions do not normally influence the evolution of hammers. If a type of hammer acquires a historically stable secondary function (e.g. removing nails), you can see the characteristic adaptations (a flattened and rounded claw), copied and perfected by new generations of hammer manufacturers. Ad hoc functions have no such consequences. Nobody modifies the shape of a hammer to make it a better paperweight. Its only when a hammer is regularly recruited for a new task that adaptation begins to shape it in a new way. This may lead to the emergence of highly specialised hammers (such as the judge’s gavel or the doctor’s knee mallet).

The first hammers were not designed by anyone. Our distant ancestors learnt to select naturally formed stones. Then they learnt to improve their shape, fix them on a handle to optimise energy transmission, etc. The functional features of a hammer are those that have been consistently selected for in the past. It’s always possible to use a tool in an unconventional way, but such occasional applications don’t explain why the tool looks the way it does. Some features (for example, the colour of the handle) are free to vary. They are non-adaptive, devoid of functional importance.

I hope you can see how this hammer analogy can be applied to linguistic structures. That’s what will be done in the next post.

18 July 2013

Who Benefits from Language Change?

Since functionalism treats language as a tool designed and perfected by humans to serve their needs, it understands function as a purpose-oriented property of linguistic structures: it is a way of achieving a communicative aim by linguistic means. Language is fine-tuned to optimise communication, which means, among other things, that the natural conflict between the speaker’s needs (encoding and sending linguistic messages at a low cost) and the listener’s needs (receiving and decoding messages without unnecessary effort) must be resolved. Languages maintain a delicate balance between ease of production and ease of perception. For example, precise enunciation is expensive in terms of articulatory effort and neuromuscular control, but if the speaker tries to reduce this cost excessively by sacrifying precision, the result may be the listener’s failure to understand the message. Since having to repeat a sentence twice is usually costlier than saying it once with sufficient clarity, the speaker has to anticipate any undesirable difficulties at the listener’s end, and the tendency to favour ease of articulation is mitigated by those anticipations.

To whose benefit?
Artist: Matthew Martin
Language change can make life minimally easier for the speaker or the listener. Sound changes are often classified into “lenitions” (weakenings) and “fortitions” (strengthenings). Weakenings consist in reducing articulatory effort (and the acoustic prominence of speech sounds), while strengthenings involve increased effort (and acoustic prominence). In this dualist interpretation, weakenings are speaker-oriented, while strengthenings are listener-oriented. Any change has a purpose, and therefore a functional significance – all that needs to be determined is its orientation: cui bono?

Note, however, that an explanatory statement like ‘/t/-glottaling occurs in some accents of English because it is a speaker-friendly articulatory weakening’ is hard to falsify. Whatever happens to the phonetic realisation of /t/, you can always “explain” it in a circular fashion as an attempt to improve either ease of production or ease of perception. A change can’t be functionally neutral simply because there’s no place for such a thing in the functionalist view of language. It would be nice if we could predict when change will be driven by the speaker’s or the listener’s needs (or when nothing happens). If instead we identify the motivating factor after the fact, depending on the outcome, it’s an “either way I win” kind of game, where you can explain everything but predict nothing. Of course there are some characteristic cross-linguistic “hotspots” of change: weakenings are more likely in unstressed environments or syllable-finally; strengthenings happen more often under stress and syllable-initially. This kind of conditioning, however, is sensitive to the segmental and prosodic context rather than the needs of language users.

Then, there are classificatory problems. In non-rhotic varieties of English final or preconsonantal /r/ becomes vocalised. If preceded by a full vowel, it coalesces with it, causing the vowel to undergo lengthening and/or diphthongisation (e.g. /kard/ > /kɑːd/ ‘card’, /niːr/ > /nɪə/ ‘near’). Whose life is made easier by this change? Is it weakening, strengthening, or six of the one and half a dozen of the other? Doesn’t the increased length/complexity of the syllable nucleus compensate for the consonant loss? What about the fact that the phonemic inventory of non-rhotic English may become larger and more complex as a result? If both the speaker and the listener lose something and gain something else at the same time, why bother changing anything? Why does this kind of change spread at all if there’s no clear net gain from it for anybody?

There are accents of American English where /æ/ is tensed, raised and diphthongised, becoming [eə]. This can be regarded as phonological reinforcement, and therefore a kind of strengthening. The vowel becomes more salient, which might benefit the listener. But in most varieties of American English the change is restricted to certain environments: some accents have it only before nasals, others before nasals and voiceless fricatives, and still others before nasals, voiceless fricatives, and voiced stops (often with lexical exceptions). Why is the presumed anticipation of the listener’s needs selective in this way?

Some “functions” are self-evident. It is obvious that the function of a word is to carry a lexical meaning and a syntactic role (sometimes more than one). There are no completely functionless words practically by definition. But what, for instance, is the function of the final /st/ in amongst (synonymous with among)? Whose convenience does it serve? If semantic change takes place, as when Old English cniht ‘boy’ developed into Middle English knyght ‘knight, nobleman’, how does one measure its impact on communication? If this particular shift was motivated by some functional pressure, I would like to hear the details.

In the next post I shall try to re-define function in such a way that it becomes less teleological and more distinguishable from accidental byproducts of linguistic evolution. Please be prepared to consider the possibility that language structure is not entirely rational, functional, or intelligently designed.

16 July 2013

Language as Clockwork

Proto-World was fun, wasn’t it? but there’s little I can add to the topic. If any readers of this blog would like to continue discussing mass comparison and global etymologies, they are welcome to do so in the comment boxes in that thread. Let’s change the perspective again and focus on linguistic microevolution. In the nearest future I would like to discuss the following things: the notion of “function” in linguistics, and two fundamental mechanism of evolution: adaptative change and random drift.

Functional approaches to language emphasise the view of language as an instrument of human communication and social interaction. Therefore, functional factors such as people’s communicative needs (and in particular considerations of iconicity, economy, and ease of processing) are thought to exert influence on the course of language change: some changes are advantageous for effective communication and therefore encouraged by functional motivations, while others are deleterious and therefore discouraged. There is an understandable tendency among functionally inclined linguists to regard all elements of language as functional in some sense (like the interlocking parts of a carefully designed clockwork mechanism), and to insist that any explanation of language change should assume the form of a functionally motivated scenario (change happens for a “reason”). The idea that a language system can be to a large extent messy and basically functionless, and that much of language change is random and neutral (or as nearly neutral as matters) with respect to its users’ needs and goals, flies in the face of the tenets of functionalism, and so may seem provocative to many mainstream linguists. It will be defended here, but first I shall take a closer look at the fuzzy concept of “function” and the role it plays in linguistics. This is what the next post will be about.

05 July 2013

Global Water for the Last Time

I’m sorry for such a long break since the last post, but the end of the academic year is a busy time. Where were we? Ah, yes, the global etymon meaning ‘water’.

I analysed the Indo-European evidence in some detail to highlight the fact that, although Latin aqua has cognates here and there in Indo-European, its attestation is too weak to treat the word as reconstructible all the way back to Proto-Indo-European. It’s a regional word with uncertain affinities, and surely not the PIE ‘water’ word (there are better candidates for that status). Its story contains a moral: sheer similarity, even within an uncontroversial family, doesn’t mean anything by itself. There is an inherited verb root meaning ‘drink’ which looks tantalisingly similar to aqua (and was once regarded as related to it), but which has to be separated from it, given what we know today. Our improved understanding of some of the languages of the past (such as Hittite and the rest of the Anatolian clade) has forced us to abandon quite a few superficially promising etymologies. And it’s a good thing: it shows that etymologies are in principle falsifiable. All you need is a good model within which they can be evaluated.

Of course absence of evidence is not evidence of absence. It may conceivably happen that a word present in a protolanguage survives only in one language descended from it, or in a small cluster of related languages. In such cases, outgroup comparison may still enable us to recognise the word as inherited. We only need some secure external cognates and a consistent pattern of correspondences. We can’t, however, trust conclusions drawn only from the existence of vaguely similar words scattered across several families, especially if there is no pattern they could fit into because the researchers feel free to avoid real reconstructive work. If you look at Bengtson & Ruhlen (B&R)’s data, you will find many clear examples of “reaching down” (selecting isolated lookalikes and pretending they represent the families in question).

For example, words related to aqua are claimed to be present in Afro-Asiatic, while in fact all the proposed cognates  come from two periferal branches: Omotic (whose very membership in Afro-Asiatic is is uncertain) and Cushitic (whose exact location in a the AA family tree is anything but clear, but which is areally close to Omotic, so that borrowing between them is hard to rule out). The meaning of the suggested cognates is sometimes ‘water’, (but also ‘[to be] wet’, ‘drink’ or ‘drops of water’). But what about the Berber, Chadic, Egyptian and Semitic branches of Afroasiatic, where no such item occurs? What about alternative ‘water’ words which can be found in Cushitic and/or Omotic? By the way, putative cognates of aqua occur only in North Omotic. Afro-Asiatic is a big family, with aboot 300 extant members. With so many languages and “related meanings” to choose from, and with no formal controls, pseudo-cognates crop up inevitably. An Amerind Etymological Dictionary (Greenberg & Ruhlen 2007) lists no fewer than seventeen different etyma meaning ‘water’: *aqʷ’a/*uqʷ’a (of course), but also *man, *poi, *re, *si, *kʷati, *p’ak, *na, *ʔali, *pan, *tuna, *c’i, *kam ~ *kom, *to ~ *do, *kona, *xi, and *hobi (while we’re at it, there are also eight Amerind words for ‘dog’ and thirteen for ‘eye’). These forms are not real comparative reconstructions (their phonetic details are nowhere dicussed or justified) and must be treated as approximate, which of course makes comparison as easy as pie, especially if semantics is given as much leeway as phonology.

Lost in distillation
[Source: Wikimedia]
If you don’t reconstruct past sound changes, how can you decide whether, e.g., French eau (pronounced /o/) is related to Spanish agua, or that both of them are related to Romanian apă? Note that these three modern Romance languages began to diverge less than two thousand years ago. Their modern ‘water’ words are already more different from the common ancestor (yes, Latin aqua) than the latter is from, say, some of the “Amerind” forms cited by B&R. Sound change may be rapid and dramatic. What, then, constitutes a “match” if you are comparing languages supposedly separated by 10,000 or 20,000 years of independent development, and if you can’t even be bothered to study systematic sound correspondences or morphological patterns? Ignorance helps you to see patterns that knowledge dispels at once. In Kove, one of the Austronesian languages of New Britain (in the Bismarck Archipelago), water is called eau. If we knew less than we do about the history of French (or Kove, for that matter), we might suspect a long-range connection, mightn’t we? Is Proto-Pama-Nyungan *nguku/i (which should replace B&R’s anachronistic “Proto-Australian” *gugu) related to Lat. aqua? Well, if I am shown a serious etymological proposal, with the relevant sound changes, morphological derivations and semantic shifts (if any) all spelt out, I’ll tell you what I think of it. Untestable guesswork hardly deserves to be discussed.

A “cognate” like “Proto-Central-Algonquian *akwā ‘from water’” may look impressive until one learns that the actual root, Proto-Algonquian *akw- (the * came from the wrong segmentation of an Algonquian compound) means ‘ashore, out of the water’ (indicating location or direction rather than the place of origin) and that the real Algonquian ‘water’ term is *nepyi (for details, as well as the for full review of other Algonquian data cited by B&R, see Marc Picard 1998). But of course there are so many “Amerind” ‘water’ words that *nepyi could even be decomposed into more than one of them (e.g. *na + *poi).

Impressionistic comparison without any regard for methodological rigour will invariably produce the same outcome: a haphazard collection of words from, say, a dozen families and a few dozen languages (out of the world’s several thousand) which look vaguely similar and have vaguely similar meanings. How should one formulate a relationship proposal based on such evidence, so that other people could evaluate it? Surely not by listing the putative cognates and saying “look!” in the hope that the raw unanalysed evidence will speak for itself. But “global etymologists” do just that. They promise that someone, sometime, will carry out the actual comparative work, but they also claim that their data stand even without it. That’s wishful thinking, pure and simple.


10 June 2013

A Water Word that Wasn’t There


The last item on Bengtson & Ruhlen’s list of “global etymologies” is ʔAQ’WA ‘water’. What can hardly escape anybody’s attention is its uncanny similarity to one of those Latin words which are the common currency of our civilisation: aqua, as in aquarium, aqueduct, or BonAqua. One knows such words even without the benefit of a good classical education. Is it possible that an ancient “global” word survived virtually unchanged in Latin? 

To be sure, Bengtson and Ruhlen don’t actually reconstruct their global proto-words. They claim that the glosses offered in the article “are intended merely to characterize the most general meaning and phonological shape of each root”. Nevertheless, the “phonological shape” looks pretty specific, complete with such fancy details as an initial glottal stop, and a medial uvular ejective. Are those segments there because there is some solid evidence for them, or are they simply ornamental? Never mind. We shall look at the global data next time. Today let’s only examine the putative Indo-European reflex of ʔAQ’WA. We have already seen how the comparative method works, so let’s apply it again. 

Bengtson & Ruhlen cite the following forms to support the PIE reconstruction *akʷā-
  • Anatolian: Hittite eku-, Luwian aku-, Palaic aḫ- ‘drink’ [somewhat sloppy and not quite correct, see below] 
  • Italic: Latin aqua ‘water’ 
  • Germanic: Gothic ahwa ‘river’ [found elsewhere in Germanic as well] 
  • Tocharian A yok- ‘drink’ [Toch. A and B, as a matter of fact] 
At first blush, the evidence looks impressive. The word (or at least its root) occurs in four branches of IE, including Anatolian and Tocharian. That should be enough to guarantee that we are dealing with a PIE lexical item. To be sure, the meaning ‘water’ occurs only in Latin; the Germanic cognate means ‘river’, and Anatolian and Tocharian only have a verb meaning ‘drink’. If the noun and the verb were related, it would be interesting to analyse the relationship and make sure that the meaning ‘water’ is indeed old and not derived within IE. That will not be necessary, however, because the words are not related in the first place. 

Hitt 3sg. ekuzi ~ eukzi, 3pl. akuanzi  ‘drink’ (+ Palaic ahu- and Cuneiform Luwian u-) may only reflect a root with a voiced consonant (a voiceless one would have become -kk-, not -k-, in Hittite). We can connect them via regular sound correspondences with Latin ēbrius ‘drunk’ and Greek nḗpʰō ‘be sober’ (= ‘not-drink’, with the IE negative particle *n(e)-). The Anatolian verb forms might go back to a plain root present *h₁égʷʰ-ti, *h₁gʷʰ-énti, but Tocharian AB yok- and the Latin adjective require a long vowel; the jury is still out on whether we should posit a PIE lengthened-grade root *h₁ēgʷʰ- or a reduplicated stem, *h₁é-h₁gʷʰ- (or even something still more complex). A couple of things seem clear, though. The root-final consonant is *gʷʰ, not *, and the initial laryngeal is *h₁ (the one that doesn’t colour an adjacent short vowel). This is enough to exclude any connection with aqua or its Germanic cognates. One might add that apart from *h₁egʷʰ- we also find the widespread perfective verb *poh₃(i)- ‘drink’ (also in Anatolian, with the meaning ‘swallow, gulp down’). As reflexes of *h₁egʷʰ- clearly refer to drunkennes at least in Latin and Greek, perhaps its original meaning was ‘get drunk’ (on something more intoxicating than water) rather than simply ‘drink’. 

Not real water
[link]
We are left with Latin aqua ‘water’ and Germanic *axʷō ‘river’ (a perfect formal match combined with a difference in meaning). Possible traces of a Celtic word reconstructible as *akʷā are few and hardly substantial: they include several European river-names ending in -apa (which might or might not be a Gaulish cognate of aqua, not confirmed by any Gaulish text), and a single occurrence of -akua as part of a longer sequence in an unclear Celtiberian inscription, where the context doesn’t rule out the meaning ‘river’ (but neither does it demand such an interpretation). By contrast, Germanic *axʷō is abundantly attested (Goth. aƕa, Old High German and Old Saxon aha, Old Frisian ā ~ ē, Old English ēa, Old Norse á). All the reflexes mean ‘running water, stream, river’, which shows that PGmc. *axʷō was roughly synonymous with PIE *h₂ap-h₃on- and possibly replaced the latter term in the prehistory of Germanic. The word-family represented by English water, German Wasser and Gothic wato was not affected. In Latin, on the other hand, aqua completely ousted *wodr̥ ~ *udōr/*udn-, etc., and became the ordinary word for ‘water’ (including “tame” water for drinking or washing). 

Germanic also displays some interesting derivatives, such as *aujō ‘island; meadow-land’ from earlier *aɣʷjō < pre-Germanic *akʷjā́ (ON ey, OE īġ ~ īeġ). This word formed the first member of the OE compound īġ-lond > ModE island (which owes its mute s to false association with Old French isle, an unrelated but acidentally similar word derived from Latin īnsula). The compound, by the way, outcompeted the free-standing word: in Middle English the element ei ~ i ~ ie was common in placenames, but no longer in isolation. As regards its further derivatives, we have OE īġoþ ‘islet, small island’ (hence modern ait ~ eyot, used mostly with reference to the topography of the Thames). Finally, Germanic *ēɣ⁽ʷ⁾ijaz (cf. the ON ocean-giant Ægir, OE ǣġ(e) ‘island, sea, sea-coast’) may be related provided that the word is old enough to reflect some characteristic “special effects” of laryngeal colouring: Lat. a- and Gmc. *a- would together point to an initial *h₂a-, but *ē-, if cognate, would imply an old lengthened grade *h₂ē-, immune to the a-colouring effect of *h₂. All this is highly speculative, especially in the absence of any uncontroversial cognates of aqua outside Latin and Germanic. The IE reconstruction *h₂ákʷah₂ is often encountered in the linguistic literature. While not impossible, it is hardly warranted by the comparative evidence. Moreover, even if the word is genuinely old within IE, neither Latin nor Germanic can tell us if we should reconstruct an intervocalic *-kʷ- or *-ḱw-. If the latter, one might attempt to connect the ‘river/water’ word with the IE adjective meaning ‘swift, fast’ (traditionally reconstructed as *ōḱú-, with an initial *ō which conceals some puzzling combination of PIE vowels and laryngeals, not yet unravelled to everyone’s satisfaction). In that case, however, we must posit an evolutionary chain like ‘swift’ → ‘rapid current’→ ’river’ → ’water’ to account for the semantics. If there’s any truth in this suggestion, the meaning ‘water’ is highly derived, and there was originally nothing aquatic about the PIE root that produced the Latin and Germanic terms. 

I have only touched upon the problems surrounding aqua and its kin. A full discussion would not change the bottom line: *akʷā (or any laryngeally revamped version thereof) is not a valid PIE reconstruction. The words we find in Germanic and Latin are regional, not common Indo-European. Their pedigree is uncertain; they may be loans from an unidentified pre-IE substrate (in which case their deeper history is unknowable for lack of data). If they are derived from an internal IE source, then in all likelihood the link with streams, rivers, and finally water as a substance is a late product of semantic evolution. The Anatolian and Tocharian words for ‘drinking’ belong to a totally different word-family despite their misleading resemblance. The famous Hittite phrase wātar⸗ma ekutteni ‘and you will drink water’ (part of the sentence that triggered Hrozný’s eureka experience) does contain a cognate of English water, but not one of Latin aqua.

[► Back to the beginning of the Proto-World thread]

05 June 2013

A Wiki-Wiki Interlude

This is not about water, but it is too good to miss.

[high-res]
Hawai‘ian phonology is simple, but its history is fascinating. Proto-Eastern Polynesian *k was shifted to a (phonemic) glottal stop /ʔ/ in Hawai‘ian (that is what the inverted comma in Hawai‘i stands for),  which left the coronal stop *t with a lot of free space to expand into (there were no other stops or fricatives articulated with the involvement of any part of the tongue). As a result, most of the allophones of *t migrated away from their original point of articulation, towards the soft palate, until *t basically changed into /k/, reaching the position vacated by the old shifted velar. To be more precise, today [k] is the main phonetic realisation of /k/ (former *t), but in some positions the pronunciation may still be [t], and in fact just about any non-labial and non-glottal obstruent (stop, fricative or affricate) may be employed as an allophone of /k/.

Thanks to this highly unusual place-of-articulation shift the Central East Polynesian adjective *witi ‘quick, lively’ became Hawai‘ian wiki (mind you, it can still be pronounced ['witi] or ['viti], but the shifted pronunciation ['wiki] brings it phonetically closer to English quick and increases the odds of its being picked up by an English-speaker). Thus was born one of the most successful linguistic replicators of today. For centuries the virus was more or less confined to its insular homeland, but in the mid-1990s it infected the mind of an American computer programmer visiting the islands. Before long, all major language communities had their Wikis. There is of course a Hawai‘ian one as well!

I want to thank Lara Prescott for bringing this beautiful infographic presentation to my attention, and I hasten to share it wiki-wiki.

01 June 2013

Wild Waters


I apologise in advance if what you find below is technical and hard to follow, but I am still talking of the comparative method. If you prefer something easy, I recommend mass comparison.

Old Indic ap- ‘water’ is a curious word. It is a feminine root noun (its stem is a bare root morpheme with no suffix), and Indo-European root nouns are generally interesting. They are primitive formations, inherited rather than borrowed, often charmingly irregular and likely to reveal some little secrets on close examination. To begin with, the declension of ap- is somewhat defective. Some of its case forms in the singular are not attested at all, and those that are occur exclusively in the archaic Vedic dialect, while Classical Sanskrit knows only plural forms. The stem has two variants, strong āp- (nom.pl.  ā́pas) and weak ap- (gen.sg. apás, loc.pl. apsú, etc.). A similar pattern can be seen in the Iranian languages, especially Avestan, where the nom.sg. āfš (< *āp-s) is preserved beside acc.sg. āpəm, nom.pl. āpō, contrasting with the weak stem of gen.sg. apō, gen.pl. apąm, etc. The pattern looks like a slightly reworked acrostatic paradigm, possibly *Hóp-/*Hép-, where *H is one of the PIE “laryngeals”. The original declension would have been like this:
  • nom.sg. *Hṓp-s
  • acc.sg. *Hóp-m̥
  • gen.sg. *Hép-s (→ *Hép-os → *Hep-ós, on the analogy of mobile stems)
  • nom.pl. *Hóp-es
etc.

One would expect the normal IE lengthening of the root vowel *o in the nom.sg.; in the acc.sg., voc.sg., and nom./voc. pl. the inherited *o would have occurred in an open syllable, a context in which it would have been affected by the Indo-Iranian lengthening known as Brugmann’s Law. In other case forms we presumably have something else than *o (so the laryngeal should be either the non-colouring *h₁ or the a-colouring *h₂). The presence of an initial laryngeal is demonstrated by vowel lengthening visible in compounds like Skt. dvīpá- ‘island’ < *dwi-Hp-ó- ‘with water on either side’. For reasons that will become clear in a moment, most specialists reconstruct the root as *{h₂ep-}, which, assuming an acrostatic paradigm, would have resulted in nom.sg. *h₂ṓps, gen.sg. *h₂áp(o)s, nom.pl. *h₂ópes. The Indo-Iranian word may mean not only just ‘water’ (natural fresh water in lakes or rivers), but also the “celestial waters”, i.e. the sky, as well as “the Waters” personified as deities.

Outside of Indo-Iranian, we have a nice Tocharian cognate (Toch.A/B āp- f. ‘water, river’, with a vowel that could reflect *ō or *a), and a few more doubtful ones: Old Prussian ape ‘stream’, as if from *h₂ap-ijah₂, cf. Vedic ápya- ‘aquatic’ (similar words in Lithuanian and Latvian begin with u-, which makes comparison problematic). No forms with a reflex of *e are visible anywhere, which favours the reconstruction of *h₂ as the initial.

There are also a number of possibly related words in Italic, Celtic and Anatolian, which mean ‘river, stream’ and present some characteristic problems as a group. In Anatolian, we find Hittite hapas, Palaic hāpna-, Cuneiform Luwian hāpa/i- (all meaning ‘river’), and the Lycian verb χba(i)- ‘to water, irrigate’ (plus a cognate verb in Hittite, apparently borrowed from Luwian). Together, hey would confirm the reconstruction of the initial laryngeal as *h₂ (*hwas not preserved in Anatolian, and word-initial *h₃ seems to have been lost in Lycian). Unfortunately, the medial stop in Anatolian cannot reflect *p, whose outcome would have been rendered as -pp-; a single spelling reflects a PIE voiced stop. That’s why the root underlying the Anatolian words is often reconstructed as *h₂abʰ-, not *h₂ap- (and not *h₂ab- either, since *b was vanishingly rare or even non-existent in PIE).

The wild waters of one of the British Avons (Devon)
[hat tip: Simon and Fiona]
Latin amnis ‘river’ could reflect *h₂ap-ni- (with a regular nasal assimilation), but if related to Palaic hāpna-, it would be better analysed as *h₂abʰ-ni- (which would have yielded the same Latin outcome). This seems to be confirmed by the Celtic nasal stem *abon- (Old Irish aub < *abū < *abō(n) ‘river’) and its synonymous derivative *abonā (Welsh afon), known from a number of tautological hydronyms in Britain (the River Avon is literally ‘the River River’). It would seem, therefore, that we actually have two “watery” roots, *h₂ap-, found in Indo-Iranian and Tocharian (with possible trace attestation elsewhere), and *h₂abʰ- (less likely *h₂ab-) in Anatolian, Latin, and Celtic. The distribution is puzzling and the roots are suspiciously similar, but *p and *b(ʰ) do not vary freely in the same morpheme in PIE. Are the roots different and their similarity accidental? Or is it some kind of aberrant dialectal variation in the protolanguage? Such variation is often taken for granted by etymological dictionaries, but it’s clearly a case of relaxing the sound standards of comparison. It would be much nicer to be able to unify the etymologies without special pleading.

A possible connection between the two variants was suggested by Eric Hamp in 1972. PIE had a quasi-possessive suffix first described by Karl Hoffmann back in 1955 and named after him. The shape of the Hoffmann suffix is *-Hon-/*-Hn-. Hoffmann himself supposed that the initial laryngeal was *h₁ (probably = IPA [h]), but some identify it as *h₃. There’s little evidence either way, to be sure, but it has long been known that *h₃ may be responsible for voicing a preceding obstruent (hence the idea that *h₃ was a voiced fricative, IPA [ɣ] or the like). The best example is the reduplicated present stem *pí-ph₃-e/o- > *píbe/o- ‘drink’ (from the root *{peh₃(i)-}. Hamp proposed that *abon- reflected *h₂abh₃on- ‘having/carrying water’, i.e. *h₂ap- extended with the Hoffmann suffix. The Latin and Palaic forms would be analysable as derivatives of the same word: *h₂ab(h₃)n-o- ~ *h₂ab(h₃)n-i-.

But what about Hittite hapas, which does not seem to contain the Hoffmann suffix? Well, it may contain it after all. PIE *h₂abh₃on- would have become pre-Hittite *xaban- (*h₃ was lost word-medially in Anatolian). But there was a strong tendency in Hittite for animate n-stems to adopt a-stem inflections. The pivot of the change was the nom.sg., which lost its final *-n early (already in PIE) but acquired a secondary -s in Anatolian on the analogy of other types of animate stems; cf. *h₃ór-ō(n) ‘eagle’, acc.sg. *h₃ór-on-m̥ > Hitt. nom.sg. hāras, acc.sg. hāran-an (n-stem) → hāra-n (a-stem). Indeed, the Hittite ‘river’ word is attested several times with n-stem endings, which lends credence to the hypothesis that hapa- is an original n-stem (Proto-Anatolian *xábō(-s)/*xabn-), and is in fact an exact cognate of Old Irish aub.

Thus the reconstruction of the Hoffmann suffix as *-h₃on-/*-h₃n-, with a laryngeal that triggers voicing in a preceding voiceless segment, allows us to derive all the forms under discussion from one acrostatic root noun *h₂óp-/*h₂áp-. A slightly different alternative solution, also possible though more controversial, would be *h₂ā́p-/*h₂áp-, with an acrostatic *ā/a alternation (fundamental rather than due to laryngeal colouring; some Indo-Europeanists deny the existence of such a pattern). In either case the weak stem is *h₂ap-, and we really can’t know whether the Indo-Iranian long vowel in the strong cases reflects *o lengthened by Brugmann’s Law, or inherited *ā. I’ll tentatively accept the former possibility (without ruling out the latter). The root noun itself is attested securely but less widely than its most important derivative, *h₂ap-h₃on- > *h₂ab(h₃)on- ‘river’. On the whole, the analysis sketched above is weaker than the reconstruction of *wódr̥/*wédn-. Some linguists do not find the identification of the laryngeal in the Hoffmann suffix as *h₃ convincing, and are happy with the reconstruction of alternative roots (or root variants) for ‘water/river’. To my mind, Hamp’s solution is elegant and parsimonious (it prevents us from positing extra variants beyond necessity).

Note that the gender of *h₂ṓp-s/*h₂áp- is feminine in Indo-Iranian (animate in PIE terms), as opposed to the neuter (inanimate) gender of PIE *wódr̥/*wédn-. The distribution of both words and their derivatives (in both primary subfamilies of Indo-European, sometimes in one and the same branch, and without any geographical restrictions – from Ireland to India, Central Asia and Chinese Turkmenistan) guarantees protolanguage status for both of them. The gender difference, the mythological significance of Indo-Iranian *Hap- (not shared with *udan-), and the fact than *h₂ap- seems to have been preferentially used in other IE branches to derive words with the meaning ’river, stream’, suggest that the words were not quite synonymous, and that the Indo-Europeans may have been like the modern Hopi Indians in having two separate concepts corresponding to English water: ‘tame water’ contained for human use (like Hopi kuuyi) versus ‘wild water’ as a natural force beyond human control (like Hopi paahu). It’s the latter kind that could be personified or even deified. Note the potential problem for long-range research: even “Swadesh” meanings are not necessarily as fundamental as we tend to imagine. If one wants to compare the IE ‘water’ terms with putative external cognates, the question arises which aspect of ‘water’ is more representative of H₂O. All right, then: which of the two do mass-comparatists mean when they talk of “the PIE word for water”? Surprisingly, neither, as we shall see next time.