Margaret Magnus Dissertation Intro

What's in a Word?
Studies in Phonosemantics

by
Margaret Magnus

Submitted to NTNU for evaluation for the degree 'Doctor Philosophiae'
4/20/01

Accepted for degree of Doctor Philosophiae
11/16/01
University of Trondheim
Trondheim, Norway
Lars Hellan, Catherine Chvany, Greg Carlson, Wim Vaan Dommelen

Table of Contents

0. Abstract 1

1. Introduction 2

1.1 Conflicting Data 2

1.2 Overview of Major Results 3

1.3 Methods Employed 9

1.4 Brief Outline 11

2. Overview of the Phonosemantics Literature 12

2.1 The Beginnings of Phonosemantics 12

2.1.1 The Ancients 12

2.1.2 The 17th-19th Centuries 14

2.2 Prewar Phonosemantics -- Major Trends in the 20th Century 19

2.2.1 Maurice Grammont 19

2.2.2 Velemir Khlebnikov 19

2.2.3 Leonard Bloomfield 20

2.2.4 Psycholinguistic Experiments -- Sapir et al. 21

2.2.5 Otto Jespersen 22

2.2.6 Richard Paget 22

2.2.7 African Ideophones -- Doke et al. 22

2.2.8 John Rupert Firth 23

2.3 Structuralism -- Saussure 24

2.4 Postwar Phonosemantics 26

2.4.1 Dwight Bolinger 26

2.4.2 Ivan Fónagy 26

2.4.3 Hans Marchand 27

2.4.4 Suitbert Ertel 28

2.4.5 Gérard Genette 29

2.4.6 Roman Jakobson 29

2.4.7 Roger Williams Wescott 30

2.4.8 Richard Rhodes & John Lawler 31

2.4.9 Keith McCune 32

2.4.10 Yakov Malkiel 32

2.5 Research after 1990 33

3. Theoretical Preliminaries 34

3.1 Recapitulation of Basic Issues 34

3.2 Classification Systems 36

3.3 A Small Scale Example of the Phonosemantic Experiment 40

3.4 Overview of the Experiments to Be Conducted 46

4. Phonosemantic Experiments 54

4.1 Experiment 1 -- Classification First by Phoneme Sequence and then by Semantic Domain 54

4.1.1 Methodology 54

4.1.2 Example 55

4.1.3 Discussion of Findings 57

4.1.3.1 Overview 57

4.1.3.2 Semantic Domains of the Consonants 59

4.1.3.3 'Exceptional' Words and Concrete Noun Classes: 66

4.1.3.4 The Senses of a Word 72

4.1.3.5 The Positional Effect 73

4.1.3.6 Summary of Results of Experiment 1 and Outline of Resultant Theories about Language 75

4.2 Experiment 2 -- Classification First by Phoneme Sequence, Subclassification by Semantic Domain and then Regrouping of Different Phonemes by Semantic Domain 77

4.2.1 Methodology 77

4.2.2 Example 78

4.2.3 Discussion of Findings 80

4.2.3.1 Evidence this Experiment Provides for the Major Theses in this Dissertation 80

4.2.3.2 Common Semantic Domains for /r/ in Second Position 81

4.2.3.3 Characterizations of the Phonetic Features 82

4.2.3.4 Characterizations of the Phonetic Features Sorted by Semantic Class 83

4.3 Experiment 3 -- Natural Classes for Arbitrary Sets of Words 85

4.3.1 Methodology 85

4.3.2 Example 85

4.3.3 Discussion of Findings 86

4.4 Experiment 4 -- Classify Words Containing a Phoneme Sequence X into a Classification Designed for Words Containing Phoneme Sequence Y 89

4.4.1 Methodology 89

4.4.2 Example 89

4.4.3 Discussion of Findings 90

4.5 Experiment 5 -- Monolingual Classification First by Semantic Domain, then by Phoneme -- Concrete Noun Classes -- Words Referring to Walking 93

4.5.1 Methodology 93

4.5.2 Example 94

4.5.3 Discussion of Findings 97

4.6 Experiment 6 -- Monolingual Classification First by Semantic Domain, then by Phoneme -- Classes Typical of Certain Phonetic Features -- The Bias in the Labials 100

4.6.1 Methodology 100

4.6.2 Example 100

4.6.3 Discussion of Findings 102

4.6.3.1. Tendency for Certain Semantic Classes to Have Disproportionately Many Labials 102

4.6.3.2. Tendency for Labials to Appear Disproportionately in Certain Semantic Classes 102

4.7 Experiment 7 -- Multi-Lingual Classification First by Semantic Domain, then by Phoneme -- Words Referring to Locations 107

4.7.1 Methodology 107

4.7.2 Example 108

4.7.3 Discussion of Findings 111

4.8 Experiment 8 -- Positional Iconism, Comparison of Similar Phonemes 113

4.8.1 Methodology 113

4.8.2 Example 113

4.8.3 Discussion of Findings 114

4.9 Experiment 9 -- Reverse Phoneme Order 118

4.9.1 Methodology 118

4.9.2 Example 118

4.9.3 Discussion of Findings 120

4.10 Experiment 10 -- Cross Linguistic Phonesthemes /str/ 127

4.10.1 Methodology 127

4.10.2 Example 127

4.10.3 Discussion of Findings 129

4.11 Experiment 11 -- Invented Definitions for Nonsense Words 133

4.11.1 Methodology 133

4.11.2 Example 133

4.11.3 Discussion of Findings 137

4.12 Experiment 12 -- More Narrowly Limited Semantic Characterizations of Nonsense Words 148

4.12.1 Methodology 148

4.12.2 Example 148

4.12.3 Discussion of Findings 148

4.13 Experiment 13 -- Invented Words for a Given Definition 150

4.13.1 Methodology 150

4.13.2 Example 150

4.13.3 Discussion of Findings 151

4.14 Experiment 14 -- Invented Words to Describe Images 158

4.14.1 Methodology 158

4.14.2 Example 158

4.14.3 Discussion of Findings 159

5. Some Observations Regarding the Nature and Structure of Language 164

5.1 Introduction 164

5.1.1 Informal Overview of the Empirical Facts 164

5.1.2 The Paradox 167

5.2 The Structure of a Word 168

5.2.1 Structural Levels 168

5.2.2 Semantic Levels 168

5.2.2.1 Iconism 169

5.2.2.2 Classification 170

5.2.2.3 Reference 171

5.2.2.3.1 Reference in General 171

5.2.2.3.2 Concrete Nouns 172

5.2.3 Semantic Association 173

5.2.4 Semantic Relations and Subcategorization 174

5.3 How the Proposed Word Structure Accounts for the Empirical Facts 176

5.3.1 Phoneme Physics and Classification 176

5.3.2 Phonosemantic Association and Iconism 176

5.3.3 Phonosemantic Association and Natural Classes 177

5.3.4 Iconic Meaning and Syntagmatic Context 178

5.3.5 Senses and Phonesthemes 179

5.3.6 Basic Words and Senses 180

5.4 Ramifications of Phonosemantics for Issues in Linguistic Theory 181

5.5.1 The Function of Language and Abstract Semantic Representations 181

5.5.2 Semantic Primitives 181

5.5.3 Universals 181

5.5.4 A Possible Mechanism by which Sound Shifts Interact with Phonosemantics 182

5.5.5 Resolution to the Cratylian Paradox 184

5.6 Future Research 186

5.7 Concluding Remarks 187

Endnotes 188

Bibliography 192

Appendix I

Appendix II

Appendix III

Appendix IV

Appendix V

Appendix VI

Appendix VII

Appendix VIII

Appendix IX

Appendix X

Appendix XI

Appendix XII

Appendix XIII

Appendix XIV

0. Abstract

The notion that there is a regular correlation between the form of a word and its meaning is, of course, controversial. In this dissertation my intention has been to shed light on that controversy by conducting a variety of tests -- for the most part on a fairly large scale -- which quantify the extent of the correspondence between sound and meaning in words. I found in the course of this project that phonosemantic correlations were much more pervasive than I initially anticipated and certainly greater than is generally supposed in the linguistics literature. Furthermore, I cannot but see that these tests show that quite general natural laws are productively operative in language which account for most of the correlations observed. If further research indeed corroborates my findings, then it follows that the meaning of every word in every language is in part (only in part!) inherent in its form. The sign is therefore not wholly arbitrary, and it is not possible to devise an abstract representation of language which is entirely unrelated to the form of language itself. The most important results of the experiments in this dissertation seem to me to be these:

* I find that much confusion regarding linguistic iconism can be attributed to the assumption that 'word semantics' is best understood as 'word reference'. I believe these tests show this presumption to be unhelpful. If a word's meaning is analyzed into components -- only one of which is its referent -- it can be shown that some aspects of a word's meaning are arbitrary and others are not. It's therefore not the case that in some words or languages iconism holds more sway than in others. Rather since all words must have these requisite semantic components in order to function at all, the semantics of any word must be in part predictable from its form and in part not.

* Reference is essentially arbitrary. One cannot predict the referent of a word just by hearing it. In words with more concrete reference, the component of reference is more salient, and the iconic sound-meaning is consequently less salient. Therefore, the apparent effect of the sound-meaning is inversely proportional on the concreteness of the referent.

* Individual phonemes and phonetic features are meaning-bearing. They each have a unique semantics which can be identified by first measuring the semantic disproportions within phonologically defined classes of words and then the converse -- measuring the phonological disproportions within semantic classes. One finds in this way that every word which contains a given phoneme bears an element of meaning which is absent in words not containing this phoneme. One finds further than the effect of the phoneme-meaning varies with the position that the phoneme bears within the syllable. In addition, one finds that all phonemes which have a common phonetic feature also have a common element of meaning.

* It is important to distinguish types of sound-meaning correlations:

- The least fundamental kind of sound-meaning correlation is onomatopoeia. It does not concern me in this dissertation.

- The type of correlation which accounts for the 'phonesthemes' or disproportions between semantic classes and phonological form is most commonly called 'Clustering'. I refer to it also as Phonosemantic Association in order to emphasize that it is a side-effect of a natural and productive tendency in human psychology to associate any form with a coherent referent.

- The most fundamental and least salient type of linguistic iconism I will refer to as 'True Iconism', or the level on which form and content are one. This type of correlation is universal, productive in every word, non-arbitrary, and blind to all higher level linguistic distinctions such as referent, part of speech, semantic class and argument structure.

I believe this dissertation provides stronger evidence for these 4 findings than any I have come across anywhere in the existing literature.

1. Introduction

1.1 Conflicting Data

The basic thesis presented in this dissertation -- that there is some level of regular correlation between the phonetics of a word and its meaning -- is controversial. Though the presumption of 'arbitrariness of the sign' seems to have dominated linguistic science since the mid-1960's, this has not always been the case. Apart from Hjelmslev and de Saussure, many of what we think of as 'great' pre-War linguists (Bloomfield, Jespersen, Sapir, Firth), wrote works in support of the position that either the sound or the articulation of words has a synchronic, productive effect on their meaning. In The Sound Shape of Language, Jakobson and Waugh wrote, "Linguists have begun to turn their attention toward the immediate and autonomous significance of the constituents of the verbal sound shape in the life of language... One cannot but agree with Coseriu (1969) when he acclaims Georg von der Gabelentz (1840-1893) as a 'precursor of present day linguistics' and especially as a promoter of the fruitful ideas on sound symbolism1." The generativists did not, of course, end up following what Jakobson and Waugh perceived to be a rising interest in phonosemantics. To my knowledge, not a single phonosemantic work was written within the generative tradition, though many generative works do presuppose or explicitly claim the converse -- that the sign is completely arbitrary.

I believe it can be demonstrated that a lot of this controversy is due to general failure within the field to have come to an adequate understanding of what is meant by terms such as 'arbitrary' and 'word semantics' or 'meaning'. Specifically, 'meaning' has been largely limited to 'reference'. Clearly, one cannot predict the referent of a word from its form. Every word is of course arbitrary in this sense. I would only take issue with the presupposition that all word semantics can be reduced to reference.

One of the fundamental debates in linguistics -- and the primary debate which concerns me in this dissertation -- is most commonly known as the conventionalist/naturalist opposition. In my view, much of the uninteresting literature surrounding this debate can be traced back to two related false assumptions, one most commonly made by the naturalists, and the other by the conventionalists. In recent decades, conventionalism has been more in vogue, and consequently, throughout the latter part of this century, we seem for the most part to have been drawing the following conclusion:

The Conventionalist Overgeneralization
We cannot predict what referent a given sequence of phonemes will have in a given language. Therefore, there is no synchronous, productive correlation between the phonetics and the semantics of words whatsoever.

This reasoning fails on two counts. In the first place, just because no correlation between two phenomena has been found, this is not evidence that none exists. Existence of anything is much easier to prove than non-existence. Furthermore, this position presupposes that word semantics can be completely reduced to word reference -- an assumption that I will question deeply in the present work. The evidence provided in this dissertation suggests that certain aspects of word semantics can be predicted from its form, and others -- most notably and saliently the referent of the word -- cannot be.

The naturalists have drawn the converse conclusion based on the very same erroneous assumption -- that word semantics cannot be analyzed into identifiable components:

The Naturalist Overgeneralization
Some aspects of word semantics are derivable from phonetics, therefore all word semantics is derivable from phonetics.

In my view and in the view of most of the literature in phonosemantics dating back to Plato, neither of these positions is tenable. I believe the 14 experiments in this thesis show that word meanings are decomposable into various components, some of which are arbitrary and some not. Since no word can function without all these components, it follows that all word meanings are in part arbitrary and in part predictable from their form. Specifically, the referent determines what the word is. The sound does not directly affect what a word denotes, but what it connotes, not what it is, but what it is like. That is, just by hearing the sound 'brump' in a language, one cannot predict whether the word refers to a sound or an animal or a verb of motion. But if 'brump' refers to a verb of motion, it will involve an initial breaching of some kind of impediment and a sudden, forceful conclusion.

1.2 Overview of Major Results

In this section, I will make no attempt whatever to substantiate what I consider to be my most important results -- I am only trying to explain what the results are. The reader is asked to withhold judgement regarding their validity until the evidence from the 14 experiments has been considered. Each of them will be discussed briefly in turn in this section:

1. The Phonosemantic Hypothesis
2. The Arbitrary Nature of Reference
3. Word Semantics is Not Reducible to Reference
4. The Universal Character of Clustering or Semantic Association
5. The Universal Character of True Iconism

One result of the 14 experiments outlined in this dissertation is to provide evidence for the following strong thesis:

The Phonosemantic Hypothesis
In every language of the world, every word containing a given phoneme has some specific element of meaning which is lacking in words not containing that phoneme. In this sense, we can say that every phoneme is meaning-bearing. The meaning that the phoneme bears is rooted in its articulation.

I am not hereby implying that the semantics of every or even any word is wholly determined by its form -- it is not. In arguments for the Conventionalist or the Naturalist Overgeneralizations, word semantics is nearly always presupposed to be a sort of unanalyzed, amorphous blob vaguely identical to the word's referent. It's my contention that a word's semantics has a definite structure and that a word means more than what it refers to. I therefore deal with the overwhelming masses of apparent counterevidence to the Phonosemantic Hypothesis (the existence of dialects, regular sound change -- both synchronic and diachronic, paradigmatic and syntagmatic, the impossibility of predicting referents based on phonetic form, etc.) by analyzing the structure of word semantics into discrete components with identifiable functions. Having done this, I can then show that some of these components are arbitrary in nature and others are not. These counterexamples concern only the arbitrary aspects of the word's semantics -- primarily its referent.

Let me here briefly describe what I understand to be the relationship between reference and semantic classes. Words which share a common element of reference are said to fall in the same 'semantic class'. The more unique and unambiguous a word's referent, the more 'concrete' it is said to be, the fewer words share its narrowest semantic class. Semantic classes may be organized hierarchically. The word 'daffodil' is in a semantic class of its own, since there are no real synonyms for 'daffodil' in English. It also, however, falls in a wider semantic class of bulbs (i.e. the word 'daffodil' shares part of the referent of other bulb flowers, but also in part has a referent that is unique only to it), and in a yet wider class of flowers in general, etc. I do not think it is most profitable to assume that each word in a language has a unique referent. Rather I think each word has a unique meaning, but that words frequently share their referents with other 'synonymous' words. For example, although I think the word 'daffodil' does have a unique referent (i.e. no real synonyms, as is typical of Concrete Nouns), I think the senses of the words 'stamp', 'stomp' and 'tamp' which concern striking the foot against the ground all are most effectively viewed as sharing the same referent and differing semantically only by their various sound-meanings. The reason I think this is the best way to look at it, is that I believe that the semantic differences between these particular senses of 'stamp', 'tramp', 'stomp', 'step', 'tamp' and related words can be shown to correlate very nicely with the variations in their phonological form.

I am assuming that a single string of phonemes can have several different referents, commonly thought of as 'word senses'. I frequently use the term 'word' when I have in mind a single word 'sense', one of several possible referents. Thus, I am assuming the phoneme string 'stamp' has, among others, a different referent than the one which fits in this particular semantic class, namely that of a postage stamp. On the other hand, the word 'daffodil' has, as far as I know, a single referent in English, and furthermore, no other words in English share that referent entirely. The phoneme sequences 'stomp' and 'tamp' also, as far as I know, have a single referent, but it is not unique to them -- they share this referent between them. The phoneme sequences 'stamp', 'tramp' and 'step' all have several referents, only one of which is the same as that of 'stomp'. There is a great deal to be said about the structure of a word which I will not delve into much in the present work, for that would take me very far afield. Typically when the various referents of a single phoneme string are obviously related by, for example, hyponymy or metaphor, they are thought of as 'senses' of the same word. Terms like 'word' and 'sense' are not at all well-defined, unfortunately, but it's impossible not to use them. Let the reader know, therefore, that I am aware of potential misunderstandings that can arise because of this, and that I will try to avoid them by being explicit when necessary.

Summarizing, then:

The Arbitrariness of Reference and Semantic Classes
The referent of a word cannot be predicted from its form. The fewer exact synonyms that a word has (the smaller the set of words that share its referent exactly) the more 'concrete' its 'reference'. The salience of iconic meaning in a word is related inversely to the concreteness of its reference.

Word Semantics is Structured
Word semantics has a definite structure. 'Word semantics' cannot be reduced to 'word reference'. A word's semantics includes among other things its part of speech, its semantic class, its argument structure, the corresponding selectional restrictions, its referent and its phonological form. Some of these aspects of word semantics are 'arbitrary' in nature (in Saussure's sense) and others are not.

A very common objection to generalizations like the Phonosemantic Hypothesis is that one cannot in principle claim anything of such universal character without having examined every word in every language. I would actually state this objection even more strongly. One could not make such a universal claim as the Phonosemantic Hypothesis even after having studied every word in every language. Such universal claims cannot be made unless it can be shown that the relevant effects can be attributed to natural laws. For example, gravity is a natural law, and using it, one can predict that objects when dropped will fall to the ground on Mars; they will not float upward. One can make this prediction without having ever turned a telescope on Mars, because one has understood that gravity must apply to anything composed of matter, even to planets one has never examined. One cannot, however, predict how fast objects will fall to the ground on Mars without having somehow estimated its mass. Similarly, if it can be shown that linguistic iconism reflects a natural law, then we will be able to predict that form must to some degree affect the semantics of every word in every language. However, that effect will vary within certain parameters, and we will not therefore be able to predict exactly what the effect of sound on meaning will be for a given word in an arbitrary language without, for example, analyzing how concrete the word's referent is.

The position taken in much of the literature arguing for the arbitrariness of the sign is that such phonesthemic disproportions are mere side effects of etymological processes and say nothing significant about the nature of language itself. I will provide evidence here that the phonesthemic disproportions are indeed subject to natural laws and processes and therefore say a great deal about the psychology of speakers. Let me propose here one such natural law or universal process which I believe to be responsible for much of the data which will be presented here, and which if valid, would mean that at least one aspect of linguistic iconism is universal in nature:

Semantic Association
When semantic domain S is associated disproportionately frequently with phonological form X, then people will be inclined to associate semantic domain S with phonological form X productively.

Phonosemantic Association
When semantic domain S is associated disproportionately frequently with phoneme X, then people will be inclined to associate semantic domain S with phoneme X productively.

Phonosemantic Association is therefore a special case of Semantic Association. It is Semantic Association at the phoneme level. Semantic Association obviously does take place on the level of an entire word. A phoneme sequence in the form of a word occurs disproportionately frequently in a certain context, and a child learning languages then continues to use that word in that context productively. It is generally acknowledged that Semantic Association happens also on the level of the morpheme, i.e. that morphemes are meaning-bearing. One of the primary questions I ask in this dissertation may be phrased as, "How far down on the linguistic hierarchy does Semantic Association apply?" Virtually no linguist would claim that Semantic Association does not happen on the level of the word or the morpheme. Does it then happen on the level of the syllable? Bolinger, Rhodes, Lawler and McCune all provide evidence that Semantic Association occurs on levels lower than the syllable. (I'll try throughout not to clutter my exposition with specific dates, when the works I have in mind are easily recoverable from the bibliography.) Does it then occur on the level of the phoneme? The phonetic feature? The Phonosemantic Hypothesis is saying essentially that Semantic Association applies at least on the level of the phoneme. I will also provide evidence that Semantic Association goes down even to the level of the phonetic feature.

On reflection, I do not believe this to be such a strange proposal. Obviously, a certain semantic domain occurs disproportionately frequently in conjunction with a word or a morpheme. A child hears a word or morpheme in a given limited way and goes on to use it productively in that limited way. Why then should it be so strange to imagine that this process happens organically on the lower levels of the syllable and the phoneme? Why should not the child hear a phoneme and associate it as well with a limited context just as s/he does a word or a morpheme? Indeed, it makes little sense to me that a child would apply such a process down to the level of the morpheme, but somehow decide it should be applied no lower. It seems more likely that Semantic Association either is a universal tendency and applies everywhere equally, or it isn't a tendency at all, and it applies nowhere. Analogously, a natural law in physics is presumed to apply universally and identically in all space and time frames if it applies at all. Furthermore, it seems to me that if Semantic Association were not a universal tendency -- at least on the level of the word -- then there would be no way to learn to talk at all.

One aspect of this research which eluded me for a long time was the recognition that Phonosemantic Association is not identical with True Iconism. Von Humboldt already in the middle of the 19th Century distinguished three types of linguistic iconism. One was the least pervasive type known as onomatopoeia. It is limited to a precise function and a very small semantic domain -- to words which either refer to a sound or to something which makes a sound -- and I will not discuss it in this dissertation. Another is the Clustering or Phonosemantic Association I have just outlined. And the third most fundamental, most universal, completely predictable and least salient type of iconism is what I call 'True Iconism' -- the level on which form literally is meaning. I will sometimes call 'True Iconism' simply 'Iconism' in contexts where I think it cannot be confused with Clustering.

Phonosemantic Association has an element of arbitrariness in it. If a fundamental word like 'house' in a given language begins with an /h/, then Phonosemantic Association will cause words with similar sound and meaning to cluster to it, so that the language ends up with disproportionately many 'house' and 'home' words starting with /h/: hacienda, hall, hangar, harem, haunt/s, haven, hearth, hive, hogan, hold, hole, hollow, home, host, hostel, hotel, house, hovel, hut, hutch. The Process of Phonosemantic Association is, as far as I can tell, universal and potentially affects any word. But whether or not the basic word for 'house' in a given language starts with /h/ is a matter of reference and is arbitrary. And whether a group of speakers will tend to cluster a nonsense word like 'bamp' in the semantic class of 'collision' words with 'bump' or in the semantic class of 'incline' words with 'ramp' turns out also to be in part (though not entirely) arbitrary. So Clustering is not blind to semantic classes, hence not entirely blind to reference, and hence not entirely predictable -- it has an element of arbitrariness.

But True Iconism is completely predictable and completely blind to reference. It does not affect what semantic class the word falls into, what its part of speech is, what its argument structure is or anything else. It is purely meaning-as-form. It cannot even be described as a 'tendency' or a 'process' the way Semantic Association is. It lies even deeper than that. One can therefore see True Iconism most clearly once one has abstracted away from all other aspects of word semantics and examine a class of word senses which effectively have the same referent and argument structure: {flit, flitter, float, flutter, fly} or {stamp, tramp, tamp, tromp, step, stomp, ...}. I am suggesting that what distinguishes word senses which are as similar as these from one another is basically how they sound. In the first class, the final /ur/ makes the movement repetitive, the short /i/ makes the movement quick and short. In the second class, a pre-final /m/ makes the contact with the ground heavy. A pre-vocalic /r/ makes the motion go forward, and so forth. Let me define here a bit more formally what I mean by True Iconism so I can refer to it later.

True Iconism
True Iconism is the level on which a word means what it is. Viewed from the perspective of 'parole', True Iconism is among the least salient aspect(s) of word semantics often masked or buried by other levels. From the perspective of 'langue', True Iconism is the most fundamental aspect of word semantics on top of which all other layers of semantics are superimposed. The form of a word does not directly affect what the word refers to, what its argument structure is, or any other aspect of its meaning. It only directly affects our understanding of what the word's referent is like, the word's connotation.

The form of a word does indirectly affect what a word refers to by Clustering. Clustering, in other words, is a process whereby words take on referents similar to the referents of similar sounding words which already exist in the language. It also causes a language to prefer borrowings that are compatible with the preexisting Clustering structure of the language, and if the borrowed word is not completely compatible, it tends to alter the word's meaning to make it compatible with the existing Clustering structure. To Iconism, on the other hand, reference is completely irrelevant.

1.3 Methods Employed

One finds in the literature two basic kinds of tests for sound-meaning correlations:

1. The existing vocabulary of a given language is classified according to both phonetic form and semantic domains to see whether certain phonemes are more or less prevalent in certain semantic domains than in others.

2. Informants are prompted with sounds, images, foreign words or nonsense words and asked to provide some kind of feedback based on their linguistic intuitions. These results are then examined to see if there if they display any sound-meaning patterns.

Tests of the first type tend to provide more specific data regarding the precise structure of word semantics than the second. However, no number of tests of type 1, regardless of coverage, can in principle prove that a universally productive natural law is involved. Tests of the second type can provide such evidence. Furthermore, tests of the first type tend to provide evidence for 'Clustering' or Phonosemantic Association, whereas the second type of test tends more readily to provide evidence for Iconism. Most of the tests outlined in this dissertation provide some evidence for both types of iconism.

The first type of test consists in classifying words into phonesthemes. 'Phonestheme' is a term first coined by John Rupert Firth (1930) to refer to a sound sequence and a meaning with which it is frequently associated. An example of a phonestheme is the English /gl/ in initial position associated with indirect light:

Reflected or indirect light -- glare, gleam, glim, glimmer, glint, glisten, glister, glitter, gloaming, glow
Indirect Use of the Eyes -- glance, glaze/d, glimpse, glint
Reflecting Surfaces -- glacé, glacier, glair, glare, glass, glaze, gloss

These make up 19 of 46 of the words beginning in /gl/ in my active, monomorphemic vocabulary of English. (I'll discuss the other /gl/ words shortly.) Surely 'indirect light' is too narrow a semantic domain, and 41% too high a percentage to support a claim that the relationship between /gl/ and 'light' is completely arbitrary. Nor is it, of course, completely predictable. The hope is that by looking more carefully at phonesthemes and drawing our distinctions more finely, we will be able to determine just what is predictable and what is not. Let me here describe very generally how I arrive at the conclusions outlined in the preceding section.

* Phonemes are meaning-bearing: When one classifies all English monosyllables into phonesthemes, one finds that disproportionately many words containing, for example, /k/ refer to containers, lids, collisions, acquisition, sticking and the like, and disproportionately many words containing /t/ imply a goal without specification as to whether that goal is reached, and disproportionately many words containing /f/ involve 'flight'. And the disproportions are quite large. (Disproportions can, of course, only be quantified if one classifies all the words in a language with a given phonological characterization. If one's empirical base is incomplete, one cannot apply a quantitative method and can therefore make no substantive claims.) However, that in itself is not enough to show that phonemes are meaning-bearing, for /k/ doesn't 'mean' anything as simple as 'collision'. If one, however, then looks at all words referring to 'collisions', one finds that those containing /k/ are different in some identifiable way from those that do not contain /k/. These two types of data taken together, I believe, constitute very strong evidence that phonemes are meaning-bearing.

* The salience of sound-meaning in a word is inversely proportional to the concreteness of its referent: The basic evidence for this is that if one classifies a large set of words (like all the English monosyllables) into phonesthemes, one finds that about 3% don't fit in any phonestheme, and these 3% are always Concrete Nouns. That is to say, the do fit in one of the following Concrete Noun classes: people, titles, body parts, clothing, cloth, periods of time, games, animals, plants, plant parts, food, minerals, containers, vehicles, buildings, rooms, furniture, tools, weapons, musical instruments, colors, symbols, units of measurement.

* Semantic Association happens productively even on the level of the phoneme: One type of indirect evidence I will present that Phonosemantic Association is living and productive is the astounding generality of the phonesthemes as evidenced by tests of type 1. If there were no productive force maintaining this phonesthemic structure, then surely phonological shifts over the centuries would have long since disintegrated any discernible sound-meaning correlations in a language which has undergone as much change as English. Another more direct type of evidence involves tests of type 2 in which informants are asked to invent arbitrary definitions for nonsense words. If asked to make up definitions for nonsense words beginning with 'gl-', a disproportionate percentage of these definitions will concern reflected light or 'gluiness', just as a disproportionate percentage of /gl/ words in the English vocabulary concern reflected light or 'gluiness'. The informants therefore are productively 'Clustering' nonsense words with similarly sounding words in the existing vocabulary.

* The productive nature of True Iconism: One type of evidence for Iconism which has already been mentioned is to find a group of words which seem very similar in every way -- i.e. they have the same argument structure, part of speech, referent, etc.. If one then compares these words, one finds that there is a quite regular and intuitively 'iconic' correspondence between their phoneme structure and their connotations. Another type of evidence for Iconism consists in comparing phonological forms with word semantics across languages. If words containing a phoneme sequence k-v-n or s-t-r are always limited to a narrow range of semantic classes across unrelated languages and vocabularies that are not cognate, then the sequence k-v-n must have some universal meaning. Such tests provide evidence both for Clustering and for Iconism. Yet another type of evidence for Iconism is to ask naive informants to invent nonsense words to describe semi-abstract images. The words chosen for a given image are confined to a much more limited set of phonological forms than one would predict if the choice were purely arbitrary. Finally, phonesthemic classifications for a given phoneme resemble the phoneme's articulation. This would be a very strange coincidence indeed if there were no True Iconism active in language.

The initial experiments primarily test for Phonosemantic Association and the Phonosemantic Hypothesis, while the final experiments primarily test for Iconism; the intermediary experiments are evidence for both, with increasing emphasis on Iconism. The initial experiments offer more insight into the precise structure of word semantics, and the final experiments offer stronger evidence that these generalizations I have outlined are the result of productive natural laws, and are not solely explainable as historical artifacts. Since it is harder to see Iconism than Clustering, for the sake of ease of exposition, I present the Clustering experiments first.

1.4 Brief Outline

In chapter 2 of this dissertation, I will review several major works in what turns out to be a fairly extensive literature in phonosemantics. In chapter 3, I will outline the phenomenon in more detail as well as some theoretical preliminaries necessary to understanding the succeeding discussion.

In chapter 4, I present the data, methods and results for 14 experiments which yield positive evidence for a strong synchronic correlation between the phonological form and the semantics of words. I believe these to be repeatable experiments, in the sense that they can be applied with positive results by any native speaker to arbitrary phonemes, semantic classes and languages. If I am correct in this, then the results I present here submit to the fundamental requirement of all scientific claims, namely that they can in principle be falsified, but the results of repeatable experiments in fact support them. Indeed, the phonosemantic literature really consists in large part of a collection of hundreds of such varied experiments performed for languages worldwide and all yielding more or less the same conclusion with varying degrees of generality. Most of the tests presented in this work cover a large portion of the vocabulary and all have been applied to every word within a given semantic or phonological characterization. I believe I have been quite thorough in my coverage of the data so that I am in a position to quantify the results and draw conclusions from them. Most of the tests were applied to English; some were applied also to other languages; and in some cases, the language of the informant was irrelevant. The results of the tests are included in the Appendices. Each test is presented in the same order:

1. I describe in detail the method employed in the experiment.

2. For the sake of clarifying the discussion, I give an example of the results that appear in the relevant appendix. Hopefully, this will also make it possible to read through and understand the thesis without referring to the appendices.

3. I provide a detailed discussion of the results of the experiment, what I think the experiment shows, and what the consequences of the results are for linguistic science.

The concluding chapter 5 contains a theoretical discussion of all the results from all the tests taken as a whole. I also take up there some fundamental related issues in linguistics, such as semantic primitives, abstract semantic representations, linguistic unicersals, arbitrariness of the symbol, and the nature of semantic classes, and discuss how my findings affect my perspective on these issues.

*******

Note: Throughout the dissertation, I occasionally allow myself to describe the phoneme effects a little informally for the sake of clarity and ease of expression. For example, in the discussion of experiment 9, I write, "The combination /t//p/ is often off balance (tip, topple, trip, steep, stoop, stumped, tipple, tipsy, top (the toy))," rather than saying something like, "Words containing the consonants /t/ followed by /p/ often have an element of meaning which implies imbalance (tip, topple, trip, steep, stoop, stumped, tipple, tipsy, top (the toy))." I find this particular type of discussion is often facilitated by attributing a sort of poetic agency to the consonants themselves. But let the reader be aware that I am doing this consciously with the purpose of making the discussion easier to read.