Compound (linguistics)

This original article is about compounds in linguistics. Its aim is to provide properly sourced information about compounds, in part to support their treatment in dictionaries. The requirement for inline references on single sentence level is taken more seriously than in most Wikipedia articles.

Definition

edit

Compounds are easy to define approximately as words made from words. They are hard to define exactly, especially since it is difficult to distinguish compounds from phrases.[1][2]

The following definitions of "compound" can be found:

  • (1) A word composed of multiple words.[3][4]
  • (2) A word composed of multiple independent words or combining forms of words.[5]
  • (3) A word composed of multiple free morphemes.[6]
  • (4) A lexeme composed of multiple stems.[7]
  • (5) A noun, adjective or verb composed of multiple words or parts of words.[8]
  • (6) A sequence of multiple words that act as a single word.[9].
  • (7) A word or word sequence consisting of multiple parts that captures a specific concept, whether the parts are words or affixes.[10]

The definitions have different implications:

  • The definitions clearly requiring compounds to be words are (1), (2) and (3).
  • Definitions (2) and (4) are a technical refinement of (1). "lexeme" is a fancy synonym of "word". The use of "stem" or "combining form" is required e.g. for Czech, where slunovrat is based on slunce and vrátit, composed of slun- and vrat-, which are word stems, not words.
  • Definition (5) seems broken: "noun, adjective or verb" has to cover typographically multiple words to cover English open compounds, but if it does, then it covers phrases like "cat sitting on the mat", not a compound.
  • Definition (6) is unclear: it is not clear what it means for something to "act as a single word". By saying "act", it allows compounds to be not single words.
  • Definition (7) requires compounds to capture a specific concept, which seems to suggests compounds are not sum of parts. This cannot be so: many German compounds are sum of parts. Furthermore, it includes affixing under compounding; this makes sense for inflected languages: vysokoškolský is a compound but requires suffix -ský to be formed. A requirement for a compound proper, in contrast to lesní, could be that at least two of its parts are words. The case of vysokoškolský shows that definitions (1), (2), (3), (4), (5) and (6) are cross-linguistically inadequate: they work for English since hyphenated adjectival compounds take no suffix.

Demarcation

edit

Compounds need to be distinguished from the following:

  • Affixed words, e.g. blueness. There may be ambiguity: is German "aufholen" made from prefix "auf-" or word "auf"? Is English overcome made from prefix "over-" or word "over"? Furthermore, a Merriam-Webster compound guide includes affixing under compounding.[10] Another source indicates the distinction between compounding and affixing has been treated as problematic in literature.[11]
  • Free non-compound phrases, e.g. green house (house that is green)[12] or cat that is on the mat. The phrase school bus traffic stop laws looks to some as a compound, but credentialed sources usually do not give such an example.
  • Full-sentence proverbs, e.g. all roads lead to Rome.
  • Phrasal verbs. Some non-credentialed sources give phrasal verbs "carry over" and "break up" as example compounds.[13][14] However, credentialed sources usually do not give such examples. On the other hand, when English phrasal verbs are considered to be single words, they meet the definitions of compounds. Still, sources usually do not define English phrasal verbs as words but rather as phrases.[15][16][17]

Part of speech

edit

A compound's part of speech can be noun, adjective and verb.[18] Examples are "bus stop", "self-centered" and "windsurf".[18]

Detection criteria or tests

edit

Compounds written with spaces present a special problem for detection.

Cambridge Grammar of English Language (CGEL) mentions stress, orthography, meaning, and productivity as playing a role in distinguishing compounds from non-compound phrases.[19] CGEL calls the non-compound noun phrases "composite nominals".[19] A further test is "coordination and modification": parts of non-compounds can "enter separately into relations of coordination and modification".[19]

Abdel Rahman Mitib Altakhaineh lists orthography, stress, modification, compositionality, displacement, insertion, referentiality, coordination, replacement of the second element by a pro-form, ellipsis, and inflection and linking element as tests.[11]

Livio Gaeta & Davide Ricca consider compounds to be morphological objects, independent of their lexical status.[20]

Wordhood

edit

Since multiple sources define compounds as words, being a word is a criterion. However, a distinction needs to be made between "orthographic word"[21], "phonological word" and other notions of word.[19][22] Compounds are not necessarily orthographic words, as per the compound "high school". The term "lexical item" is broader than most notions of word, as containing proverbs.[22] Another notion is "morphological word".[21][23] While wordhood is usually a requirement, it is not a simple test but rather depends on a multitude of simpler tests.

Cross-linguistic uniformity or universality of notion

edit

It may be difficult to arrive at a single universal cross-linguistic set of operational tests of compoundhood: "The first, very simple observation is that all languages examined here have morphological compounds. However, it turned out that the compounds in these languages do not all share the same defining properties. While lexical (compound) stress, headedness (either right or left), inseparability and debarment of word-internal inflection, recursiveness, and linking elements are generally considered essential criteria for the definition of compound, in particular from a German(ic) perspective, all of them also emerged as problematic in at least one language, or as non-existent. Thus, it seems that there is no universal definition of compound. Rather, as pointed out by Ralli (2013b: 184): 'What makes a compound morphological should be defined on a language-specific basis, since languages vary with respect to the realization of their morphological features and the use of morphologically-proper units.'"[1]

Unity among different linguists within the same language

edit

Even within a single language, different treatments of compounds can be found in literature, resulting in different classification of candidate compounds as true compounds or not.

French is a language for which some linguists count the likes of pomme de terre as compounds.[24] Some linguists go so far as to claim French has no true compounding at all.[24]

The Italian linguistic tradition is divided over constructions such as zuppa di verdure.[25]

Spanish multi-word phrases león marino and paquete bomba were regarded as compounds by some but not others.[26]

Spelling or orthography

edit

Words written solid or hyphenated are easier to recognize as compounds. Word sequences written with spaces present a problem: not each such sequence is a compound. For instance, "cat that is on the mat" is not a compound, whereas "high school" is a compound. Britannica's article on compounding gives no example of an open compound, implying it does not consider open compounds to be compounds.[27]

Spelling tests work well for some languages:

  • For German, all compounds are written without spaces, and writing them with spaces is a rare error.[28]
  • In Czech and Slovak, all compounds are spelled as one word, while syntactic phrases are spelled as separate words.[11]
  • Finnish: "As a general rule, Finnish compounds are written without space between the constituents"[29]
  • Greek: "Greek compounds display solid spelling, contrary to phrases, just as in German."[1]
  • In Polish, most compounds are spelled as one word without a hyphen, but there are exceptions such as Bośnia-Hercegowina and czarno-biały.[11]

Composition: morphology vs. syntax

edit

The name "compound" implies a composite object. However, both words and multi-word expressions are composite objects, the former made from morphemes (which include some words), the latter made from words. Two different kinds of composition are distinguished: morphological composition vs. syntactic composition.

To some extent, the distinction is unproblematic: "blueness" is a result of morphological composition while "the cat that is on the mat" is a result of syntactic composition. It is in the case of candidate open compounds such as "white house" vs. "White House" where the boundary becomes unclear in English.

"Compounds are the output of morphology, while MWEs [multi-word expressions] are the output of syntax. [...] The property of being morphological implies that an item is the output of some morphological schema or rule, which is different from a syntactic schema or rule."[1]

"in contrast to German it seems much more difficult to provide clear criteria for morphological compounds as opposed to MWEs in French, Spanish, and Italian."[1]

Phonology

edit

English open compounds have a distinctive phonology.[27][30][31][12] Britannica distinguishes compounds from word groups or phrases by "stress, juncture, or vowel quality or by a combination of these".[27] However, while a great majority of English compounds written as single words stress the left component of the compound, a small minority of them stresses the right-hand component instead.[32] There is also a number of double-stressed compounds.[33]. Thus, for English, stress alone is not a universal criterion.

In Romance languages, "compounds and MWEs are basically stressed in the same way".[1]

Meaning and sum of parts

edit

Some sources indicate compounds are not sum of parts: their meaning cannot be derived from the meaning of their parts.[10][34][7] A Czech encyclopedia says compounds usually have a meaning different from the base words.[35] However, being more than a sum of parts is not a necessary condition: German compounds Tanzschule, Zirkusschule and many more are counterexamples, as are English compounds bookshop and appletree. Moreover, it is not a sufficient condition either: idiomatic proverbs are not compounds.

Separate inflection

edit

Consisting of separately inflected parts is one test of non-compoundhood for highly inflected languages.[11] It works only for some of them:

The test has no value for English and Chinese.

The test fails for some languages:

  • In Spanish, some items considered compounds show separate inflection of parts.[11]
  • In Icelandic, there is compound-internal inflection.[1]

Linking element

edit

Presence of a linking element may indicate compoundhood in some languages.[11] Thus, in German, Liebesbrief contains s.[11] However, this is no necessary condition in German, per Konzertreise.[11]

"(Native) linking elements, [...], do not exist in French and Italian."[1]

See also section Linking element examples.

Norms and prescriptions

edit

Some sources for some languages prescribe compounds to be written without spaces:

  • Dutch: "Dutch orthography requires compounds to be written without an internal space."[37]
  • German: "Die Wörter Kürbissuppe, Zwiebelkuchen und Hairstudio werden nach deutschen Wortbildungsregeln zusammengeschrieben."[28]

Compound examples

edit

Example compounds in various languages:

  • Ancient Greek: dermatology, democracy, pyromania, rhododendron[38], that is, δερματολογία, δημοκρατία, πυρομανία, ῥοδόδενδρον
  • Bulgarian: бензиностанция, бира-скара, пиле-грил, бензиностанция, кафе-аперитив, пиле-грил, бира-скара, фаст-фууд[39]
  • Chinese: 大褂儿[40]
  • Czech: zeměpis, olejomalba, vysokoškolský[35]
  • Danish: fyrværkerigrund, bankrådgivning, kulturkløft[41]
  • Dutch: jonggetrouwd, tandextractie, boerenzóon, koningszoon[37]
  • English: rowboat, high school, devil-may-care[3], crime-prone, grass-green, sky-blue, air-quote, dry-burn[42]
  • Estonian: lutipudel, riisipuder, noortööline[43]
  • Finnish: lentokoneonnettomuus[44], kesäyö, märkäpuku, metsäyhtiö[29]
  • French: timbre-poste, essuie-glace[24]
  • German: Kürbissuppe, Zwiebelkuchen, Hairstudio[28], Handelsvertrag, Affenhaus[44], Frischluft[36]
  • Greek: χαρτόκουτ[7], κεφαλόσκαλο, εθιμοτυπικός, κρυφοκοιτάζω[45]
  • Hebrew: beyt sefer[34] (בית ספר)
  • Hungarian: kisautó, kőkemény, városháza, tojásfehérje[46]
  • Icelandic: gufubátur, Norðausturatlantshafsfiskveiðinefndin[1]
  • Italian: pescecane, cavatappi, criminologo, transporto latte, poeta pittore[47]
  • Latin: aequilibrium, multilateralis, carnivora[48]
  • Polish: czerwono-czarni, listopad, językoznawstwo[49], czcigodny, zmartwychwstały, drobnoustrój[50]
  • Russian: glubokomyslie[51], lesostep, zvukorežisser, senouborka[52]
  • Sanskrit: rājapūruṣāḥ, rāmakṛṣṇau[53]
  • Slovak: svetonázor[52]
  • Slovene, Slovenian: ȃvtocesta, vodomèt, očenàš[54]
  • Spanish: coliflor,[26] coche cama, bocacalle, telaraña[55]
  • Swedish: livbåt, livbåtsbesättning, flickebarn, människokärlek[44]

Non-compound examples

edit

The following items are non-compound phrases:

  • Danish: røget laks, stor begivenhed[1]
  • Dutch: rode wijn, rijk versierd, koffie zetten[37]
  • English: piece of cake, dry cough, grass slug, hit the road, green card (card that is green), heavy smoker, kick the bucket[1]
  • German: weich wie Butter, schwarzer Tee, rotes Kraut, Spanisches Rohr, kalter Krieg[1]
  • Greek: psixrós pólemos, zóni asfalías[1]
  • Polish: kontrola jakośki, karma dla zwierzat, numer telefonu, pasta do zębów[1]
  • Russian: novaja kniga, myľnaja opera, sredstva massovoj informacii[52]
  • Swedish: röda hund, hög hatt, ymnig grönska, duka bordet[1]

Linking element examples

edit

Example compounds using a linking element:

Long compounds

edit

Some languages tend to form long compounds, consisting of 3 or more word bases. Some examples:

Lists of compounds

edit

A fairly extensive list of example English compounds is given in a non-native bachelor thesis written in English, sourced from English sources.[61]

Very long lists of compounds are available in Wiktionary categories such as Category:German compound terms. However, these are unreliable and subject to miscategorization.

Long compounds can be found in syllable-count categories such as Category:Finnish 11-syllable words, Category:German 9-syllable words, Category:Polish 9-syllable words and Category:Russian 11-syllable words. Not all of the members need to be compounds.

Neo-classical compounds

edit

Some sources classify the likes of historiography, chromatography and immunological as "neo-classical compounds". [62] They are defined as "words consisting of two or more free morphemes (of Latin or Ancient Greek) which are bound, not free, in the modern language concerned, such as English biology."[62]

Treatment in dictionaries

edit

Candidate compounds and multi-word phrases are treated in dictionaries as follows:

Machine translation

edit

Translating closed compounds (those written solid, with no spaces or hyphens) is a relevant problem for machine translation from languages forming long compounds such as German.[7] These languages form a huge number of transparent long closed compounds, for which it is impractical to maintain a translation dictionary. While breaking these compounds up into components is fairly easy for humans, it is non-trivial for machines. A sum-of-part translation consists in breaking the compound into components and translating the components separately. And example of ambiguity is German "verinbart", which is properly analyzed as a participe of "vereinbaren", but a machine could analyze it as Verein + Bart.[7] (However, even the machine could note that vereinbart is not capitalized and that it is therefore not a noun. Still, the principle remains.)

Early analyses

edit

Some of the earlier analyses of compound vs. phrase are Kruisinga 1932, Bloomfield 1933, Bloch and Trager 1942, Trager and Smith 1951, Marchand 1960, Lees 1960, Zimmer 1971, and Quirk et al. 1972.[30]

Compound term

edit

The phrase "compound term" can be found in reference to compounds in linguistics[76][77][78], but seems rare. One user of the term is Dimković-Telebaković[77], who includes "vertical take-off and landing aircraft" as an example, which would not be considered to be an English compound by many linguists.

References

edit
  1. 1.00 1.01 1.02 1.03 1.04 1.05 1.06 1.07 1.08 1.09 1.10 1.11 1.12 1.13 1.14 1.15 1.16 Compounds and multi-word expressions in the languages of Europe by Rita Finkbeiner and Barbara Schlücker, 2019
  2. Compounds or Phrases? - A Look at The Structure of Atypical Noun-Noun Combinations. by Elna Arvidsson, 2020
  3. 3.0 3.1 Template:R:MWO
  4. Template:R:Cambridge
  5. Template:R:AHD
  6. Adams, §3.1. (this is how the reference is given in Wikipedia)
  7. 7.0 7.1 7.2 7.3 7.4 Language-independent Compound Splitting with Morphological Operations, 2011
  8. Template:R:Oxford Learner's Dictionaries
  9. Template:R:Macmillan
  10. 10.0 10.1 10.2 A Comprehensive Guide to Forming Compounds, merriam-webster.com
  11. 11.0 11.1 11.2 11.3 11.4 11.5 11.6 11.7 11.8 What is a compound? The main criteria for compoundhood by Abdel Rahman Mitib Altakhaineh, Al Ain University of Science and Technology
  12. 12.0 12.1 Compound Nouns, englishclub.com
  13. Examples of Compound Words by Type, anonymous, yourdictionary.com
  14. Compound Words | Types and List of 1000+ Compound Words in English by The English Teacher (de facto anonymous), 2021
  15. Template:R:MWO
  16. Template:R:AHD
  17. Template:R:Macmillan
  18. 18.0 18.1 Template:R:Macmillan
  19. 19.0 19.1 19.2 19.3 Why do grammars claim that adjective+adjective is always a morphological compound and never a syntactic construction?, english.stackexchange.com
  20. Compounds as lexical units or morphological objects? by Livio Gaeta & Davide Ricca
  21. 21.0 21.1 Word, encyclopedia.com
  22. 22.0 22.1 What’s in a Word? by Jennifer A. Henderson, 2007
  23. Wordhood issues: Typology and grammaticalization by Tim Zingler, 2020
  24. 24.0 24.1 24.2 Compounds and multi-word expressions in French by Kristel Van Goethem, 2018
  25. Compounding in Morphology by Pius ten Hacken, 2017
  26. 26.0 26.1 Compounds and multi-word expressions in Spanish by Jesús Fernández-Domínguez, 2019
  27. 27.0 27.1 27.2 Composition in Encyclopedia Britannica
  28. 28.0 28.1 28.2 Kompo­sition: Zusammen­schreibung, Ge­trennt­schreibung, Binde­strich in Duden
  29. 29.0 29.1 Compounds and multi-word expressions in Finnish by Irma Hyvärinen, 2019
  30. 30.0 30.1 Forestress and Afterstress* by Arnold Zwicky, 1986
  31. What Is Compounding in the English Language? by Richard Nordquist, 2019, thoughtco.com
  32. Stress placement on phrases and compounds in English by Metin Yurtbaşı, 2017
  33. Stress in compounds: An experimental research by Pavol Stekauer, Julius Zimmermann, and Renáta Gregová, 2007
  34. 34.0 34.1 Compounds: the View from Hebrew by Hagit Borer, 2008
  35. 35.0 35.1 Kompozitum in czechency.org by Ivana Bozděchová and Roland Wagner, 2017
  36. 36.0 36.1 Compounds and multi-word expressions in German by Barbara Schlücker, 2019
  37. 37.0 37.1 37.2 Compounds and multi-word expressions in Dutch: Compounds and Multi-Word Expressions by Geert Booij, 2019
  38. §109. General Principles of Greek Compounds in Greek and Latin Roots: Part II – Greek by Peter Smith
  39. CONCEPTUAL METONYMY IN THE MEANING OF ENGLISH AND BULGARIAN NOMINAL COMPOUND by Tsveta Luizova-Horeva, 2012
  40. Chinese: A Language of Compound Words? by Giorgio Francesco Arcodia, 2007
  41. Compounding in Danish by Elena Krasnova, 2014
  42. Compounds and multi-word expressions in English: Compounds and Multi-Word Expressions by Laurie Bauer, 2019
  43. Acquisition of compounds in Estonian and Russian: Frequency, productivity, transparency and simplicity effect by Reili Argus, Victoria V. Kazakovskaya, 2013
  44. 44.0 44.1 44.2 Compounds in Dictionary-Based Cross-Language Information Retrieval by Turid Hedlund, 2002
  45. Compounds and multi-word expressions in Greek: Compounds and Multi-Word Expressions by Maria Koliopoulou, 2019
  46. Compounds and multi-word expressions in Hungarian: Compounds and Multi-Word Expressions by Ferenc Kiefer, 2019
  47. Compounds and multi-word expressions in Italian: Compounds and Multi-Word Expressions by Francesca Masini, 2019
  48. §92. General Principles of Latin Compounds in Greek and Latin Roots: Part I – Latin by Peter Smith
  49. Złożenie, encyklopedia.interia.pl
  50. Compounds and multi-word expressions in Polish: Compounds and Multi-Word Expressions by Bozena Cetnarowska, 2019
  51. Compounds in English and Russian: A comparative analysis, a master's thesis, core.ac.uk
  52. 52.0 52.1 52.2 Compounds and multi-word expressions in Russian: Compounds and Multi-Word Expressions by Ingeborg Ohnheiser, 2019
  53. Generation of Sanskrit Compounds by Pavankumar Satuluri and Amba Kulkarni, 2013
  54. A Short Reference Grammar of Standard Slovene by Marc L. Greenberg, 2006
  55. A Complete List of Compound Words in Spanish by Olga Put, 2021
  56. 56.0 56.1 56.2 56.3 56.4 56.5 56.6 56.7 56.8 LINGUIST List 10.1477 Thu Oct 7 1999 Sum: Linking Elements in Compounds, Editor for this issue: Karen Milligan
  57. Weighted Finite-State Morphological Analysis of Finnish Compoundingwith HFST-LEXC by Krister Lind and Tommi Pirinen
  58. THE INTERPRETATION OF N+N AND V+N COMPOUNDS BY SPANISH HERITAGE SPEAKERS by Patricia Garza-González
  59. What’s in a Word? by Jennifer A. Henderson, 2007
  60. Our favourite compound nouns in European languages, 2019, britishcouncil.org
  61. Compound Nouns and Noun Phrases by Michaela Bartušová, 2017
  62. 62.0 62.1 NEO-CLASSICAL COMPOUNDS IN STUDENT WRITING : A CORPUS-BASED STUDY by Simon Smith, 2012
  63. černokněžník in IJP
  64. černá díra in IJP
  65. sort hul in Den Danske Ordbog
  66. black hole in OneLook Dictionary Search
  67. Schwarztee in Duden
  68. schwarzer Tee in Duden
  69. schwarzes Loch in Duden
  70. schwarzes Loch in Digitales Wörterbuch der deutschen Sprache
  71. czarna dziura in Polish dictionaries at PWN
  72. černokňažník in Slovak dictionaries at slovnik.juls.savba.sk
  73. čierna diera in Slovak dictionaries at slovnik.juls.savba.sk
  74. svart hål in Svenska Akademiens ordbok (SAOB)
  75. svart hål in Svenska Akademiens ordlista (SAOL)
  76. Compound Terms and Their Multi-word Variants: Case of German and Russian Languages by Elizaveta Clouet & Béatrice Daille, 2014
  77. 77.0 77.1 English Compound Terms in Air Traffic and Waterways Transport and Traffic Engineering and their Translation into Serbian by Gordana Dimković-Telebaković, 2019
  78. Contrastive conceptual analysis of noun compound terms in English, French and Spanish within a restricted, specialized domain by Jeanette Pugh, 2016

Further reading

edit