Jump to content

Linguistics

From Thesmotetai

Syntax

The study of how words (the smallest unit of grammar that has meaning and can stand alone; the word 'car') and morphemes (smaller units than words that carry meaning but cannot stand alone; the prefix 'un-') combine to form larger units of grammar, such as phrases and sentences. Syntax confirms itself with word order, grammatical relationships between words and morphemes, constituency (hierarchical structure for sentences), agreement (when words change to adapt to their neighbors in a larger combined unit of grammar), crosslinguistic variation, and the semantics (the relationship between the form of a word and its meaning).

The word syntax comes from Ancient Greek: σύνταξις 'coordination,' which consists of σύν (syn), 'together,' and τάξις (táxis), 'ordering.' Language played a crucial role in Greek philosophy, with figures like Plato and Aristotle exploring the relationship between words, meaning, and reality. Aristotle's work on logic delved into the principles of constructing valid arguments, indirectly touching upon aspects of syntax. Classical Greek and Hellenic thinkers made substantial contributions to the study of grammar; Dionysius Thrax (100s BCE), composed the Tékhnē Grammatikḗ (Art of Grammar), the first work of analytical linguistics focusing on Ancient Greek, which included discussions on parts of speech, morphology, and syntax.

In English, syntax is largely controlled by word order ('the girl loves the boy' versus 'the boy loves the girl') whereas in many other languages, case markers indicate these grammatical relationships; we see this trait in Latin where word order is far less important - 'the girl loves the boy' can be written in a variety of correct orders because the -um ending on the object (boy) stays constant (puerum puella amat, amat puella puerum, amat puerum puella, or puella amat puerum are all correct. Editor's note: puella is girl, puer(-um) is boy, and amat is love).

Word Order

The sequence in which the subject (S), verb (V), and object (O) typically appear in sentences. Typically is an important qualifier; while the dominant word order offers a foundational understanding, many languages exhibit flexibility in their word order, especially for emphasis, thematic structure, or in questions versus statements.

  • Subject-Object-Verb (SOV): Languages that typically use SOV order include Japanese, Korean, Turkish, and Latin. Around 40-45% of the world's languages are believed to use SOV as their default word order.
  • Subject-Verb-Object (SVO): Languages that typically use SVO order include English, Mandarin Chinese, Spanish, and Russian. Approximately 35-40% of languages are thought to use SVO as their primary order.
  • Verb-Subject-Object (VSO) and Verb-Object-Subject (VOS): These orders are less common. VSO is seen in Classical Arabic and Welsh, while VOS appears in languages like Malagasy and Fijian. Combined, VSO and VOS might account for around 10-15% of languages.
  • Object-Verb-Subject (OVS) and Object-Subject-Verb (OSV): These are the rarest word orders among languages. Examples include Hixkaryana for OVS and Xavante for OSV. Together, they constitute less than 1% of all languages.

The separation by language family is as follows:

  • Indo-European predominantly features SVO (English, Spanish) and SOV (Hindi, Persian) orders.
  • Sino-Tibetan languages can show diversity in structure; Mandarin Chinese is notably SVO.
  • Turkic generally features SOV order, as seen in Turkish and Uzbek.
  • Uralic features a mix, but Finnish and Hungarian, for example, tend to favor SVO.
  • Afro-Asiatic languages like Arabic (VSO), Hebrew (SVO), and Amharic (SOV) show diversity.
  • Austronesian also has this variability; Indonesian (SVO) and Malagasy (VOS).
  • Dravidian primarily features SOV order, as in Tamil and Telugu.

Grammatical Relationships

The standard examples of grammatical functions from traditional grammar are subject, direct object, and indirect object.

  • The subject is the noun/pronoun about which a statement is made (John gave an apple to Sally).
  • The direct object is the noun/pronoun being acted upon by the verb (John gave an apple to Sally).
  • The indirect object is the recipient of the direct object (John gave an apple to Sally).

Many modern theories of grammar acknowledge numerous additional types of relations (e.g. complement, specifier, predicative, et cetera). The role of grammatical relations in theories of grammar is greatest in dependency grammars, which propose dozens of distinct grammatical relations. Critics argue that overemphasis on certain grammatical models, such as word order (particularly from an Indo-European focus), may overlook the diversity and complexity of language structures worldwide.

  • Complement: A word or phrase that is necessary to complete the meaning of another part of the sentence. Their main role is to complete the idea expressed by the word they complement. Without the complement, the idea would feel incomplete.
    • Verb Complement: Provides essential information about the action or state described by the verb. For example, in "She gave her friend a gift," "her friend a gift" is a complement of the verb "gave" because it completes the action by specifying what was given and to whom.
    • Noun Complement: Completes the meaning of a noun. For instance, "The decision to leave early" includes "to leave early" as a complement of "decision," explaining what the decision is about.
    • Adjective Complement: Completes the meaning of an adjective. In "She is capable of winning," "of winning" is a complement of "capable," specifying in what way she is capable.
  • Specifier: A word that modifies or provides more specific information about another word, often relating to quantity, definiteness, or possession. Specifiers can be articles (the, a), possessive pronouns (his, her), demonstratives (this, that), and quantifiers (some, many).
  • Predicative: A predicative (or predicate) element relates to the subject or object by providing information about it, typically through a linking verb (such as "to be," "seem," "become"). Predicatives can be predicative adjectives or predicative nominatives (nouns or pronouns) that describe or identify the subject or object. Predicatives provide additional information or description about the subject or object, linking it to a quality, identity, or condition.
    • Predicative Adjective: Describes the subject or object. In "The sky is blue," "blue" is a predicative adjective providing information about "the sky."
    • Predicative Nominative: Identifies or renames the subject or object. For example, in "Karen is a teacher," "a teacher" is a predicative nominative that identifies Karen's occupation.

Dependency grammars emphasize the idea that linguistic units are connected to each other by direct links or dependencies, forming a network of relations that structure the sentence. In this framework, each unit depends on a head (a central word it is connected to) and can have dependents (words that depend on it). Dependency grammars identify a wide range of specific grammatical relations to describe the types of dependencies that can exist, reflecting the nuanced ways words can relate to each other within a sentence.

Some languages, often referred to as non-configurational, exhibit a high degree of flexibility in word order. Grammatical relationships are often indicated through inflectional morphology. In Australian Aboriginal languages, Latin, and Classical Greek, the subject, object, and verb can appear in various orders without changing the fundamental meaning of the sentence.

In topic-prominent languages, such as Mandarin Chinese, the topic of the sentence (what the sentence is about, which can be the subject, object, or another element) comes first, and what is said about the topic follows. This structure emphasizes the topic-comment construction over the subject-predicate construction typical of subject-prominent languages like English.

In languages with rich morphological systems, such as agglutinative and fusional languages, grammatical roles are often marked by case endings or through agreement rather than strictly by word order. This allows for greater flexibility in sentence structure and can convey additional nuances.

Polysynthetic languages, which include many indigenous languages of the Americas, incorporate a high degree of information within single words through complex inflection. This can include the subject, object, verb, and additional modifiers and relational elements, making the concept of word order as applied in more analytic languages less directly relevant.

Inflection

A process of word formation where a word is modified to express different grammatical categories such as tense, case, voice, aspect, comparatives, person, number, gender, mood, animacy, agreement, and definiteness. When a verb is inflected, it is called conjugation; when other word types (nouns, adjectives, adverbs, pronouns, determiners, participles, prepositions, postpositions, numerals, articles, aspects, et cetera) it is called declension.

An inflection expresses grammatical categories with affixation (such as prefix, suffix, infix, circumfix, and transfix), apophony (as Indo-European ablaut, sound changes to the root word), or other modifications. For example, the Latin verb ducam, meaning "I will lead", includes the suffix -am, expressing person (first), number (singular), and tense-mood (future indicative or present subjunctive).

The inflected form of a word often contains both free morphemes (can stand by itself as a word), and bound morphemes (cannot stand alone as a word). For example, the English word cars is a noun that is inflected for number to express the plural: the morpheme car is unbound because it could stand alone as a word, while the suffix -s is bound because it cannot stand alone as a word. These two morphemes together form the inflected word cars.

Words that are never subject to inflection are said to be invariant; for example, the English verb must is an invariant item: it never takes a suffix or changes form to signify a different grammatical category. Its categories can be determined only from its context. Languages that have some degree of inflection are synthetic languages. These can be highly inflected (such as Latin, Greek, Biblical Hebrew, and Sanskrit), or slightly inflected (such as English, Dutch, Persian). Languages that use little inflection are also said to be 'analytic.' Analytic languages that do not make use of derivational morphemes, such as Standard Chinese, are said to be isolating.

Types of Inflections

Inflection often involves agreement between nouns and other parts of speech, such as adjectives, pronouns, and verbs.

Cases indicate the grammatical role of a noun or pronoun in a sentence, such as subject, object, or possession. Common cases include: Nominative - subject of the verb; Accusative - direct object of the verb; Dative - indirect object of the verb; Genitive - possessive relation; Locative - location; Instrumental - means by which an action is performed; Ablative - movement away from something; Vocative - direct address; Allative - movement toward something.

Number distinguishes between one (singular), two (dual), and more than two (plural) entities, with some languages having additional distinctions - some languages distinguish between 'a few' and 'many' (paucal) or may have a particular inflection for a collective quantity.

Gender in language can be a way to classify nouns and pronouns, often impacting agreement with adjectives, verbs, and pronouns; can generally include masculine, feminine, neuter, and/or common, among others.

Animacy can also play a role similar to gender in some languages, with animate or inanimate objects having distinct inflection markers.

Definiteness distinguishes between specific and non-specific objects ('the' versus 'a/an').

Partitive is used in to express something being 'part of' some bigger or greater whole.

Comparison markers indicate comparative and superlative forms.

Many languages have unique declension categories not widely found elsewhere, reflecting specific semantic or syntactic distinctions important in those languages.

In English most nouns are inflected for number with the inflectional plural affix -s (as in "dog" → "dog-s"), and most English verbs are inflected for tense with the inflectional past tense affix -ed (as in "call" → "call-ed"). English also inflects verbs by affixation to mark the third person singular in the present tense (with -s), and the present participle (with -ing). English short adjectives are inflected to mark comparative and superlative forms (with -er and -est respectively). Despite the general regularization, English retains traces of its ancestry, with a minority of its words still using inflection by ablaut (sound change, mostly in verbs) and umlaut (a particular type of sound change, mostly in nouns), as well as long-short vowel alternation.

  • Write, wrote, written (marking by ablaut variation, and also suffixing in the participle); sing, sang, sung (ablaut); foot, feet (marking by umlaut variation); mouse, mice (umlaut); child, children (ablaut, and also suffixing in the plural)

In Latin, there are five types of declension. Words that belong to the first declension usually end in -a and are usually feminine in gender, and thus can be said to share a common inflectional framework. In Old English, nouns are divided into two major categories of declension, strong and weak. Through the process of syncretism, the same inflection form may serve multiple grammatical functions in many languages, and context helps reveal the semantic details.

Morphemes can be added to inflectional languages in a variety of ways:

  • Affixation, or simply adding morphemes onto the word without changing the root;
  • Reduplication, repeating all or part of a word to change its meaning;
  • Alternation, exchanging one sound for another in the root (usually vowel sounds, as in the ablaut / umlaut in Germanic languages);
  • Suprasegmental variations, such as of stress, pitch or tone, where no sounds are added or changed but the intonation and relative strength of each sound is altered regularly.

Inflection and PIE Languages

Because the Proto-Indo-European (PIE) language was highly inflected, all of its descendants, such as Albanian, Armenian, English, German, Ukrainian, Russian, Persian, Kurdish, Italian, Irish, Spanish, French, Hindi, Marathi, Urdu, Bengali, and Nepali, are inflected to a greater or lesser extent. In general, older Indo-European languages such as Latin, Ancient Greek, Old English, Old Norse, Old Church Slavonic and Sanskrit are extensively inflected because of their temporal proximity to the PIE ancestor.

Deflexion has caused modern versions of some Indo-European languages that were previously highly inflected to be much less so; an example is English, as compared to Old English. In general, languages where deflexion occurs replace inflectional complexity with more rigorous word order, which provides the lost inflectional details. However, the transition from synthetic (inflection-heavy) to analytic (relying more on word order and auxiliary words) is complex and can be influenced by multiple factors.

Most Slavic languages and some Indo-Aryan languages are an exception to the general Indo-European deflexion trend, continuing to be highly inflected (in some cases acquiring additional inflectional complexity and grammatical genders, as in Czech & Marathi).

Subordination and Coordination

These fundamental mechanisms by which clauses are combined in languages allow for the expression of complex ideas and relationships between events, actions, and descriptions. These mechanisms contribute to the grammatical hierarchy of sentences, determining how clauses are structured and integrated.

Coordination joins two or more syntactic units of equal rank, such as words, phrases, or clauses, enabling them to function as a single unit. The coordinated elements can be seen as parallel or additive in nature.

  • Markers: often marked by coordinating conjunctions ("and," "or," "but"). However, languages may also use non-conjunctive means, such as intonation, punctuation (in written language), or even juxtaposition without explicit markers.
  • Symmetry and Balance: typically share the same syntactic type (noun with noun, clause with clause) and contribute equally to the meaning of the sentence.
  • Functions: allow for listing (enumeration), addition, choice, contrast, and other relational meanings between equally important ideas or actions.

Subordination involves embedding one clause (the subordinate or dependent clause) within another (the main or independent clause), creating a hierarchical relationship. The subordinate clause functions as a single unit within the larger structure, often specifying time, reason, condition, manner, or place.

  • Markers: Subordination is marked in various ways, including subordinating conjunctions ("because," "if," "when"), relative pronouns ("which," "that," "whom"), or specific verb forms (subjunctive moods, non-finite verb forms). Some languages employ specific particles or changes in word order to indicate subordination.
  • Dependency: The subordinate clause depends on the main clause for its interpretation, not being able to stand alone as a complete sentence. It typically provides additional information about an element in the main clause.
  • Types of Subordinate Clauses: These include adverbial clauses (indicating time, reason, condition, etc.), relative clauses (modifying nouns), and complement clauses (serving as the object or subject of a verb or complement of an adjective or noun).

In a grammatical hierarchy, independent clauses rank higher than dependent clauses. Subordination allows for the embedding of clauses within clauses, leading to potentially complex sentence structures. This embeddedness is key to expressing nuanced and layered relationships between actions, events, and descriptions. Coordination and subordination can interact within a single sentence, with coordinated structures containing subordinate clauses, or with subordinate clauses themselves being coordinated, contributing to the syntactic and semantic complexity of language.

Sentence and Clause Structure

Although the specific grammatical tools and markers used to combine clauses can vary widely among languages, the concepts of simple, compound, complex, and compound-complex sentences offer a useful framework for analyzing sentence types from a language-agnostic perspective.

Simple Sentences

A simple sentence consists of a single independent clause; it contains a subject and a predicate and expresses a complete thought. Simple sentences can be straightforward or can include multiple subjects, objects, or adjuncts, but they always revolve around a single main verb or verb phrase.

  • [Subject] [Verb] [Object].
  • The bird sings.

Compound Sentences

Compound sentences are formed by joining two or more independent clauses of equal status, meaning they could stand alone as simple sentences. The clauses in a compound sentence are typically connected by coordinating conjunctions (such as 'and,' 'but,' 'or') or by punctuation marks like semicolons in written language. The exact conjunctions or punctuation marks used can vary significantly across languages.

  • [Independent Clause] [Conjunction] [Independent Clause].
  • The bird sings and the cat meows.

Complex Sentences

Complex sentences consist of one independent clause and one or more dependent (or subordinate) clauses, which cannot stand alone as a sentence and is linked to the independent clause by subordinating conjunctions (like 'because,' 'since,' 'which') or relative pronouns, depending on the language. The dependent clause adds information to the independent clause, specifying time, reason, condition, contrast, et cetera.

  • [Independent Clause] [Subordinating Conjunction] [Dependent Clause].
  • The bird sings because it is happy.

Compound-Complex Sentences

Compound-complex sentences combine elements of both compound and complex sentences; they include at least two independent clauses and one or more dependent clauses. These sentences are useful for conveying intricate relationships between ideas and combining multiple actions or situations.

  • [Independent Clause] [Conjunction] [Independent Clause] [Subordinating Conjunction] [Dependent Clause].
  • The bird sings, and the cat meows, because it is morning.

Considerations Across Languages

While these categories are helpful for analyzing sentence structure, it's important to recognize that languages may employ unique mechanisms for combining clauses:

  • Differential markings: Some languages might use specific particles, verb forms, or word order changes to distinguish between independent and dependent clauses.
  • Conjunctions and connectors: The exact words or particles used to connect clauses can vary widely, and some languages may use non-conjunctional means to indicate clause relationships.
  • Ellipsis and inference: In some languages, elements of compound or complex sentences might be omitted if they can be inferred from context, making the sentence structure less explicit.

Syntactic Theories and Models of the 21st Century

Developed by Noam Chomsky, generative grammar emphasizes the innate linguistic capability of humans and the idea of a universal grammar underlying all languages. It has evolved through various iterations:

  • Government and Binding (GB) Theory was focused on the modular organization of syntax, including principles like X-bar theory, case theory, and theta theory.
  • Minimalist Program (MP) aims to explain language using the minimal necessary theoretical constructs and principles, focusing on the economy of derivation and representation.

Lexical-Functional Grammar (LFG) emphasizes the importance of lexical information in determining grammatical structure and includes a distinction between functional structure (f-structure) and constituent structure (c-structure).

Head-Driven Phrase Structure Grammar (HPSG) focuses on the lexicon and complex feature-based descriptions, with a strong emphasis on the role of the head in phrase structure and a commitment to a non-transformational view of syntax.

Categorial Grammar (CG) is based on the idea that syntactic categories can be understood in terms of functions that combine words and phrases into larger units. Modern variants include Combinatory Categorial Grammar (CCG) and Type Logical Grammar.

Dependency Grammar (DG) centers on the dependency relation between headwords and their dependents as the primary structure organizing sentences, differing from phrase structure grammars by not assuming constituent phrases as primary.

Construction Grammar (CxG) argues that knowledge of language is based on constructions, or form-meaning pairings, that range from specific idioms to general grammatical patterns. Variants include Cognitive Grammar and Radical Construction Grammar.

Tree Adjoining Grammar (TAG) is a highly lexicalized grammar framework that uses trees as the basic unit of syntax, allowing for more flexibility in describing cross-linguistic syntactic phenomena.

Role and Reference Grammar (RRG) integrates syntactic, semantic, and pragmatic information, focusing on the role of verb argument structure and the linking between semantic and syntactic roles.

Optimality Theory (OT) for Syntax was originally developed for phonology, but has been applied to syntax. It proposes that surface forms result from the competition between constraints, rather than from transformational derivations.

Dynamic Syntax (DS) emphasizes the temporal dimension of syntax, modeling how sentences are processed in real-time, focusing on the incremental building of semantic representations.

Stochastic/Probabilistic Grammar models incorporate probabilities into grammatical descriptions, often used in computational linguistics to model language understanding and production processes based on statistical patterns.

Semantics

Put simply, semantics is the study of linguistic meaning. It evaluates how words get specific meanings, how the meaning of a complex combination-unit of grammar is dependent upon its constituent parts. The term derives from the Greek verb sēmainō (“to mean” or “to signify”), through the adjective sēmantikos (“significant”).

Natural languages (those that develop 'in the wild' among human populations) are known for being unboundedly productive; there is no upper limit in length, complexity, or number of grammatical expressions; Simpler expressions can be concatenated, relativized, complementized, et cetera to create ever-more-complex formulations. These complex expressions are both grammatically correct, and meaningful.

In addition to compositionality, semantic theories must also account for the phenomenon of reference. Reference is a characteristic of many expressions where the appearance of connection between words and the world seems quite unclear to outsiders by 'de jure meaning' (or sense) alone.

In the philosophy of language, the distinction between sense and reference was an idea of the German philosopher and mathematician Gottlob Frege (On Sense and Reference; Über Sinn und Bedeutung), reflecting the two ways he believed a singular term could have meaning.

  • Reference (Bedeutung): The reference of a term is the object it refers to in the world. For example, the reference of the name "Venus" is the planet Venus itself.
  • Sense (Sinn): The sense of a term is the way the reference is presented or the mode of presentation of the object. It involves the cognitive aspect of the term, how an individual conceives of the term's reference. For instance, "the morning star" and "the evening star" have the same reference (the planet Venus) but different senses, because they refer to Venus in different ways or contexts. The statement "the morning star is the evening star" is informative because it reveals that two different senses ("the morning star" and "the evening star") refer to the same object (Venus), even though the reference is the same.

The expressions triangular and trilateral, for example, are not synonymous, but there is no possible world in which they do not apply to exactly the same things. And the expression round square appears to be meaningful, but there is no possible world in which it applies to anything at all. Such examples are easy to multiply.

In semantics, truth is a property of statements that accurately present the world and true statements are in accord with reality. Whether a statement is true usually depends on the relation between the statement and the rest of the world; the truth conditions of a statement are the way the world needs to be for the statement to be true. For example, it belongs to the truth conditions of the sentence "it is raining outside" that raindrops are falling from the sky. The sentence is true if it is used in a situation in which the truth conditions are fulfilled: if there is actually rain outside.

The semiotic triangle is a model used to explain the relation between language, language users, and the world, represented in the model as Symbol, Thought / Reference, and Referent. The symbol is a linguistic signifier, either in its spoken or written form. The central idea of the model is that there is no direct relation between a linguistic expression and what it refers to; this is expressed in the diagram by the dotted line between symbol and referent.

The model holds instead that the relation between the two is mediated through a third component (the thought or reference). For example, the term apple stands for a type of fruit but there is no direct connection between this string of letters and the corresponding physical object; the relation is only established indirectly through the mind of the language speaker. When they see this string of symbols or hear this string of phonemes, it evokes a mental image or a concept, which establishes the symbol's connection to the physical reference. This process is only possible if the speaker learned the meaning of the symbol previously; the meaning of a specific symbol is governed by the conventions of each specific language.

Lexical relations describe how words stand to one another. Two words are synonyms if they share the same or a very similar meaning, like car and automobile, or buy and purchase. Antonyms have opposite meanings, such as with alive and dead, or fast and slow. One term is a hyponym of another if the meaning of the first is included in the meaning of the second (ant is a hyponym of insect). A prototype is a hyponym that has characteristic features of the type it belongs to (a robin is a prototype of a bird but a penguin is not). Two words with the same pronunciation are homophones like flour and flower, while two words with the same spelling are homonyms, like a bank of a river in contrast to a bank as a financial institution. Hyponymy is closely related to meronymy, which describes the relation between part and whole. For instance, wheel is a meronym of car. An expression is ambiguous if it has more than one possible meaning. The term polysemy is used if different meanings of a word are closely related to one another, like with the word head: the topmost part of the human body, or the top-ranking person in an organization.

Morphology

The internal construction of words from morphemes; some languages have highly complex morphology, while others (like Vietnamese) have very little, or even none. In linguistic typology, languages are categorized based on how they use morphology to convey grammatical relationships.

Inflectional morphology involves the modification of a word to express different grammatical categories such as tense, mood, voice, aspect, person, number, gender, and case. Inflection does not change the basic category or meaning of a word but rather adjusts it to fit its role within a sentence.

Inflection conveys syntactic information necessary for the grammatical functioning of the word within a sentence. It helps in indicating relationships between different words in the sentence. The set of inflectional morphemes in a language is typically fixed or closed, meaning that new inflectional morphemes are rarely added to the language. Inflectional processes are generally productive, applying to nearly all words within the category they affect (though there can be irregular forms).

Derivational morphology involves creating a new word from an existing one by adding a prefix or suffix (or sometimes through other processes like internal change or compounding). Derivation alters the base word to create a new word with a related, but distinct, meaning (or it can place the word in a new grammatical category; noun to adjective, for example).

New derivational morphemes can be added to the language over time (the set is open), and the process can give rise to an extensive variety of new words. Derivational processes vary in productivity; some apply broadly across many words, while others are restricted to a limited set of cases.

A fundamental difference between inflection and derivation is that inflection adjusts a word's form to fit grammatical constraints without changing its core meaning or category, whereas derivation changes the word's meaning and often its category as well. Inflectional changes result in different inflected forms of the same word, whereas derivational changes result in entirely new derived words.

Analytic (Isolating) Languages

These languages tend to use a single morpheme (the smallest meaningful unit in a language) per word, meaning that they do not rely heavily on inflectional changes to indicate grammatical relationships. Instead, they often use word order, auxiliary words, and prepositions to convey grammatical relationships, tense, aspect, and other grammatical categories.

Synthetic Languages

These languages are characterized by a higher use of inflection or derivation to encode grammatical relationships within words. Synthetic languages can be further divided into:

Agglutinative: morphemes (prefixes, suffixes, infixes, or circumfixes) are added to a base word in a relatively straightforward manner, with each morpheme typically representing a single grammatical category (such as tense, number, case, or aspect). These morphemes are easily separable and maintain consistent forms across different words. This cumulative process allows for a high degree of precision and clarity in expressing nuances of meaning and grammatical relationships. Turkish, Finnish, Swahili, and Japanese are examples.

Fusional (Inflectional): morphemes combine several grammatical categories such as gender, number, tense, mood, or case into a single affix attached to a word. This means that a single inflectional ending or internal modification in a word can convey multiple grammatical details. These languages tend to have a rich inventory of morphological forms. A single verb form can encode person, number, tense, aspect, and mood simultaneously. Unlike agglutinative languages, where morphemes have clear boundaries, fusional languages often exhibit forms where it's challenging to segment the word into discrete morphemes due to the merging of grammatical categories within a single affix. Languages like Russian and Latin fit here.

Polysynthetic: multiple morphemes to express what might be conveyed in a full sentence in more analytic languages. These languages feature incorporation of nouns into verbs, allowing a single word to convey complex actions and relations. A single word can express what would require a full sentence in more analytic or even agglutinative languages. This includes incorporating subjects, objects, and relational information into one word. While agglutinative languages have clear morpheme boundaries and functions, polysynthetic languages often feature morphemes that blend together more fluidly, with less clear-cut distinctions between them. A single affix in a polysynthetic language might serve multiple grammatical or semantic roles. Examples include Inuktitut, Yupik, and Mohawk.

Affixation

A core morphological process used across languages to modify the meaning of a word or to adjust it for grammatical congruity. This process involves adding an affix, a bound morpheme (that cannot stand alone) to a base word (or root). The base word can be simple form or another morphologically complex form; affixes can be categorized by their position relative to the base and the functions they perform.

Types of Affixes

  • Prefixes are added to the beginning of a word. They can alter the meaning of the word or adapt it for grammatical purposes. The English prefix "un-" can be added to "happy" to form "unhappy."
  • Suffixes attach to the end of a word. They are prevalent in indicating grammatical categories like tense, case, or number, as well as deriving new words. In English, adding "-ness" to "happy" creates "happiness."
  • Infixes are inserted within the body of a word. Infixation is less common in Indo-European languages but is found in other language families, such as Austronesian. In Tagalog, the infix "-um-" can be inserted into "sulat" (write) to form "sumulat" (wrote).
  • Circumfixes (or disfixes) attach to a base word in two parts, one at the beginning and one at the end. Circumfixation is seen in languages like German, where "ge-" and "-t" can enclose a verb to form a past participle, as in "gesagt" (said) from "sagen" (to say).

Affixation serves two primary functions: derivational and inflectional.

  • Derivational Affixation creates new words with new meanings and often changes the grammatical category of the base. This process is crucial for expanding a language's vocabulary and can involve nuanced changes in meaning.
  • Inflectional Affixation modifies a word to fit grammatical requirements without changing the word's essential meaning or category. It indicates grammatical relationships and features such as tense, number, gender, mood, and case. Inflectional affixation is more uniform and limited compared to derivational affixation.

Affixation in Morphological Types

  • Agglutinative languages tend to have a clear-cut, one-to-one correspondence between affixes and grammatical functions, stacking multiple affixes onto a base to express compound grammatical concepts.
  • Fusional languages allow affixes to convey multiple grammatical categories simultaneously, and the form of the affix can change depending on the word it attaches to, leading to less transparent relationships between form and function.
  • Polysynthetic languages can utilize extensive affixation, incorporating a large number of affixes into single, highly complex words that can express a great deal of information.

Reduplication

A morphological process involving the repetition of a whole or part of a word to convey grammatical functions or semantic nuances. This process can modify the meaning of the original word, indicate grammatical categories, or create new words, and can be partial or total, depending on whether part of the word or the entire word is repeated.

  • Total Reduplication: The entire word or root is repeated. This can serve various functions, such as pluralization, intensification, or aspectual distinctions.
    • In Indonesian, "orang" (person) can become "orang-orang" to indicate "people" or a plurality of persons.
  • Partial Reduplication: Only a segment of the word, such as a syllable or a set of phonemes, is duplicated. Partial reduplication can be used to denote diminutives, verb tenses, and other grammatical or semantic changes.
    • In Tagalog, "ganda" (beauty) can be reduplicated as "maganda" (beautiful) by partially repeating the first syllable and adding it as a prefix.

Reduplication serves a variety of linguistic functions, which can differ widely among languages:

  • Grammatical Aspect: Indicating completed, habitual, ongoing, or iterative actions.
  • Tense: Marking past, present, or future tense in verbs.
  • Mood: Showing mood such as indicative, imperative, or subjunctive.
  • Number: Distinguishing singular from plural or dual.
  • Intensity or Degree: Signaling intensity, emphasis, or degree of an adjective or adverb.
  • Diminutive, Augmentative, or Pejorative Meaning: Modifying nouns or adjectives to convey size, importance, or emotional tone.

Compounding

A morphological process involving the combination of two or more lexemes (independent words) to form a new word. The resulting compound word typically acquires a meaning that is distinct from, but related to, the meanings of its constituent parts. Compounds can be formed from various combinations of parts of speech, including nouns, adjectives, verbs, and adverbs, among others.

  • Two nouns are combined to form a new noun. For example, "toothbrush" (tooth + brush) or "bookcase" (book + case).
  • An adjective and a noun are combined, usually to describe a specific type of the noun. Examples include "blackboard" (black + board) or "greenhouse" (green + house).
  • A verb and a noun are combined, often describing an action related to the noun. For instance, "pickpocket" (pick + pocket).
  • Two verbs are combined to describe actions that are done together or in sequence. This is less common in some families but can be seen in some South Asian languages.

The compound often represents a concept or entity that is semantically unified, which means its overall meaning cannot always be directly inferred from the meanings of its parts. Compounding can involve morphological changes (such as the modification of roots) or phonological changes (such as stress shifts) to mark the compound status, or aid in pronunciation. Once words are compounded, the new formation behaves as a single unit syntactically. It takes on the grammatical roles as one word, and its internal structure is not affected by syntactic operations that affect phrases.

Compounds typically have a head, the part that determines the overall category and grammatical properties of the compound. The position of the head varies across languages; for example, in English, the head is usually on the right ("toothbrush" is a type of brush), whereas in Japanese, it is typically on the left (電話 "denwa," phone, is literally "electric + talk").

Endocentric compounds have a head that provides the compound with its basic meaning and category (e.g., "blueberry" is a type of berry). Exocentric compounds lack a clear head, and the compound does not belong to the category of any of its parts (e.g., "pickpocket" is not a type of pocket). The rules governing compounding and the productivity of this process vary widely among languages; some are highly compounding, creating many new words through this process, while others use it much more sparingly.

Suppletion

This concept represents an irregular way to form word variants, such as different tenses of a verb or comparative and superlative forms of an adjective. Unlike regular morphological processes that add, modify, or replace affixes while keeping the base of the word recognizable, suppletion involves using entirely different roots to express grammatical or semantic relations. This process results in word forms that cannot be predicted from the form of the base word and must be memorized as separate lexical entries.

Suppletion is highly irregular and constitutes an exception to the normal morphological rules of a language. The suppletive forms bear no phonological or morphological resemblance to their base forms. It is therefore relatively rare and typically affects a small set of commonly used words within a language: often verbs, adjectives, and pronouns. Despite its irregularity, suppletion serves regular grammatical functions, such as tense, number, or degree of comparison.

One of the most well-known examples is the verb "to go" in English, which forms its past tense as "went," a word originally derived from the Old English verb "wendan" (to turn). Thus, "go" and "went" demonstrate suppletion. The adjectives "good," "better," and "best" in English show suppletion, where "better" and "best" do not derive from "good" through regular morphological processes. In many languages, personal pronouns exhibit suppletion. For example, in English, the forms "I," "me," "my," and "mine" are related through meaning and function but not through form.

Suppletion is found across a wide range of languages, indicating that it is a common, though irregular, linguistic phenomenon:

  • Latin: "To be" shows suppletion with "sum," "es," "est" (present) and "fui" (perfect).
  • Spanish: "To go" is "ir," but its future stem is "ir-" and the past stem is "fu-," as in "fui" (I went).
  • Russian: "To go" has "идти" (idti) in the present tense and "пойти" (pojti) in the future tense.

Suppletion is often explained as the result of historical linguistic evolution, where words from different etymological sources merge over time due to similar meanings or functions. These changes are usually driven by frequent use, phonological erosion, and the natural tendency for languages to regularize irregular forms, leaving only a few highly irregular suppletive forms. Linguists study suppletion to understand language change and the balance between irregularity and the drive for regular grammatical patterns.

Conversion

Also known as zero derivation or functional shift, conversion is a morphological process through which a word is reassigned to a new grammatical category without the addition of an affix or any change to its form. This process allows a word from one part of speech, such as a noun, to function as another part of speech, like a verb, simply through a shift in its grammatical context. Conversion allows languages to expand their lexical and functional capacity, often creating pairs of related words that differ in grammatical category but are identical in form.

Unlike other morphological processes, conversion involves no alteration to the word's spelling or phonetics; the change in grammatical function is indicated solely by the word's use in context. Conversion can occur in multiple directions, such as noun to verb, verb to noun, adjective to noun, et cetera. It is a highly productive means of word formation in many languages, allowing for the rapid creation of new words to meet communicative demands.

  • Noun to Verb: The use of a noun in a verbal context, thereby creating a verb with a related meaning. For example, "to bottle" (from "bottle," implying the action of putting something into bottles).
  • Verb to Noun: The use of a verb as a noun, often referring to the act or instance of performing the verb. An example might be "a run" (from "to run," referring to a single act of running).
  • Adjective to Verb: The transformation of an adjective into a verb, enabling the description of the action of becoming or making something have the quality of the adjective. For instance, "to empty" (from "empty," meaning to make something empty).

Cliticization

A morphological and phonological process involving clitics, grammatical elements that possess syntactic characteristics of words but behave phonologically like affixes. Clitics are dependent on adjacent words for their pronunciation and often cannot stand alone, lacking the stress patterns typical of full words.

They require a host word to attach to, as they cannot independently bear stress; unlike affixes, which are bound to specific parts of speech, clitics can attach to various categories of words. They often encode grammatical functions such as negation, possession, articles, pronouns, or auxiliary verbs. Proclitics attach to the beginning of a host word; enclitics attach to the end of a host word; mesoclitics are inserted within a host word.

In Romance languages like Spanish and Italian, object pronouns often appear as clitics attached to verbs. In Italian, "lo" (it) can attach to "vedo" (I see) to form "lo vedo" (I see it). In English, the possessive "'s" is considered a clitic that can attach to noun phrases to indicate possession, as in "the king of Spain's crown." In Greek, the definite article can act as a proclitic, attaching to the beginning of the word it modifies. In French, "ne" in the negation "ne...pas" behaves like a clitic, attaching to auxiliary or main verbs to form negatives, as in "Je ne sais pas" (I don't know).

Distinguishing clitics from affixes can be challenging but important. Affixes are inherently bound morphemes that directly modify the meaning or grammatical category of the words they attach to and are restricted to specific lexical categories. Clitics, while also bound in pronunciation, retain a syntactic identity separate from the host and can often attach to a broader range of categories.

Grammatical Categories and Concepts

  • Cases in Languages
    • Nominative, Accusative, Ergative, and Others
    • Case Systems: Tripartite, Split Ergativity
  • Tense, Aspect, and Mood (TAM)
    • Tense: Past, Present, Future
    • Aspect: Imperfective, Perfective, Progressive
    • Mood: Indicative, Subjunctive, Imperative, Conditional
  • Voice and Valency
    • Active, Passive, Middle Voices
    • Valency Changing Operations: Causatives, Applicatives

Semantic Roles and Relations

  • Thematic Roles: Agent, Patient, Theme, Experiencer
  • Semantic Fields and Lexical Sets

Language Change and Evolution

  • Historical Linguistics and Language Change
    • Sound Changes, Analogical Changes
    • Grammaticalization and Language Contact
  • Dialectology and Sociolinguistics
    • Language Variation and Change
    • Social Factors in Language Change

Comparative Studies Across Language Families

  • Indo-European Languages
  • Afro-Asiatic Languages
  • Sino-Tibetan Languages
  • Language Isolates and Constructed Languages

Research Methods in Comparative Linguistics

  • Comparative Method and Reconstruction
  • Typological and Areal Linguistics
  • Corpus Linguistics and Computational Approaches

Current Trends and Future Directions

  • Interdisciplinary Approaches: Cognitive Linguistics
  • Endangered Languages and Language Revival
  • Universal Grammar and Language Acquisition