Jump to content

Linguistics

From Thesmotetai
Revision as of 19:09, 10 March 2024 by Jojo (talk | contribs) (Syntax)

Syntax

The study of how words (the smallest unit of grammar that has meaning and can stand alone; the word 'car') and morphemes (smaller units than words that carry meaning but cannot stand alone; the prefix 'un-') combine to form larger units of grammar, such as phrases and sentences. Syntax confirms itself with word order, grammatical relationships between words and morphemes, constituency (hierarchical structure for sentences), agreement (when words change to adapt to their neighbors in a larger combined unit of grammar), crosslinguistic variation, and the semantics (the relationship between the form of a word and its meaning).

The word syntax comes from Ancient Greek: σύνταξις 'coordination,' which consists of σύν (syn), 'together,' and τάξις (táxis), 'ordering.' Language played a crucial role in Greek philosophy, with figures like Plato and Aristotle exploring the relationship between words, meaning, and reality. Aristotle's work on logic delved into the principles of constructing valid arguments, indirectly touching upon aspects of syntax. Classical Greek and Hellenic thinkers made substantial contributions to the study of grammar; Dionysius Thrax (100s BCE), composed the Tékhnē Grammatikḗ (Art of Grammar), the first work of analytical linguistics focusing on Ancient Greek, which included discussions on parts of speech, morphology, and syntax.

In English, syntax is largely controlled by word order ('the girl loves the boy' versus 'the boy loves the girl') whereas in many other languages, case markers indicate these grammatical relationships; we see this trait in Latin where word order is far less important - 'the girl loves the boy' can be written in a variety of correct orders because the -um ending on the object (boy) stays constant (puerum puella amat, amat puella puerum, amat puerum puella, or puella amat puerum are all correct. Editor's note: puella is girl, puer(-um) is boy, and amat is love).

Word Order

The sequence in which the subject (S), verb (V), and object (O) typically appear in sentences. Typically is an important qualifier; while the dominant word order offers a foundational understanding, many languages exhibit flexibility in their word order, especially for emphasis, thematic structure, or in questions versus statements.

  • Subject-Object-Verb (SOV): Languages that typically use SOV order include Japanese, Korean, Turkish, and Latin. Around 40-45% of the world's languages are believed to use SOV as their default word order.
  • Subject-Verb-Object (SVO): Languages that typically use SVO order include English, Mandarin Chinese, Spanish, and Russian. Approximately 35-40% of languages are thought to use SVO as their primary order.
  • Verb-Subject-Object (VSO) and Verb-Object-Subject (VOS): These orders are less common. VSO is seen in Classical Arabic and Welsh, while VOS appears in languages like Malagasy and Fijian. Combined, VSO and VOS might account for around 10-15% of languages.
  • Object-Verb-Subject (OVS) and Object-Subject-Verb (OSV): These are the rarest word orders among languages. Examples include Hixkaryana for OVS and Xavante for OSV. Together, they constitute less than 1% of all languages.

The separation by language family is as follows:

  • Indo-European predominantly features SVO (English, Spanish) and SOV (Hindi, Persian) orders.
  • Sino-Tibetan languages can show diversity in structure; Mandarin Chinese is notably SVO.
  • Turkic generally features SOV order, as seen in Turkish and Uzbek.
  • Uralic features a mix, but Finnish and Hungarian, for example, tend to favor SVO.
  • Afro-Asiatic languages like Arabic (VSO), Hebrew (SVO), and Amharic (SOV) show diversity.
  • Austronesian also has this variability; Indonesian (SVO) and Malagasy (VOS).
  • Dravidian primarily features SOV order, as in Tamil and Telugu.

Grammatical Relationships

The standard examples of grammatical functions from traditional grammar are subject, direct object, and indirect object.

  • The subject is the noun/pronoun about which a statement is made (John gave an apple to Sally).
  • The direct object is the noun/pronoun being acted upon by the verb (John gave an apple to Sally).
  • The indirect object is the recipient of the direct object (John gave an apple to Sally).

Many modern theories of grammar acknowledge numerous additional types of relations (e.g. complement, specifier, predicative, et cetera). The role of grammatical relations in theories of grammar is greatest in dependency grammars, which propose dozens of distinct grammatical relations. Critics argue that overemphasis on certain grammatical models, such as word order (particularly from an Indo-European focus), may overlook the diversity and complexity of language structures worldwide.

  • Complement: A word or phrase that is necessary to complete the meaning of another part of the sentence. Their main role is to complete the idea expressed by the word they complement. Without the complement, the idea would feel incomplete.
    • Verb Complement: Provides essential information about the action or state described by the verb. For example, in "She gave her friend a gift," "her friend a gift" is a complement of the verb "gave" because it completes the action by specifying what was given and to whom.
    • Noun Complement: Completes the meaning of a noun. For instance, "The decision to leave early" includes "to leave early" as a complement of "decision," explaining what the decision is about.
    • Adjective Complement: Completes the meaning of an adjective. In "She is capable of winning," "of winning" is a complement of "capable," specifying in what way she is capable.
  • Specifier: A word that modifies or provides more specific information about another word, often relating to quantity, definiteness, or possession. Specifiers can be articles (the, a), possessive pronouns (his, her), demonstratives (this, that), and quantifiers (some, many).
  • Predicative: A predicative (or predicate) element relates to the subject or object by providing information about it, typically through a linking verb (such as "to be," "seem," "become"). Predicatives can be predicative adjectives or predicative nominatives (nouns or pronouns) that describe or identify the subject or object. Predicatives provide additional information or description about the subject or object, linking it to a quality, identity, or condition.
    • Predicative Adjective: Describes the subject or object. In "The sky is blue," "blue" is a predicative adjective providing information about "the sky."
    • Predicative Nominative: Identifies or renames the subject or object. For example, in "Karen is a teacher," "a teacher" is a predicative nominative that identifies Karen's occupation.

Dependency grammars emphasize the idea that linguistic units are connected to each other by direct links or dependencies, forming a network of relations that structure the sentence. In this framework, each unit depends on a head (a central word it is connected to) and can have dependents (words that depend on it). Dependency grammars identify a wide range of specific grammatical relations to describe the types of dependencies that can exist, reflecting the nuanced ways words can relate to each other within a sentence.

Some languages, often referred to as non-configurational, exhibit a high degree of flexibility in word order. Grammatical relationships are often indicated through inflectional morphology. In Australian Aboriginal languages, Latin, and Classical Greek, the subject, object, and verb can appear in various orders without changing the fundamental meaning of the sentence.

In topic-prominent languages, such as Mandarin Chinese, the topic of the sentence (what the sentence is about, which can be the subject, object, or another element) comes first, and what is said about the topic follows. This structure emphasizes the topic-comment construction over the subject-predicate construction typical of subject-prominent languages like English.

In languages with rich morphological systems, such as agglutinative and fusional languages, grammatical roles are often marked by case endings or through agreement rather than strictly by word order. This allows for greater flexibility in sentence structure and can convey additional nuances.

Polysynthetic languages, which include many indigenous languages of the Americas, incorporate a high degree of information within single words through complex inflection. This can include the subject, object, verb, and additional modifiers and relational elements, making the concept of word order as applied in more analytic languages less directly relevant.

Inflection

A process of word formation where a word is modified to express different grammatical categories such as tense, case, voice, aspect, comparatives, person, number, gender, mood, animacy, agreement, and definiteness. When a verb is inflected, it is called conjugation; when other word types (nouns, adjectives, adverbs, pronouns, determiners, participles, prepositions, postpositions, numerals, articles, aspects, et cetera) it is called declension.

An inflection expresses grammatical categories with affixation (such as prefix, suffix, infix, circumfix, and transfix), apophony (as Indo-European ablaut, sound changes to the root word), or other modifications. For example, the Latin verb ducam, meaning "I will lead", includes the suffix -am, expressing person (first), number (singular), and tense-mood (future indicative or present subjunctive).

The inflected form of a word often contains both free morphemes (can stand by itself as a word), and bound morphemes (cannot stand alone as a word). For example, the English word cars is a noun that is inflected for number to express the plural: the morpheme car is unbound because it could stand alone as a word, while the suffix -s is bound because it cannot stand alone as a word. These two morphemes together form the inflected word cars.

Words that are never subject to inflection are said to be invariant; for example, the English verb must is an invariant item: it never takes a suffix or changes form to signify a different grammatical category. Its categories can be determined only from its context. Languages that have some degree of inflection are synthetic languages. These can be highly inflected (such as Latin, Greek, Biblical Hebrew, and Sanskrit), or slightly inflected (such as English, Dutch, Persian). Languages that use little inflection are also said to be 'analytic.' Analytic languages that do not make use of derivational morphemes, such as Standard Chinese, are said to be isolating.

Types of Inflections

Inflection often involves agreement between nouns and other parts of speech, such as adjectives, pronouns, and verbs.

Cases indicate the grammatical role of a noun or pronoun in a sentence, such as subject, object, or possession. Common cases include: Nominative - subject of the verb; Accusative - direct object of the verb; Dative - indirect object of the verb; Genitive - possessive relation; Locative - location; Instrumental - means by which an action is performed; Ablative - movement away from something; Vocative - direct address; Allative - movement toward something.

Number distinguishes between one (singular), two (dual), and more than two (plural) entities, with some languages having additional distinctions - some languages distinguish between 'a few' and 'many' (paucal) or may have a particular inflection for a collective quantity.

Gender in language can be a way to classify nouns and pronouns, often impacting agreement with adjectives, verbs, and pronouns; can generally include masculine, feminine, neuter, and/or common, among others.

Animacy can also play a role similar to gender in some languages, with animate or inanimate objects having distinct inflection markers.

Definiteness distinguishes between specific and non-specific objects ('the' versus 'a/an').

Partitive is used in to express something being 'part of' some bigger or greater whole.

Comparison markers indicate comparative and superlative forms.

Many languages have unique declension categories not widely found elsewhere, reflecting specific semantic or syntactic distinctions important in those languages.

In English most nouns are inflected for number with the inflectional plural affix -s (as in "dog" → "dog-s"), and most English verbs are inflected for tense with the inflectional past tense affix -ed (as in "call" → "call-ed"). English also inflects verbs by affixation to mark the third person singular in the present tense (with -s), and the present participle (with -ing). English short adjectives are inflected to mark comparative and superlative forms (with -er and -est respectively). Despite the general regularization, English retains traces of its ancestry, with a minority of its words still using inflection by ablaut (sound change, mostly in verbs) and umlaut (a particular type of sound change, mostly in nouns), as well as long-short vowel alternation.

  • Write, wrote, written (marking by ablaut variation, and also suffixing in the participle); sing, sang, sung (ablaut); foot, feet (marking by umlaut variation); mouse, mice (umlaut); child, children (ablaut, and also suffixing in the plural)

In Latin, there are five types of declension. Words that belong to the first declension usually end in -a and are usually feminine in gender, and thus can be said to share a common inflectional framework. In Old English, nouns are divided into two major categories of declension, strong and weak. Through the process of syncretism, the same inflection form may serve multiple grammatical functions in many languages, and context helps reveal the semantic details.

Morphemes can be added to inflectional languages in a variety of ways:

  • Affixation, or simply adding morphemes onto the word without changing the root;
  • Reduplication, repeating all or part of a word to change its meaning;
  • Alternation, exchanging one sound for another in the root (usually vowel sounds, as in the ablaut / umlaut in Germanic languages);
  • Suprasegmental variations, such as of stress, pitch or tone, where no sounds are added or changed but the intonation and relative strength of each sound is altered regularly.

Inflection and PIE Languages

Because the Proto-Indo-European (PIE) language was highly inflected, all of its descendants, such as Albanian, Armenian, English, German, Ukrainian, Russian, Persian, Kurdish, Italian, Irish, Spanish, French, Hindi, Marathi, Urdu, Bengali, and Nepali, are inflected to a greater or lesser extent. In general, older Indo-European languages such as Latin, Ancient Greek, Old English, Old Norse, Old Church Slavonic and Sanskrit are extensively inflected because of their temporal proximity to the PIE ancestor.

Deflexion has caused modern versions of some Indo-European languages that were previously highly inflected to be much less so; an example is English, as compared to Old English. In general, languages where deflexion occurs replace inflectional complexity with more rigorous word order, which provides the lost inflectional details. However, the transition from synthetic (inflection-heavy) to analytic (relying more on word order and auxiliary words) is complex and can be influenced by multiple factors.

Most Slavic languages and some Indo-Aryan languages are an exception to the general Indo-European deflexion trend, continuing to be highly inflected (in some cases acquiring additional inflectional complexity and grammatical genders, as in Czech & Marathi).

Syntactic Theories and Models of the 21st Century

Developed by Noam Chomsky, generative grammar emphasizes the innate linguistic capability of humans and the idea of a universal grammar underlying all languages. It has evolved through various iterations:

  • Government and Binding (GB) Theory was focused on the modular organization of syntax, including principles like X-bar theory, case theory, and theta theory.
  • Minimalist Program (MP) aims to explain language using the minimal necessary theoretical constructs and principles, focusing on the economy of derivation and representation.

Lexical-Functional Grammar (LFG) emphasizes the importance of lexical information in determining grammatical structure and includes a distinction between functional structure (f-structure) and constituent structure (c-structure).

Head-Driven Phrase Structure Grammar (HPSG) focuses on the lexicon and complex feature-based descriptions, with a strong emphasis on the role of the head in phrase structure and a commitment to a non-transformational view of syntax.

Categorial Grammar (CG) is based on the idea that syntactic categories can be understood in terms of functions that combine words and phrases into larger units. Modern variants include Combinatory Categorial Grammar (CCG) and Type Logical Grammar.

Dependency Grammar (DG) centers on the dependency relation between headwords and their dependents as the primary structure organizing sentences, differing from phrase structure grammars by not assuming constituent phrases as primary.

Construction Grammar (CxG) argues that knowledge of language is based on constructions, or form-meaning pairings, that range from specific idioms to general grammatical patterns. Variants include Cognitive Grammar and Radical Construction Grammar.

Tree Adjoining Grammar (TAG) is a highly lexicalized grammar framework that uses trees as the basic unit of syntax, allowing for more flexibility in describing cross-linguistic syntactic phenomena.

Role and Reference Grammar (RRG) integrates syntactic, semantic, and pragmatic information, focusing on the role of verb argument structure and the linking between semantic and syntactic roles.

Optimality Theory (OT) for Syntax was originally developed for phonology, but has been applied to syntax. It proposes that surface forms result from the competition between constraints, rather than from transformational derivations.

Dynamic Syntax (DS) emphasizes the temporal dimension of syntax, modeling how sentences are processed in real-time, focusing on the incremental building of semantic representations.

Stochastic/Probabilistic Grammar models incorporate probabilities into grammatical descriptions, often used in computational linguistics to model language understanding and production processes based on statistical patterns.

Semantics

Put simply, semantics is the study of linguistic meaning. It evaluates how words get specific meanings, how the meaning of a complex combination-unit of grammar is dependent upon its constituent parts. The term derives from the Greek verb sēmainō (“to mean” or “to signify”), through the adjective sēmantikos (“significant”).

In the philosophy of language, the distinction between sense and reference was an idea of the German philosopher and mathematician Gottlob Frege (On Sense and Reference; Über Sinn und Bedeutung), reflecting the two ways he believed a singular term could have meaning.

  • Reference (Bedeutung): The reference of a term is the object it refers to in the world. For example, the reference of the name "Venus" is the planet Venus itself.
  • Sense (Sinn): The sense of a term is the way the reference is presented or the mode of presentation of the object. It involves the cognitive aspect of the term, how an individual conceives of the term's reference. For instance, "the morning star" and "the evening star" have the same reference (the planet Venus) but different senses, because they refer to Venus in different ways or contexts. The statement "the morning star is the evening star" is informative because it reveals that two different senses ("the morning star" and "the evening star") refer to the same object (Venus), even though the reference is the same.

Natural languages (those that develop 'in the wild' among human populations) are known for being unboundedly productive; there is no upper limit in length, complexity, or number of grammatical expressions; Simpler expressions can be concatenated, relativized, complementized, et cetera to create ever-more-complex formulations. These complex expressions are both grammatically correct, and meaningful.

Lexicography

Morphology

Pragmatics

Morphology: The Study of Word Structure

  • Morphological Typology
    • Inflectional vs. Derivational Morphology
    • Synthetic vs. Analytic Languages
  • Morphological Processes
    • Affixation, Reduplication, Compounding
    • Suppletion, Conversion, Cliticization

Syntax: The Study of Sentence Structure

  • Syntactic Theories and Models
    • Generative Grammar
    • Dependency Grammar
  • Sentence Types and Clause Structures
    • Simple, Compound, Complex, and Compound-Complex Sentences
    • Subordination and Coordination

Grammatical Categories and Concepts

  • Cases in Languages
    • Nominative, Accusative, Ergative, and Others
    • Case Systems: Tripartite, Split Ergativity
  • Tense, Aspect, and Mood (TAM)
    • Tense: Past, Present, Future
    • Aspect: Imperfective, Perfective, Progressive
    • Mood: Indicative, Subjunctive, Imperative, Conditional
  • Voice and Valency
    • Active, Passive, Middle Voices
    • Valency Changing Operations: Causatives, Applicatives

Advanced Topics in Morphosyntax

  • Agreement and Concord
    • Subject-Verb Agreement
    • Noun-Adjective Agreement
  • Word Order Typology
    • SOV, SVO, VSO, and Free Word Order
    • Word Order and Information Structure

Semantic Roles and Relations

  • Thematic Roles: Agent, Patient, Theme, Experiencer
  • Semantic Fields and Lexical Sets
  • Polysemy, Homophony, and Synonymy

Language Change and Evolution

  • Historical Linguistics and Language Change
    • Sound Changes, Analogical Changes
    • Grammaticalization and Language Contact
  • Dialectology and Sociolinguistics
    • Language Variation and Change
    • Social Factors in Language Change

Comparative Studies Across Language Families

  • Indo-European Languages
  • Afro-Asiatic Languages
  • Sino-Tibetan Languages
  • Language Isolates and Constructed Languages

Research Methods in Comparative Linguistics

  • Comparative Method and Reconstruction
  • Typological and Areal Linguistics
  • Corpus Linguistics and Computational Approaches

Current Trends and Future Directions

  • Interdisciplinary Approaches: Cognitive Linguistics
  • Endangered Languages and Language Revival
  • Universal Grammar and Language Acquisition