Lexical systems with systematic gaps: verbs of falling
Folia Linguistica. 2024. Vol. 58. No. 1. P. 191–226.
Folia Linguistica. 2024. Vol. 58. No. 1. P. 191–226.
The paper focuses on the lexical typology of dimensional terms such as English long, deep, wide, etc. Compared to other semantic fields, this one is relatively well-studied; however, the present study is the first to approach it from the modern typological point of view. We propose a semantic map of dimensional terms, which outlines the possible and impossible colexification patterns in the domain. However, other regularities appear likely to exist, which cannot be captured by the model of semantic mapping. We discuss the potential restrictions on colexifications, and suggest explanations for them.
The article compares the qualities ‘sharp’ and ‘blunt’ in 20 languages. We show that they tend to be unequal, with bluntness being negatively defined through sharpness. The two main oppositions in the domain are 1) the type of sharp object, and 2) the sense through which the quality is primarily experienced. The first opposition divides objects into bladed (knives etc) and pointed (needles etc), the second deals with touch vs. vision and translates to function (sharp/blunt instruments etc) vs. shape (pointed/rounded features etc).
The paper presents a methodology for automatic construction of lexical typological questionnaires for qualitative semantic domains (e.g. sharp, straight, thick, or smooth). Our algorithm is based on data from a monolingual corpus; it constructs a list of collocations for the corresponding lexemes, computes a vector representation for every collocation, clusters the vector space into semantically homogeneous groups and extracts the three central elements from every cluster. We compare the resulting questionnaires against test data from the semantic domains that are already well studied manually. The algorithm demonstrates high quality results and can be used in the practice of lexical typological research.
The article studies the domain of wetness in 20 languages. In many of them the domain features two main words (e.g. German nass, feucht; Mongolian nojton, čijgleg; Moksha načkə, l’et’kə) and the difference between them tends to be described in terms of degree, i.e. ‘intensely’ versus ‘slightly wet’. Typological analysis shows that in each case the degree of humidity receives a specific interpretation depending on the noun that is being modified, so that the choice of a particular synonym is based not simply on the quantity of the fluid, but on the situation as a whole (including the source of moisture, intentional versus non-intentional event, etc.). We also discuss the additional factors relevant to the domain in the languages that have more than two words in it, that is, the additional words with a positive or a negative connotation, or moisture from contact with a liquid versus moisture absorbed from humid air.
The chapter outlines the goals of our project, points out the aspects that distinguish the vocabulary of qualities from other lexical domains, when viewed from a typological perspective, and introduces the methods of data collection and analysis we use in this project and in other related studies. It goes on to discuss the semantic parameters that motivate the lexical oppositions in various qualitative domains.
The paper outlines the basics of data collection, analysis and visualization under the frame-based approach to lexical typology and illustrates its methodology using the data of cross-linguistic research on verbs of falling. The framework reveals several challenges to semantic map modelling that usually escape researchers’ attention. These are: (1) principles of establishing lexical comparative concepts; (2) the effective ways of visualization for the opposition between direct and figurative meanings of lexical items; (3) the problem of the borderlines between semantic fields, which seem to be very subtle. These problems are discussed in detail in the paper, as well as possible theoretical decisions and semantic modelling techniques that could overcome these bottlenecks.
Encyclopedia of Slavic Languages and Linguistics Online. / Ed. by M. Greenberg,. L. Grenoble,.
. Brill, 2020
The paper examines the properties of heavy as a perceptual concept, based on evidence from 11 languages. We demonstrate that the semantics of this concept is heterogeneous; lexemes of this field can be used in situations of at least three types: Lifting, Shifting and Weighing. These situations are either lexicalised as separate words or they converge in a single lexeme in various combinations following certain strategies. We also argue that different metaphorical extensions correspond to different situation types; this allows us to use analysis of metaphoric shifts as an additional instrument to establish the semantic structure of direct meanings.
Questionnaires constitute a crucial tool in linguistic typology and language description. By nature, a Questionnaire is both an instrument and a result of typological work: its purpose is to help the study of a particular phenomenon cross-linguistically or in a particular language, but the creation of a Questionnaire is in turn based on the analysis of cross-linguistic data. We attempt to alleviate linguists’ work by constructing lexical Questionnaires automatically prior to any manual analysis. A convenient Questionnaire format for revealing fine-grained semantic distinctions includes pairings of words with diagnostic contexts that trigger different lexicalizations across languages. Our method to construct this type of a Questionnaire relies on distributional vector representations of words and phrases which serve as input to a clustering algorithm. As an output, our system produces a compact prototype Questionnaire for cross-linguistic exploration of contextual equivalents of lexical items, with groups of three homogeneous contexts illustrating each usage. We provide examples of automatically generated Questionnaires based on 100 frequent adjectives of Russian, including veselyj ‘funny’, ploxoj ‘bad’, dobryj ‘kind’, bystryj ‘quick’, ogromnyj ‘huge’, krasnyj ‘red’, byvšij ‘former’ etc. Quantitative and qualitative evaluation of the Questionnaires confirms the viability of our method.
In this paper, we present an application for formal concept analysis (FCA) by showing how it can help construct a semantic map for a lexical typological study. We show that FCA captures typological regularities, so that concept lattices automatically built from linguistic data appear to be even more informative than traditional semantic maps. While sometimes this informativeness causes unreadability of a map, in other cases, it opens up new perspectives in the field, such as the opportunity to analyze the relationship between direct and figurative lexical meanings.
The article deals with the methodology and techniques of lexical typological
studies.
This paper deals with the typology of surface texture expressions, such as a slippery road, a smooth wooden board, rough hands, coarse or rough fabric. We discuss both their direct uses and metaphors formed with them, such as a slippery person, a smooth speech, a rugged captain. Our language sample includes 10 Uralic languages (Finnish, Estonian, Mari, Erzya, Moksha, Udmurt, Komi-Zyrjan, Hungarian, Khanty, Nenets), as well as 5 languages from other families (Russian, English, Spanish, Chinese, and Korean). The categorisation of these attributes includes primarily the division into visually perceived surfaces and surfaces perceived through physical contact. We discuss how much and in what ways the antonymic areas under observation are asymmetrical in their semantic features and combinability. One more focus in this research is to evaluate texture lexicon variation in an intragenetic study of a group of related languages in comparison with its variation across a broader sample of languages.
This paper elaborates on an approach to the cross-linguistic comparison of lexical (sub)systems, which is based on the differentiation of typologically relevant semantic domains. We illustrate this approach exploring the conceptualization of motion / being in liquid medium (aqua-motion), within which four general domains (SWIMMING, SAILING, DRIFTING and FLOATING) are recognized. Using this distinction, we propose a typology of aqua-motion systems that distinguishes between ‘rich’, ‘poor’ and ‘middle’ systems of aqua-motion expressions depending on the lexical contrasts that the language displays.
The semantic domain of pain seems to be unique in that, crosslinguistically, it includes few predicates that are specifically dedicated to pain (like hurt or ache); instead, the major part of the field is constituted by lexical units drawn from other semantic domains, which are applied to pain through processes of semantic derivation (like my eyes are burning, my throat is scratching). After discussing methodological considerations concerning data collection, the article first analyzes the semantic sources for pain predicates and addresses the issue of their typological consistency, based on data from over 20 languages It is then demonstrated that the evolution of a pain meaning cannot be reduced to a merely semantic process, since the meaning shift may be accompanied by changes in the morphological, morphosyntactic and/or syntactic properties of the source verb. We suggest the term “re-branding” for the complex meaning changes of this kind and discuss their theoretical relation to the well-established notions of metaphor and metonymy.
The paper presents a study in lexical typology. We focus on the semantic domain of pain as one of the most universal and complex areas of human experience. The predicates of unpleasant bodily sensations are compared in a sample of 23 languages. The collected material demonstrates that the use of pain verbs is dependent on the range of factors of different nature. This data heterogeneity poses the problem of cross-linguistic comparability of pain predicates. As a way to overcome this problem, we propose the construction of a typological database. The multidimensional classifications implemented in the database allow for various cross-linguistic generalizations on pain and human body conceptualizations as well as on regularities of semantic shifts in different languages.