Questionnaires constitute a crucial tool in linguistic typology and language description. By nature, a Questionnaire is both an instrument and a result of typological work: its purpose is to help the study of a particular phenomenon cross-linguistically or in a particular language, but the creation of a Questionnaire is in turn based on the analysis of cross-linguistic data. We attempt to alleviate linguists’ work by constructing lexical Questionnaires automatically prior to any manual analysis. A convenient Questionnaire format for revealing fine-grained semantic distinctions includes pairings of words with diagnostic contexts that trigger different lexicalizations across languages. Our method to construct this type of a Questionnaire relies on distributional vector representations of words and phrases which serve as input to a clustering algorithm. As an output, our system produces a compact prototype Questionnaire for cross-linguistic exploration of contextual equivalents of lexical items, with groups of three homogeneous contexts illustrating each usage. We provide examples of automatically generated Questionnaires based on 100 frequent adjectives of Russian, including veselyj ‘funny’, ploxoj ‘bad’, dobryj ‘kind’, bystryj ‘quick’, ogromnyj ‘huge’, krasnyj ‘red’, byvšij ‘former’ etc. Quantitative and qualitative evaluation of the Questionnaires confirms the viability of our method.
Publications from 2012 to 2022
The article examines the relationship between time and space in language on the basis of adjectives denoting high or low speed in Russian and other (mostly Slavic) languages. In physics, the notion of speed is defined in terms of time and space (distance per time unit). It is argued, however, that speed in natural language is a primarily temporal concept involving the comparison of the temporal properties of a ‘target situation’ with those of a ‘norm’. Speed terms are shown to develop their own metaphors and metonymies, subsequently becoming connectors and intensifying markers. This argument has important theoretical implications insofar as it demonstrates that the domain of time is less dependent on space than the traditional view might indicate.
The semantic domain of pain seems to be unique in that, crosslinguistically, it includes few predicates that are specifically dedicated to pain (like hurt or ache); instead, the major part of the field is constituted by lexical units drawn from other semantic domains, which are applied to pain through processes of semantic derivation (like my eyes are burning, my throat is scratching). After discussing methodological considerations concerning data collection, the article first analyzes the semantic sources for pain predicates and addresses the issue of their typological consistency, based on data from over 20 languages It is then demonstrated that the evolution of a pain meaning cannot be reduced to a merely semantic process, since the meaning shift may be accompanied by changes in the morphological, morphosyntactic and/or syntactic properties of the source verb. We suggest the term “re-branding” for the complex meaning changes of this kind and discuss their theoretical relation to the well-established notions of metaphor and metonymy.