Analyzing the lexical gravity of a lemma

Concept: Each lexeme exerts a specific influence on what other words are selected in its environment. This influence is called lexical gravity. A word may affect other words chosen in its vicinity in two ways. First, semantically: for thematic reasons, there is a preference to use semantically related words in the vicinity of certain words. This is why other words may occur rarely, or not at all, in these places. Second, syntactically/grammatically: words influence their environment syntactically. A preposition, for instance, needs to be followed by a noun, a verbal noun, or a suffix pronoun.

Statistically, the lexical gravity of a word may be demonstrated as follows: If all citation co-texts of a particular lexeme (node word) are put together, and the position of the node word itself is set to 0, all words to the right of the node may be described as positions +1, +2 ... , etc., all words to the left of the node as positions -1, -2 ... , etc. For each position, the type-token ratio (TTR) may be determined as the number of different lexemes in this position divided by the total number of text words in this position. The lower the TTR in a position, the lower the lexical variation in this position.

The distribution of the TTR in the positions around the node word may be displayed in graphical form. In 0 position, always occupied by the node word by definition, the curve reaches its lowest point near 0. In contrast, farther away from the node, the TTR shows a relatively homogeneous level typical of the general situation. Within the immediate vicinity of the node word, however, the lowering of the curve shows, in an individual pattern, the node word's restrictive influence on what lexemes are selected in its neighbourhood. (It remains to be noted in parentheses though, that, in the vicinity of the node word, the curve may also be higher than it would be on average. This would be a case of what is known as 'negative' lexical gravity.)

A specific pattern of lexical gravity may be used to adjust the size of window spans in running collocation analyses.

Requesting a lemma's lexical gravity: Calling up this function users will find pre-set the number of the lemma used as entry point. This default setting may be changed to suit individual preferences. Setting the drop-down list, users may opt for an analysis to be run for the top level of the lemma hierarchy (suggested), or just for individual lemmata. Checking the appropriate box, users may indicate whether or not the results of their analysis are to be displayed in numerical tables in addition to the graph. Numerical results are usually not required to be viewed.

Results display: Results will be displayed as a graph showing the portion of the TTR curve within a range of 15 positions before and after the node word. The TTR is calculated bi-logarithmically. At the top of the graph, the number of node word citations (Belegstellen) this representation is based on is shown.

A numerical representation of the results includes a table indicating, for each position, the number of types, tokens, and the bi-logarithmic type-token ratio.

