Analyzing word frequency distribution
Concept: Different words occur with widely differing frequencies. There are very few lexemes that occur extremely often. For most lexemes, however, the frequency of
occurrence is low. This function serves to determine and represent word frequency distribution within the entire corpus or a partial corpus, with the option to look into frequency distribution within
specific word classes.
Calling up this function for the entire corpus allows users to choose a text database from a drop down list ("Select text database"), the options being a (non-Demotic)
Egyptian and a Demotic corpus of texts. When this function is called up via part of the corpus, non-Demotic Egyptian will be the default setting.
- Using the drop-down list "Run analysis for word category" will restrict the procedure to specific word classes
- Using the drop-down list "Run analysis for", users may opt for the index to be generated for top level lemmata within the hierarchy of the lemma list (suggested), or just for individual lemmata.
- The drop-down list "Results graph scaling" allows to opt for a chart with a (suggested) bi-logarithmic (log-log) scale, a semi-logarithmic scale (only Y-axis logarithmic), or a linear scale.
- A numerical printout of results may be obtained by using the check box "Print numerical results table?". In most cases, however, a printout will not be necessary.
Results will be represented in chart form, with the graph showing the distribution of word frequencies on the requested scale.
Numerical results will be grouped in a table that includes the number of lexemes for each frequency class, and shows what proportion the incidences of these lexemes constitute relative to the entire
lexical inventory. These values are displayed also in cumulated form.