cs| en

Hex – Keywords

Tools » Hex
SEARCHING THE KEYWORD
INVESTIGATING THE DATABASE
Lemma: [?]
Year: [?] from: to:
Author: [?]
Poetry Collection: [?]
Poem: [?]
Sort by author's name
Sort by poetry collection name
Sort by publication year
Sort by poem name
List of frequencies of the selection
Nouns
Adjectives
Pronouns
Numerals
Verbs
Adverbs
Prepositions
Conjunctions
Particles
Interjections
Method:
Keywords
Thematic Words (h-point)

Minimal Frequency:

Significance Level (α):

Corpus of Reference:
Corpus of Czech Verse
Author's Work

Minimal size
of author's subcorpus:

In the Corpus of Czech Verse , Hex Application is able to search texts containing a user-specified key/thematic word, or to display all key/thematic words found in a user-specified scope of texts.

Those lemmata are considered key ones the frequencies of which statistically surpass – in the poem – their frequencies in the total of the Corpus of Czech Poetry. Statistical significance is, at the same time, verified via χ2 test with Yates' correction , and log-likelihood test. An user is free to specify whether the tests will operate at the α = 0,001 significance level (i.e., with the 0.1% risk that a lemma the high frequency of which is only a matter of randomness will be declared a keyword), or at the α = 0,01 one (with the 1% risk of the same). Besides this, a user can specify which parts of speech should be excluded from the keyword analysis (in the default mode, only nouns, adjectives, and verbs are permitted), declare the minimal number of lemma occurrences in the poem needed for it to be considered it a keyword, and select a corpus of reference (i.e., whether the values will be compared with the whole corpus, or with the works of the given author only).
For a keyword analysis in user's own texts, we recommend the application KWords, which was designed by the Institute of the Czech National Corpus (Faculty of Arts, Charles University) and which served for us as inspiration for Hex.

Those lemmata are considered key ones the absolute frequencies of which are higher than their positions in the rank-frequency distributions of the given poems (they occur in the poems more often than the ranks of their frequencies are); cf. Čech-Popescu-Altmann 2014.

This site uses cookies. By continuing to browse the site, you are agreeing to our use of cookies.

This website was created with support of the Czech Science Foundation as part of the P406/11/1825 (The History and Theory of the 19th Century Czech Verse) and 17-01723S (Stylometric Analysis of Poetic Texts) projects and with support aiming at a long-term, conception-based development of a research institution (no. 68378068).
© 2018 Petr Plecháč