cs| en

Data

Download » Data

All data published here are subject to copyright CC-BY-SA.

Recommended quoting:
Plecháč, Petr – Kolár, Robert (2017). The Corpus of Czech Verse: Source Data. Available at: <http://versologie.cz>.
Verse Forms
The file dcm.xml (~ 180 MB) contains structured information about verse forms (metres, numbers of feet, endings, formulae) of individual lines incorporated in the Corpus of Czech Verse.

Fixed Verse Forms
The file forms.xml (~ 1 MB) contains a structured list of the CCV poems that were indicated as realisations of fixed verse forms (sonnet, limerick,...).

Rhymes
The file gunstick.xml (~ 120 MB) contains a structured inventory of all rhymes found in the Corpus of Czech Verse.

Keywords
The file hex.xml (~ 100 MB) contains a structured inventory of key and thematic words in the CCV poems.

Frequency lists
Frequency lists of Czech poetry contain data about word frequencies in the poetry production incorporated in the Corpus of Czech Verse. Lists give information on the frequencies of lemmata and words, providing these in poetry collections, entire authors' subcorpora, and the whole Corpus of Czech Verse.

[lists »]

This site uses cookies. By continuing to browse the site, you are agreeing to our use of cookies.

This website was created with support of the Czech Science Foundation as part of the P406/11/1825 (The History and Theory of the 19th Century Czech Verse) and 17-01723S (Stylometric Analysis of Poetic Texts) projects and with support aiming at a long-term, conception-based development of a research institution (no. 68378068).
© 2018 Petr Plecháč