Petr Plecháč :: Corpus of Czech Verse and Beyond ASEEES 2016
Versification RG, ICL CAS | Institute of the Czech National Corpus

Detection of rhymes

Data (gold standard)

Corpus of Czech Verse Corpus of English Poetry Corpus of French Poetry
Time span: 18th to 20th century 16th to 20th century 16th to 17th century
Number of lines: ~ 2 500 000 ~ 95 000 ~ 25 000
Annotation of rhymes: automatic with detailed manual check
(P. Plecháč & R. Kolár)
manual
(S. Reddy & K. Knight)
manual
(S. Reddy & K. Knight)
Phonetic transcription: KVĚTA Mary TTS Mary TTS
Authors / grouping of data: ~ 300 authors / decades when born 32 authors / centuries 9 authors / centuries
1740_melezinek
1740_vavak
1750_leska
1750_stach
1760_hek
1760_puchmajer
...
14-15_more
14-15_wyatt
15-16_constable
15-16_daniel
15-16_drayton
15-16_fletcher
15-16_griffin
15-16_jonson
15-16_lodge
15-16_lovelace
15-16_milton
15-16_shakespeare
15-16_sidney
15-16_smith
15-16_spenser
16-17_dryden
16-17_drayton
16-17_finch
16-17_pope
16-17_prior
16-17_swift
17-18_byron
17-18_coleridge
17-18_goldsmith
17-18_shelley
17-18_turner
17-18_wordsworth
18-19_brooke
18-19_crosland
18-19_housman
18-19_chesterton
18-19_kipling
18-19_thomas
14-15_guillet
14-15_margueritte
14-15_marot
14-15_rabelais
14-15_sceve
15-16_bellay
15-16_bertaut
15-16_labe
15-16_ronsard