Petr Plecháč :: Corpus of Czech Verse and Beyond ASEEES 2016

Versification RG, ICL CAS | Institute of the Czech National Corpus

Detection of rhymes

Training set: "vertical collocations"

High precision, low recall (completely unsupervised):

	CZECH	ENGLISH	FRENCH
PRECISION:	94.6 %	96.0 %	99.5 %
RECALL:	14.9 %	14.7 %	2.7 %

Rhyme-type / rhyme-token ratio

Does the size of rhyme repertory corellate with the extent of inflection ?

10 random samples w/o replacement; n = 1000; 10 times repeated:

CZECH ENGLISH FRENCH

MEAN: 0.9784 - 0.9826 0.9199 - 0.9295 0.9562 - 0.9644

ST.DEV.: 0.0028 - 0.0062 0.0043 - 0.0099 0.0037 - 0.0080

10,000 random samples w/ replacement; n = 10,000:

CZECH ENGLISH FRENCH

MEAN: 0.8885 0.6881 0.8052

ST.DEV.: 0.0040 0.0029 0.0033