02 APR 2024
PoeTree version 0.0.2
- PoeTree.sl added
- PoeTree.de enriched with Deutsches Lyrik Korpus
Dataset comprising over 330,000 poems / 89,000,000 tokens in ten languages (Czech, English, French, German, Hungarian, Italian, Portuguese, Russian, Slovenian, and Spanish). Each corpus has been deduplicated, enriched with Universal Dependencies, provided with additional metadata and converted into a unified JSON structure.
Universal Dependencies
wikidata
VIAF
16 OCT 2023
PoeTree version 0.0.1
Dataset comprising over 300,000 poems / 84,000,000 tokens in nine languages (Czech, English, French, German, Hungarian, Italian, Portuguese, Russian, and Spanish). Each corpus has been deduplicated, enriched with Universal Dependencies, provided with additional metadata and converted into a unified JSON structure.
Universal Dependencies
wikidata
VIAF
Supported by the Czech Science Foundation (GA23-07727S)