PoeTree is a standardized collection of poetry corpora comprising over 330,000 poems in ten languages (Czech, English, French, German, Hungarian, Italian, Portuguese, Russian, Slovenian, Spanish). Each corpus has been deduplicated, enriched with Universal Dependencies, provided with additional metadata and converted into a unified JSON structure.
The latest version of full JSON collection is available at
PoeTree is also accessible via and through and libraries.
PoeTree size measured by number of poems
PoeTree coverage measured by number of poems