Here are the data sets, text corpora and publications we have produced and made available so far.

Text corpora and other data sets:

  1. UNESCO’s Standard-Setting Instruments
    • Standard-setting instruments, 1945-2019
      • This digital text corpus compiles the English-language texts of all Conventions, Declarations and Recommendations adopted by UNESCO’s General Conference (1945-2019).
      • Available for download on our GitHub repository “Legal Instruments”.
  2. The UNESCO Courier:
    • Data sets:
      • Curated Courier article corpus, 1948-2020
        • This corpus consists of the texts of all articles published in the English-language edition of The UNESCO Courier between 1948 and 2020, and includes a comprehensive curated metadata index (document_index.csv).
        • Online at Zenodo.
        • POS-tagged and DTM versions of the curated article corpus are available in the project GitHub release.
      • Complete curated issue corpus, 1948-2020
        • This corpus compiles the complete text of all Courier issues (English-language edition), 1948-2020.
        • Online at Zenodo.
    • Analytical tools and Supplementary materials:
      • Courier-Lab
        • Courier-Lab allows you to explore the Courier text corpus through a variety of digital text analysis tools through a web-based Jupyter Notebook.
      • Quality control data and supplementary material
      • GitHub repository “Tagged Courier”
  3. Proceedings of the General Conference:

Project publications

B. Martin and F. Mohammadi Norén, “Nature and Culture in the Age of Environmental Crisis: Digital Analysis of a Global Debate in The UNESCO Courier, 1948-2020”, in A. Rockenberger, S. Gilbert and J. Tiemann, eds., DHNB2023 Conference Proceedings. Digital Humanities in the Nordic and Baltic Countries Publications 5, 1 (Oslo, 2023): 274-86. DOI:

Benjamin G. Martin, Fredrik Mohammedi Norén, Roger Mähler, Andreas Marklund and Oriane Martin, “The Curated UNESCO Courier 1.0: Annotated Corpora for Digital Research in the Global Humanities,” Journal of Open Humanities Data, 10: 20, pp. 1–13. DOI: