Liberating Data: Insights from Laurence Bénichou at the Berlin Training Workshop



Berlin, 11 November 2025. The recent Data Liberation Workshop, organised in collaboration with DEST (Distributed European School of Taxonomy of CETAF), the European Journal of Taxonomy (EJT), Plazi, and GBIF, brought together experts and trainees in Berlin for two intensive days of hands-on training on how to extract, structure, annotate and reuse biodiversity data from scientific literature. The training followed an online theoretical session held on 15 October 2025, which introduced participants to the state of biodiversity publishing and the global move toward data liberation.

At the centre of the workshop’s message was a simple but powerful idea: publishing biodiversity research is not enough if the data cannot be found, accessed, or reused.
This principle was highlighted throughout the training by Laurence Bénichou, Head of the Scientific Publishing Department at the Muséum national d’Histoire naturelle (MNHN), Co-founder and Liaison Officer of the European Journal of Taxonomy, and one of the course trainers.

Why liberating data matters

According to Bénichou, the purpose of the workshop was clear: to help participants understand how to extract data from publications and why this matters for biodiversity science. Scientific articles are rich in data, from taxonomic treatments to specimen information, figures, maps and occurrences; but much of this content remains locked inside PDFs.

Bénichou emphasised that the goal is for biodiversity data to be FAIR: Findable, Accessible, Interoperable and Reusable. “If the data are published but cannot be retrieved, reused or found,” she explained, “their value is lost. It is like writing a book and leaving it in a basement.”

Ensuring that data can be extracted as soon as a paper is published is essential for enabling access, reuse, and wide dissemination. The workshop was therefore designed to demonstrate both the extraction process and the tools available to retrieve these data once they have been liberated.

Laurence Bénichou

A two-part training: from theory to practice

The workshop combined an online theoretical introduction with an intensive in-person practical component.

The online session, held on 15 October, set the scene by exploring the current state of biodiversity literature, the need to make taxonomic data machine-readable, and the global momentum toward open and reusable data. This provided participants with the conceptual background necessary to understand why data liberation is becoming central to modern biodiversity research.

The following two days in Berlin transformed these ideas into practice. Through hands-on exercises, participants learned how to structure and annotate biodiversity publications, extract information from both born-digital and legacy literature, and work with real data using tools such as XML-first workflows, GoldenGate, TreatmentBank, BLR, GBIF, Biodiversity PMC and others.

This practical immersion allowed trainees not only to understand the principles behind data liberation but also to directly experience the tools and workflows that make it possible.

The mission of the European Journal of Taxonomy

Bénichou also highlighted the unique role of the European Journal of Taxonomy (EJT) in fostering data liberation. Founded in 2011, EJT is a diamond open-access journal, owned by ten European natural history institutions and endorsed by CETAF. Neither authors nor readers pay fees, ensuring that taxonomic knowledge remains fully accessible to the scientific community.

For Bénichou, this community-driven model is essential. Because the journal is collectively owned, the community can jointly decide: which data to extract, how to extract it, and where to make it accessible.

She described EJT as “the best journal in taxonomy”, highlighting its success and its significance as a collaborative platform for shaping the future of biodiversity publishing.

She encouraged more institutions and experts to engage with the journal: “Having the journal in the hands of the community is essential to serve its needs. We welcome people to join us and participate.”

A collaborative training team

In addition to Bénichou, the workshop was delivered by a diverse group of experts representing key organisations in biodiversity data infrastructure:

  • Chris Le Coquet: French publisher and desk editor for EJT; digital projects lead on FAIR biodiversity data at MNHN
  • Donat Agosti: Co-founder of Plazi, specialist in transforming literature into FAIR data
  • Julia Giora: Head of Learning & Engagement, Plazi
  • Emilie Pasche: Research associate at HES-SO Geneva & SIB, involved in Biodiversity PMC
  • Markus Döring: Botanist and biodiversity informatician, GBIF Secretariat

Their combined expertise ensured that trainees received a complete overview of workflows spanning the entire data lifecycle — from publication to global dissemination.

Leave a Reply

Your email address will not be published.

In order to facilitate the use of our website, we use cookies.

Please confirm if you accept our tracking cookies. When declining the cookies, you can continue visiting the website without sending data to third party services. Read our complete cookie statement here.