Skip to main content
  • Pricing
  • Policies
  • Support us
  • Login
Sign up
10.70950
  • Announcing a Major Update to the Thoth Platform
  • What the FIL Guadalajara debates reveal about metadata for academic books and how Thoth Open Metadata and SciELO Books respond to this challenge
  • New Report Published: International Metadata Recommendations and Platform-Specific Requirements for Open Access Books and Chapters
  1. Home
  2. Blog
  3. 10.70950
  4. What the FIL Guadalajara debates reveal about metadata for academic books and how Thoth Open Metadata and SciELO Books respond to this challenge

On this page

  • When books have no data, they have no visibility
  • Metadata as institutional infrastructure
  • Beyond platforms: why interoperability matters
  • Who bears responsibility for the data?
  • How Thoth Open Metadata and SciELO Books connect to this challenge
  • SciELO Books: a public infrastructure for academic books
  • Thoth: fostering autonomy for publishers
  • Building the future

What the FIL Guadalajara debates reveal about metadata for academic books and how Thoth Open Metadata and SciELO Books respond to this challenge

by Amanda Ramalho and Toby Steiner

In early December, FIL Guadalajara 2025 took place, one of the main international events in the Latin American publishing sector. Among the various professional activities at the fair, the Encuentro de Editores Universitarios Iberoamericanos (Ibero-American University Publishers Meeting) stood out, in which Thoth Open Metadata participated, bringing together university publishers, researchers and representatives of publishing infrastructures to discuss the challenges and opportunities of academic publishing in the region.

Throughout the meeting, topics such as publishing communication, responsible science evaluation, multilingualism and the creation of the Ibero-American Council for University and Academic Publishing were addressed. In this context, one panel focused on presenting the results of the first phase of the Radiografías de la edición académica iberoamericana project, a collective initiative that seeks to map the production, circulation and visibility of academic books in the region.

When books have no data, they have no visibility

The analysis presented by the Radiografías project highlighted recurring structural issues related to editorial information management. Among the main problems identified are metadata inconsistency, lack of essential information, data fragmentation across multiple systems, and low interoperability between platforms.

Preliminary results point to three key challenges:

  1. Data fragmentation: the absence of standardised systems hinders the sharing, integration and discovery of academic content.
  2. Metadata inconsistency: high percentages of incomplete or missing information, such as authorship, institutional affiliations, types of work, persistent identifiers (DOI, ISBN), rights information and access channels.
  3. Low interoperability: the lack of common standards prevents books from circulating efficiently between libraries, indexers, open access platforms, and evaluation systems.

In practice, this scenario means that many academic books remain invisible to:

  • library discovery systems;
  • aggregators and indexing services;
  • scientific information systems;
  • and, above all, in research evaluation processes.

The debate made it clear that, in the current context, the visibility of academic books does not depend merely on being published, but on the ability of their metadata to circulate in a structured, interoperable, and reusable manner across different scientific communication ecosystems.

Metadata as institutional infrastructure

One of the key points highlighted at the roundtable was the need to understand metadata as an institutional responsibility, rather than merely a technical or ancillary task. Decisions about how metadata is created, stored, and shared directly impact the autonomy of publishers, their long-term sustainability, and their ability to integrate open scientific communication infrastructures.

From this perspective, open metadata does not represent an ideological position, but rather a governance choice. Keeping bibliographic records under institutional control allows for:

  • Ensuring data ownership and integrity
  • Reducing dependencies on proprietary systems
  • Ensuring consistency across different dissemination channels
  • Participation in cooperative, non-exclusive ecosystems

Beyond platforms: why interoperability matters

Publishing an open access book alone is not enough to guarantee its visibility and circulation, nor to ensure its discovery, if the associated data is incomplete, inconsistent, or isolated in technical silos.

What the Ibero-American publishing ecosystem increasingly demands are layers of infrastructure capable of:

  • Adopting widely used standards such as ONIX, MARC, and KBART.
  • Enabling automated data exchange through APIs.
  • Enabling simultaneous metadata flow to multiple destinations.

In this context, interoperability is no longer a differentiator but a basic requirement for the recognition of academic books.

Who bears responsibility for the data?

The data presented by Radiografías reveal a central weakness in Ibero-American academic publishing: the absence of structured and consistent open data at source.

In the publishing ecosystem, each actor plays a specific role. ISBN agencies record information, distributors share it, and aggregators index it. The quality of metadata, however, depends fundamentally on its creation at the point of origin: the publishers. It is at this point that the consistency, completeness, and standardisation of data can be ensured in a lasting way.

The high percentages of unavailable information identified by Radiografías — such as gaps in authorship, affiliations, persistent identifiers, and rights information — are not always related to a lack of commitment on the part of publishers or the absence of qualified teams. In many cases, these are structural limitations associated with the lack of adequate infrastructure for metadata management throughout the editorial flow. Furthermore, data quality degrades during its journey through different intermediaries’ systems, and artificial barriers to downstream re-use (e.g. by libraries) are being introduced by certain intermediaries via an application of restrictive copyright.

How Thoth Open Metadata and SciELO Books connect to this challenge

The discussions in Guadalajara pointed to the importance of solutions that integrate standards, governance, and the effective circulation of data as structural elements of academic publishing.

This is precisely where Thoth Open Metadata and SciELO Books connect by offering open and convergent infrastructures that respond to the same structural challenge: without consistent and open data, academic books do not circulate.

SciELO Books: a public infrastructure for academic books

For over 28 years, the SciELO Programme has consolidated a robust infrastructure for scientific journals, based on editorial standardisation, digital preservation, international interoperability and a complete metadata chain integrated into its bibliometric database. Over time, this experience has revealed a structural asymmetry: while journals had a consolidated infrastructure, academic books remained without an equivalent ecosystem. Until the early 2010s, academic books were published mainly in print format and, even when made available in PDF, this occurred without structured treatment of full texts or metadata capable of ensuring interoperability, traceability and international circulation. The result was fragmentation of information, low visibility and difficulty in inserting books into scientific discovery and evaluation systems.

In 2012, the SciELO Books collection was launched, adapting the same logic of standards, criteria and structured metadata already applied to journals to academic books. In 2025, a new phase began with the expansion to the SciELO Books Network, whose first collection under development was SciELO Books Mexico. During FIL Guadalajara 2025, the start of development of the SciELO Books Peru collection was also announced, expanding this model to a Latin American and Ibero-American scale.

Currently, SciELO Books operates on six key fronts, in cooperation with participating publishers:

  • Master record per book and per chapter
  • Production of accessible EPUBs, following accessibility and technical consistency requirements
  • DOI registration and management for books and chapters, with verification prior to publication
  • Review of consistency between ISBNs and the data presented in the final files
  • Automated export of metadata in formats such as ONIX and KBART
  • Digital preservation, ensuring long-term maintenance and availability

Thoth: fostering autonomy for publishers

At the same time, in the European context, the COPIM project reached a similar conclusion from a different perspective: many academic publishers, especially small and medium-sized ones, lacked open and sustainable tools to create, manage, and distribute metadata in a standardised way. The response to this challenge was the development of an open dissemination system now called Thoth, designed as an open, community-based infrastructure for the management and circulation of editorial metadata, without dependence on proprietary platforms.

Thoth offers:

  • Complete management: Metadata recorded in a standardised format, under the publisher's own control, ensuring complete and auditable access and sharing
  • Structured export in multiple formats: More than 15 formats available for free download, including ONIX, MARC, KBART, XML for Crossref, and custom platform-specific distribution schemes
  • Interoperability via open APIs: Libraries, repositories, distributors, institutional portals, and dashboards can consume metadata directly from the platform, without intermediaries through two dedicated open APIs and an OAI-PMH endpoint
  • OMP plugin to export metadata directly to Thoth

Thoth Open Metadata, the non-profit organisation responsible for developing and maintaining the Thoth system, also offers optional services such as automated distribution, website and catalogue hosting, creation of custom catalogues, provision of usage statistics across platforms, and integration with publishers' own systems.

Towards the end of Q1/2026, Thoth Open Metadata will be launching a complete overhaul of the system with a multilingual interface in Spanish and Portuguese, multilingual metadata, accessibility metadata, and improvements to APIs and ingestion and distribution flows.

Thoth contributes autonomy, portability, transparency, and scalability to publishers, positioning itself as a complementary infrastructure to existing publishing and dissemination platforms.

Building the future

The debates presented throughout this blog post converge on a central finding: the circulation of academic books increasingly depends on the ability of publishers to maintain effective control over the data describing their editorial production.

Having full access to one's own metadata — in an open, structured, exportable, and auditable form — is no longer a technical detail but has become a strategic element of academic publishing. When data exists only on third-party platforms, publishers lose visibility, traceability, and autonomy over how their books are represented in libraries, databases, and scientific discovery systems.

In this context, some questions become inevitable for university publishers:

  • Is it possible to export all metadata in standard formats, whenever necessary?
  • Is there access to the complete history of editorial records?
  • Is it possible to audit what information is being sent to libraries, aggregators, and indexing services?
  • Can the data be downloaded in its entirety, at any time, without dependence on specific suppliers?

The ability to answer these questions affirmatively is what defines, in practice, genuinely open metadata. Not as an ideological position, but as an editorial and institutional necessity. In light of the diagnosis presented by the Radiografías de la edición académica iberoamericana project, three lines of action stand out as fundamental for academic books to circulate more widely, consistently, and sustainably:

  • Structured catalogue management: Each publisher needs a master record for each work, capable of linking editions, formats and distribution channels over time.
  • Truly open metadata: Data under institutional control, exportable, reusable and auditable, regardless of commercial platforms.
  • Adoption of shared standards: ONIX for marketing, MARC for libraries, and KBART for databases are not optional: they are the lingua franca of the publishing ecosystem.

The data produced by the Radiografías project offers an accurate diagnosis. The challenge now is collective and forward-looking: what infrastructure will be built so that, in the coming years, these gaps will cease to exist?

The editorial quality of academic books is directly linked to the quality of the data that describes them — and the quality of this data depends on the infrastructure choices made today.

DOI: https://doi.org/10.70950/sfsv8305


Header image by Amanda Ramalho, CC BY 4.0

UK registered social enterprise and Community Interest Company (CIC).

Company registration 14549556

Metadata

  • By book
  • By publisher
  • GraphQL API
  • Export API

Thoth

  • About Us
  • Privacy policy
  • Terms & Conditions
  • Service status

Contact

  • Email
  • Twitter
  • Mastodon
  • Github

Copyright © 2025 Thoth Open Metadata. Except where otherwise noted, content on this site is licensed under a Creative Commons Attribution 4.0 International license.