Newsroom

Is it Possible to use Ethical AI in Today’s Audiobook Production?

Juliana Rueda, Founder of Miut and MiVoz, Specialist in audiobook production and AI-driven audio innovation

The concept of ethical AI has been circulating in the publishing industry since the rise of artificial intelligence in 2022. It is presented as a benchmark, almost like a promised land to which publishers should aspire. But what does it really mean in the practice of audiobook production? And is it truly achievable?

In early 2024, a publisher presented me with a very specific dilemma. They had a backlist of nonfiction titles they wanted to produce as audiobooks. The problem was the budget: They didn’t have sufficient resources to undertake a production with traditional human narration and were exploring AI-based solutions. However, the options they found raised doubts. Some offered results of questionable quality; others raised ethical concerns regarding the origin of the voices used or the conditions under which the systems had been trained.

As the founder of Miut, an audiobook production company for the Spanish-speaking market with over a decade of experience, they asked me if I knew of a reliable alternative. I didn’t. That question led me to investigate the AI narration market more thoroughly.

“Publishers needed economically sustainable models for certain segments of their catalog, and many narrators wanted to participate in the conversation about how to integrate this technology without losing control over their professional identity.”
What I found was an immature and alarming landscape. There were some alternatives where the origin of the voice was not transparent or where voices had been used without explicit consent. In others, compensation for narrators was unclear or insufficient. Alongside this supply, a growing demand was evident: Publishers needed economically sustainable models for certain segments of their catalog, and many narrators wanted to participate in the conversation about how to integrate this technology without losing control over their professional identity.

In that context, I created MiVoz, an AI-based audiobook production company whose production model relies on the use of Authorized Voice Replicas (AVR) from actors already established in the Spanish-language audiobook market. The intention was not to replace human narration, but to explore an intermediate and complementary approach that combined three principles: explicit consent, traceability, and quality standards comparable to those of traditional production.

AVRs are based on samples recorded by the narrator themselves, with specific contractual authorization to generate a digital replica of their voice. Unlike generic synthetic voices created from large anonymous databases, this model allows for the clear identification of the voice’s origin and the transparent establishment of previously agreed-upon terms of use and compensation.

During the technical testing process, it became clear that licensed voice actors offered superior naturalness and interpretive consistency compared to many standard synthetic voice solutions. However, technology alone does not guarantee a satisfactory result. Pre-production, editing, quality control, and human supervision remain essential elements for maintaining the standard that listeners associate with audiobooks, positioning them as a high-quality cultural product.

“Pre-production, editing, quality control, and human oversight remain fundamental elements for maintaining the standard that listeners associate with audiobooks.”
Implementing this model has been a learning process. Not all initial decisions were perfect, and it has been necessary to adjust procedures, contracts, and workflows. Constant dialogue with narrators and other industry professionals proved key to defining a framework that balances innovation and protection.

But even with these precautions, one might ask whether we can claim to have achieved “ethical AI.” The answer is likely not definitive. More than a static state, ethics is an ongoing process of review. Even if a specific model establishes clear conditions for consent and compensation, it remains essential to examine the technological foundations upon which it is built. What data are the systems trained on? What transparency do technology providers offer? What audit mechanisms exist?

In this regard, industry initiatives such as the labeling guidelines for AI-narrated audiobooks launched in 2024 by professional associations in the United Kingdom represent an important step. The proper identification of AI-generated content not only contributes to transparency but also promotes consumer education and strengthens trust in the product.

Available data underscores the relevance of this debate. In a recent survey conducted by Audible among European listeners, 80% said they choose an audiobook based on the narrator, and 90% perceived the audiobook as “premium” content. These figures reflect the extent to which the voice is not an incidental element, but central to the listening experience.

Precisely for this reason, the AI models that are developed must take this reality into account: It is not merely a matter of generating audio, but of preserving the identity and interpretive quality that the public already recognizes and values. In this sense, alternatives based on authorized replicas of established actors may better align with listener expectations than generic solutions disconnected from the professional market.

If the industry opts for a race toward extreme cost-cutting, the risk is eroding that perception of value. Technology can facilitate the expansion of the catalog and reaching new audiences, but only if a clear commitment to quality and transparency is maintained.

The integration of AI into audiobook production is not a question of “what if,” but of “how.” Denying it does not seem realistic; adopting it without criteria doesn’t either. The current moment offers a limited opportunity to define standards that protect narrators, provide security to publishers, and maintain listeners’ trust.

“The integration of AI into audiobook production is not a question of ‘if,’ but of ‘how.’ Denying it does not seem realistic; adopting it without criteria doesn’t either.”
Perhaps the more appropriate question is not whether AI can be ethical, but under what conditions we are willing to consider it as such. The answer will depend on the industry’s ability to establish clear frameworks, hold technology providers accountable, and avoid opaque or improvised solutions.

The future of the audiobook will remain deeply tied to the voice. The question is how we decide to integrate these new tools into that ecosystem. The decisions we make now will help define not only business models but also the cultural perception of the format in the coming years.

At MiVoz, we believe that this transformation can and must be built on consent, quality, and transparency. It is not a matter of our AVRs replacing human narration—quite the contrary—it’s a question of integrating them responsibly into the audiobook ecosystem. Nor does it make sense to ignore current technology: it is here to stay, and with the decisions we make now, we will have the opportunity to shape the industry we want to build in the future.