Five Publishers and Scott Turow Sue Meta and Mark Zuckerberg

Publishers and Scott Turow Accuse Meta of Pirating Books to Train Llama AI

Sharing is caring!

Five Publishers and Scott Turow Sue Meta and Mark Zuckerberg

Five Publishers and Scott Turow Sue Meta and Mark Zuckerberg – Image for illustrative purposes only (Image credits: Pixabay)

Five leading publishers joined forces with bestselling author Scott Turow to file a class-action lawsuit against Meta Platforms and its CEO Mark Zuckerberg. The complaint, submitted Tuesday in the United States District Court for the Southern District of New York, charges the tech company with widespread copyright infringement.[1][2] It alleges Meta downloaded millions of unauthorized copies of books and journal articles from piracy websites to develop its Llama artificial intelligence models. The case highlights growing tensions between creators and tech firms over AI training data.

Key Players in the Legal Challenge

The plaintiffs include Elsevier Inc., Cengage Learning Inc., Hachette Book Group Inc., Macmillan Publishing Group LLC, and McGraw Hill LLC. These companies represent a mix of trade, educational, and academic publishing interests. Scott Turow, known for legal thrillers like Presumed Innocent, serves as a representative for individual authors alongside his publishing entity S.C.R.I.B.E. Inc.[1][2]

Together, they seek to represent a broad class of copyright holders whose works suffered similar treatment. The suit demands monetary damages, an injunction to halt further use of the materials, destruction of infringing copies, and a jury trial. Publishers argue the actions not only violated copyrights but also undermined potential licensing markets for AI training.[3]

Allegations of Systematic Piracy

Meta engineers reportedly turned to notorious piracy platforms when facing data shortages for Llama training. Sites such as Anna’s Archive, Library Genesis (LibGen), Sci-Hub, Z-Library, and torrent trackers like Books3 provided the bulk of the content. The company also incorporated web-scraped datasets from sources including Common Crawl, which contained paywalled and pirated materials.[1][2]

During processing, Meta allegedly stripped copyright management information like notices and author attributions to hide origins. Llama models, spanning versions 1 through 3.1, memorized and regurgitated elements from these works. Examples include detailed summaries of Turow’s novels and style-mimicking outputs that could displace original sales on platforms like Amazon.[1]

  • Verbatim excerpts from Cengage’s Calculus: Early Transcendentals.
  • Knockoff sequels to Turow’s Innocent.
  • Travel guides echoing Hachette authors’ voices.

Zuckerberg’s Hands-On Role

The complaint portrays Zuckerberg as a central figure in the decisions. As Meta’s founder, chairman, and controlling shareholder, he demanded solutions to training data gaps. Internal escalations to “MZ” led to abandoning licensing talks in favor of piracy, with one employee noting it preserved a “fair use strategy.”[2][3]

Turow described the conduct as “shameless, damaging and unjust behavior” in an email. He expressed outrage that Meta, among the world’s richest firms, used pirated copies of his and others’ books to generate competing content.[1] Maria A. Pallante, head of the Association of American Publishers, called for a “sustainable A.I. landscape” with protections for creators.[1]

Meta’s Defense and Wider Context

A Meta spokesperson countered that courts have recognized AI training on copyrighted material as fair use, fueling innovation. The company vowed to contest the suit vigorously.[4] This marks the first major publisher-led action against an AI developer, though authors previously sued Meta unsuccessfully in 2025.[3]

The dispute fits into a surge of litigation, including cases against OpenAI and Anthropic. Anthropic settled one for $1.5 billion last fall. As AI revenue projections soar – potentially trillions for Meta – these battles will shape content access for generative models. The full complaint outlines the stakes for publishers and authors alike.[1]

About the author
Lucas Hayes

Leave a Comment