This is a post by Syn Ong, AI Policy Researcher at Authors Alliance.
Open access publishing has transformed the way research circulates. In principle, open access means that anyone, anywhere, can read and reuse scholarly work without financial, legal, or technical barriers. But in practice, many works labeled as “open” are quietly constrained by restrictions that limit how they can be used, especially by machines. Some of these restrictions are well known – “NonCommercial” and “NoDerivatives” terms that limit downstream uses. But they also increasingly include fine-print Terms of Service that bar uses like text and data mining (TDM) or AI training. These additional constraints dilute the value of openness and conflict with its foundational definitions.
Creative Commons licenses have become the standard way to mark open works. But not all CC licenses support the same freedoms. Some publishers such as Taylor & Francis and MIT Press frequently apply CC BY-NC-ND licenses to books or articles described as “open.” This license restricts others from adapting the work or using it commercially without permission, even though in some cases limited reuse or transformation could qualify as fair use under copyright law. For instance, a CC BY-NC-ND article generally cannot be translated, remixed, or used to train a machine learning model without separate authorization, though some research-driven applications might nonetheless fall within fair use.
A longstanding debate in the open-access community centers on exactly how “open” one must be to be “open access,” and many authors who choose NC/ND terms do so for compelling reasons – such as maintaining integrity of expression, avoiding commercial exploitation, or protecting sensitive material. Some NC/ND-licensed works still achieve substantial openness in access and impact. For example, MIT Press’s Direct to Open titles and UC Press’s Luminos series include CC BY-NC-ND books that are freely available online, widely cited, and used in classrooms, even if reuse is limited. These examples show that openness is not always all-or-nothing: access alone can meaningfully expand a work’s reach, though broader reuse remains constrained.
According to the Budapest Open Access Initiative, open access literature should be free to “read, download, copy, distribute, print, search, or link to… or use for any other lawful purpose.” Likewise, the Open Definition states that knowledge is open only if it can be “freely used, modified, and shared by anyone for any purpose.” Both frameworks make it clear that reusability is not a bonus feature of openness, it is essential. Works carrying ND or NC clauses may technically be open access under some publishers’ policies, but they fall short of these core principles.
More recently, a growing number of publishers have started imposing restrictions not through licenses, but through website Terms of Service. In 2024 and 2025, both Wiley and Elsevier added sweeping notices in their website footers: “All rights are reserved, including those for text and data mining, AI training, and similar technologies.” These restrictions apply even to articles released under permissive CC BY licenses. Similarly, the New England Journal of Medicine prohibits any use of its content for AI training in its legal terms, requiring explicit permission even when users have lawful access. MIT Press includes a “no AI training” clause within the downloadable PDFs of its open-access books, further muddying the waters. While such statements carry little legal force against users acting under open licenses or fair use, their presence conveys to users that the rights otherwise granted to them by CC licenses are actually much more limited.
That raises a serious problem. Creative Commons licenses include a “no additional restrictions” clause: licensors may not impose legal terms or technical barriers that block uses permitted by the license. A publisher that releases an article under CC BY but prohibits AI training is in direct conflict with that condition. Creative Commons’ own policy makes this clear: if you want to add extra restrictions, you shouldn’t call your license a Creative Commons license at all. When publishers add these restrictions anyway, they not only breach community norms, they mislead authors and readers about what is actually allowed.
Some publishers justify these restrictions as necessary protections against data scraping or AI misuse. But the result is that legitimate scholarly activities (like mining text to study linguistic trends or training models to identify bias in the literature) are chilled. The very same computational methods that help unlock insights from large bodies of research are often undermined by these restrictions, even though U.S. courts have recognized large-scale digitization and search/TDM as fair use in key cases. In an era of growing reliance on automated tools for research, these limitations are not just nuisances; they are obstacles to progress.
Authors have a role to play in preventing this. When publishing open access, authors should check not just the license offered, but also the publisher’s Terms of Service and any embedded file restrictions. A work labeled as CC BY shouldn’t come with hidden clauses that say “no AI” or “no TDM.” Institutions and funders can help by encouraging the use of open repositories and pushing back against licensing or contractual practices that conflict with open norms. Transparency matters. If additional restrictions are imposed, they should be clearly disclosed, and ideally, avoided altogether.
Open access was never meant to mean “read for free, but don’t touch.” Its promise was broader: to democratize knowledge and empower new uses of research, including uses we can’t predict. When publishers quietly reserve TDM and AI rights, or apply restrictive license terms by default, they close off precisely the kinds of innovation that openness was supposed to enable. Authors, readers, and research institutions should be wary of erecting such barriers. If a work is called open, it should be open in fact, not just in form.
Discover more from Authors Alliance
Subscribe to get the latest posts sent to your email.

