Author Archives: Dave Hansen

Hachette v. IA Amicus Briefs: Highlight on Privacy and Controlled Digital Lending

Posted January 16, 2024

Photo by Matthew Henry on Unsplash

Over the holidays you may have read about the amicus brief we submitted in the Hachette v. Internet Archive case about library controlled digital lending (CDL), which we’ve been tracking for quite some time. Our brief was one of 11 amicus briefs filed that explained to the court the broader implications of the case. Internet Archive itself has a short overview of the others already (representing 20 organizations and 298 individuals–mostly librarians and legal experts). 

I thought it would be worthwhile to highlight some of the important issues identified by these amici that did not receive much attention earlier in the lawsuit. This post is about the reader’s privacy issues raised by several amici in support of Internet Archive and CDL. Later this week we’ll have another post focused on briefs and arguments about why the district court inappropriately construed Internet Archive’s lending program as “commercial.” 

Privacy and CDL 

One aspect of library lending that’s really special is the privacy that readers are promised when they check out a book. Most states have special laws that require libraries to protect readers’ privacy, something that libraries enthusiastically embrace (e.g., see the ALA Library Bill of Rights) as a way to help foster free inquiry and learning among readers.  Unlike when you buy an ebook from Amazon–which keeps and tracks detailed reader information–dates, times, what page you spent time on, what you highlighted–libraries strive to minimize the data they keep on readers to protect their privacy. This protects readers from data breaches or other third party demands for that data. 

The brief from the Center for Democracy and Technology, Library Freedom Project, and Public Knowledge spends nearly 40 pages explaining why the court should consider reader privacy as part of its fair use calculus. Represented by Jennifer Urban and a team of students at the Samuelson Law, Technology and Public Policy Clinic at UC Berkeley Law (disclosure: the clinic represents Authors Alliance on some matters, and we are big fans of their work), the brief masterfully explains the importance of this issue. From their brief, below is a summary of the argument (edited down for length): 

The conditions surrounding access to information are important. As the Supreme Court has repeatedly recognized, privacy is essential to meaningful access to information and freedom of inquiry. But in ruling against the Internet Archive, the district court did not consider one of CDL’s key advantages: it preserves libraries’ ability to safeguard reader privacy. When employing C

DL, libraries digitize their own physical materials and loan them on a digital-to-physical, one-to-one basis with controls to prevent redistribution or sharing. CDL provides extensive, interrelated benefits to libraries and patrons, such as increasing accessibility for people with disabilities or limited transportation, improving access to rare and fragile materials, facilitating interlibrary resource sharing—and protecting reader privacy. For decades, libraries have protected reader privacy, as it is fundamental to meaningful access to information. Libraries’ commitment is reflected in case law, state statutes, and longstanding library practices. CDL allows libraries to continue protecting reader privacy while providing access to information in an increasingly digital age. Indeed, libraries across the country, not just the Internet Archive, have deployed CDL to make intellectual materials more accessible. And while increasing accessibility, these CDL systems abide by libraries’ privacy protective standards. 

Commercial digital lending options, by contrast, fail to protect reader privacy; instead, they threaten it. These options include commercial aggregators—for-profit companies that “aggregate” digital content from publishers and license access to these collections to libraries and their patrons—and commercial e-book platforms, which provide services for reading digital content via e-reading devices, mobile applications (“apps”), or browsers. In sharp contrast to libraries, these commercial actors track readers in intimate detail. Typical surveillance includes what readers browse, what they read, and how they interact with specific content—even details like pages accessed or words highlighted. The fruits of this surveillance may then be shared with or sold to third parties. Beyond profiting from an economy of reader surveillance, these commercial actors leave readers vulnerable to data breaches by collecting and retaining vast amounts of sensitive reader data. Ultimately, surveilling and tracking readers risks chilling their desire to seek information and engage in the intellectual inquiry that is essential to American democracy. 

Readers should not have to choose to either forfeit their privacy or forgo digital access to information; nor should libraries be forced to impose this choice on readers. CDL provides an ecosystem where all people, including those with mobility limitations and print disabilities, can pursue knowledge in a privacy-protective manner. . . . 

An outcome in this case that prevents libraries from relying on fair use to develop and deploy CDL systems would harm readers’ privacy and chill access to information. But an outcome that preserves CDL options will preserve reader privacy and access to information. The district court should have more carefully considered the socially beneficial purposes of library-led CDL, which include protecting patrons’ ability to access digital materials privately, and the harm to copyright’s public benefit of disallowing libraries from using CDL. Accordingly, the district court’s decision should be reversed.

The court below considered CDL copies and licensed ebook copies as essentially equivalent and concluded that the CDL copies IA provided acted as substitutes for licensed copies. Authors Alliance’s amicus brief points out some of the ways that CDL copies actually quite different significantly from licensed copies. It seems to me that this additional point about protection of reader privacy–and the protection of free inquiry that comes with it–is exactly the kind of distinguishing public benefit that the lower court should have considered but did not. 

You can read the full brief from the Center for Democracy and Technology, Library Freedom Project, and Public Knowledge here. 

Authors Alliance Amicus Briefs: Defending Free Expression from Trademark, Social Media, and Copyright Law Challenges

Posted December 8, 2023
Photo by Claire Anderson on Unsplash

The cases that threaten authors’ rights aren’t always obvious. You might have noticed in the last year that we’ve filed amicus briefs in some unusual ones—for example, a trademark lawsuit about squishy dog toys, or a case about YouTube recommendation algorithms. It’s often true that cases like these raise legal questions that extend well beyond their facts, and decisions in these cases can have unintended consequences for authors. As part of our mission of speaking up on behalf of authors for the public interest, we file amicus briefs in these cases to help courts craft better, more informed decisions that account for the interests of authors. 

In the last few weeks Authors Alliance has joined with several other organizations to file amicus briefs in three cases like these: 

Hermès v. Rothschild: Free Expression and Trademarks

The first is an amicus brief we joined in Hermès International v. Rothschild. Our brief is in support of Mason Rothschild, a digital artist who was sued by Hermès, creator of the famous “Birkin bag,” for allegedly infringing Hermès’s trademark. Rothschild created a series of NFT’s mimicking the bag that he called “metaBirkins,”which, he argues, comments on the brand, consumerism, luxury goods and so on. Rothschild lost at the court below and the case is currently before the Second Circuit Court of Appeals. 

Our amicus brief—drafted by the Harvard Cyberlaw Clinic and joined by Authors Alliance, MSCHF, CTHDRL, Alfred Steiner, and Jack Butcher—argues that such uses are protected by the test announced in Rogers v. Grimaldi, a threshold test designed to protect First Amendment interests in the trademark context, allowing courts to quickly resolve trademark litigation when trademarks are used in expressive works unless there is “no artistic relevance to the underlying work whatsoever, or, if it has some artistic relevance, unless the [second work] explicitly misleads as to the source or the content of the work.” Our brief argues that Rogers remains good law after the United States Supreme Court’s recent decision in Jack Daniel’s Props. v. VIP Products and that a creator’s intent to sell their work (in this case selling NFTs) is not relevant when balancing trademark owners’ and creators’ rights. The amici we joined with represent artists, creators, and organizations that are concerned that a ruling in favor of Hermès will stifle creators’ ability to comment on popular brands and companies. 

Warner Chappell v. Nealy: Copyright damages 

This is a case before the U.S. Supreme court raising questions about how far back courts can look when calculating monetary damages in copyright infringement lawsuits. The dispute in this case arose between Nealy, owner of an independent record label that released a number of albums in the 1980s, and Warner Chappell, who Nealy claims engaged in unauthorized reproduction and distribution of his works for years. Nealy claims he didn’t learn of the violation until 2016. He filed suit in 2018 and sought monetary damages for uses going back to 2008. Warner Chappell argues that the Copyright Act’s statute of limitations bars Nealy from recovering for damages going that far back. 

The legal question in this case is whether under the Copyright Act’s “discovery accrual” statute of limitations rule, a copyright plaintiff can recover damages for acts that allegedly occurred more than three years before the filing of a lawsuit. Lower courts have held that for actually filing a lawsuit, the statute of limitations clock starts to run based on when a plaintiff discovers the alleged infringement. This case raises a related question of how far back courts should look when assessing damages—just three years from the date of filing the suit, or an indeterminate period of time as long as it was within three years of the plaintiff discovering the harm? 

We joined  an amicus brief with EFF, the American Library, and the Association of Research Libraries in support of Warner Chappell. EFF did most of the heavy lifting for this brief (thank you!), making the case that a damages regime that extends indeterminately into the past will stifle creativity and encourage copyright trolls. A three-year lookback period is enough. 

We’ve long argued that the copyright’s damages regime needs to be reformed for authors. With statutory damage awards of up to $150,000 per work infringed,  the specter of such crippling liability can chill even non-infringing and socially beneficial acts of authorship, dissemination, archiving, and curation. For authors, it takes little imagination to see how problematic it would be if opportunistic copyright litigants with flimsy claims could leverage a decades old acts—e.g., an image reproduced in a blog post or article or book pursuant to fair use—to extract large damage awards spanning many years. If the court were to allow damages to reach back indeterminately, we argue that copyright trolls would be emboldened, hampering creativity and harming creators while providing them little forward-looking benefit for protection of their own works. 

NetChoice v. Paxton and Moody v. NetChoice: First Amendment and Online Platform Regulation

This case has received a ton of attention, in part because it is so politically charged. The basis of the suit is a challenge brought by NetChoice and CCIA, two internet and technology industry groups, against  laws passed in Texas and Florida that attempt to regulate how large social media websites moderate speech on their platforms. Each of those laws are ostensibly designed to protect the speech of users by limiting how platforms can remove or otherwise moderate their posts and each were passed in response to accusations of political bias.

On their faces these laws sound appealing—authors along with many other users are frustrated with opaque decision making on platforms about why their posts may be taken down, demonetized, or deprioritized by platform algorithms. These are real problems, but in our view, the right solution is not government-dictated content moderation rules. Authors use a wide variety of online platforms and rely heavily on content moderation to ensure that their views are not drowned out by spam, lies, or trolls. 

For our amicus brief, we joined EFF along with the National Coalition Against Censorship, Woodhull Freedom Foundation, Fight for the Future, and the First Amendment Coalition. We argue as follows: The First Amendment right of social media publishers to curate and edit the user speech they publish, free from government mandates, results in a diverse array of forums for users, with unique editorial views and community norms. Although some internet users are understandably frustrated and perplexed by the process of “content moderation,” by which sites decide which users’ posts to publish, recommend, or amplify, it’s on the whole far best for internet users when the First Amendment protects the sites’ rights to make those curatorial decisions. This First Amendment right to be editorially diverse does not evaporate the moment a site reaches a certain state-determined level of popularity. But both Texas House Bill 20 (HB 20) and Florida Senate Bill 7072 (SB 7072) take those protections away and force popular sites to ignore their own rules and publish speech inconsistent with their editorial vision, distorting the marketplace of ideas.

Content moderation by online intermediaries is an already fraught process, and government interjection of itself into that process raises serious practical and First Amendment concerns. Inconsistent and opaque private content moderation is a problem for users. But it is one best addressed through self-regulation and regulation that doesn’t retaliate against the editorial process.

An Open Letter Regarding Copyright Reform on Behalf of South African Authors

Posted September 25, 2023
Photo by Jacques Nel on Unsplash

Today we are very pleased to share an open letter regarding copyright reform on behalf of South African authors. The letter is available here and is also available as a PDF (with names as of today) here.

The letter comes at a critical decision making moment for South Africa’s Copyright Amendment Bill which has been debated for years (read more here and here on our views). We believe it is important for lawmakers to hear from authors who support this bill, and in particular hear from us about why we view its fair use provisions and author remuneration provisions so positively.

We welcome other South African authors to add their names to the letter to express their support. You can do so by completing this form.


Assessing the U.S. Copyright Small Claims Court After One Year

Posted September 18, 2023

Authors Alliance members will recall the series of posts we’ve made about the United States’s new copyright small claims court. The below is a post by Dave Hansen and Authors Alliance member Katie Fortney, based on a forthcoming article we recently posted assessing how this court has fared in its first year of operations. This post was originally published on the Kluwer Copyright Blog.

In June 2023 the U.S. Copyright Office celebrated the one-year anniversary of operations of the Copyright Claims Board (“CCB”), a novel new small claims court housed within the agency with a budget request for $2.2 million in ongoing yearly costs. Though not entirely unique (e.g., the UK’s IP Enterprise court has been described as filling a similar role since 2012), the CCB has been closely watched and hotly debated (see here, here, and here).

The CCB was preceded by years of argument about the benefits and risks of such a small claims court.  Proponents argued that the CCB would offer rightsholders a low-cost, efficient alternative to litigation in federal courts (which can easily cost over $100,000 to litigate), allowing small creators to more effectively defend their rights. Opponents feared that the CCB would foster abuse, encouraging frivolous lawsuits while creating a trap for unwary defendants.

We set out to assess these arguments in light of data on the CCB’s first year of operation, which is explored in more detail in our article here, forthcoming in the Journal of the Copyright Society of the USA, and the data used for this article available here. The post summarizes from that article, which is itself based on an empirical review of the CCB’s first year of operations using data extracted from the CCB’s online filing system for the 487 claims filed with the court between June 2022 and June 2023.

How the CCB Works

To assess the work of the CCB, it’s first important to understand how the new court works. For claimants to successfully pursue a claim, they must first pass three hurdles:

  • their claim must be compliant, which means that it must include some key information regarding, e.g., ownership of a copyright, access to the work by the respondent in order to copy it, and substantial similarity between the allegedly infringing copy and the original;
  • their claim must also be properly served or delivered to the respondent, following the specific procedures that the Copyright Office has established;
  • the claimant must wait 60 days to see if the respondent decides to opt-out of the proceedings (in which case the claimant can refile in the more expensive, but more robust federal district court).

Once the opt-out window has passed, the proceeding becomes “active” and a scheduling order is issued. Then the parties can engage in discovery, have hearings and conferences, and eventually receive a final determination where the CCB may award damages.

CCB By the Numbers

In the first year of the CCB 487 claims were filed. However, only 43 of these 487 claims–less than 9%–had been issued scheduling orders and made it to the active phase by June 15, 2023.

Meanwhile, 302 cases had been closed, most of them dismissed without prejudice (meaning the case did not reach the merits and the claimant could choose to file again). The remaining claims were either awaiting review by the CCB, or waiting for an action from the claimant like filing an amended claim or filing proof of service.

Though the CCB gives claimants multiple opportunities to amend their complaint to fix problems with it (even offering detailed and helpful suggestions on how to fix those problems), over 150 claims were dismissed because the claimant did not file a proper claim. Failure to state facts sufficient to support Access and Substantial Similarity were common problems, showing up about 110 times each in CCB orders to amend (sometimes in the same order to amend). In some cases, however, there was no way to fix the complaint. For example, 35 claims were trying to pursue cases against foreign respondents, over whom the CCB has no jurisdiction. And over 100 claims were copyright infringement claims where the claimant hadn’t filed for copyright registration of the work allegedly infringed (a prerequisite to filing).

Claimants also had problems with service: 60 claims were dismissed in the first year because claimants didn’t file documentation showing that they’d accomplished valid proof of service. Finally, opt-out (which some proponents of the CCB feared would undermine the court) is an important but much smaller pathway out of the CCB: it accounted for 35 dismissals.

Perhaps because copyright is technical and complicated, it may not be surprising to find that having a lawyer helps avoid dismissal:  90% of claims from represented claimants had been certified as compliant; for claims from self-represented claimants, only 46% were compliant. Unregistered claimants account for over 70% of claims filed, but only 40% of those that make it to the active phase.

Looking more closely at the claimants themselves, we do see that the CCB system is being used by aggressive and prolific copyright litigants, but we haven’t seen the volume of copyright-troll litigation seen in the past in federal district courts.This may be in part because the Copyright Office took these concerns seriously and created rules to discourage it, such as limiting the number of claims a plaintiff can file within one year. The number of repeat filers was low – only nine filers had more than five claims. Those include, however, 17 claims filed by Higbee and Associates (sometimes referred to as a “troll” though the label may not exactly fit), and 20 by David C. Deal (another known and aggressive serial copyright litigant). And the only case in which the CCB had issued an order was in favor of David Oppenheimer, who has separately filed more than 170 copyright suits in federal courts.

Because the process has been so slow, it’s difficult to evaluate how the CCB is working for respondents. Opponents of the CCB feared that its ability to make default determinations (issuing monetary awards when the respondent never shows up) could be a trap for the unwary. The CCB has issued only two such determinations so far (both in August 2023, for $3000 each), and only one final determination that wasn’t the result of a default, withdrawal, or settlement. So, it’s too early to tell how common defaults will be. However, they will continue to be an issue to watch: in the first year, respondents were as likely to end up on the path to default as they were to participate in a proceeding.

Our Takeaways and Conclusion

On the one hand, we haven’t seen rampant abuse of the system. To be sure, serial copyright litigants are actively using the CCB, but in numbers far fewer than previously seen even in federal district court. And damage awards have been modest.

However, it also seems that the CCB has not achieved its promised efficiency for small litigants–for most claimants the system seems to be too complicated and slow, with the CCB only issuing a final determination in a single case in its entire first year, and the vast majority of claims dismissed for failure to adequately comply with CCB rules. The CCB has already gone to great lengths to explain the process and to help claimants correct errors early in the process. It may be hard for the CCB to adjust its rules to lower barriers unless it is willing to sacrifice basic procedural safeguards for respondents (something we think it should not do). Despite the hope of advocates and legislators and the admirable efforts of those working at the CCB, the early results lead us to think that it may just be that complex copyright disputes are ill-suited for a self-service small claims tribunal.

Coalition Letter to Congress on Copyright and AI

Posted September 11, 2023

Photo by Chris Grafton on Unsplash

Earlier today Authors Alliance joined a broad coalition of public interest organizations, creators, academics, and others in a letter to members of Congress urging caution when considering proposals to revamp copyright law to address concerns about artificial intelligence. As we explained previously, we believe that copyright law currently has the appropriate tools needed to both protect creators and encourage innovation.  

As the letter states, the signatories share a common interest in ensuring that artificial intelligence meets its potential to enrich the American economy, empower creatives, accelerate the progress of science and useful arts, and expand humanity’s overall welfare. Many creators are already using AI to conduct innovative new research, address long-standing questions, and produce new creative works. Some, such as some of these artists, have used this technology for many years.

So our message is simple: existing copyright doctrine has evolved and adapted to accommodate many revolutionary technologies, and is well equipped to address the legitimate concerns of creators. Our courts are the proper forum to apply those doctrines to the myriad fact patterns that AI will present over the coming years and decades. 

You can read the full letter here. 

Current copyright law isn’t perfect, and we certainly believe creativity and innovation would benefit from some changes. However, we should be careful about reactionary, alarmist politics. It seldom makes for good law. Unfortunately, that’s what we’re seeing right now with AI, and we hope that Congress has the wisdom to see through it. 

We encourage our members to reach out to your own Congressional representative to express the need to tread carefully, and (if you are) to explain how you are using AI in your work.  We’d also be very happy to hear from you as we develop our own further policy communications to Congress and to agencies such as the U.S. Copyright Office. 

Open Access and University IP Policies in the United States

Posted August 18, 2023

Perhaps the most intuitive statement in the whole of the U.S. Copyright Act is this: “Copyright in a work protected under this title vests initially in the author. . . ..” Of course authors are the owners of the copyright in their works. 

In practice, however, control over copyrighted works is often more complicated. When it comes to open access scholarly publishing, the story is particularly complicated because the default allocation of rights is often modified by an complex series of employment agreements, institutional open access policies, grant terms, relationships (often not well defined) between co-authors, and of course the publishing agreement between the author and the publisher. Because open access publishing is so dependent on those terms, it’s important to have a clear understanding of who holds what rights and how they can exercise them.

Work for Hire and the “Teacher Exception”

First, it’s important to figure out who owns rights in a work when it’s first created. For most authors, the answer is pretty straightforward. If you’re an independent creator, you as the author generally own all the rights under copyright. If co-authors create a joint work (e.g., co-author an article), they both hold rights and can freely license that work to others, subject to an accounting to each other. 

If, however, you work for a company and create a copyrighted work in the scope of your employment (e.g., I’m writing this blog post as part of my work for Authors Alliance) then at least in the United States, the “work for hire” doctrine applies and, the law says, “the employer or other person for whom the work was prepared is considered the author.” For people who aren’t clearly employees, or who are commissioned to make copyrighted works, whether their work is considered “work for hire” can sometimes be complicated, as illustrated in the seminal Supreme Court case CCNV v. Reid, addressing work for hire in the context of a commissioned sculpture.  

For employees of colleges or universities who create scholarly works, the situation is a little more complicated because of a judicially developed exception to the work-for-hire doctrine known as the “teacher exception.” In a series of cases in the mid-20th Century, the courts articulated an exception to the general rule that creative works produced within the scope of one’s employment were owned by the employer for teachers or educators. Those cases each have their own peculiar facts, however, and most significantly, they predated the 1976 Copyright Act, which was a major overhaul of U.S. copyright law. Whether the “teacher exception” continues to survive as a judge-made doctrine is highly contested. Despite the massive number of copyrighted works authored by university faculty after the 1976 Act (well over a hundred million scholarly articles alone, not to mention books and other creative works), we have seen very few cases addressing this particular issue.  

There are a number of law review articles and books on the subject. Among the best, I think, is Professor Elizabeth Townsend-Gard’s thorough and worthwhile article. She concludes, based on a review of past and modern case law, that the continued survival of the teacher exception is tenuous at best: 

“The teacher exception was established under the 1909 act by case law, but because the 1976 act did not incorporate it, the “teacher exception” was subsumed by a work-for-hire doctrine that the Supreme Court’s definition of employment in CCNV v. Reid places teachers’ materials under the scope of employment. Thus the university-employers own their original creative works. No court has decided whether the “teacher exception” survived Reid, but the Seventh Circuit in Weinstein, decided two years before Reid, had already transferred the “teacher exception” from a case-based judge made law to one dictated by university policy.”

University Copyright and IP policies

Whatever the default initial allocation of copyright ownership, authors of all types must also understand how other agreements may modify control and exercise of copyright. These policies can be somewhat difficult to untangle because there actually may be layers of agreements or policies that cross reference each other and are buried deep within institutional policy handbooks. 

For academic authors, this collection of agreements typically includes something like an employee handbook or academic policy manual, which will include policies that all university employees must agree to as a condition of employment. Typically, that will include a policy on copyright or intellectual property. Regardless of whether the teacher exception or work-for-hire applies, these agreements can override that default allocation of rights and transfer them, both from the creator to the university, or from university to the creator. 

These policies differ significantly in the details, but most university IP policies choose to allocate all or substantially all rights under copyright to individual creators of scholarly works, notwithstanding the potential application of the work for hire doctrine. In other words, even though copyright in faculty scholarly works may initially be held by the university, through university policy those rights are mostly handed over to individual creators. The net effect is that most university IP policies treat faculty as the initial copyright holders even if the law isn’t clear that they actually are.

Some universities, like Duke University, say nothing about “work for hire” in their IP policies but merely “reaffirm[] its traditional commitment to the personal ownership of intellectual property rights in works of the intellect by their individual creators.” Others like Ohio State, are similar, stating that copyright in scholarly works “remains” with their creators, but then also provide that “the university hereby assigns any of its copyrights in such works, insofar as they exist, to their creators,” which can act as a sort of savings clause to address circumstances in which the there may be uncertainty about ownership by individual creators. 

Others, like Yale, are a little clearer about their stance on work-for-hire. Yale explains that “The law provides . . . that works created by faculty members in the course of the their teaching and research, and works created by staff members in the course of their jobs, are the property of the University,” but then goes on to recognize that “[i]t is traditional at Yale and other universities, however, for books, articles and other scholarly writings by a faculty member to be deemed the property of the writer . . . . In recognition of that longstanding practice, the University disclaims ownership of works by faculty, staff, postdoctoral fellows and postdoctoral associates and students. . . .” Another example of a university taking a similar approach is the University of Michigan.

Carve outs and open access policies

Every university copyright or IP policy that I’ve seen includes some carve outs from the general rule that copyright will, one way or another, end up being held by individual creators. Almost universally, universities IP policies provide that the university will retain rights sufficient to satisfy grant obligations. Some universities’ IP policies simply provide that, for example, ownership shall be determined by the terms of the grant (see, for example, the University of California system policy). In other cases, however, university IP policy accomplishes compliance with grants simply stating that all intellectual property of any kind (including copyright) created under a grant is owned by the university, full stop. This, therefore, gives the university sufficient authority to satisfy whatever grant obligations it may have. For example, the University of Texas system states that it will not assert ownership of copyright in scholarly works, but that provisio is subject to the limitation that “intellectual property resulting from research supported by a grant or contract with the government (federal and/or state) or an agency thereof is owned by the Board of Regents.” These kinds of broad ownership claw-backs raise some hard questions when it comes to publishing scholarly work. For example, when a UT author personally signs a publication agreement transferring copyright for an article that is the result of grant funding, do they actually hold the rights to make that transfer effective? 

For open access, these grant clauses are important because they are the operative terms through which the university complies with funder open access requirements. Sometimes, these licensing clauses lie somewhat dormant, with funders holding but not necessarily exercising the full scope of their rights. For example, for every article or other copyrighted work produced under a federal grant, even prior to the recent OSTP open access announcement, the government already reserved for all works produced under federal grants a broad “royalty-free, nonexclusive and irrevocable right to reproduce, publish, or otherwise use the work for Federal purposes, and to authorize others to do so.” 

Some universities also retain a broad, non-exclusive license for themselves to make certain uses of faculty-authored scholarly work, even while providing that the creator owns the copyright. For example, Georgia Tech’s policy provides that individual creators own rights in scholarly works, but Georgia Tech retains a “fully paid up, universe-wide, perpetual, non-exclusive, royalty-free license to use, re-use, distribute, reproduce, display, and make derivative works of all scholarly and creative works for the educational, research, and administrative purposes of [Georgia Tech].” Others such as the University of Maryland are less specific, providing simply that although the individual creator owns rights to their work, “the University reserves the right at all times to exercise copyright in Traditional Scholarly Works as authorized under United States Copyright Law.” Those kinds of broad licenses would seem to give the university discretion to make use of scholarly work, including, I think, for open access uses should the university decide that such uses are desirable.

Finally, a growing number of universities have policies, enacted at the behest of faculty, that specifically provide rights to make faculty scholarship openly available. The “Harvard model” is probably the most common, or at least the most well known. These types of policies allocate a license to the university, to exercise on behalf of the individual creator, with the specific intent of making the work available free of charge. Often these policies will include special limitations (e.g., the university cannot sell access to the article) or allow for faculty to opt-out (often by seeking a waiver). 

Pre-existing licenses and publishing agreements

The maze of policies and agreements can matter a great deal for the legal mechanics of effectively publishing an article openly. Of course in the scenario where authors hold rights themselves, they can retain sufficient rights through their publishing contract so they can make their work openly available, typically either via “green open access” by posting their own article to an institutional repository, or by “gold open access” directly from the publisher (though these are sometimes accompanied by a hefty article processing fee). Tools like the SPARC open access addendum are wonderful negotiating tools to ensure authors retain sufficient rights to achieve OA.

That works sometimes, but often publishing contracts come with unacceptably restrictive strings attached. For individual authors publishing with journals and publishers that have great market power, they often have little ability to negotiate for OA terms that they would prefer. 

In these situations, a pre-existing license can be a major advantage for an author. For example, for authors who are writing under the umbrella of a Harvard-style open access policy, the negotiating imbalance with journals is leveled, at least in part  because the journal knows that the university has a pre-existing OA license and also knows that although those policies often permit waivers, it’s not as easy as just telling the author “no” to claw that license back. The same is true about other forms of university pre-existing licenses that could be used to make a work available openly, such as those general licenses I mention that are retained by Georgia Tech or Maryland. While these kinds of pre-existing licenses are seldom acknowledged in journal publishing agreements, sophisticated publishers with large legal teams are undoubtedly aware of them. Because of that, I think there are strong arguments that their publishing agreements with authors implicitly incorporate them (or, if not, good arguments that a publisher that does not recognize them is intentionally interfering with a pre-existing contractual relationship between author and their university). Funder mandates, made effective through university IP policies, take the scenario a step further and force the issue: either the journal acquiesces or it doesn’t publish the paper at all. There is often no waiver option. Of course there are other pathways that both funders and journals may be willing to accept – many funders are willing to support OA publishing fees, and many journals will happily accept OA license terms for a price. 


Although the existing, somewhat messy, maze of institutional IP policies, publishing agreements, and OA policies can seem daunting, understanding their terms is important for authors who want to see their works made openly available. I’ll leave for another day to explore whether it’s a good thing that the rights situation is so complex. In many situations, rights thickets like these can be a real detriment to authors and access to their works. In this case the situation is at least nuanced such that authors are able to leverage pre-existing licenses to avoid negotiating away the bundle of rights they need to see their works made available openly. 

Prosecraft, text and data mining, and the law

Posted August 14, 2023

Last week you may have read about a website called, a site with an index of some 25,000 books that provided a variety of data about the texts (how long, how many adverbs, how much passive voice) along with a chart showing sentiment analysis of the works in its collection and displayed short snippets from the texts themselves, two paragraphs representing the most and least vivid from the text. Overall, it was a somewhat interesting tool, promoted to authors to better understand how their work compares to those of other published works. 

The news cycle about was about the campaign to get its creator Benji Smith to take the site down (he now has) based on allegations of copyright infringement. A Gizmodo story about it generated lots of attention, and it’s been written up extensively, for example here, here, here, and here.  

It’s written about enough that I won’t repeat the whole saga here. However, I think a few observations are worth sharing:  

1) Don’t get your legal advice from Twitter (or whatever its called)

Fair Use does not, by any stretch of the imagination, allow you to use an author’s entire copyrighted work without permission as a part of a data training program that feeds into your own ‘AI algorithm.’”  – Linda Codega, Gizmodo (a sentiment that was retweeted extensively)

Fair use actually allows quite a few situations where you can copy an entire work, including situations when you can use it as part of a data training program (and calling an algorithm “AI” doesn’t magically transform it into something unlawful). For example, way back in 2002 in Kelly v. Ariba Soft, the 9th Circuit concluded that it was fair use to make full text copies of images found on the internet for the purpose of enabling web image search. Similarly, in AV ex rel Vanderhye v. iParadigms, the 4th Circuit in 2009 concluded that it was fair use to make full text copies of academic papers for use in a plagiarism detection tool.  

Most relevant to prosecraft, in Authors Guild v. HathiTrust (2014)  and Authors Guild v. Google (2015) the Second Circuit held that Google’s copying of millions of books for purposes of creating a massive search engine of their contents was fair use . Google produced full-text searchable databases of the works, and displayed short snippets containing whatever term the user had searched for (quite similar to prosecraft’s outputs). That functionality also enabled a wide range of computer-aided textual analysis, as the court explained: 

The search engine also makes possible new forms of research, known as “text mining” and “data mining.” Google’s “ngrams” research tool draws on the Google Library Project corpus to furnish statistical information to Internet users about the frequency of word and phrase usage over centuries.  This tool permits users to discern fluctuations of interest in a particular subject over time and space by showing increases and decreases in the frequency of reference and usage in different periods and different linguistic regions. It also allows researchers to comb over the tens of millions of books Google has scanned in order to examine “word frequencies, syntactic patterns, and thematic markers” and to derive information on how nomenclature, linguistic usage, and literary style have changed over time. Authors Guild, Inc., 954 F.Supp.2d at 287. The district court gave as an example “track[ing] the frequency of references to the United States as a single entity (‘the United States is’) versus references to the United States in the plural (‘the United States are’) and how that usage has changed over time.”

While there are a number of generative AI cases pending (a nice summary of them is here) that I agree raise some additional legal questions beyond those directly answered in Google Books, the kind of textual analysis that offered seems remarkably similar to the kinds of things that the courts have already said are permissible fair uses. 

2) Text and data mining analysis has broad benefits

Not only is text mining fair use, it also yields some amazing insights that truly “promote the progress of Science,” which is what copyright law is all about.  Prosecraft offered some pretty basic insights into published books – how long, how many adverbs, and the like. I can understand opinions being split on whether that kind of information is actually helpful for current or aspiring authors. But, text mining can reveal so much more. 

In the submission Authors Alliance made to the US Copyright Office three years ago in support of a Section 1201 Exemption permitting text data mining, we explained:

TDM makes it possible to sift through substantial amounts of information to draw groundbreaking conclusions. This is true across disciplines. In medical science, TDM has been used to perform an overview of a mass of coronavirus literature.Researchers have also begun to explore the technique’s promise for extracting clinically actionable information from biomedical publications and clinical notes. Others have assessed its promise for drawing insights from the masses of medical images and associated reports that hospitals accumulate. 

In social science, studies have used TDM to analyze job advertisements to identify direct discrimination during the hiring process.7 It has also been used to study police officer body-worn camera footage, uncovering that police officers speak less respectfully to Black than to white community members even under similar circumstances.

TDM also shows great promise for drawing insights from literary works and motion pictures. Regarding literature, some 221,597 fiction books were printed in English in 2015 alone, more than a single scholar could read in a lifetime. TDM allows researchers to “‘scale up’ more familiar humanistic approaches and investigate questions of how literary genres evolve, how literary style circulates within and across linguistic contexts, and how patterns of racial discourse in society at large filter down into literary expression.” TDM has been used to “observe trends such as the marked decline in fiction written from a first-person point of view that took place from the mid-late 1700s to the early-mid 1800s, the weakening of gender stereotypes, and the staying power of literary standards over time.” Those who apply TDM to motion pictures view the technique as every bit as promising for their field. Researchers believe the technique will provide insight into the politics of representation in the Network era of American television, into what elements make a movie a Hollywood blockbuster, and into whether it is possible to identify the components that make up a director’s unique visual style [citing numerous letters in support of the TDM exemption from researchers].

3) Text and data mining is not new and it’s not a threat to authors

Text mining of the sort it seemed prosecraft employed isn’t some kind of new phenomenon. Marti Hearst, a professor at UC Berkeley’s iSchool explained the basics in this classic 2003 piece. Scores of computer science students experiment with projects to do almost exactly what prosecraft was producing in their courses each year. Textbooks like Matt Jockers’s Text Analysis with R for Students of Literature have been widely used and adopted all across the U.S. to teach these techniques. Our submissions during our petition for the DMCA exemption for text and data mining back in 2020 included 14 separate letters of support from authors and researchers engaged in text data mining research, and even more researchers are currently working on TDM projects. While fears over generative AI may be justified for some creators (and we are certainly not oblivious to the threat of various forms of economic displacement), it’s important to remember that text data mining on textual works is not the same as generative AI. On the contrary, it is a fair use that enriches and deepens our understanding of literature rather than harming the authors who create it.

The appropriation bill that would defund the OSTP open access memo

Posted July 27, 2023

A couple of weeks ago the U.S. House Appropriations Subcommittee on Commerce, Justice, and Science (CJS) released an appropriations bill containing language that would defund efforts to implement a federal, zero-embargo open access policy for federally funded research.

We think this is a fantastically bad idea. One of the most important developments in the movement for open access to scholarship came last year when Dr. Alondra Nelson, Director of the Office of Science and Technology Policy, issued a memorandum mandating that all federal agencies that sponsor research put in place policies to ensure immediate open access to published research, as well as access to research data. The agencies are at various stages of implementing the Nelson memo now, but work is well underway. This appropriations bill specifically targets those implementation efforts and would prevent any federal government expenditures from being used to further them. 

For the vast majority of scholarly works, the primary objective of the authors is to share their research as widely as possible. Open access achieves that for authors (if they can only get their publishers to agree). The work is already funded, already paid for. As you might imagine, those opposed to the memo are primarily publishers who have resisted adapting the business model they’ve built of putting a paywall in front of publicly-funded work, largely for profit. 

Thankfully, the CJS appropriations bill, one of twelve appropriation bills, is just a first crack at how to fund the government in the coming year. The Senate, of course, will have their say, as will the President. With the current division in Congress, combined with the upcoming recess (Congress will be on recess in August and reconvene in September), the smart bet is that none of these bills will be enacted in time for the federal government’s new fiscal year on October 1. Instead, a continuing resolution–funding the government under the status quo, as Congress frequently does–will likely be enacted as a stop gap until a compromise can be reached later in the year. 

It is important, however, that legislators understand that this attempt to defund OA efforts is majorly concerning, especially for authors, universities, and libraries that believe that federally funded research should be widely available on an open access basis. It’s a good moment to speak out. SPARC has put together a helpful issue page on this bill, complete with sample text for how to write to your representative or senator. 

As you’ll see if you read the proposed appropriations bill, it is loaded with politics. The relevant OSTP memo language is located amongst other clauses that defund the Biden administration’s efforts to implement diversity initiatives at various agencies, address gun violence, sue states over redistricting, and dozens of other hot-button issues as well. It’s pretty easy for an issue like access to science to get lost in the political shuffle, but we hope with some attention from authors and others, it won’t.

The Anti-Ownership Ebook Economy

Posted July 25, 2023
The Anti-Ownership Ebook Economy

Earlier this month, the Engelberg Center on Innovation Law and Policy at NYU Law released a groundbreaking new report: The Anti-Ownership Ebook Economy: How Publishers and Platforms Have Reshaped the Way We Read in the Digital Age is a detailed report that traces the history of ebooks and, through a series of interviews with publishers, platforms, librarians, and others, explains how the law and the markets have converged to produce the dysfunction we see today in the ebook marketplace.

The report focuses especially closely on the role of platform companies, such as Amazon, Apple and OverDrive, which now play an enormous role in controlling how readers interact with ebooks. “Just as platforms control our tweets, our updates, and the images that we upload, platforms can also control the books we buy, keeping tabs on how, when, and where we use them, and at times, modifying or even deleting their content at will.” 

Claire Woodcock

Last Friday, I spoke with one of the authors, Claire Woodcock, to learn a little bit more about the project and its goals: 

Q: What was your motivation to work on this project? 

A: My co-authors, Michael Weinberg, Jason Schultz, and Sarah Lamdan had all been working on this for well over a year [before] I joined. I knew Sarah from another story I’d written about an ebook platform that was prioritizing the platforming of disinformation last year, and she had approached me about this project. When I hopped on a call with the three of them, I believe it was Michael who posed the core question of this project: “Why can we not own, but only license ebooks?” 

I’ve thought about that question ever since. So my role in joining the project was to help talk to as many people as we could – publishers, librarians, platforms, and other stakeholders to try to understand why not. It seems like a simple question but there are so many convoluted reasons and we wanted to try to distill this down. 

Q: Many different people were interviewed for this project. Tell me about how that went. 

A: There was actually some hesitation to talk; I think a reason why was almost extreme fear of retaliation. So, it took a while to crack into learning about some of the different areas, especially with some publishers and platforms. I wish there was more of a willingness to engage on the part of some publishers, who would flat out tell me things like they weren’t authorized to talk about their company’s internal practices , or from platforms like OverDrive, who we sent our list of questions over to and never heard from again (until I ran into Steve Potash at the American Library Association’s Annual Conference). I’d have loved to hear more from them directly when I was actively conducting interviews.

Q: I noticed there weren’t many interviews with authors. Can you say why not? 

A: Authors weren’t as big of a focus because we realized, particularly in talking with several literary agents, that from a business and legal perspective authors don’t have much of a say in how their books are distributed. Contractually, they aren’t involved in downstream use. I think it would be really interesting to do a follow up with authors to get their perspective on how their books are licensed or sold online.

Q: The report contains a number of conclusions and recommendations. Which among them are your favorite? 

A: One of the most striking things I learned, and what stuck out to me the most when I went back and listened to the interviews, is the importance of market consolidation and lack of competition. OverDrive has roughly 95% of the ebook marketplace for libraries (and I know it’s different for academic publishing, for sure). The lack of competition in our society, especially in this area, makes it hard to speak up and speak out when a certain stakeholder has issues with the dominant market players. Because of that, looking at each of the groups of stakeholder types we spoke with, each could point to other groups causing the problem (it reminds me of the spiderman meme) and there are platforms and other publishers, mostly smaller, who want to make this work but the major players are not doing that. It also stuck out that, almost everyone we talked to talks about librarians as partners, but when we talk to the librarians, they say “they think we are partners, but we don’t feel like we have a seat at the table, decisions that impact us are often made without consulting us in a way that is transparent.” 

Q: If you could do a follow up study, what additional big questions would you focus on? 

A: Lots of people talked about audiobooks. We were focused on ebooks, but the audiobook market is even more concentrated, and lots of people raised the issue that ebooks are only part of the issue. There is a version of this that is happening with audiobooks as well. I also think that the intersections of this market with television, platform streaming, and even other consumer goods like toys and other parts of the market are really interesting. What we’re seeing here, it’s a version of what’s happening in other creative industries. 

I also think it would be worth learning more about how libraries and others are working around the current issues. For example, lots of libraries ask for perpetual licenses, since they’re looking at working within the current context and looking at contracts so they can get assurances, for example if something happens to the publishers platform, the library could still get some assurances that even if something happened to the company, the license agreement could still be honored. But are those efforts actually effective? And, given the importance of licensing, it might also be interesting to explore how libraries are resourced to negotiate those agreements – for example, training and staff to negotiate. I think if libraries were better funded they would probably be able to better handle these challenges. 

Authorship and Ebook Licensing: Introducing the Library Ebook Pledge

Posted July 12, 2023

Authors rarely have meaningful rights to say how their publisher licenses or distributes their book. 

A typical publishing contract will grant the publisher broad discretion to determine the format, price and sublicensing terms under which an author’s book is made available. It can be hard to negotiate for the right to have a say over those terms. Even contracts designed to prioritize authors’ rights, such as the Authors Guild model trade contract,  don’t contemplate an author exercising much control over these matters, and leave most publication and distribution details “as Publisher determines.”

In many cases, ceding control can be OK as long as the interests of the publisher and author are tightly aligned. It’s why we recommend authors pay close attention to the mission and practices of their publisher before signing a contract. But even when a publisher purports to share the author’s interests, this could change in the future, and information the publisher provides about itself can be misleading.

Sometimes, it’s hard to see how those interests diverge until it’s too late. For example, recall last year when academic publisher Wiley decided to remove some 1,300 ebooks from online library collections. We quickly found that many authors of those books objected strongly, and joined us in a letter that outlined concerns and expressed dismay that Wiley, an academic publisher that supposedly prioritizes” access to knowledge,” would make such an aggressive and profit-maximizing decision. But under their contracts, those authors had no legal grounds to push back. 

Library distribution in particular is an area of concern. Libraries provide an important way for authors to connect with readers, and provide a means of access to their books for many people who might otherwise never read them. Libraries also serve an important democratic function in supporting widespread learning that we all benefit from. We’ve written several times over the years about challenges that libraries face in licensing ebooks, and it’s why we’ve supported model state legislation to address the problem and also why we’ve supported models like controlled digital lending that allow for limited access outside of the licensing model.

In addition to basic economic concerns about gouging libraries on price (in some cases publishers have decided to charge libraries 10x the consumer list price for ebook access), some publishers have imposed a variety of other terms that we find unreasonable. This includes, for example, only offering ebooks to libraries through large bundles of content rather than title-by-title, which forces libraries to buy access to books that aren’t necessarily relevant for their community (a practice which also obfuscates and dilutes per-title sales and consequently author royalties). Or limiting access for use only on platforms controlled by the publisher, which can contain significant compromises for reader privacy. Perhaps the most frustrating is the flat refusal to deal – with some publishers refusing to sell some ebooks to libraries at all, in the hopes of driving some would-be library readers (likely a very small percentage of them)  into buying a personal copy. 

Introducing the Library Ebook Pledge

What libraries need to do their jobs in the digital environment isn’t all that complicated. For physical books, libraries have been successful in reaching readers  because they have had clear rights to purchase, lend, and preserve. Publishers have limited, by contract, libraries’ ability to do those same activities with ebooks, but it doesn’t have to be that way. That’s why we’ve been pleased to work with Knowledge Rights 21 and Library Futures to outline twelve basic principles that represent a reasonable approach to ensuring that libraries can continue to do their jobs online.

We know that many publishers care deeply about the role of libraries in supporting research, education and learning. This Pledge, which can be viewed here,  offers a way for those publishers to express their support and commitment to 21st century libraries, so libraries can provide meaningful preservation of and access to ebooks for their readers. We’re encouraged to see some publishers already signing on, and encourage others to do so as well.

We also think this pledge is a valuable tool for authors who care about access to their works. While negotiating for control over distribution can be a challenge, we are hopeful that authors can try to incorporate these principles into their contracts and use this pledge to ask publishers to publicly communicate their intent to license ebooks in ways that will account for the public interest.