Category Archives: Blog

A Copyright Small Claims Update: Defaults and Failure to Opt Out

Posted February 1, 2024

We’ve been tracking for a few years the new copyright small claims court known as the Copyright Claims Board. My last update was in September when I posted a summary of a paper I wrote with Katie Fortney summarizing data about the first year of operations of the court (thanks entirely to Katie for doing the hard work of extracting that data and sharing it in an easy-to-understand format). 

As explained then, the CCB has been slow in processing cases; it only entered a final judgment on the merits in one case when I last wrote. It has now issued a total of 18 final determinations, about half of which are default determinations (cases where the respondent failed to appear or refused to participate in the CCB process). The facts for most of these cases are not very interesting, but two of the most recent caught my attention. 

Oakes v. Heart of Gold Pageant System

The first case, Oakes v. Heart of Gold Pageant System Inc., highlights a concern from opponents of the CCB when it was being debated in Congress. Namely, the CCB’s ability to make default determinations could be a trap for the unwary defendants who don’t understand what the CCB is, what a case before it could mean for them, or what their rights are to opt out of a CCB proceeding. 

The facts are unspectacular: Oakes, a professional photographer represented by Higbee & Associates, filed a CCB complaint against Heart of Gold and its owner, Angel Jameson, for using photographs taken by Oakes on Heart of Gold’s Facebook page and in materials for events it sponsored. Oakes originally filed the claim in July 2022 and then refiled it in August 2022 with some corrections. Oakes then provided the CCB with the required proof of service (proof that Oakes had adequately informed Heart of Gold and Jameson of the CCB claim) in October 2022. 

At this point, the ball was in Heart of Gold and Jameson’s court; she could either respond and defend her use, or (if done within 60 days of service) opt out of the CCB proceeding altogether. Unfortunately for her, she did neither, which resulted in a default determination against her for $4,500. 

We learn in the final determination a little more about Jameson’s lack of participation. As the CCB recounts in its final default determination: 

“At multiple points in this procedural history, Jameson has contacted the CCB, and after communicating with staff, has affirmed each time her intent to not participate in this proceeding.”

“Jameson initially contacted the Board in response to this Zoom link, expressing her disbelief that the Board is a government tribunal.”

“Jameson then sent another email in response to the First Default, requesting an ‘official day in court.’”

“In a subsequent call with CCB staff in March, Jameson indicated that she would not participate.”

“Shortly after the order scheduling the hearing, Jameson contacted the U.S. Copyright Office’s Public Information Office, who placed her in contact with CCB staff. In a follow-up call, CCB staff again explained the proceeding and Jameson again affirmed that she would not participate in the proceeding.”

Jameson missed her opportunity to opt out early in the case – she had a sixty-day window to do so, as defined by CCB regulations. So, her protests later were ineffective to opt out, even though it seems clear that she did not want her case to be heard by the CCB. 

Joe Hand Promotions v. Dawson 

A second default determination case offers a slightly different view of how the CCB treats defaults. The facts are similarly straightforward: Joe Hand is a company that “specializes in commercially licensing premier sporting events to commercial locations such as bars, restaurants, lounges, clubhouses, and similar establishments.” Joe Hand had obtained the exclusive right to sell pay-per-view access to a boxing event–” Deontay Wilder vs. Tyson Fury II,” to commercial establishments, including bars. Joe Hand provided evidence that a California bar, “Bottoms Up,” had shown the match without permission. 

Joe Hand (a frequent filer with the CCB, with 33 cases to its name) ran into a problem in this case, however, because it didn’t actually file its case against Bottoms Up, but instead against the individual that is listed on the bar’s liquor license and ownership documents, Mary Dawson. Even in Dawson’s absence, the CCB was unwilling to rubber-stamp Joe Hand’s claims against her. The final determination explained, 

Beyond the conclusory and clearly boilerplate allegations in the Claim that Dawson (and now-dismissed respondent Giglio) ‘owned, operated, maintained, and controlled the commercial business known as Bottoms Up Bar & Grill’ and ‘had a right and ability to supervise the activities of the Establishment on the date of the Program and had an obvious and direct financial interest in the activities of the Establishment on the date of the Program’ (Dkt. 1), Claimant offers absolutely no information linking Respondent to the infringement.” 

I will spare you the details, but the CCB went on to cite case after case explaining why courts have routinely rejected such boilerplate claims, and required plaintiffs to at least allege meaningful facts connecting an individual to an act of infringement.  Even in this default case where Dawson was not present to defend herself, the CCB put in the effort on her behalf. 


I have a few observations. In the first case, given that Jameson clearly did not want her case heard before the CCB, I think it would have been fair for the CCB to allow her a second chance to opt out. At least on the record we have available, there is no indication that the CCB offered her that chance.  Although the normal opt-out period extends only sixty days after service, the CCB opt-out regulations also state that “the Board may extend the 60-day period to opt out in exceptional circumstances and in the interests of justice.” 

It seems to me, given the newness of the CCB system, the small number of cases filed to date, and the relative lack of awareness among most people that the CCB is a legitimate government forum (Jameson expressed such doubt herself), the “interests of justice” may well dictate a more flexible approach at least at the outset of operations of the CCB. 

The CCB has demonstrated an extraordinary willingness to offer helpful guidance, flexibility, and multiple opportunities to claimants, and so respondents may have expected a similar approach to help them along through the process. At least in this case, we see a more stringent approach. An obvious takeaway for respondents then is to pay attention to notices about CCB claims and associated deadlines, and opt-out early on in the process if they think they don’t want their case heard there. 

The Dawson case, however, does show that the CCB isn’t willing to let claimants make unsubstantiated claims against absent respondents. Though Joe Hand is surely familiar with the process and it would have been easy for the CCB to accept its barebones allegations against Dawson as true, the CCB made the case itself–with ample legal support–that even claims against absent respondents require claimants to make a real case. 

Overall, these are just two cases,  so I don’t want to read into them too much. But it’s already looking like a large portion of CCB cases will be defaults (10 out of the 18 final determinations to date, and more than half of the existing active cases are trending in that direction). So, it’s good to keep an eye on how the CCB will treat these types of cases, given the risks they pose for unwary and uninformed respondents. 

Authors Alliance Submits Long-Form Comment to Copyright Office in Support of Petition to Expand Existing Text and Data Mining Exemption 

Posted January 29, 2024
Photo by Simona Sergi on Unsplash

Last month, Authors Alliance submitted detailed comments in response to the Copyright Office’s Notice of Proposed Rulemaking in support of our petition to expand the existing Digital Millennium Copyright Act (DMCA) exemptions that enable text and data mining (TDM) as part of this year’s §1201 rulemaking cycle

To recap: our expansion petitions ask the Copyright Office to modify the existing TDM exemption so that researchers who assemble corpora of ebooks or films on which to conduct text and data mining are able to share that corpus with other academic researchers, where this second group of researchers qualifies under the exemption. Under the current exemption, academic researchers are only able to share their corpora with other qualified researchers for purposes of “collaboration and verification.” This simple change would eliminate the need for duplicative efforts to remove digital locks from ebooks and films, a time and resource-intensive process, broadening the group of academic researchers who are able to use the exemption. 

Our comment argues that the existing TDM exemption has begun to enable valuable digital humanities research and teaching, but that the proposed expansion would go much further towards enabling this research and helping TDM researchers reach their goals. The comment is accompanied by 13 letters of support from researchers, educators, and funding organizations, highlighting the research that has been done in reliance on the exemption, and explaining why this expansion is necessary. Our thanks go out to our stellar clinical team at UC Berkeley’s Samuelson Law, Technology & Public Policy Clinic—law students Mathew Cha and Zhudi Huang, and clinical supervisor Jennifer Urban—for writing and submitting this comment on our behalf. We are also grateful to our co-petitioners, the Library Copyright Alliance and American Association of University Professors, for their support on this comment. 

Ambiguity in “Collaboration”

One reason the expansion is necessary is the uncertainty over what constitutes “collaboration” under the existing exemption. Researchers have open questions about what level of individual contribution to a project would make researchers “collaborators” under the exemption. As our comment explains, collaboration can come in a number of different forms, from “formal collaborations under the auspice of a grant, [to] ad hoc collaborations that result from two teams discovering that they are working on similar material to the same ends, or even discussions at conferences between members of a loose network of scholars working on the same broad set of interests.” But it is not clear which of these activities is “collaboration” for the purposes of the exemption. And this uncertainty has had a chilling effect on the socially valuable research made possible by the exemption. 

Costly Corpora Creation 

Our comment also highlights the vast costs that go into creating a usable corpus for TDM research. Institutions whose researchers are conducting TDM research pursuant to the exemption must lawfully own the works in question, or license them through a license that is not time-limited. But these costs pale in comparison to the required computing resources—a cost which is compounded by the exemption’s strict security requirements—and human labor involved in bypassing technical protection measures and assembling a corpus. Moreover, it’s important to recognize that there is simply not a tremendous amount of grant funding or even institutional support available to TDM researchers. 

Because corpora are so costly to assemble and create, we believe it to be reasonable to permit researchers to share their corpora with researchers at other institutions who want to conduct independent TDM research on these corpora. As the exemption currently stands, researchers interested in pre-existing corpora must duplicate the efforts of the previous researchers, incurring massive costs along the way. We’ve already seen indications that these costs can lead researchers to avoid certain research questions and areas of study altogether. As our comment explains, this “duplicative circumvention” can be avoided by changing the language of the exemption to permit corpora sharing between qualified researchers at separate institutions. 

Equity Issues

Worse still, not all institutions are able to bear these expenses. Our comment explains how the current exemption’s prohibition on sharing beyond collaboration and verification—and consequent duplication of prior labor—-”create[s] barriers that can prevent smaller and less-well-resourced institutions from conducting TDM research at all.” This creates inequity in what type of institutions can support TDM projects, and what types of researchers can conduct them. The unfortunate result has been that large institutions that have “the resources to compensate and maintain technical staff and infrastructure” are able to support TDM research under the exemption, while smaller institutions are not. 

Values of Corpora Sharing

Our comment explains how allowing limited sharing of corpora under the exemption would go a long way towards lowering barriers to entry for TDM research and ameliorating the equity issues described above. Since digital humanities is already an under-resourced field, the effects of enabling researchers to share their corpora with other academic researchers could be quite profound. 

Researchers who wrote letters in support of the petition described a multitude of exciting projects, and have built “a rich set of corpora to study, such as a collection of fiction written by African American writers, a collection of books banned in the United States, and a curated corpus of movies and television with an ‘emphasis on racial, ethnic, sexual, and gender diversity.’” Many of those who wrote letters in support of our petition recounted requests they’ve gotten from other researchers to use their corpora, and who were frustrated that the exemption’s prohibition on non-collaborative sharing and their limited capacity for collaboration prevented them from sharing these corpora. 

Allowing new researchers with new research questions to study these corpora could reveal new insights about these bodies of work. As we explain, “in the same way a single literary work or motion picture can evince multiple meanings based on the lens of analysis used, when different researchers study one corpus, they are able to pose different research questions and apply different methodologies, ultimately revealing new and original findings . . . . Enabling broader sharing and thus, increasing the number of researchers that can study a corpus, will allow a body of works to be better understood beyond the initial ‘limited set of research questions.’”

Fair Use

The 1201 rulemaking process for exemptions to DMCA § 1201’s prohibition on breaking digital locks requires that the proposed activity be a fair use. In the 2021 proceedings, the Office recognized TDM for research and teaching purposes as a fair use. Because the expansion we’re seeking is relatively minor, our comment explains that the types of uses we are asking the Office to permit researchers to make is also fair use. Our comment explains that each of the four fair use factors favor fair use in the context of the proposed expansion. We further explain why the enhanced sharing the expansion would provide does not harm the market for the original works under factor four: because institutions must lawfully own (or license under a non-time-limited license) the works that their researchers wish to conduct TDM on, it makes no difference from a market standpoint whether researchers bypass technical protection measures themselves, or share another institution’s corpus. Copyright holders are not harmed when researchers at one institution share a corpus created by researchers at another institution, since both institutions must purchase the works in order to be eligible under the exemption. 

What’s Next?

If there are parties that oppose our proposed expansion, they have until February 20th to submit opposition comments to the Copyright Office. Then, on March 19th, our reply comments to any opposition comments will be due. We will keep our readers and members apprised as the process continues to move forward.

Authors Alliance 2023 Annual Report

Posted January 23, 2024

Authors Alliance is pleased to share our 2023 annual report, where you can find highlights of our work in 2023 to promote laws, policies, and practices that enable authors to reach wide audiences. In the report, you can read about how we’re helping authors meet their dissemination goals for their works, representing their interests in the courts, and otherwise working to advocate for authors who write to be read. 

Click here to read the full report.

Hachette v. Internet Archive: Amicus Briefs on Non-Commercial Use

Posted January 19, 2024

Our last post highlighted one of the amicus briefs filed in the Hachette v. Internet Archive lawsuit, which made the point that controlled digital lending serves important privacy interests for library readers. Today I want to highlight a second new issue introduced on appeal and addressed by almost every amici: the proper way to assess whether a given use is “non-commercial.”

“Non-commercial” use is important  because the first fair use factor directs courts to assess “the purpose and character of the use, including whether such use is of a commercial nature or is for nonprofit educational purposes.” Before the district court, neither Internet Archive (IA) nor amici who filed in support of IA paid considerable attention to arguing about whether IA’s use was commercial, I think because it seemed so clear that lending books for free to library patrons appeared to us a paradigmatic example of non-commercial use. It came as a shock, therefore, when the District Court in this case concluded that “IA stands to profit” from its use and that the use was therefore commercial. 

The Court’s reasoning was odd. While it recognized that IA “is a non-profit organization that does not charge patrons to borrow books and because private reading is noncommercial in nature,” the court concluded that because IA gains “an advantage or benefit from its distribution and use” of the works at issue, its use was commercial. Among the “benefits” that the court listed: 

  • IA exploits the Works in Suit without paying the customary price
  • IA uses its Website to attract new members, solicit donations, and bolster its standing in the library community.
  • Better World Books also pays IA whenever a patron buys a used book from BWB after clicking on the “Purchase at Better World Books” button that appears on the top of webpages for ebooks on the Website.

Although almost every amici addressed the problems with this approach to “non-commercial” use, three briefs, in particular, added important additional context, explaining both why the district court was wrong on the law and why its rule would have dramatically negative implications for other libraries and nonprofit organizations. 

First, the Association of Research Libraries and the American Library Association, represented by Brandon Butler, make a forceful legal argument in their amicus brief about why the district court’s baseline formulation of commerciality (benefit without paying the customary price) was wrong: 

The district court’s determination that the Internet Archive (“IA”) was engaged in a “commercial” use for purposes of the first statutory factor is based on a circular argument that seemingly renders every would-be fair use “commercial” so long as the user benefits in some way from their use. This cannot be the law, and in the Second Circuit it is not. The correct standard is clearly stated in American Geophysical Union v. Texaco Inc., 60 F. 3d 913 (2d Cir. 1994), a case the district court ignored entirely.

ARL and ALA then go on to highlight numerous examples of appellate courts (including the Second Circuit) rejecting this approach such as in the 11th Circuit in the Georgia State E-reserves copyright lawsuit: “Of course, any unlicensed use of copyrighted material profits the user in the sense that the user does not pay a potential licensing fee, allowing the user to keep his or her money. If this analysis were persuasive, no use could qualify as ’nonprofit’ under the first factor.” 

Second was an amicus brief by law professor Rebecca Tushnet on behalf of Intellectual Property Law Scholars, explaining both whycopyright law and fair use favor non-commercial uses, and how IA’s uses fall squarely within the public-benefit objectives of the law. The brief begins by highlighting the close connection between non-commercial use and the goals of copyright: 

The constitutional goal of copyright protection is to “promote the progress of science and useful arts,” Art. I, sec. 1, cl. 8, and the first copyright law was “an act for the encouragement of learning,” Cambridge University Press v. Patton, 769 F.3d 1232, 1256 (11th Cir. 2014). This case provides an opportunity for this Court to reaffirm that vision by recognizing the special role that noncommercial, nonprofit uses play in supporting freedom of speech and access to knowledge. 

The IP Professors Brief then goes on to highlight the many ways that Congress has indicated that library lending should be treated favorably because it furthers objectives of supporting learning, and how the court’s constrained reading of “non-commercial” is actually in conflict with how that term is used elsewhere in the Copyright Act (for example, Sections 111, 114, and 118 for non-commercial broadcasters, or Section 1008 for non-commercial consumers who copy music). The brief then goes on to make a strong case for why the district court wasn’t only mistaken, but that library lending should presumptively be treated as non-commercial. 

Finally, we see the amicus brief from the Wikimedia Foundation, Creative Commons, and Project Gutenberg, represented by Jef Pearlman and a team of students at the USC IP & Technology Law Clinic. Their brief highlighted in detail the practical challenges that the district court’s approach to non-commercial use would pose for all sorts of online nonprofits. The brief explains how nonprofits that raise money will inevitably include donation buttons on pages with fair use content, rely on volunteer contributions, and engage in revenue-generated activities to support their work, which in some cases require millions of dollars for technical infrastructure. The brief explains: 

The district court defined “commercial” under the first fair use factor far too broadly, inextricably linking secondary uses to fundraising even when those activities are, in practice, completely unrelated. In evaluating what constitutes commercial use, the district court misapplied several considerations and ignored other critical considerations. As a result, the district court’s ruling threatens nonprofit organizations who make fair use of copyrighted works. Adopting the district court’s approach would threaten both the processes of nonprofit fundraising and the methods by which educational nonprofits provide their services.

Hachette v. IA Amicus Briefs: Highlight on Privacy and Controlled Digital Lending

Posted January 16, 2024

Photo by Matthew Henry on Unsplash

Over the holidays you may have read about the amicus brief we submitted in the Hachette v. Internet Archive case about library controlled digital lending (CDL), which we’ve been tracking for quite some time. Our brief was one of 11 amicus briefs filed that explained to the court the broader implications of the case. Internet Archive itself has a short overview of the others already (representing 20 organizations and 298 individuals–mostly librarians and legal experts). 

I thought it would be worthwhile to highlight some of the important issues identified by these amici that did not receive much attention earlier in the lawsuit. This post is about the reader’s privacy issues raised by several amici in support of Internet Archive and CDL. Later this week we’ll have another post focused on briefs and arguments about why the district court inappropriately construed Internet Archive’s lending program as “commercial.” 

Privacy and CDL 

One aspect of library lending that’s really special is the privacy that readers are promised when they check out a book. Most states have special laws that require libraries to protect readers’ privacy, something that libraries enthusiastically embrace (e.g., see the ALA Library Bill of Rights) as a way to help foster free inquiry and learning among readers.  Unlike when you buy an ebook from Amazon–which keeps and tracks detailed reader information–dates, times, what page you spent time on, what you highlighted–libraries strive to minimize the data they keep on readers to protect their privacy. This protects readers from data breaches or other third party demands for that data. 

The brief from the Center for Democracy and Technology, Library Freedom Project, and Public Knowledge spends nearly 40 pages explaining why the court should consider reader privacy as part of its fair use calculus. Represented by Jennifer Urban and a team of students at the Samuelson Law, Technology and Public Policy Clinic at UC Berkeley Law (disclosure: the clinic represents Authors Alliance on some matters, and we are big fans of their work), the brief masterfully explains the importance of this issue. From their brief, below is a summary of the argument (edited down for length): 

The conditions surrounding access to information are important. As the Supreme Court has repeatedly recognized, privacy is essential to meaningful access to information and freedom of inquiry. But in ruling against the Internet Archive, the district court did not consider one of CDL’s key advantages: it preserves libraries’ ability to safeguard reader privacy. When employing C

DL, libraries digitize their own physical materials and loan them on a digital-to-physical, one-to-one basis with controls to prevent redistribution or sharing. CDL provides extensive, interrelated benefits to libraries and patrons, such as increasing accessibility for people with disabilities or limited transportation, improving access to rare and fragile materials, facilitating interlibrary resource sharing—and protecting reader privacy. For decades, libraries have protected reader privacy, as it is fundamental to meaningful access to information. Libraries’ commitment is reflected in case law, state statutes, and longstanding library practices. CDL allows libraries to continue protecting reader privacy while providing access to information in an increasingly digital age. Indeed, libraries across the country, not just the Internet Archive, have deployed CDL to make intellectual materials more accessible. And while increasing accessibility, these CDL systems abide by libraries’ privacy protective standards. 

Commercial digital lending options, by contrast, fail to protect reader privacy; instead, they threaten it. These options include commercial aggregators—for-profit companies that “aggregate” digital content from publishers and license access to these collections to libraries and their patrons—and commercial e-book platforms, which provide services for reading digital content via e-reading devices, mobile applications (“apps”), or browsers. In sharp contrast to libraries, these commercial actors track readers in intimate detail. Typical surveillance includes what readers browse, what they read, and how they interact with specific content—even details like pages accessed or words highlighted. The fruits of this surveillance may then be shared with or sold to third parties. Beyond profiting from an economy of reader surveillance, these commercial actors leave readers vulnerable to data breaches by collecting and retaining vast amounts of sensitive reader data. Ultimately, surveilling and tracking readers risks chilling their desire to seek information and engage in the intellectual inquiry that is essential to American democracy. 

Readers should not have to choose to either forfeit their privacy or forgo digital access to information; nor should libraries be forced to impose this choice on readers. CDL provides an ecosystem where all people, including those with mobility limitations and print disabilities, can pursue knowledge in a privacy-protective manner. . . . 

An outcome in this case that prevents libraries from relying on fair use to develop and deploy CDL systems would harm readers’ privacy and chill access to information. But an outcome that preserves CDL options will preserve reader privacy and access to information. The district court should have more carefully considered the socially beneficial purposes of library-led CDL, which include protecting patrons’ ability to access digital materials privately, and the harm to copyright’s public benefit of disallowing libraries from using CDL. Accordingly, the district court’s decision should be reversed.

The court below considered CDL copies and licensed ebook copies as essentially equivalent and concluded that the CDL copies IA provided acted as substitutes for licensed copies. Authors Alliance’s amicus brief points out some of the ways that CDL copies actually quite different significantly from licensed copies. It seems to me that this additional point about protection of reader privacy–and the protection of free inquiry that comes with it–is exactly the kind of distinguishing public benefit that the lower court should have considered but did not. 

You can read the full brief from the Center for Democracy and Technology, Library Freedom Project, and Public Knowledge here. 

Licensing research content via agreements that authorize uses of artificial intelligence

Posted January 10, 2024
Photo by Hal Gatewood on Unsplash

This is a guest post by Rachael G. Samberg, Timothy Vollmer, and Samantha Teremi, professionals within the Office of Scholarly Communication Services at UC Berkeley Library. 

On academic and library listservs, there has emerged an increasingly fraught discussion about licensing scholarly content when scholars’ research methodologies rely on artificial intelligence (AI). Scholars and librarians are rightfully concerned that non-profit educational research methodologies like text and data mining (TDM) that can (but do not necessarily) incorporate usage of AI tools are being clamped down upon by publishers. Indeed, libraries are now being presented with content license agreements that prohibit AI tools and training entirely, irrespective of scholarly purpose. 

Conversely, publishers, vendors, and content creators—a group we’ll call “rightsholders” here—have expressed valid concerns about how their copyright-protected content is used in AI training, particularly in a commercial context unrelated to scholarly research. Rightsholders fear that their livelihoods are being threatened when generative AI tools are trained and then used to create new outputs that they believe could infringe upon or undermine the market for their works.

Within the context of non-profit academic research, rightsholders’ fears about allowing AI training, and especially non-generative AI training, are misplaced. Newly-emerging content license agreements that prohibit usage of AI entirely, or charge exorbitant fees for it as a separately-licensed right, will be devastating for scientific research and the advancement of knowledge. Our aim with this post is to empower scholars and academic librarians with legal information about why those licensing outcomes are unnecessary, and equip them with alternative licensing language to adequately address rightsholders’ concerns

To that end, we will: 

  1. Explain the copyright landscape underpinning the use of AI in research contexts;
  2. Address ways that AI usage can be regulated to protect rightsholders, while outlining opportunities to reform contract law to support scholars; and 
  3. Conclude with practical language that can be incorporated into licensing agreements, so that libraries and scholars can continue to achieve licensing outcomes that satisfy research needs.

Our guidance is based on legal analysis as well as our views as law and policy experts working within scholarly communication. While your mileage or opinions may vary, we hope that the explanations and tools we provide offer a springboard for discussion within your academic institutions or communities about ways to approach licensing scholarly content in the age of AI research.

Copyright and AI training

As we have recently explored in presentations and posts, the copyright law and policy landscape underpinning the use of AI models is complex, and regulatory decision-making in the copyright sphere will have ramifications for global enterprise, innovation, and trade. A much-discussed group of lawsuits and a parallel inquiry from the U.S. Copyright Office raise important and timely legal questions, many of which we are only beginning to understand. But there are two precepts that we believe are clear now, and that bear upon the non-profit education, research, and scholarship undertaken by scholars who rely on AI models. 

First, as the UC Berkeley Library has explained in greater detail to the Copyright Office, training artificial intelligence is a fair use—and particularly so in a non-profit research and educational context. (For other similar comments provided to the Copyright Office, see, e.g., the submissions of Authors Alliance and Project LEND). Maintaining its continued treatment as fair use is essential to protecting research, including TDM. 

TDM refers generally to a set of research methodologies reliant on computational tools, algorithms, and automated techniques to extract revelatory information from large sets of unstructured or thinly-structured digital content. Not all TDM methodologies necessitate usage of AI models in doing so. For instance, the words that 20th century fiction authors use to describe happiness can be searched for in a corpus of works merely by using algorithms looking for synonyms and variations of words like “happiness” or “mirth,” with no AI involved. But to find examples of happy characters in those books, a researcher would likely need to apply what are called discriminative modeling methodologies that first train AI on examples of what qualities a happy character demonstrates or exhibits, so that the AI can then go and search for occurrences within a larger corpus of works. This latter TDM process involves AI, but not generative AI; and scholars have relied non-controversially on this kind of non-generative AI training within TDM for years. 

Previous court cases like Authors Guild v. HathiTrust, Authors Guild v. Google, and A.V. ex rel. Vanderhye v. iParadigms have addressed fair use in the context of TDM and confirmed that the reproduction of copyrighted works to create and conduct text and data mining on a collection of copyright-protected works is a fair use. These cases further hold that making derived data, results, abstractions, metadata, or analysis from the copyright-protected corpus available to the public is also fair use, as long as the research methodologies or data distribution processes do not re-express the underlying works to the public in a way that could supplant the market for the originals. 

For the same reasons that the TDM processes constitute fair use of copyrighted works in these contexts, the training of AI tools to do that text and data mining is also fair use. This is in large part because of the same transformativeness of the purpose (under Fair Use Factor 1) and because, just like “regular” TDM that doesn’t involve AI, AI training does not reproduce or communicate the underlying copyrighted works to the public (which is essential to the determination of market supplantation for Fair Use Factor 4). 

But, while AI training is no different from other TDM methodologies in terms of fair use, there is an important distinction to make between the inputs for AI training and generative AI’s outputs. The overall fair use of generative AI outputs cannot always be predicted in advance: The mechanics of generative AI models’ operations suggest that there are limited instances in which generative AI outputs could indeed be substantially similar to (and potentially infringing of) the underlying works used for training; this substantial similarity is possible typically only when a training corpus is rife with numerous copies of the same work. And a recent case filed by the New York Times addresses this potential similarity problem with generative AI outputs.  

Yet, training inputs should not be conflated with outputs: The training of AI models by using copyright-protected inputs falls squarely within what courts have already determined in TDM cases to be a transformative fair use. This is especially true when that AI training is conducted for non-profit educational or research purposes, as this bolsters its status under Fair Use Factor 1, which considers both transformativeness and whether the act is undertaken for non-profit educational purposes. 

Were a court to suddenly determine that training AI was not fair use, and AI training was subsequently permitted only on “safe” materials (like public domain works or works for which training permission has been granted via license), this would curtail freedom of inquiry, exacerbate bias in the nature of research questions able to be studied and the methodologies available to study them, and amplify the views of an unrepresentative set of creators given the limited types of materials available with which to conduct the studies.

The second precept we uphold is that scholars’ ability to access the underlying content to conduct fair use AI training should be preserved with no opt-outs from the perspective of copyright regulation. 

The fair use provision of the Copyright Act does not afford copyright owners a right to opt out of allowing other people to use their works in any other circumstance, for good reason: If content creators were able to opt out of fair use, little content would be available freely to build upon. Uniquely allowing fair use opt-outs only in the context of AI training would be a particular threat for research and education, because fair use in these contexts is already becoming an out-of-reach luxury even for the wealthiest institutions. What do we mean?

In the U.S., the prospect of “contractual override” means that, although fair use is statutorily provided for, private parties like publishers may “contract around” fair use by requiring libraries to negotiate for otherwise lawful activities (such as conducting TDM or training AI for research). Academic libraries are forced to pay significant sums each year to try to preserve fair use rights for campus scholars through the database and electronic content license agreements that they sign. This override landscape is particularly detrimental for TDM research methodologies, because TDM research often requires use of massive datasets with works from many publishers, including copyright owners who cannot be identified or who are unwilling to grant such licenses. 

So, if the Copyright Office or Congress were to enable rightsholders to opt-out of having their works fairly used for training AI for scholarship, then academic institutions and scholars would face even greater hurdles in licensing content for research. Rightsholders might opt out of allowing their work to be used for AI training fair uses, and then turn around and charge AI usage fees to scholars (or libraries)—essentially licensing back fair uses for research. 

Fundamentally, this undermines lawmakers’ public interest goals: It creates a risk of rent-seeking or anti-competitive behavior through which a rightsholder can demand additional remuneration or withhold granting licenses for activities generally seen as being good for public knowledge or that rely on exceptions like fair use. And from a practical perspective, allowing opt-outs from fair uses would impede scholarship by or for research teams who lack grant or institutional funds to cover these additional licensing expenses; penalize research in or about underfunded disciplines or geographical regions; and result in bias as to the topics and regions that can be studied. 

“Fair use” does not mean “unregulated” 

Although training AI for non-profit scholarly uses is fair use from a copyright perspective, we are not suggesting AI training should be unregulated. To the contrary, we support guardrails because training AI can carry risk. For example, researchers have been able to use generative AI like ChatGPT to solicit personal information by bypassing platform safeguards.

To address issues of privacy, ethics, and the rights of publicity (which govern uses of people’s voices, images, and personas), there should be the adoption of best practices, private ordering, and other regulations. 

For instance, as to best practices, scholar Matthew Sag has suggested preliminary guidelines to avoid violations of privacy and the right to publicity. First, he recommends that AI platforms avoid training their large language models on duplicates of the same work. This would reduce the likelihood that the models could produce copyright-infringing outputs (due to memorization concerns), and it would also lessen the likelihood that any content containing potentially private or sensitive information would be outputted from having been fed into the training process multiple times. Second, Sag suggests that AI platforms engage in “reinforcement learning through human feedback” when training large language models. This practice could cut down on privacy or rights of publicity concerns by involving human feedback at the point of training, instead of leveraging filtering at the output stage.  

Private ordering would rely on platforms or communities to implement appropriate policies governing privacy issues, rights of publicity, and ethical concerns. For example, the UC Berkeley Library has created policies and practices (called “Responsible Access Workflows”) to help it make decisions around whether—and how—special collection materials may be digitized and made available online. Our Responsible Access Workflows require review of collection materials across copyright, contracts, privacy, and ethics parameters. Through careful policy development, the Library applies an ethics of care approach to making available online the collection content with ethical concerns. Even if content is not shared openly online, it doesn’t mean it’s unavailable for researchers for use in person; we simply have decided not to make that content available in digital formats with lower friction for use. We aim to apply transparent information about our decision-making, and researchers must make informed decisions about how to use the collections, whether or not they are using them in service of AI.

And finally, concerning regulations, countries like those in the EU have recently introduced an AI training framework that requires, among other things, the disclosure of source content, and the rights for content creators to opt out of having their works included in training sets except when the AI training is being done for research purposes by research organizations, cultural heritage institutions, and their members or scholars. United States agencies could consider implementing similar regulations here. 

But from a copyright perspective, and within non-profit academic research, fair use in AI training should be preserved without the opportunity to opt out for the reasons we discuss above. Such an approach regarding copyright would also be consistent with the distinction the EU has made for AI training in academic settings, as the EU’s Digital Single Market Directive bifurcates practices outside the context of scholarly research

While we favor regulation that preserves fair use, it is also important to note that merely preserving fair use rights in scholarly contexts for training AI is not the end of the story in protecting scholarly inquiry. So long as the United States permits contractual override of fair uses, libraries and researchers will continue to be at the mercy of publishers aggregating and controlling what may be done with the scholarly record, even if authors dedicate their content to the public domain or apply a Creative Commons license to it. So in our view, the real work that should be done is pursuing legislative or regulatory arrangements like the approximately 40 other countries that have curtailed the ability of contracts to abrogate fair use and other limitations and exceptions to copyright within non-profit scholarly and educational uses. This is a challenging, but important, mission.

Licensing guidance in the meantime 

While the statutory, regulatory, and private governance landscapes are being addressed, libraries and scholars need ways to preserve usage rights for content when training AI as part of their TDM research methodologies. We have developed sample license language intended to address rightsholders’ key concerns while maintaining scholars’ ability to train AI in text and data mining research. We drafted this language to be incorporated into amendments to existing licenses that fail to address TDM, or into stand-alone TDM and AI licenses; however, it is easily adaptable into agreements-in-chief (and we encourage you to do so). 

We are certain our terms can continue to be improved upon over time or be tailored for specific research needs as methodologies and AI uses change. But in the meantime, we think they are an important step in the right direction.

With that in mind, it is important to understand that within contracts applying U.S. law, more specific language controls over general language in a contract. So, even if there is a clause in a license agreement that preserves fair use, if it is later followed by a TDM clause that restricts how TDM can be conducted (and whether AI can be used), then that more specific language governs TDM and AI usage under the agreement. This means that libraries and scholars must be mindful when negotiating TDM and AI clauses as they may be contracting themselves out of rights they would otherwise have had under fair use. 

So, how can a library or scholar negotiate sufficient AI usage rights while acknowledging the concerns of  publishers? We believe publishers have attempted to curb AI usage because they are concerned about: (1) the security of their licensed products, and the fear that researchers will leak or release content behind their paywall; and (2) AI being used to create a competing product that could substitute for the original licensed product and undermine their share of the market. While these concerns are valid, they reflect longstanding fears over users’ potential generalized misuse of licensed materials in which they do not hold copyright. But publishers are already able to—and do—impose contractual provisions disallowing the creation of derivative products and systematically sharing licensed content with third-parties, so additionally banning the use of AI in doing so is, in our opinion, unwarranted.

We developed our sample licensing language to precisely address these concerns by specifying in the grant of license that research results may be used and shared with others in the course of a user’s academic or non-profit research “except to the extent that doing so would substantially reproduce or redistribute the original Licensed Materials, or create a product for use by third parties that would substitute for the Licensed Materials.” Our language also imposes reasonable security protections in the research and storage process to quell fears of content leakage. 

Perhaps most importantly, our sample licensing language preserves the right to conduct TDM using “machine learning” and “other automated techniques” by expressly including these phrases in the definition for TDM, thereby reserving AI training rights (including as such AI training methodologies evolve), provided that no competing product or release of the underlying materials is made. 

The licensing road ahead

As legislation and standards around AI continue to develop, we hope to see express contractual allowance for AI training become the norm in academic licensing. Though our licensing language will likely need to adapt to and evolve with policy changes and research or technological advancements over time, we hope the sample language can now assist other institutions in their negotiations, and help set a licensing precedent so that publishers understand the importance of allowing AI training in non-profit research contexts. While a different legislative and regulatory approach may be appropriate in the commercial context, we believe that academic research licenses should preserve the right to incorporate AI, especially without additional costs being passed to subscribing institutions or individual users, as a fundamental element of ensuring a diverse and innovative scholarly record.

Authors Alliance Submits Amicus Brief to the Second Circuit in Hachette Books v. Internet Archive

Posted December 21, 2023
Photo by Dylan Dehnert on Unsplash

We are thrilled to announce that we’ve submitted an amicus brief to the Second Circuit Court of Appeals in Hachette Books v. Internet Archive—the case about whether controlled digital lending is a fair use—in support of the Internet Archive. Authored by Authors Alliance Senior Staff Attorney, Rachel Brooke, the brief reprises many of the arguments we made in our amicus brief in the district court proceedings and elaborates on why and how the lower court got it wrong, and why the case matters for our members and other authors who write to be read.

The Case

We’ve been writing about this case for years—since the complaint was first filed back in 2020. But to recap: a group of trade publishers sued the Internet Archive in federal court in the Southern District of New York over (among other things) the legality of its controlled digital lending (CDL) program. The publishers argued that the practice infringed their copyrights, and Internet Archive defended its project on the grounds that it was fair use. We submitted an amicus brief in support of IA and CDL (which we have long supported as a fair use) to the district court, explaining that copyright is about protecting authors, and many authors strongly support CDL

The case finally went to oral argument before a judge in March of this year. Unfortunately, the judge ruled against Internet Archive, finding that each of the fair use factors favored the publishers. Internet Archive indicated that it planned to appeal, and we announced that we planned to support them in those efforts. Now, the case is before the Second Circuit Court of Appeals. After Internet Archive filed its opening brief last week, we (and other amici) filed our briefs in support of a reversal of the lower court’s decision.

Our Brief

Our amicus brief argues, in essence, that the district court  judge failed to adequately consider the interests of authors.  While the commercial publishers in the case did not support CDL, those publishers’ interests do not always align with authors’ and they certainly do not speak for all authors. We conducted outreach to authors, including launching a CDL survey, and uncovered a diversity of views on CDL—most of them extremely positive. We offered up these authors’ perspectives to show the court that many authors do support CDL, contrary to the representations of the publishers. Since copyright is about incentivizing new creation for the benefit of the public and protecting author interests, we felt these views were important for the Second Circuit to hear. 

We also sought to explain how the district court judge got it wrong when it comes to fair use. One of the key findings in the lower court decision was that loans of CDL scans were direct substitutes for loans of licensed ebooks. We explained that this is not the case: a CDL scan is not the same thing as an ebook, they look different and have different functions and features. And CDL scans can be resources for authors conducting research in some key ways that licensed ebooks cannot. Out of print books and older editions of books are often available as CDL scans but not licensed ebooks, for example.

Another issue from the district court opinion that we addressed was the judge’s finding that IA’s use of the works in question was “commercial.” We strongly disagreed with this conclusion: borrowing a CDL scan from IA’s Open Library is free, and the organization—which is also a nonprofit—actually bears a lot of expenses related to digitization. Moreover, the publishers had failed to establish any concrete financial harm they had suffered as a result of IA’s CDL program. We discussed a recent lawsuit in the D.C. Circuit, ASTM v. PRO, to further push back on the district court’s conclusion on commerciality. 

You can read our brief for yourself here, or find it embedded at the bottom of this post. In the new year, you can expect another post or two with more details about our amicus brief and the other amicus briefs that have been, or soon will be, submitted in this case.

What’s Next?

Earlier this week, the publishers proposed that they file their own brief on March 15, 2024—91 days after Internet Archive filed its opening brief. The court’s rules stipulate that any amici supporting the publishers file their briefs within seven days of the publishers’ filing. Then, the parties can decide to submit reply briefs, and will notify the court of their intent to do so. Finally, the parties can choose to request oral argument, though the court might still decide to decide the case “on submission,” i.e., without oral argument. If the case does proceed to oral argument, a three-judge panel will hear from attorneys for each side before rendering their decision. We expect the process to extend into mid-2024, but it can take quite a while for appeals courts to actually hand down their decision. We’ll keep our readers apprised of any updates as the case moves forward.


Authors Alliance Amicus Briefs: Defending Free Expression from Trademark, Social Media, and Copyright Law Challenges

Posted December 8, 2023
Photo by Claire Anderson on Unsplash

The cases that threaten authors’ rights aren’t always obvious. You might have noticed in the last year that we’ve filed amicus briefs in some unusual ones—for example, a trademark lawsuit about squishy dog toys, or a case about YouTube recommendation algorithms. It’s often true that cases like these raise legal questions that extend well beyond their facts, and decisions in these cases can have unintended consequences for authors. As part of our mission of speaking up on behalf of authors for the public interest, we file amicus briefs in these cases to help courts craft better, more informed decisions that account for the interests of authors. 

In the last few weeks Authors Alliance has joined with several other organizations to file amicus briefs in three cases like these: 

Hermès v. Rothschild: Free Expression and Trademarks

The first is an amicus brief we joined in Hermès International v. Rothschild. Our brief is in support of Mason Rothschild, a digital artist who was sued by Hermès, creator of the famous “Birkin bag,” for allegedly infringing Hermès’s trademark. Rothschild created a series of NFT’s mimicking the bag that he called “metaBirkins,”which, he argues, comments on the brand, consumerism, luxury goods and so on. Rothschild lost at the court below and the case is currently before the Second Circuit Court of Appeals. 

Our amicus brief—drafted by the Harvard Cyberlaw Clinic and joined by Authors Alliance, MSCHF, CTHDRL, Alfred Steiner, and Jack Butcher—argues that such uses are protected by the test announced in Rogers v. Grimaldi, a threshold test designed to protect First Amendment interests in the trademark context, allowing courts to quickly resolve trademark litigation when trademarks are used in expressive works unless there is “no artistic relevance to the underlying work whatsoever, or, if it has some artistic relevance, unless the [second work] explicitly misleads as to the source or the content of the work.” Our brief argues that Rogers remains good law after the United States Supreme Court’s recent decision in Jack Daniel’s Props. v. VIP Products and that a creator’s intent to sell their work (in this case selling NFTs) is not relevant when balancing trademark owners’ and creators’ rights. The amici we joined with represent artists, creators, and organizations that are concerned that a ruling in favor of Hermès will stifle creators’ ability to comment on popular brands and companies. 

Warner Chappell v. Nealy: Copyright damages 

This is a case before the U.S. Supreme court raising questions about how far back courts can look when calculating monetary damages in copyright infringement lawsuits. The dispute in this case arose between Nealy, owner of an independent record label that released a number of albums in the 1980s, and Warner Chappell, who Nealy claims engaged in unauthorized reproduction and distribution of his works for years. Nealy claims he didn’t learn of the violation until 2016. He filed suit in 2018 and sought monetary damages for uses going back to 2008. Warner Chappell argues that the Copyright Act’s statute of limitations bars Nealy from recovering for damages going that far back. 

The legal question in this case is whether under the Copyright Act’s “discovery accrual” statute of limitations rule, a copyright plaintiff can recover damages for acts that allegedly occurred more than three years before the filing of a lawsuit. Lower courts have held that for actually filing a lawsuit, the statute of limitations clock starts to run based on when a plaintiff discovers the alleged infringement. This case raises a related question of how far back courts should look when assessing damages—just three years from the date of filing the suit, or an indeterminate period of time as long as it was within three years of the plaintiff discovering the harm? 

We joined  an amicus brief with EFF, the American Library, and the Association of Research Libraries in support of Warner Chappell. EFF did most of the heavy lifting for this brief (thank you!), making the case that a damages regime that extends indeterminately into the past will stifle creativity and encourage copyright trolls. A three-year lookback period is enough. 

We’ve long argued that the copyright’s damages regime needs to be reformed for authors. With statutory damage awards of up to $150,000 per work infringed,  the specter of such crippling liability can chill even non-infringing and socially beneficial acts of authorship, dissemination, archiving, and curation. For authors, it takes little imagination to see how problematic it would be if opportunistic copyright litigants with flimsy claims could leverage a decades old acts—e.g., an image reproduced in a blog post or article or book pursuant to fair use—to extract large damage awards spanning many years. If the court were to allow damages to reach back indeterminately, we argue that copyright trolls would be emboldened, hampering creativity and harming creators while providing them little forward-looking benefit for protection of their own works. 

NetChoice v. Paxton and Moody v. NetChoice: First Amendment and Online Platform Regulation

This case has received a ton of attention, in part because it is so politically charged. The basis of the suit is a challenge brought by NetChoice and CCIA, two internet and technology industry groups, against  laws passed in Texas and Florida that attempt to regulate how large social media websites moderate speech on their platforms. Each of those laws are ostensibly designed to protect the speech of users by limiting how platforms can remove or otherwise moderate their posts and each were passed in response to accusations of political bias.

On their faces these laws sound appealing—authors along with many other users are frustrated with opaque decision making on platforms about why their posts may be taken down, demonetized, or deprioritized by platform algorithms. These are real problems, but in our view, the right solution is not government-dictated content moderation rules. Authors use a wide variety of online platforms and rely heavily on content moderation to ensure that their views are not drowned out by spam, lies, or trolls. 

For our amicus brief, we joined EFF along with the National Coalition Against Censorship, Woodhull Freedom Foundation, Fight for the Future, and the First Amendment Coalition. We argue as follows: The First Amendment right of social media publishers to curate and edit the user speech they publish, free from government mandates, results in a diverse array of forums for users, with unique editorial views and community norms. Although some internet users are understandably frustrated and perplexed by the process of “content moderation,” by which sites decide which users’ posts to publish, recommend, or amplify, it’s on the whole far best for internet users when the First Amendment protects the sites’ rights to make those curatorial decisions. This First Amendment right to be editorially diverse does not evaporate the moment a site reaches a certain state-determined level of popularity. But both Texas House Bill 20 (HB 20) and Florida Senate Bill 7072 (SB 7072) take those protections away and force popular sites to ignore their own rules and publish speech inconsistent with their editorial vision, distorting the marketplace of ideas.

Content moderation by online intermediaries is an already fraught process, and government interjection of itself into that process raises serious practical and First Amendment concerns. Inconsistent and opaque private content moderation is a problem for users. But it is one best addressed through self-regulation and regulation that doesn’t retaliate against the editorial process.

Authors Alliance Releases New Legal Guide to Writing About Real People

Posted December 5, 2023

We are delighted to announce the publication of our brand new guide, the Authors Alliance Guide to Writing About Real People, a legal guide for authors writing nonfiction works about real people. The guide was written by students in two clinical teams at the UC Berkeley Samuelson Law and Public Policy Clinic—Lily Baggott, Jameson Davis, Tommy Ferdon, Alex Harvey, Emma Lee, and Daniel Todd—as well as clinical supervisors Jennifer Urban and Gabrielle Daley, along with Authors Alliance’s Senior Staff Attorney, Rachel Brooke. The guide was edited by Executive Director Dave Hansen and former Executive Director, Brianna Schofield. This long list of names is a testament to the fact that it took a village to create this guide, and we are so excited to finally share it with our members, allies, and any and all authors who need it. You can read and download our guide here

On Thursday, we are hosting a webinar about our guide, where Authors Alliance staff will share more about what went into producing it, those who partnered with us or supported the guide, and the particulars of the guide’s contents. Sign up here!

The Writing About Real People guide covers several different legal issues that can arise for authors writing about real people in nonfiction books like memoirs, biographies, and other narrative nonfiction projects. The issues it addresses are “causes of action” (or legal theories someone might sue under) based on state law. The requirements and considerations involved vary from state to state, so the guide highlights trends and commonalities among states. Throughout the guide, we emphasize that even though these causes of action might sound scary, the First Amendment to the U.S. Constitution in most cases empowers authors to write freely about topics of their choosing. The causes of action in this guide are exceptions to that rule, and each of them is limited in their reach and scope by the First Amendment’s guarantees. 

False Statements and Portrayals

The first section in the Writing About Real People guide concerns false statements and portrayals. This encompasses two different causes of action: defamation and false light. 

You have probably heard of defamation: it’s one of the most common causes of action related to writing about a real person. Defamation occurs when someone makes a false statement about another person that injures that person’s reputation, when the statement is made with some degree of “fault.” The level of fault required turns on what kind of person the statement is made about. For public people—people with some renown or governmental authority—the speaker must exercise “actual malice,” or reckless disregard as to whether the statement is true. But for private people, a speaker must be negligent as to whether the statement was true, meaning that the speaker failed to take an ordinary amount of care in verifying the veracity of the statement. An author might expose themselves to defamation liability if they write something untrue about another person in their published work that is held up as factual, that statement injures a person’s reputation, and the author failed to take the requisite level of care to ensure that the statement was factual. 

False light is similar to defamation, and many states do not recognize false light since these causes of action are so similar. Where defamation concerns false statements represented as factual, false light concerns false portrayals. It can occur when a speaker creates a misleading impression about a subject, through implication or omission, by example. Like defamation, false light requires fault on the part of the speaker, and the public person/private person standards are the same as for defamation. 

Invasions of Privacy

The second section in the Writing About Real People guide concerns invasions of privacy, or violations of a person’s rights to privacy. This covers two related causes of action: intrusion on seclusion and public disclosure of private facts. 

Intrusion on seclusion occurs when someone intentionally intrudes on another’s private place or affairs in a way that is highly offensive—judged by the perspective of an ordinary, reasonable person. For authors, intrusion on seclusion can arise when an author uses research or information-gathering methods that are invasive. This could include things like entering someone’s home without permission or digging through personal information like health or banking records without permission. Intrusion on seclusion might be an issue for authors during the research and writing stages of their processes, not when the work is actually published, as is the case with other causes of action in this guide.

Public disclosure of private facts occurs when someone makes private facts about a person public, when that disclosure is highly offensive and made with some degree of fault, and when the information disclosed doesn’t relate to a matter of public concern. Essentially, public disclosure of private facts liability exists to address situations where a speaker shares highly private information about a person that the public has no interest in knowing about, and the subject suffers as a result. Like defamation and false light, the level of fault required for a speaker to be liable depends on whether the subject is a public or private person, and these levels are the same as for defamation (actual malice for public people, and negligence for private people). This means that authors have much more leeway to share private information about public people than private people. And the “public concern” piece provides even more protection for speech about public people. 

Right of Publicity and Identity Rights

The third section in the Writing About Real People Guide concerns the right of publicity and unauthorized use of identity. Violations of the right of publicity, or unauthorized uses of identity, can occur when someone uses another person’s identity in a way that is “exploitative” and derives a benefit from that use. Importantly for authors, this excludes merely writing about someone in a book, article, or other piece of writing. The right of publicity is mostly concerned with commercial uses, like using someone’s name or likeness to sell a product without permission, but it can also apply to non-commercial uses that are exploitative, like using someone’s identity to generate attention for a work. In most cases, the right of publicity involves uses of someone’s image or likeness rather than just evoking their identity in text, but this is not necessarily the case. This section might be informative for authors who want to use someone’s image on their book cover or evoke an identity in advertising, but most authors merely writing nonfiction text about a real person do not have to worry too much about the right of publicity. 

Practical Guidance

A final section in our guide covers practical guidance for authors on how to avoid legal liability for the causes of action discussed in the guide in ways that are simple to understand and implement. Using reliable research methods and sources, obtaining consent from subjects where that is practicable, and carefully documenting your research and sources can go a long way towards helping you avoid legal liability while still empowering you to write freely.

Authors Alliance Submits Comment to Copyright Office in Generative AI Notice of Inquiry

Posted November 3, 2023
Photo by erica steeves on Unsplash

We are pleased to announce that we have submitted a comment to the Copyright Office in response to their recent notice of inquiry regarding how copyright law interacts with generative AI. In our comment, we shared our views on copyright and generative AI (which you can read about here) and the stories we heard from authors about how they are using generative AI to support their creative labors, research, and the mundane but important tasks being involved with being a working author. The Office received over 10,000 comments in response to its NOI, showing the high level of interest in how copyright regulates AI-generated works and training data for generative AI. We hope the Office will appreciate our perspective as it considers policy interventions to address copyright issues involved in the use of generative AI by creators. You can read our full comment here, or at the bottom of this post. 

You can hear more about our comment, and about contributions from other commenters, at the Berkeley Center for Law and Technology virtual roundtable on Monday, November 13th, where Authors Alliance senior staff attorney Rachel Brooke will be a panelist. The event is free and open to the public, and you can sign up here. 


Since the Copyright Office issued an opinion letter on copyright in a graphic novel containing AI-generated images back in February, the debate about copyright and generative AI has grown to a near fever pitch. Authors Alliance has been engaged in these issues since the decision letter was released: we exist to support authors who want to leverage the tools available in the digital age to see their creations reach broad audiences and create innovative new works, and we see generative AI systems as one such tool that can support authors and authorship. We participated in the Copyright Office’s listening session on copyright issues in AI-generated textual works this spring, and were eager to further weigh in as the Copyright Office wades through the thorny issues involved. 

In late August, the Copyright Office issued a notice of inquiry, asking stakeholders to weigh in on a series of questions about copyright policy and generative AI. These were broken down into general questions, questions about training AI models, questions about transparency and recordkeeping, and various issues related to AI outputs—copyrightability, infringement, and labeling and identification. 

Our Comment

Our comment was devoted in large part to sharing the ways that authors are using generative AI systems and tools to support their creative labors and research. We heard from authors that used generative AI systems for ideation, late stage editing, and generating text. We also learned that authors are using generative AI systems in ways we wouldn’t have anticipated—like creating books of prompts for other authors to use as inputs for generative AI systems. Generative AI has helped authors who don’t publish with conventional publishers create marketing copy and even generate book covers (despite the common adage, these are pretty important for attracting readers). We also heard from researchers using generative AI for literature reviews as well as to make their writing process more efficient so they can focus on doing the work of researching and innovating. Generative AI also has the potential to lower barriers to entry for scientific researchers who are not native English speakers, but want to make contributions to scientific fields in which literature tends to be written in English. 

We also spent some time explaining our views on why the use of copyrighted materials in training datasets for AI models constitutes fair use and how fair use analysis applies when copyrighted materials are included in training datasets. The use of creative works in training datasets is a transformative one with a different purpose than the works themselves—regardless of whether the institutions that develop and deploy them are commercial or nonprofit. And it’s highly unlikely that a generative AI system could harm the markets for the works in the training sets for the underlying models: a generative AI system is not a substitute for a book a reader is interested in reading, for example. We also explained that the market harm consideration (factor four in fair use analysis) should consider the effect of the use (using training data on AI models) on the market for the specific work in question (i.e., in an infringement action, the work that is alleged to have been infringed), and not the market for that author’s other works, similar works, or anything else.

Our comment also argued that new copyright legislation on AI—either to codify copyright’s human authorship requirement and explain how it applies to AI-generated content or to address other issues related to copyright and generative AI—is not warranted. AI systems, AI models, and the ways creators use them are still evolving. Copyright law is already highly flexible, having adapted to new technologies that weren’t anticipated when the copyright legislation itself was enacted. And legislating around nascent technologies can result in laws that are eventually ill-suited to deal with unexpected challenges that new technologies bring about (recall that the DMCA, which has faced a lot of criticism as a statute intended to regulate copyright online, was passed in 1998). We instead suggest that the Office stick with a “wait and see” approach as generative AI and how we use it continue to develop rather than recommending legislation to Congress. 

Next, we explained why a licensing system for AI works in training data is neither desirable nor practicable. Because we consider the use of copyrighted works in training data to be a fair use, licenses are not necessary in the first place. We also explained the host of problems that either a compulsory licensing regime or a collective licensing scheme would bring about. The large size of datasets for training AI models make it difficult to envision systematically seeking licenses for each and every copyrighted work in the training dataset, and the “orphan works problem” means that a majority of rightsholders might not be able to be found. It’s also not clear who would administer licensing under a licensing regime, and we could not think of any appropriate party that exists or is likely to emerge. The Office’s past failed investigations into possible collective rights management organizations (or CMOs) only underscore this point. 

Finally, we echoed our support for the substantial similarity test as a way to handle generative AI outputs that look very similar to existing copyrighted works. The substantial similarity test has been around for decades and has been applied across the country in a variety of contexts. It seems to us to be a good way to approach the rare cases in which generative AI outputs are strikingly similar to copyrighted works (so-called “memorization”) such that a rightsholder might sue for infringement. 

What’s Next?

The same day we submitted our comment, the Biden Administration released an executive order on “Safe, Secure, and Trustworthy Artificial Intelligence,” directing federal agencies to take a variety of measures to ensure that the use of generative AI is not harmful to innovation, privacy, labor, and more. Then on Wednesday, representatives from a coalition of countries (including the U.S.) signed “The Bletchley Declaration” following an AI Safety Summit in the U.K., warning of the dangers of generative AI and pledging to work together to find solutions. All of this is to say that how public policy should regulate generative AI, and whether and how the law needs to change to accommodate it, is a live issue that continues to evolve every day. Dozens of lawsuits are pending about the interaction between copyright and the use of generative AI systems, and as these cases move through the courts, judges will have their opportunity to weigh in. As ever, we will keep our readers and members appraised in any new legal developments around copyright and generative AI.