Category Archives: Blog

Assessing the U.S. Copyright Small Claims Court After One Year

Posted September 18, 2023

Authors Alliance members will recall the series of posts we’ve made about the United States’s new copyright small claims court. The below is a post by Dave Hansen and Authors Alliance member Katie Fortney, based on a forthcoming article we recently posted assessing how this court has fared in its first year of operations. This post was originally published on the Kluwer Copyright Blog.

In June 2023 the U.S. Copyright Office celebrated the one-year anniversary of operations of the Copyright Claims Board (“CCB”), a novel new small claims court housed within the agency with a budget request for $2.2 million in ongoing yearly costs. Though not entirely unique (e.g., the UK’s IP Enterprise court has been described as filling a similar role since 2012), the CCB has been closely watched and hotly debated (see here, here, and here).

The CCB was preceded by years of argument about the benefits and risks of such a small claims court.  Proponents argued that the CCB would offer rightsholders a low-cost, efficient alternative to litigation in federal courts (which can easily cost over $100,000 to litigate), allowing small creators to more effectively defend their rights. Opponents feared that the CCB would foster abuse, encouraging frivolous lawsuits while creating a trap for unwary defendants.

We set out to assess these arguments in light of data on the CCB’s first year of operation, which is explored in more detail in our article here, forthcoming in the Journal of the Copyright Society of the USA, and the data used for this article available here. The post summarizes from that article, which is itself based on an empirical review of the CCB’s first year of operations using data extracted from the CCB’s online filing system for the 487 claims filed with the court between June 2022 and June 2023.

How the CCB Works

To assess the work of the CCB, it’s first important to understand how the new court works. For claimants to successfully pursue a claim, they must first pass three hurdles:

  • their claim must be compliant, which means that it must include some key information regarding, e.g., ownership of a copyright, access to the work by the respondent in order to copy it, and substantial similarity between the allegedly infringing copy and the original;
  • their claim must also be properly served or delivered to the respondent, following the specific procedures that the Copyright Office has established;
  • the claimant must wait 60 days to see if the respondent decides to opt-out of the proceedings (in which case the claimant can refile in the more expensive, but more robust federal district court).

Once the opt-out window has passed, the proceeding becomes “active” and a scheduling order is issued. Then the parties can engage in discovery, have hearings and conferences, and eventually receive a final determination where the CCB may award damages.

CCB By the Numbers

In the first year of the CCB 487 claims were filed. However, only 43 of these 487 claims–less than 9%–had been issued scheduling orders and made it to the active phase by June 15, 2023.

Meanwhile, 302 cases had been closed, most of them dismissed without prejudice (meaning the case did not reach the merits and the claimant could choose to file again). The remaining claims were either awaiting review by the CCB, or waiting for an action from the claimant like filing an amended claim or filing proof of service.

Though the CCB gives claimants multiple opportunities to amend their complaint to fix problems with it (even offering detailed and helpful suggestions on how to fix those problems), over 150 claims were dismissed because the claimant did not file a proper claim. Failure to state facts sufficient to support Access and Substantial Similarity were common problems, showing up about 110 times each in CCB orders to amend (sometimes in the same order to amend). In some cases, however, there was no way to fix the complaint. For example, 35 claims were trying to pursue cases against foreign respondents, over whom the CCB has no jurisdiction. And over 100 claims were copyright infringement claims where the claimant hadn’t filed for copyright registration of the work allegedly infringed (a prerequisite to filing).

Claimants also had problems with service: 60 claims were dismissed in the first year because claimants didn’t file documentation showing that they’d accomplished valid proof of service. Finally, opt-out (which some proponents of the CCB feared would undermine the court) is an important but much smaller pathway out of the CCB: it accounted for 35 dismissals.

Perhaps because copyright is technical and complicated, it may not be surprising to find that having a lawyer helps avoid dismissal:  90% of claims from represented claimants had been certified as compliant; for claims from self-represented claimants, only 46% were compliant. Unregistered claimants account for over 70% of claims filed, but only 40% of those that make it to the active phase.

Looking more closely at the claimants themselves, we do see that the CCB system is being used by aggressive and prolific copyright litigants, but we haven’t seen the volume of copyright-troll litigation seen in the past in federal district courts.This may be in part because the Copyright Office took these concerns seriously and created rules to discourage it, such as limiting the number of claims a plaintiff can file within one year. The number of repeat filers was low – only nine filers had more than five claims. Those include, however, 17 claims filed by Higbee and Associates (sometimes referred to as a “troll” though the label may not exactly fit), and 20 by David C. Deal (another known and aggressive serial copyright litigant). And the only case in which the CCB had issued an order was in favor of David Oppenheimer, who has separately filed more than 170 copyright suits in federal courts.

Because the process has been so slow, it’s difficult to evaluate how the CCB is working for respondents. Opponents of the CCB feared that its ability to make default determinations (issuing monetary awards when the respondent never shows up) could be a trap for the unwary. The CCB has issued only two such determinations so far (both in August 2023, for $3000 each), and only one final determination that wasn’t the result of a default, withdrawal, or settlement. So, it’s too early to tell how common defaults will be. However, they will continue to be an issue to watch: in the first year, respondents were as likely to end up on the path to default as they were to participate in a proceeding.

Our Takeaways and Conclusion

On the one hand, we haven’t seen rampant abuse of the system. To be sure, serial copyright litigants are actively using the CCB, but in numbers far fewer than previously seen even in federal district court. And damage awards have been modest.

However, it also seems that the CCB has not achieved its promised efficiency for small litigants–for most claimants the system seems to be too complicated and slow, with the CCB only issuing a final determination in a single case in its entire first year, and the vast majority of claims dismissed for failure to adequately comply with CCB rules. The CCB has already gone to great lengths to explain the process and to help claimants correct errors early in the process. It may be hard for the CCB to adjust its rules to lower barriers unless it is willing to sacrifice basic procedural safeguards for respondents (something we think it should not do). Despite the hope of advocates and legislators and the admirable efforts of those working at the CCB, the early results lead us to think that it may just be that complex copyright disputes are ill-suited for a self-service small claims tribunal.

Coalition Letter to Congress on Copyright and AI

Posted September 11, 2023

Photo by Chris Grafton on Unsplash

Earlier today Authors Alliance joined a broad coalition of public interest organizations, creators, academics, and others in a letter to members of Congress urging caution when considering proposals to revamp copyright law to address concerns about artificial intelligence. As we explained previously, we believe that copyright law currently has the appropriate tools needed to both protect creators and encourage innovation.  

As the letter states, the signatories share a common interest in ensuring that artificial intelligence meets its potential to enrich the American economy, empower creatives, accelerate the progress of science and useful arts, and expand humanity’s overall welfare. Many creators are already using AI to conduct innovative new research, address long-standing questions, and produce new creative works. Some, such as some of these artists, have used this technology for many years.

So our message is simple: existing copyright doctrine has evolved and adapted to accommodate many revolutionary technologies, and is well equipped to address the legitimate concerns of creators. Our courts are the proper forum to apply those doctrines to the myriad fact patterns that AI will present over the coming years and decades. 

You can read the full letter here. 

Current copyright law isn’t perfect, and we certainly believe creativity and innovation would benefit from some changes. However, we should be careful about reactionary, alarmist politics. It seldom makes for good law. Unfortunately, that’s what we’re seeing right now with AI, and we hope that Congress has the wisdom to see through it. 

We encourage our members to reach out to your own Congressional representative to express the need to tread carefully, and (if you are) to explain how you are using AI in your work.  We’d also be very happy to hear from you as we develop our own further policy communications to Congress and to agencies such as the U.S. Copyright Office. 

Authors Alliance and Allies Petition to Renew and Expand Text Data Mining Exemption

Posted September 6, 2023
Photo by Alina Grubnyak on Unsplash

Authors Alliance is pleased to announce that in recent weeks, we have submitted petitions to the Copyright Office requesting that it recommend renewing expanding the existing text data mining exemptions to DMCA liability to make the current legal carve-out that enables text and data mining more flexible, so that researchers can share their corpora of works with other researchers who want to conduct their own text data mining research. On each of these petitions, we were joined by two co-petitioners, the American Association of University Professors and the Library Copyright Alliance. These were short filings—requesting changes and providing brief explanations—and will be the first of many in our efforts to obtain a renewal and expansion of the existing TDM exemptions. 


The Digital Millennium Copyright Act (DMCA) includes a provision that forbids people from bypassing technical protection measures on copyrighted works. But it also implements a triennial rulemaking process whereby organizations and individuals can petition for temporary exemptions to this rule. The Office recommends an exemption when its proponents show that they, or those they represent, are “adversely affected in their ability to make noninfringing [fair] uses due to the prohibition on circumventing access controls.” Every three years, petitioners must ask the Office to renew existing exemptions in order for them to continue to apply. Petitioners can also ask the Office to recommend expanding an existing exemption, which requires the same filings and procedure as petitioning for a new exemption. 

Back in 2020, during the eighth of these triennial rulemakings, Authors Alliance—along with the Library Copyright Alliance and the American Association of University Professors—petitioned the Copyright Office to create an exemption to DMCA liability that would enable researchers to conduct text and data mining. Text and data mining is a fair use, and the DMCA prohibitions on bypassing DRM and similar technical protection measures made it difficult or even impossible for researchers to conduct text and data mining on in-copyright e-books and films. After a long process which included filing a comment in support of the exemption and an ex parte meeting with the Copyright Office, the Office ultimately recommended that the Librarian of Congress grant our proposed exemption (which she did). The Office also recommended that the exemption be split into two parts, with one exemption addressing literary works distributed electronically, and the other addressing films. 

While the ninth triennial rulemaking does not technically happen until 2024, petitions for renewals, expansions, and new exemptions have already been filed. 

Our Petitions

Back in early July, we made our first filings with the Copyright Office in the form of renewal petitions for both exemptions. For this step, proponents of current exemptions simply ask the Copyright Office to renew them for another three year cycle, accompanied by a short explanation of whether and how the exemption is being used and a statement that neither law nor technology has changed such that the exemption is no longer warranted. Other parties are then given an opportunity to respond to or oppose renewal petitions. The Office recommends that exemption proponents who want to expand a current exemption also petition for its renewal—which is just what we did. In our renewal petitions, we explained how researchers are using the exemptions and how neither recent case law nor the continued availability of licensed TDM databases represent changes in the law or technology, making renewal of the TDM exemptions proper and justified. The renewal petitions follow a streamlined process, where they are generally simply granted unless the Office finds there to be “meaningful opposition” to a renewal petition, articulating a change in the law or facts. You can find our renewal petition for the literary works TDM exemption here, and our renewal petition for the film TDM exemption here.

But we also sought to expand the current exemptions, in two petitions submitted a few weeks back. In our expansion petitions, we proposed a simple change that we would like to see made to the current DMCA exemptions for text data mining. In the exemption’s current form, academic researchers can bypass technical protection measures to assemble a corpus on which to conduct TDM research, but they can only share it with other researchers for purposes of “collaboration and verification.” We asked the Office to permit these researchers to share their corpora with other researchers who want to use the corpus to conduct TDM research, but are not direct collaborators. However, this second group of researchers would still have to comply with the various requirements of the exemption, such as complying with security measures. Essentially, we seek to expand the sharing provision of the current exemption while leaving the other provisions intact. This is largely based on feedback we have received from those using the exemption and our understanding of how the regulation can be improved so that their desired noninfringing uses are no longer adversely affected by this limitation. You can find our expansion petition for the literary works TDM exemption here, and our expansion petition for the film TDM exemption here.

What’s Next?

The next step in the triennial rulemaking process is the Copyright Office issuing a notice of proposed rulemaking, where it will lay out its plan of action. While we do not have a set timeline for the notice of proposed rulemaking, during the last rulemaking cycle, it happened in mid-October—meaning it is reasonable to expect the Office to issue this notice in the next two months or so. Then, there will be several rounds of comments in support of or in opposition to the proposals. Finally, the Office will issue a final recommendation, and the Librarian of Congress will issue a final rule. While the Librarian of Congress is not legally obligated to adopt the Copyright Office’s recommendations, they traditionally do. Based on last year’s cycle, we can expect a final rule to be issued around October 2024. So we are in for a long wait and a lot of work! We will keep our readers updated as the rulemaking moves forward.

Copyright and Generative AI: Our Views Today

Posted August 30, 2023
Large copyright sign made of jigsaw puzzle pieces” by Horia Varlan is licensed under CC BY 2.0.

Authors Alliance readers will surely have noticed that we have been writing a lot about generative AI and copyright lately. Since the Copyright Office issued its decision letter on copyright registration in a graphic novel that included AI-generated images a few months back, many in the copyright community and beyond have struggled with the open questions around generative AI and copyright.

The Copyright Office has launched an initiative to study generative AI and copyright, and today issued a notice of inquiry to solicit input on the issues involved. The Senate Judiciary Committee has also held multiple hearings on IP rights in AI-generated works, including one last month focused on copyright. And of course there are numerous lawsuits pending over its legality, based on theories ranging from copyright infringement to to privacy to defamation. It’s also clear that there is little agreement about a one-size-fits-all rule for AI-generated works that applies across industries. 

At Authors Alliance, we care deeply about access to knowledge because it supports free inquiry and learning, and we are enthusiastic about ways that generative AI can meaningfully further those ideals. In addition to all the mundane but important efficiency gains generative AI can assist with, we’ve already seen authors incorporate generative AI into their creative processes to produce new works. We’ve also seen researchers incorporate these tools to help make new discoveries. There are some clear concerns about how generative AI tools, for example, can make it easier to engage in fraud and deception, as well as perpetuating disinformation. There have been many calls for legal regulation of generative AI technologies in recent months, and we wanted to share our views on the copyright questions generative AI poses, recognizing that this is a still-evolving set of questions.  

Copyright and AI

Copyright is at its core an economic regulation meant to provide incentives for creators to produce and disseminate new expressive works. Ultimately, its goal is to benefit the public by promoting the “progress of science,” as the U.S. Constitution puts it. Because of this, we think new technology should typically be judged by what it accomplishes with respect to those goals, and not by the incidental mechanical or technological means that it uses to achieve its ends. 

Within that context, we see generative AI as raising three separate and distinct legal questions. The first and perhaps most contentious is whether fair use should permit use of copyrighted works as training data for generative AI models. The second is how to treat generative AI outputs that are substantially similar to existing copyrighted works used as inputs for training data—in other words, how to navigate claims that generative AI outputs infringe copyright in existing works. The third question is whether copyright protection should apply to new outputs created by generative AI systems. It is important to consider these questions separately, and avoid the temptation to collapse them into a single inquiry, as different copyright principles are involved. In our view, existing law and precedent give us good answers to all three questions, though we know those answers may be unpalatable to different segments of a variety of content industries. 

Training Data and Fair Use

The first area of difficulty concerns the input stage of generative AI. Is the use of training data which includes copyrighted works a fair use, or does it infringe on a copyright owner’s exclusive rights in her work? The generative AI models used by companies like OpenAI, Stability AI, and Stable Diffusion are based on massive sets of training data. Much of the controversy around intellectual property and generative AI concerns the fact that these companies often do not seek permission from rights holders before training their models on works controlled by these rights holders (although some companies, like Adobe, are building generative AI models based on their own stock images, openly-licensed images, and public domain content). Furthermore, due to the size of the data sets and nature of their collection (often obtained via scraping websites), the companies that deploy these models do not make clear what works make up the training data. This question is one that is controversial and highly debated in the context of written works, images, and songs. Some creators and creator communities in these areas have made calls for “consent, credit, and compensation” when their works are included in training data. The obstacle to that point of view is, if the use of training data is a fair use, none of this is required, at least not by copyright.  

We believe that the use of copyrighted works as training data for generative AI tools should generally be considered fair use. We base this view on our reading of numerous fair use precedents including Google Books and HathiTrust cases as well others such as iParadigms. These and other cases support the idea that fair use allows for copying for non-expressive uses—copying done as an “intermediate step” in producing non-infringing content, such as by extracting non-expressive content such as patterns, facts, and data in or about the work. The notion that non-expressive (also called “non-consumptive”) uses do not infringe copyrights is based in large part on a foundational principle in copyright law: copyright protection does not extend to facts or ideas. If it did, copyright law would run the risk of limiting free expression and inhibiting the progress of knowledge rather than furthering it. Using in-copyright works to create a tool or model with a new and different purpose from the works themselves, which does not compete with those works in any meaningful way, is a prototypical fair use. Like the Google Books project (as well as text data mining), generative AI models use data (like copyrighted works) to produce information about the works they ingest, including abstractions and metadata, rather than replicating expressive text. 

In addition, fair use of copyrighted works as training data for generative AI has several practical implications for the public utility of these tools. For example, without it, AI could be trained on only “safe materials,” like public domain works or materials specifically authorized for such use. Models already contain certain filters—often excluding hateful content or pornography as part of its training set. However, a more general limit on copyrighted content—virtually all creative content published in the last one hundred years—would tend to amplify bias and the views of an unrepresentative set of creators. 

Generative AI Outputs and Copyright Infringement

The feature that most distinguishes generative AI from technology in copyright cases that preceded it, such as Google Books and HathiTrust, is that generative AI not only ingests copyrighted works for the purpose of extracting data for analysis or search functionality, but for using this extracted data to produce new content. Can content produced by a generative AI tool infringe on existing copyrights?

Some have argued that the use of training data in this context is not a fair use, and is not truly a “non-expressive use” because generative AI tools produce new works based on data from originals and because these new works could in theory serve as market competitors for works they are trained on. While it is a fair point that generative AI is markedly different from those earlier technologies because of these outputs, the point also conflates the question of inputs and outputs. In our view, e using copyrighted works as inputs to develop a generative AI tool is generally not infringement, but this does not mean that the tool’s outputs can’t infringe existing copyrights. 

We believe that while inputs as training data is largely justifiable as fair use, it is entirely possible that certain outputs may cross the line into infringement. In some cases, a generative AI tool can fall into the trap of memorizing inputs such that it produces outputs that are essentially identical to a given input. While evidence to date indicates that memorization is rare, it does exist

So how should copyright law address outputs that are essentially memorized copies of inputs? We think the law already has the tools it needs to address this. Where fair use does not apply, copyright’s “substantial similarity” doctrine is equipped to handle the question of whether a given output is similar enough to an input to be infringing. The substantial similarity doctrine is appropriately focused on protection of creative expression while also providing room for creative new uses that draw on unprotectable facts or ideas. Substantial similarity is nothing new: it has been a part of copyright infringement analysis for decades, and is used by federal courts across the country. And it may well be that standards, such as a set of  “Best Practices for Copyright Safety for Generative AI” proposed by law professor Matthew Sag, will become an important measure of assessing whether companies offering generative AI have done enough to guard against the risk of their tools producing infringing outputs.

Copyright Protection of AI Outputs

A third major question is, what exactly is the copyright status of the outputs of generative AI programs: are they protected by copyright at all, and if so, who owns those copyrights? Under the Copyright Office’s recent registration guidance, the answer seems to be that there is no copyright protection in the outputs. This does not sit well with some generative AI companies or many creators who rely on generative AI programs in their own creative work. 

We generally agree with the Copyright Office’s recent guidance concerning the copyright status of AI-generated works, and believe that they are unprotected by copyright. This is based on the simple but enduring “human authorship” requirement in copyright law, which dates back to the late 19th century. In order to be protected by copyright, a work must be the product of a human author and contain a modicum of human creativity. Purely mechanical processes that occur without meaningful human creative input cannot generate copyrightable works. The Office has categorized generative AI models as this kind of mechanical tool: the output responds to the human prompt, but the human making the prompt does not have sufficient control over how the model works to make them an “author” of the output for the purposes of copyright law. The district court for D.C. recently issued a decision agreeing with this take in Thaler v. Perlmutter, a case that challenged the human authorship requirement in the context of generative AI. 

It’s interesting to note here that in the Copyright Office listening session on text-based works, participants nearly universally agreed that outputs should not be protected by copyright, agreeing with the Copyright Office’s guidance. Yet the other listening sessions had more of a diversity of views. In particular, the participants in the listening sessions on audiovisual works and sound recordings were concerned about this issue. In industries like the music and film industries, where earlier iterations of generative AI tools have long been popular (or are even industry norms), the prospect of being denied copyright protection in songs or films, simply due to the tools used, can understandably be terrifying for creators who want to make a profit from their works. On this front, we’re sympathetic. Creators who rely on their copyrights to defend and monetize their works should be permitted to use generative AI as a creative tool without losing that protection. While we believe that the human authorship requirement is sound, it would be helpful to have more clarity on the status of works that incorporate generative AI content. How much additional human creativity is needed to render an AI-generated work a work of human authorship, and how much can a creator use a generative AI tool as part of their creative process without foregoing copyright protection in the work they produce? The Copyright Office seems to be grappling with these questions as well and seeking to provide additional guidance, such as in a recent webinar with more in-depth registration guidance for creators relying on generative AI tools in their creative efforts.

Other Information Policy Issues Affecting Authors

Generative AI has generated questions in other areas of information policy beyond the copyright questions we discuss above. Fraudulent content or disinformation, the harm caused by deep fakes and soundalikes, defamation, and privacy violations are serious problems that ought to be addressed. Those uses do nothing to further learning, and actually pollute public discourse rather than enhance it. They can also cause real monetary and reputational harm to authors. 

In some cases, these issues can be addressed by information policy doctrines outside of copyright, and in others, they can be best handled by regulations or technical standards addressing development and use of generative AI models. A sound application of state laws such as defamation law, right of publicity laws, and various privacy torts could go a long way towards mitigating these harms. Some have proposed that the U.S. implement new legislation to enact a federal right of publicity. This would represent a major change in law and the details of such a proposal would be important. Right now, we are not convinced that this would serve creators better than the existing state laws governing the right of publicity. While it may take some time for courts to figure out how to adapt legal regimes outside of copyright to questions around generative AI, adapting the law to new technologies is nothing new. Other proposals call for regulations like labeling AI-generated content, which could also be reasonable as a tool to combat disinformation and fraudulent content. 

In other cases, creators’ interests could be protected through direct regulation of the development and use of generative AI models. For example, certain creators’ desire for consent, credit, and compensation when their works are included in training data sets for generative AI programs is an issue that could be perhaps addressed through regulation of AI models. As for consent, some have called for an opt-out system where creators could have their works removed from the training data, or the deployment of a “do not train” tag similar to the robots.txt “do not crawl” tag. As we explain above, under the view that training data is generally a fair use, this is not required by copyright law. But the views that using copyrighted training data without some sort of recognition of the original creator is unfair, which many hold, may support arguments for other regulatory or technical approaches that would encourage attribution and pathways for distributing new revenue streams to creators. 

Similarly, some have called for collective licensing legislation for copyrighted content used to train generative AI models, potentially as an amendment to the Copyright Act itself. We believe that this would not serve the creators it is designed to protect and we strongly oppose it. In addition to conflicting with the fundamental principles of fair use and copyright policy that have made the U.S. a leader in innovation and creativity, collective licensing at this scale would be logistically infeasible and ripe for abuse, and would tend to enrich established, mostly large rights holders while leaving out newer entrants. Similar efforts several years ago were proposed and rejected in the context of mass digitization based on similar concerns.  

Generative AI and Copyright Going Forward

What is clear is that the copyright framework for AI-generated works is still evolving, and just about everyone can agree on that. Like many individuals and organizations, our views may well shift as we learn more about the real-world impacts of generative AI on creative communities and industries. It’s likely that as these policy discussions continue to move forward and policymakers, advocacy groups, and the public alike grapple with the open questions involved, the answers to these open questions will continue to develop. Changes in generative AI technology and the models involved may also influence these conversations. Today, the Copyright Office published issued a notice of inquiry on the topic of copyright in AI-generated works. We plan to submit a comment sharing our perspective, and are eager to learn about the diversity of views on this important issue.

Copyright Protection in AI-Generated Works Update: Decision in Thaler v. Perlmutter

Posted August 24, 2023
Photo by Google DeepMind on Unsplash

Last week, the District Court for the District of Columbia announced a decision in Thaler v. Perlmutter, a case challenging copyright’s human authorship requirement in the context of a work produced by a generative AI program. This case is one of many lawsuits surrounding copyright issues in generative AI, and surely will not be the last we hear about the copyrightability of AI-generated works, and how this interacts with copyright’s human authorship requirement. In today’s post, we’ll provide a quick summary of the case and offer our thoughts about what this means for authors and other creators.


Back in 2018 (before the current public debate about copyright and generative AI had reached the fever pitch we now see today), Dr. Stephen Thaler applied for copyright registration in a work of visual art produced by a generative AI system he created, called the Creativity Machine. Thaler sought to register his work as a computer-generated “work-made-for-hire” since he created the machine, which “autonomously” produced the work. After a lot of back and forth with the Copyright Office, it maintained its denial of the application, explaining that the human authorship requirement in copyright law foreclosed protection for the AI-generated work, since it was not the product of a human’s creativity.

Then, Thaler then sued Shira Perlmutter, the Register of Copyrights, in the D.C. district court, asking the court to decide “whether a work autonomously generated by an AI system is copyrightable.” Judge Baryl A. Howell upheld the Copyright Office’s decision, explaining that under the plain language of the Copyright Act, “an original work of authorship” required that the author be a human “based on centuries of settled understanding” and a dictionary definition of “author.” She also cited to the U.S. Constitution’s IP clause, which similarly mentions “authors and inventors,” and over a century of Supreme Court precedent to support this principle.

Thaler’s attorney has indicated that he will be appealing the ruling to the D.C. Circuit court of appeals, and it remains to be seen whether that court will affirm the ruling. 

Implications for copyright law

The headline takeaway from this ruling is that AI generated art is not copyrightable because copyright requires human authorship, which remains a requirement in copyright law. However, the ruling is actually more nuanced and contains a few subtle points worth highlighting. 

For one, this case tested not just the human authorship requirement but also the application of the work-for-hire doctrine in the context of generative AI. On one view of the issues, if Thaler created a machine capable of creating a work that would be copyrightable were it created by a human, there is a certain appeal in framing the work as one commissioned by Thaler. On this point, the court explained that since there was no copyright in the work in the first instance based on its failure to meet the human authorship requirement, this theory also did not hold water. In other words, a work-made-for-hire requires that the “hired” creator also be a human. 

It’s important to keep in mind that Thaler was in a sense testing the reach of the limited or “thin” copyright that can be granted in compilations of AI-generated work, or AI-generated work that a human has altered, thus endowing it with at least a modicum of human creativity as copyright requires. Thaler made no changes to the image produced by his Creativity Machine, and in fact, described the process to the Copyright Office as fully autonomous rather than responding to an original prompt (as is generally the case with generative AI). Thaler was not trying to get a copyright in the work in order to monetize it for his own livelihood, but—presumably—to explore the contours of copyright in computer-generated works. In other words, the case has some philosophical underpinnings (and in fact, Thaler has said in interviews that he believes his AI inventions to be sentient, a view that many of us tend to reject). But for creators using generative AI who seek to register copyrights in order to benefit from copyright protection, things are unlikely to be quite so clear-cut. And while she found the outcome to be fairly clear cut in this case, Judge Howell observed:

“The increased attenuation of human creativity from the actual generation of the final work will prompt challenging questions regarding how much human input is necessary to qualify the user of an AI system as an ‘author’ of a generated work, the scope of the protection obtained over the resultant image, how to assess the originality of AI-generated works where the systems may have been trained on unknown pre-existing works, how copyright might best be used to incentivize creative works involving AI, and more.”

What does this all mean for authors? 

For authors who want to incorporate AI-generated text or images into their own work, the situation is a bit murkier than it was for Thaler. The case itself provides little in the way of information for human authors who use generative AI tools as part of their own creative processes. But while the Copyright Office’s registration guidance tells creators what they need to do to register their copyrights, this decision provides some insight about what will hold up in court. Courts can and do overturn agency actions in some cases (in this case, the judge could have overturned the Copyright Office’s denial of Thaler’s registration application had she found it to be “arbitrary and capricious”). So the Thaler case in many ways affirms what the Copyright Office has said so far about registrability of AI-generated works, indicating that the Office is on the right track as far as their approach to copyright in AI-generated works, at least for now. 

The Copyright Office has attempted to provide more detailed guidance on copyright in “AI-assisted” works, but a lot of confusion remains. One guideline the Office promulgated in a recent webinar on copyright registration in works containing AI-generated material is for would-be registrants to disclose the contribution of an AI system when its contribution is more than “de minimis,” i.e., when the AI-generated creation would be entitled to copyright protection if it were created by a human. This means that using an AI tool to sharpen an image doesn’t require disclosure, but using an AI tool to generate one part of an image does. An author will then receive copyright protection in only their contributions to the work and the changes they made to the AI-generated portions. As Thaler shows, an author must make some changes to an AI-generated work in order to receive any copyright protection at all in that work.

All of this means, broadly speaking, that the more an author changes an AI-generated work—such as by using tools like photoshop to alter an image or by editing AI-generated text—the more likely it is that the work will be copyrightable, and, by the same token, the less “thin” any copyright protection in the work will be. While there are open questions about how much creativity is required from a human in order to transform an AI-generated work into a copyrightable work of authorship, this case has underscored that at least some creativity is required—and using an AI tool that you yourself developed to create the work does not cut it. 

The way Thaler framed his Creativity Machine as the creator of the work in question also shows that it is important to avoid anthropomorphizing AI systems—just as the court rejected the notion of an AI-generated work being a work-made-for-hire, a creative work with both generative AI and human contributions probably could not be registered as a “co-authored” work. Humans are predisposed to attribute human characteristics to non-humans, like our pets or even our cars, a phenomenon which we have seen repeatedly in the context of chat bots. Regardless, it’s important to remember that a generative AI program is a tool based on a model. And thinking of generative AI programs as creators rather than tools can distract us from the established and undisturbed principle in copyright law that only a human can be considered an author, and only a human can hold a copyright. 

Open Access and University IP Policies in the United States

Posted August 18, 2023

Perhaps the most intuitive statement in the whole of the U.S. Copyright Act is this: “Copyright in a work protected under this title vests initially in the author. . . ..” Of course authors are the owners of the copyright in their works. 

In practice, however, control over copyrighted works is often more complicated. When it comes to open access scholarly publishing, the story is particularly complicated because the default allocation of rights is often modified by an complex series of employment agreements, institutional open access policies, grant terms, relationships (often not well defined) between co-authors, and of course the publishing agreement between the author and the publisher. Because open access publishing is so dependent on those terms, it’s important to have a clear understanding of who holds what rights and how they can exercise them.

Work for Hire and the “Teacher Exception”

First, it’s important to figure out who owns rights in a work when it’s first created. For most authors, the answer is pretty straightforward. If you’re an independent creator, you as the author generally own all the rights under copyright. If co-authors create a joint work (e.g., co-author an article), they both hold rights and can freely license that work to others, subject to an accounting to each other. 

If, however, you work for a company and create a copyrighted work in the scope of your employment (e.g., I’m writing this blog post as part of my work for Authors Alliance) then at least in the United States, the “work for hire” doctrine applies and, the law says, “the employer or other person for whom the work was prepared is considered the author.” For people who aren’t clearly employees, or who are commissioned to make copyrighted works, whether their work is considered “work for hire” can sometimes be complicated, as illustrated in the seminal Supreme Court case CCNV v. Reid, addressing work for hire in the context of a commissioned sculpture.  

For employees of colleges or universities who create scholarly works, the situation is a little more complicated because of a judicially developed exception to the work-for-hire doctrine known as the “teacher exception.” In a series of cases in the mid-20th Century, the courts articulated an exception to the general rule that creative works produced within the scope of one’s employment were owned by the employer for teachers or educators. Those cases each have their own peculiar facts, however, and most significantly, they predated the 1976 Copyright Act, which was a major overhaul of U.S. copyright law. Whether the “teacher exception” continues to survive as a judge-made doctrine is highly contested. Despite the massive number of copyrighted works authored by university faculty after the 1976 Act (well over a hundred million scholarly articles alone, not to mention books and other creative works), we have seen very few cases addressing this particular issue.  

There are a number of law review articles and books on the subject. Among the best, I think, is Professor Elizabeth Townsend-Gard’s thorough and worthwhile article. She concludes, based on a review of past and modern case law, that the continued survival of the teacher exception is tenuous at best: 

“The teacher exception was established under the 1909 act by case law, but because the 1976 act did not incorporate it, the “teacher exception” was subsumed by a work-for-hire doctrine that the Supreme Court’s definition of employment in CCNV v. Reid places teachers’ materials under the scope of employment. Thus the university-employers own their original creative works. No court has decided whether the “teacher exception” survived Reid, but the Seventh Circuit in Weinstein, decided two years before Reid, had already transferred the “teacher exception” from a case-based judge made law to one dictated by university policy.”

University Copyright and IP policies

Whatever the default initial allocation of copyright ownership, authors of all types must also understand how other agreements may modify control and exercise of copyright. These policies can be somewhat difficult to untangle because there actually may be layers of agreements or policies that cross reference each other and are buried deep within institutional policy handbooks. 

For academic authors, this collection of agreements typically includes something like an employee handbook or academic policy manual, which will include policies that all university employees must agree to as a condition of employment. Typically, that will include a policy on copyright or intellectual property. Regardless of whether the teacher exception or work-for-hire applies, these agreements can override that default allocation of rights and transfer them, both from the creator to the university, or from university to the creator. 

These policies differ significantly in the details, but most university IP policies choose to allocate all or substantially all rights under copyright to individual creators of scholarly works, notwithstanding the potential application of the work for hire doctrine. In other words, even though copyright in faculty scholarly works may initially be held by the university, through university policy those rights are mostly handed over to individual creators. The net effect is that most university IP policies treat faculty as the initial copyright holders even if the law isn’t clear that they actually are.

Some universities, like Duke University, say nothing about “work for hire” in their IP policies but merely “reaffirm[] its traditional commitment to the personal ownership of intellectual property rights in works of the intellect by their individual creators.” Others like Ohio State, are similar, stating that copyright in scholarly works “remains” with their creators, but then also provide that “the university hereby assigns any of its copyrights in such works, insofar as they exist, to their creators,” which can act as a sort of savings clause to address circumstances in which the there may be uncertainty about ownership by individual creators. 

Others, like Yale, are a little clearer about their stance on work-for-hire. Yale explains that “The law provides . . . that works created by faculty members in the course of the their teaching and research, and works created by staff members in the course of their jobs, are the property of the University,” but then goes on to recognize that “[i]t is traditional at Yale and other universities, however, for books, articles and other scholarly writings by a faculty member to be deemed the property of the writer . . . . In recognition of that longstanding practice, the University disclaims ownership of works by faculty, staff, postdoctoral fellows and postdoctoral associates and students. . . .” Another example of a university taking a similar approach is the University of Michigan.

Carve outs and open access policies

Every university copyright or IP policy that I’ve seen includes some carve outs from the general rule that copyright will, one way or another, end up being held by individual creators. Almost universally, universities IP policies provide that the university will retain rights sufficient to satisfy grant obligations. Some universities’ IP policies simply provide that, for example, ownership shall be determined by the terms of the grant (see, for example, the University of California system policy). In other cases, however, university IP policy accomplishes compliance with grants simply stating that all intellectual property of any kind (including copyright) created under a grant is owned by the university, full stop. This, therefore, gives the university sufficient authority to satisfy whatever grant obligations it may have. For example, the University of Texas system states that it will not assert ownership of copyright in scholarly works, but that provisio is subject to the limitation that “intellectual property resulting from research supported by a grant or contract with the government (federal and/or state) or an agency thereof is owned by the Board of Regents.” These kinds of broad ownership claw-backs raise some hard questions when it comes to publishing scholarly work. For example, when a UT author personally signs a publication agreement transferring copyright for an article that is the result of grant funding, do they actually hold the rights to make that transfer effective? 

For open access, these grant clauses are important because they are the operative terms through which the university complies with funder open access requirements. Sometimes, these licensing clauses lie somewhat dormant, with funders holding but not necessarily exercising the full scope of their rights. For example, for every article or other copyrighted work produced under a federal grant, even prior to the recent OSTP open access announcement, the government already reserved for all works produced under federal grants a broad “royalty-free, nonexclusive and irrevocable right to reproduce, publish, or otherwise use the work for Federal purposes, and to authorize others to do so.” 

Some universities also retain a broad, non-exclusive license for themselves to make certain uses of faculty-authored scholarly work, even while providing that the creator owns the copyright. For example, Georgia Tech’s policy provides that individual creators own rights in scholarly works, but Georgia Tech retains a “fully paid up, universe-wide, perpetual, non-exclusive, royalty-free license to use, re-use, distribute, reproduce, display, and make derivative works of all scholarly and creative works for the educational, research, and administrative purposes of [Georgia Tech].” Others such as the University of Maryland are less specific, providing simply that although the individual creator owns rights to their work, “the University reserves the right at all times to exercise copyright in Traditional Scholarly Works as authorized under United States Copyright Law.” Those kinds of broad licenses would seem to give the university discretion to make use of scholarly work, including, I think, for open access uses should the university decide that such uses are desirable.

Finally, a growing number of universities have policies, enacted at the behest of faculty, that specifically provide rights to make faculty scholarship openly available. The “Harvard model” is probably the most common, or at least the most well known. These types of policies allocate a license to the university, to exercise on behalf of the individual creator, with the specific intent of making the work available free of charge. Often these policies will include special limitations (e.g., the university cannot sell access to the article) or allow for faculty to opt-out (often by seeking a waiver). 

Pre-existing licenses and publishing agreements

The maze of policies and agreements can matter a great deal for the legal mechanics of effectively publishing an article openly. Of course in the scenario where authors hold rights themselves, they can retain sufficient rights through their publishing contract so they can make their work openly available, typically either via “green open access” by posting their own article to an institutional repository, or by “gold open access” directly from the publisher (though these are sometimes accompanied by a hefty article processing fee). Tools like the SPARC open access addendum are wonderful negotiating tools to ensure authors retain sufficient rights to achieve OA.

That works sometimes, but often publishing contracts come with unacceptably restrictive strings attached. For individual authors publishing with journals and publishers that have great market power, they often have little ability to negotiate for OA terms that they would prefer. 

In these situations, a pre-existing license can be a major advantage for an author. For example, for authors who are writing under the umbrella of a Harvard-style open access policy, the negotiating imbalance with journals is leveled, at least in part  because the journal knows that the university has a pre-existing OA license and also knows that although those policies often permit waivers, it’s not as easy as just telling the author “no” to claw that license back. The same is true about other forms of university pre-existing licenses that could be used to make a work available openly, such as those general licenses I mention that are retained by Georgia Tech or Maryland. While these kinds of pre-existing licenses are seldom acknowledged in journal publishing agreements, sophisticated publishers with large legal teams are undoubtedly aware of them. Because of that, I think there are strong arguments that their publishing agreements with authors implicitly incorporate them (or, if not, good arguments that a publisher that does not recognize them is intentionally interfering with a pre-existing contractual relationship between author and their university). Funder mandates, made effective through university IP policies, take the scenario a step further and force the issue: either the journal acquiesces or it doesn’t publish the paper at all. There is often no waiver option. Of course there are other pathways that both funders and journals may be willing to accept – many funders are willing to support OA publishing fees, and many journals will happily accept OA license terms for a price. 


Although the existing, somewhat messy, maze of institutional IP policies, publishing agreements, and OA policies can seem daunting, understanding their terms is important for authors who want to see their works made openly available. I’ll leave for another day to explore whether it’s a good thing that the rights situation is so complex. In many situations, rights thickets like these can be a real detriment to authors and access to their works. In this case the situation is at least nuanced such that authors are able to leverage pre-existing licenses to avoid negotiating away the bundle of rights they need to see their works made available openly. 

Prosecraft, text and data mining, and the law

Posted August 14, 2023

Last week you may have read about a website called, a site with an index of some 25,000 books that provided a variety of data about the texts (how long, how many adverbs, how much passive voice) along with a chart showing sentiment analysis of the works in its collection and displayed short snippets from the texts themselves, two paragraphs representing the most and least vivid from the text. Overall, it was a somewhat interesting tool, promoted to authors to better understand how their work compares to those of other published works. 

The news cycle about was about the campaign to get its creator Benji Smith to take the site down (he now has) based on allegations of copyright infringement. A Gizmodo story about it generated lots of attention, and it’s been written up extensively, for example here, here, here, and here.  

It’s written about enough that I won’t repeat the whole saga here. However, I think a few observations are worth sharing:  

1) Don’t get your legal advice from Twitter (or whatever its called)

Fair Use does not, by any stretch of the imagination, allow you to use an author’s entire copyrighted work without permission as a part of a data training program that feeds into your own ‘AI algorithm.’”  – Linda Codega, Gizmodo (a sentiment that was retweeted extensively)

Fair use actually allows quite a few situations where you can copy an entire work, including situations when you can use it as part of a data training program (and calling an algorithm “AI” doesn’t magically transform it into something unlawful). For example, way back in 2002 in Kelly v. Ariba Soft, the 9th Circuit concluded that it was fair use to make full text copies of images found on the internet for the purpose of enabling web image search. Similarly, in AV ex rel Vanderhye v. iParadigms, the 4th Circuit in 2009 concluded that it was fair use to make full text copies of academic papers for use in a plagiarism detection tool.  

Most relevant to prosecraft, in Authors Guild v. HathiTrust (2014)  and Authors Guild v. Google (2015) the Second Circuit held that Google’s copying of millions of books for purposes of creating a massive search engine of their contents was fair use . Google produced full-text searchable databases of the works, and displayed short snippets containing whatever term the user had searched for (quite similar to prosecraft’s outputs). That functionality also enabled a wide range of computer-aided textual analysis, as the court explained: 

The search engine also makes possible new forms of research, known as “text mining” and “data mining.” Google’s “ngrams” research tool draws on the Google Library Project corpus to furnish statistical information to Internet users about the frequency of word and phrase usage over centuries.  This tool permits users to discern fluctuations of interest in a particular subject over time and space by showing increases and decreases in the frequency of reference and usage in different periods and different linguistic regions. It also allows researchers to comb over the tens of millions of books Google has scanned in order to examine “word frequencies, syntactic patterns, and thematic markers” and to derive information on how nomenclature, linguistic usage, and literary style have changed over time. Authors Guild, Inc., 954 F.Supp.2d at 287. The district court gave as an example “track[ing] the frequency of references to the United States as a single entity (‘the United States is’) versus references to the United States in the plural (‘the United States are’) and how that usage has changed over time.”

While there are a number of generative AI cases pending (a nice summary of them is here) that I agree raise some additional legal questions beyond those directly answered in Google Books, the kind of textual analysis that offered seems remarkably similar to the kinds of things that the courts have already said are permissible fair uses. 

2) Text and data mining analysis has broad benefits

Not only is text mining fair use, it also yields some amazing insights that truly “promote the progress of Science,” which is what copyright law is all about.  Prosecraft offered some pretty basic insights into published books – how long, how many adverbs, and the like. I can understand opinions being split on whether that kind of information is actually helpful for current or aspiring authors. But, text mining can reveal so much more. 

In the submission Authors Alliance made to the US Copyright Office three years ago in support of a Section 1201 Exemption permitting text data mining, we explained:

TDM makes it possible to sift through substantial amounts of information to draw groundbreaking conclusions. This is true across disciplines. In medical science, TDM has been used to perform an overview of a mass of coronavirus literature.Researchers have also begun to explore the technique’s promise for extracting clinically actionable information from biomedical publications and clinical notes. Others have assessed its promise for drawing insights from the masses of medical images and associated reports that hospitals accumulate. 

In social science, studies have used TDM to analyze job advertisements to identify direct discrimination during the hiring process.7 It has also been used to study police officer body-worn camera footage, uncovering that police officers speak less respectfully to Black than to white community members even under similar circumstances.

TDM also shows great promise for drawing insights from literary works and motion pictures. Regarding literature, some 221,597 fiction books were printed in English in 2015 alone, more than a single scholar could read in a lifetime. TDM allows researchers to “‘scale up’ more familiar humanistic approaches and investigate questions of how literary genres evolve, how literary style circulates within and across linguistic contexts, and how patterns of racial discourse in society at large filter down into literary expression.” TDM has been used to “observe trends such as the marked decline in fiction written from a first-person point of view that took place from the mid-late 1700s to the early-mid 1800s, the weakening of gender stereotypes, and the staying power of literary standards over time.” Those who apply TDM to motion pictures view the technique as every bit as promising for their field. Researchers believe the technique will provide insight into the politics of representation in the Network era of American television, into what elements make a movie a Hollywood blockbuster, and into whether it is possible to identify the components that make up a director’s unique visual style [citing numerous letters in support of the TDM exemption from researchers].

3) Text and data mining is not new and it’s not a threat to authors

Text mining of the sort it seemed prosecraft employed isn’t some kind of new phenomenon. Marti Hearst, a professor at UC Berkeley’s iSchool explained the basics in this classic 2003 piece. Scores of computer science students experiment with projects to do almost exactly what prosecraft was producing in their courses each year. Textbooks like Matt Jockers’s Text Analysis with R for Students of Literature have been widely used and adopted all across the U.S. to teach these techniques. Our submissions during our petition for the DMCA exemption for text and data mining back in 2020 included 14 separate letters of support from authors and researchers engaged in text data mining research, and even more researchers are currently working on TDM projects. While fears over generative AI may be justified for some creators (and we are certainly not oblivious to the threat of various forms of economic displacement), it’s important to remember that text data mining on textual works is not the same as generative AI. On the contrary, it is a fair use that enriches and deepens our understanding of literature rather than harming the authors who create it.

Update: Consent Judgment in Hachette v. Internet Archive

Posted August 11, 2023
Photo by Markus Winkler on Unsplash

UPDATE: On Monday, August 14th, Judge Koeltl issued an order on the proposed judgement, which you can read here, and which this blog post has been updated to reflect. In his order, the judge adopted the definition of “Covered Book” suggested by the Internet Archive, limiting the permanent injunction subject to an appeal to only those books published by the four publisher plaintiffs that are available in ebook form.

After months of deadline extensions, there is finally news in Hachette Books v. Internet Archive, the case about whether Controlled Digital Lending is a fair use, which we have been covering since its inception over two years ago, and in which Authors Alliance filed an amicus brief in support of Internet Archive and CDL. On Friday, August 11th, attorneys for the Internet Archive and a group of publishers filed documents in federal court proposing “an appropriate procedure to determine the judgment to be entered in this case,” as Judge John G. Koeltl of the Southern District of New York requested

In a letter to the court, both parties indicated that they had agreed to a permanent injunction, subject to an appeal by IA, “enjoining the Internet Archive [] from distributing the ‘Covered Books’ in, from or to the United States electronically.” This means that the Internet Archive has agreed to stop distributing within the U.S. the books in its CDL collection which are published by the plaintiff publishers in the case (Hachette Book Group, HarperCollins, Wiley, and Penguin Random House), and are currently available as ebooks from those publishers. The publishers must also send IA a catalog “identifying such commercially available titles (including any updates thereto in the Plaintiffs’ discretion), or other similar form of notification,” and “once 14 days have elapsed since the receipt of such notice[,]” IA will cease distributing CDL versions of these works under the proposed judgment.

Open Questions

Last week’s proposed judgment did leave an open question, which Judge Koeltl was asked to decide before issuing a final judgment: should IA be enjoined from distributing CDL versions of books published by the four publishers where those books are available in any form, or should it only be enjoined from distributing CDL versions of these books that are available as ebooks? This difference may seem subtle, but it’s actually really meaningful. 

The publishers asked for a broader definition, whereby any of their published works that remain in print in any form are off the table when it comes to CDL. The publishers explain in a separate letter to the court that they believe that it would be consistent with the judgment to ban the IA from loaning out CDL versions of any of the commercially available books they publish, whatever the format. They argue that it should be up to the publishers whether or not to issue an ebook edition of the work, and that even when they decide not to do so (based on an author’s wishes or other considerations), IA’s digitization and distribution of CDL scans is still infringement. 

On the other hand, the Internet Archive is asked the judge to confine the injunction to books published by the four publishers that are available as ebooks, leaving it free to distribute CDL scans of the publishers’ books that are in print, but only available as print and/or audio versions. It argues that to forbid it from lending out CDL versions of books with no ebook edition available would go beyond the matters at issue in the case—the judge did not decide whether it would be a fair use to loan out CDL versions of books only available in print, because none of the works that the publishers based the suit upon were available only as print editions. Furthermore, IA explains that other courts have found that the lack of availability of a competing substitute (in this case, an ebook edition) weighs in favor of fair use under the fourth factor, which considers market competition and market harm.

It seems to me that the latter position is much more sensible. In addition to CDL scans of books only available as physical books not being at issue in the case, the fair use argument for this type of lending is quite different. One of the main thrusts of the judge’s decision in the case was his argument that CDL scans compete with ebooks, since they are similar products, but this logic does not extend to competition between CDL scans and print books. This is because the markets for digital versions of books and analogue versions of books are quite different. Some readers strongly prefer print versions of books, and some turn to electronic editions for reasons of disability, physical distance from libraries or bookstores, or simple preference. While we believe that IA’s CDL program is a fair use, its case is even stronger when it comes to CDL loans of books that are not available electronically. 

Then on Monday, August 14th, Judge Koeltl issued an order and final judgment in the case, agreeing with the Internet Archive and limiting the injunction to books published by the four publishers which are available in ebook form. Again, this may seem minor, but I actually see it as a substantial win, at least for now. While even the more limited injunction is a serious blow to IA’s controlled digital lending program, it does allow them to continue to fill a gap in available electronic editions of works. The judge’s primary reasoning was that books not available as ebooks was beyond the scope of what was at issue in the case, but he also mentioned that factor four analysis could have been different were there no ebook edition available.

Limitations of the Proposed Judgment

Importantly, the parties also stipulated that this injunction is subject to an appeal by the Internet Archive. This means that if the Internet Archive appeals the judgment (which it has indicated that it plans to do), and the appeals court overturns Judge Koeltl’s decision, for example by finding that its CDL program is a fair use, IA may be able to resume lending out those CDL versions of books published by the plaintiffs which are also available as ebooks. The agreement also does not mean that IA has to end its CDL program entirely—neither books published by other publishers nor books published by the publisher plaintiffs that are not available as ebooks are covered under the judge’s order.  

What’s Next?

The filing represents the first step towards the Internet Archive appealing the court’s judgment. As we’ve said before, Authors Alliance plans to write another amicus brief in support of the Internet Archive’s argument that Controlled Digital Lending is a fair use. Now that the judge has issued his final judgment, IA has 30 days to file a “notice of appeal” with the district court. Then, the case will receive a new docket in the Second Circuit Court of Appeals, and the various calendar and filing processes will begin anew under the rules of that court. We will of course keep our readers apprised of further developments in this case.

Federal Right of Publicity Takes Center Stage in Senate Hearing on AI

Posted July 28, 2023

The Authors Alliance found this write-up by Professor Jennifer Rothman at the University of Pennsylvania useful and wanted to share it with our readers. You can find Professor Rothman’s original post on her website, Rothman’s Roadmap to the Right of Publicity, here.

By Jennifer Rothman

On July 12th, the Senate Judiciary Committee’s Subcommittee on Intellectual Property held its second hearing about artificial intelligence (AI) and intellectual property, this one was to focus expressly on “copyright” law. Although copyright was mentioned many times during the almost two-hour session and written testimony considered whether the use of unlicensed training data was copyright infringement, a surprising amount of the hearing focused not on copyright law, but instead on the right of publicity.

Both senators and witnesses spent significant time advocating for new legislation—a federal right of publicity or a federal anti-impersonation right (what one witness dubbed the FAIR Act). Discussion of such a federal law occupied more of the hearing than predicted and significantly more time than was spent parsing either existing copyright law or suggesting changes to copyright law.

In Senator Christopher Coons’s opening remarks, he suggested that a federal right of publicity should be considered to address the threat of AI to performers. At the start of his comments, Coons played an AI-generated song about AI set to the tune of New York, New York in the vocal style of Frank Sinatra. Notably, Coons highlighted that he had sought and received permission to use both the underlying copyrighted material and Frank Sinatra’s voice.

In addition to Senator Coons, Senators Marsha Blackburn and Amy Klobuchar expressly called for adding a federal right of publicity. Blackburn, a senator from Tennessee, highlighted the importance of name and likeness rights for the recording artists, songwriters, and actors in her state and pointed to the concerns raised by the viral AI-generated song “Heart on My Sleeve”. This song was created by a prompt to produce a song simulating a song created by and sung by Drake and The Weekend. Ultimately, Universal Music Group got platforms, such as Apple Music and Spotify, to take the song down on the basis of copyright infringement claims. Universal alleged that the use infringed Drake and The Weekend’s copyrighted music and sound recordings. The creation (and popularity!) of the song sent shivers through the music industry.

It therefore is no surprise that Jeffrey Harleston, General Counsel for Universal Music Group, advocated both in his oral and written testimony for a federal right of publicity to protect against “confusion, unfair competition[,] market dilution, and damage” to the reputation and career of recording artists if their voices or vocal styles are imitated in generative AI outputs. Karla Ortiz, a concept artist and illustrator, known for her work on Marvel films, also called for a federal right of publicity in her testimony. Her concerns were tied to the use of her name as a prompt to produce outputs trained on her art in her style and that could substitute for hiring her to create new works. Law Professor Matthew Sag supported adoption of a federal right of publicity to address the “hodgepodge” of state laws in the area.

Dana Rao, the Executive Vice President and General Counsel of Adobe, expressed support for a federal anti-impersonation right, which he noted had a catchy acronym—the FAIR Act. His written testimony on behalf of Adobe highlighted its support for such a law and gave the most details of what such a right might look like. Adobe suggested that such an anti-impersonation law would “offer[] artists protection against” direct economic competition of an AI-generated replication of their style and suggested that this law “would provide a right of action to an artist against those that are intentionally and commercially impersonating their work through AI tools. This type of protection would provide a new mechanism for artists to protect their livelihood from people misusing this new technology, without having to rely solely on copyright, and should include statutory damages to alleviate the burden on artists to prove actual damages, directly addressing the unfairness of an artist’s work being used to train an AI model that then generates outputs that displace the original artist.” Adobe was also open to adoption of “a federal right of publicity . . . to help address concerns about AI being used without permission to copy likenesses for commercial benefit.”

Although some of the testimony supporting a federal right of publicity suggested that many states already extend such protection, there was a consensus that a preemptive federal right could provide greater predictability, consistency, and protection. Senator Klobuchar and Universal Music’s Harleston emphasized the value of elevating the right of publicity to a federal “intellectual property” right. Notably, this would have the bonus of clarifying an open question of whether right of publicity claims are exempted from the Communications Decency Act’s § 230 immunity provision for third-party speech conveyed over internet platforms. (See, e.g. Hepp v. Facebook.)

Importantly, Klobuchar noted the overlap between concerns over commercial impersonation and concerns over deepfakes that are used to impersonate politicians and create false and misleading videos and images that pose a grave danger to democracy.

Of course, the proof is in the pudding. No specific legislation has been floated to my knowledge and so I cannot evaluate its effectiveness or pitfalls. Although the senators and witnesses who spoke about the right of publicity were generally supportive, the details of what such a law might look like were vague.

From the right-holders’ (or identity-holders’) perspective the scope of such a right is crucial. Many open questions exist. If preemptive in nature, how would such a statute affect longstanding state law precedents and the appropriation branch of the common law privacy tort that in many states is the primary vehicle for enforcing the right of publicity? When confronted with similar concerns over adopting a new “right of publicity” to replace New York’s longstanding right of privacy statute that protected against the misappropriation of a person’s name, likeness, and voice, New York legislators astutely recognized the danger of unsettling more than 100 years of precedents that had provided (mostly) predictable protection for individuals in the state. 

Another key concern is whether these rights will be transferable away from the underlying identity-holders. If they are, then a federal right of publicity will have a limited and potentially negative impact on the very people who are supposedly the central concern driving the proposed law. This very concern is central to the demands of SAG-AFTRA as part of its current strike. The actors’ union wants to limit the ability of studios and producers to record a person’s performance in one context and then use AI and visual effects to create new performances in different contexts. As I have written at length elsewhere, a right of publicity law (whether federal or otherwise) that does not limit transferability will make identity-holders more  vulnerable to exploitation rather than protect them. (See, e.g., Jennifer E. Rothman, The Inalienable Right of Publicity, 100 Georgetown L.J. 185 (2012); Jennifer E. Rothman, What Happened to Brooke Shields was Awful. It Could Have Been Worse, Slate, April 2023.)

Professor Matthew Sag rightly noted the importance of allowing ordinary people—not just the famous or commercially successful—to bring claims for publicity violations. This is a position with which I wholeheartedly agree, but Sag, when pressed on remedies, suggested that there should not be statutory damages. Yet, such damages are usually the best and sometimes only way for ordinary individuals to be able to recover damages and to get legal assistance to bring such claims. In fact, what is often billed as California’s statutory right of publicity for the living (Cal. Civ. Code § 3344) was originally passed under the moniker “right of privacy” and was specifically adopted to extend statutory damages to plaintiffs who did not have external commercial value making damage recovery challenging. (See Jennifer E. Rothman, The Right of Publicity: Privacy Reimagined for a Public World (Harvard Univ. Press 2018)). Notably, Dana Rao of Adobe, recognizing this concern, specifically advocated for the adoption of statutory damages.

The free speech and First Amendment concerns raised by the creation of a federal right of publicity will turn on the specific scope and likely exceptions to such a law. Depending on the particulars, it may be that potential defendants stand more to gain by a preemptive federal law than potential plaintiffs do. If there are clear and preemptive exemptions to liability this will be a win for many repeat defendants in right of publicity cases who now have to navigate a wide variety of differing state laws. And if liability is limited to instances in which there is likely confusion as to participation or sponsorship, the right of publicity will be narrowed from its current scope in most states. (See Robert C. Post and Jennifer E. Rothman, The First Amendment and the Right(s) of Publicity, 130 Yale L.J. 86 (2020)).

In short, the focus in this hearing on “AI and Copyright” on the right of publicity instead supports my earlier take that the right of publicity may pose a significant legal roadblock for developers of AI. Separate from legal liability, AI developers should take seriously the ethical concerns of producing outputs that imitate real people in ways that confuse as to their participation in vocal or audiovisual performances, or in photographs.

The appropriation bill that would defund the OSTP open access memo

Posted July 27, 2023

A couple of weeks ago the U.S. House Appropriations Subcommittee on Commerce, Justice, and Science (CJS) released an appropriations bill containing language that would defund efforts to implement a federal, zero-embargo open access policy for federally funded research.

We think this is a fantastically bad idea. One of the most important developments in the movement for open access to scholarship came last year when Dr. Alondra Nelson, Director of the Office of Science and Technology Policy, issued a memorandum mandating that all federal agencies that sponsor research put in place policies to ensure immediate open access to published research, as well as access to research data. The agencies are at various stages of implementing the Nelson memo now, but work is well underway. This appropriations bill specifically targets those implementation efforts and would prevent any federal government expenditures from being used to further them. 

For the vast majority of scholarly works, the primary objective of the authors is to share their research as widely as possible. Open access achieves that for authors (if they can only get their publishers to agree). The work is already funded, already paid for. As you might imagine, those opposed to the memo are primarily publishers who have resisted adapting the business model they’ve built of putting a paywall in front of publicly-funded work, largely for profit. 

Thankfully, the CJS appropriations bill, one of twelve appropriation bills, is just a first crack at how to fund the government in the coming year. The Senate, of course, will have their say, as will the President. With the current division in Congress, combined with the upcoming recess (Congress will be on recess in August and reconvene in September), the smart bet is that none of these bills will be enacted in time for the federal government’s new fiscal year on October 1. Instead, a continuing resolution–funding the government under the status quo, as Congress frequently does–will likely be enacted as a stop gap until a compromise can be reached later in the year. 

It is important, however, that legislators understand that this attempt to defund OA efforts is majorly concerning, especially for authors, universities, and libraries that believe that federally funded research should be widely available on an open access basis. It’s a good moment to speak out. SPARC has put together a helpful issue page on this bill, complete with sample text for how to write to your representative or senator. 

As you’ll see if you read the proposed appropriations bill, it is loaded with politics. The relevant OSTP memo language is located amongst other clauses that defund the Biden administration’s efforts to implement diversity initiatives at various agencies, address gun violence, sue states over redistricting, and dozens of other hot-button issues as well. It’s pretty easy for an issue like access to science to get lost in the political shuffle, but we hope with some attention from authors and others, it won’t.