Category Archives: Law and Policy

Copyright Office Holds Listening Session on Copyright Issues in AI-Generated Music and Sound Recordings

Posted June 2, 2023
Photo by Possessed Photography on Unsplash

Earlier this week, the Copyright Office convened a listening session on the topic of copyright issues in AI-generated music and sound recordings, the fourth in its listening session series on copyright issues in different types of AI-generated creative works. Authors Alliance participated in the first listening session on AI-generated textual works, and we wrote about the second listening session on AI-generated images here. The AI-generated music listening session participants included music industry trade organizations like the Recording Industry Association of America, Songwriters of North America, and the National Music Publishers’ Association; generative AI music companies like Boomy, Tuney, and Infinite Album; music labels like the Universal Music Group and Wixen; and individual musicians, artists, and songwriters. Streaming service Spotify and collective-licensing group SoundExchange also participated. 

Generative AI Tools in the Music Industry

Many listening session participants discussed the fact that some musical artists, such as Radiohead and Brian Eno, have been using generative AI tools as part of their work for decades. For those creators, generative AI music is nothing new, but rather an expansion of existing tools and techniques. What is new is the ease with which ordinary internet users without musical training can assemble songs using AI tools—programs like Boomy enable users to generate melodies and musical compositions, with options to overlay voices or add other sounds. Some participants sought to distinguish generative tools from so-called “assistive tools,” with the latter being more established for professional and amateur musicians. 

Where some established artists themselves have long relied on assistive AI tools to create their works, AI-generated music has lowered barriers to entry for music creation significantly. Some take the view that this is a good thing, enabling more creation by more people who could not otherwise produce music. Others protest that those with musical talent and training are being harmed by the influx of new participants in music creation, as these types of songs flood the market. In my view, it’s important to remember that the purpose of copyright, furthering the progress of science and the useful arts, is served when more people can generate creative works, including music. Yet AI-generated music may already be at or past the point where it can be indistinguishable from works created by human artists without the use of these tools, at least to some listeners. It may be the case that, as at least one participant suggested, audio generated works are somehow different from AI-generated textual works such that they may require different forms of regulation. 

Right of Publicity and Name, Image, and Likeness

Although the topic of the listening session was federal copyright law, several participants discussed artists’ rights in both their identities and voices—aspects of the “right of publicity” or the related name, image, and likeness (“NIL”) doctrine. These rights are creatures of state law, rather than federal law, and allow individuals, particularly celebrities, to control what uses various aspects of their identities may be put to. In one well-known right of publicity case, Ford used a Bette Midler “sound alike” for a car commercial, which was found to violate her right of publicity. That case and others like it have popularized the idea that the right of publicity can cover voice. This is a particularly salient issue within the context of AI-generated music due to the rise of “soundalikes” or “voice cloning” songs that have garnered substantial popularity and controversy, such as the recent Drake soundalike, “Heart on My Sleeve.” Some worry that listeners could believe they are listening to the named musical artist when in fact they are listening to an imitation, potentially harming the market for that artist’s work. 

The representative from the Music Artists Coalition argued that the hodge podge of state laws governing the right of publicity could be one reason why soundalikes have proliferated: different states have different levels of protection, and the lack of unified guidance on how these types of songs are governed under the law can create uncertainty as to how they will be regulated. And the representative from Controlla argued that copyright protection should be expanded to cover voice or identity rights. In my view, expanding the scope of copyright in this way is neither reasonable nor necessary as a matter of policy (and furthermore, would be a matter for Congress, and not the Copyright Office, to address), but it does show the breadth of the soundalike problem for the music industry. 

Copyrightability of AI-Generated Songs

Several listening session participants argued for intellectual property rights in AI-generated songs, and others argued that the law should continue to center human creators. The Copyright Office’s recent guidance regarding copyright in AI-generated works suggests that the Office does not believe that there is any copyright in the AI-generated materials due to the lack of human authorship, but human selection, editing, and compilation can be protected. The representatives from companies with AI-generating tools expressed a need for some form of copyright protection for the songs these programs produce, explaining that they cannot be effectively commercialized if they are not protected. In my view, this can be accomplished through protection for the songs as compilations of uncopyrightable materials or as original works, owing to human input and editing. Yet, as many listening session participants across these sessions have argued, the Copyright Office registration guidance does not make clear precisely how much human input or editing is needed to render an AI-generated work a protectable original work of authorship. 

Licensing or Fair Use of AI Training Data

In contrast to the view taken by many during the AI-generated text listening session, none of the participants in this listening session argued outright that training generative AI programs on in-copyright musical works was fair use. Instead, much of the discussion focused on the need for a licensing scheme for audio materials used to train generative AI audio programs. Unlike the situations with many text and image-based generative AI programs, the representatives from generative AI music programs expressed an interest and willingness to enter into licensing agreements with music labels or artists. In fact, there is some evidence that licensing conversations are already taking place. 

The lack of fair use arguments during this listening session may be due to the particular participants, industry norms, or the “safety” of expressing this view in the context of the music industry. But regardless, it provides an interesting contrast to views around training data text-generating programs like ChatGPT, which many (including Authors Alliance) have argued are fair uses. This is particularly remarkable since at least some of these programs, in our view, use the audio data they are trained on for a highly transformative purpose. Infinite Album, for example, allows users to generate “infinite music” to accompany video games. The music reacts to events in the video game—becoming more joyful and upbeat for victories, or sad for defeats—and can even work interactively for those streaming their games, where those watching the stream can temporarily influence the music. This seems like precisely the sort of “new and different purpose” that fair use contemplates, and similarly like a service that is unlikely to compete directly with individual songs and records. 

Generative AI and Warhol Foundation v. Goldsmith

Many listening session participants discussed the interactions between how AI-generated music should be regulated under copyright law and the recent Supreme Court fair use decision in Warhol Foundation v. Goldsmith (you can read our coverage of that decision here), which also considered whether a particular use which could have been licensed was fair use. And some participants argued that the decision in Goldsmith makes it clear that training generative AI models (i.e., the input stage) is not a fair use under the law. It is not clear precisely how the decision will impact the fair use doctrine going forward, particularly as it applies to generative AI, and I think it is a stretch to call it a death knell for the argument that training generative AI models is a fair use. However, the Court did put a striking emphasis on the commerciality of the use in that case, deemphasizing the transformativeness inquiry somewhat. This could impact the fair use inquiry in the context of generative AI programs, as these programs tend overwhelmingly to be commercial, and the outputs they create can and are being used for commercial purposes. 

Supreme Court Issues Decisions in Warhol Foundation and Gonzalez

Posted May 19, 2023
Photo by Claire Anderson on Unsplash

Yesterday, the Supreme Court released two important decisions in Warhol Foundation v. Goldsmith and Gonzalez v. Google—cases that Authors Alliance has been deeply invested in, submitting amicus briefs to the Court in both cases. 

Warhol Foundation v. Goldsmith and Transformativeness

First, the Court issued its long-awaited opinion in Warhol Foundation v. Goldsmith, a case Authors Alliance has been following for years, and for which we submitted an amicus brief last summer. The case concerned a series of screen prints of the late musical artist Prince created by Andy Warhol, and asked whether the creation and licensing of one of the images, an orange screen print inspired by Goldsmith’s black and white photograph (which the Court calls “Orange Prince”), constituted fair use. After the Southern District of New York found for the Warhol Foundation on fair use grounds, the Second Circuit overturned the ruling, finding that the Warhol Foundation’s use constituted infringement. The sole question before the Supreme Court was whether the first factor in fair use analysis, the purpose and character of the use, favored a finding of fair use. 

To our disappointment, the Supreme Court’s majority agreed with the holding of the district court, finding that the purpose and character of Warhol’s use favored Goldsmith, such that it did not support a finding of fair use. This being said, the decision focused narrowly on the Warhol Foundation’s “commercial licensing of Orange Prince to Condé Nast,” expressing “no opinion as to the creation, display, or sale of any of the original Prince Series works.” Because the Court cabins its opinion, focusing specifically on the licensing of Orange Prince to Condé Nast rather than the creation of the entire Prince series, the decision is less likely to have a deleterious effect on the fair use doctrine generally than a broader decision would have. 

Writing for the majority, Justice Sotomayor argued that Goldsmith’s photograph and the Prince screen print in question shared the same purpose, “portraits of Prince used to depict Prince in magazine stories about Prince.” Moreover, the Court found the use to be commercial, given that the screen print was licensed to Condé Nast. Justice Sotomayor explained that “if an original work and secondary use share the same or highly similar purposes, and the secondary use is commercial, the first fair use factor is likely to weigh against fair use, absent some other justification for copying.” Justice Sotomayor found that the two works shared the same commercial purpose, and therefore concluded that factor one favored Goldsmith. 

Justice Kagan, joined by Chief Justice Roberts, issued a strongly worded dissenting opinion. The dissent admonished the majority for its departure from Campbell’s “new meaning or message test,” an inquiry that Authors Alliance advocated for in our amicus brief. Justice Kagan further criticized the majority’s shifting focus towards commerciality, arguing that the fact that the use was a licensing transaction should not be given so much importance in the analysis. While Authors Alliance agrees with these points, we are less sure that the majority’s decision goes so far as to “constrain creative expression” or “threaten[] the creative process. And while it’s uncertain what effect this case will have on the fair use doctrine more generally, one important takeaway is that the question of whether the use in question is commercial in nature—a consideration under the first factor—has been elevated to one of greater importance. 

While we thought this case offered a good opportunity for the Court to affirm a more nuanced approach to transformative use, we much prefer the Supreme Court’s approach to the Second Circuit’s decision, and applaud the Court on confining its ruling to the narrow question at issue. The holding does not, in our view, radically alter the doctrine of fair use or disrupt a bulk of established case law. Moreover, some aspects of arguments we made in our brief—such as the notion that transformativeness is a matter of degree, not a binary—are present in the Court’s decision. This is a good thing, in our view, as it will allow for more nuanced consideration of a use’s character and purpose, and stands in contrast to the Second Circuit’s all or nothing view of transformativeness. 

Gonzalez v. Google and the Missing Section 230

Also yesterday, the Court released its opinion in Gonzalez v. Google, a case that generated much attention because of its potential threat to Section 230, and another case in which Authors Alliance submitted an amicus brief. The case asked whether Google could be held liable under an anti-terrorism statute for harm caused by ISIS recruitment videos that YouTube’s algorithm recommended. In its per curiam decision (a unanimous one without a named Justice as author), the Court stated that Gonzalez’s complaint had failed to state a viable claim under the relevant anti-terrorism statute. Therefore, it did not reach the question of the applicability of Section 230 to the recommendations at issue. In other words, a case that generated tremendous concern about the Court disturbing Section 230 and harming internet creators, communities, and services that relied on it ended up saying nothing at all about the statute. 

Authors Alliance Joins Copyright Office Listening Session On Copyright in AI-Generated Literary Works

Posted April 20, 2023
Photo by Possessed Photography on Unsplash

Yesterday, I represented Authors Alliance in a Copyright Office listening session on copyright issues in AI-generated literary works, in the first of two of such sessions that the Office convened yesterday afternoon. I was pleased to be invited to share our views with the Office and participate in a rousing discussion among nine other stakeholders, representing a diverse group of industries and positions. Generative AI raises challenging legal questions, particularly for its skeptics, but it also presents some incredible opportunities for authors and other creators.

During the listening session, I emphasized the potential for generative AI programs (like OpenAI’s Chat GPT, Microsoft’s Bing AI, Jasper, and others) to support authorship in a number of different ways. For instance, generative AI programs support authors by increasing the efficiency of some of the practical aspects of being a working author aside from their writings. But more importantly, generative AI programs can actually help authors express themselves and create new works of authorship. 

In the first category, generative AI programs can support authors by, for example, helping them create text for pitch letters to send to agents and editors, produce copy for their professional websites, and develop marketing strategies for their books. Making these activities more efficient frees up time for authors to focus on their writing, particularly for authors whose writing time is limited by other commitments. 

In the second category, generative AI has tremendous potential to help authors come up with new ideas for stories, develop characters, summarize their writings, and perform early stage edits of manuscripts. Moreover, and particularly for academic authors, generative AI can be an effective research tool for authors seeking to learn from a large corpus of texts. Generative AI programs can help authors research by providing short and simple summaries of complex issues, surveys of the landscape of various fields, or even guidance on what human works to turn to in their research. Authors Alliance is committed to protecting authors’ right to conduct research, and we see generative AI tools as a new, innovative, and efficient form of conducting this research. Making research easier helps authors save time, and has a particular benefit for authors with disabilities that make it difficult to travel to multiple libraries or otherwise rely on analog forms of research. 

These programs undoubtedly have the potential to serve as powerful creative tools that support authorship in these ways and more, but, when discussing the copyright implications of the programs and the works they produce, it’s important to remember just how new these technologies are. Because generative AI remains in its infancy, and the costs and benefits for different segments of the creative industry have yet to be seen, it seems to me to be sensible to preserve the development of these tools before crafting legal solutions to problems they might pose in the future. And in fact, in our view, U.S. copyright law already has the tools to deal with many of the legal challenges that these programs might post. When generative AI outputs look too much like the copyrighted inputs they are trained on, the substantial similarity test can be used to assess claims of copyright infringement to vindicate an authors’ exclusive rights in their works when those outputs do infringe. 

In any case, in order for generative AI programs to be effective creative tools, it’s necessary that they are trained on large corpora. Narrowing the corpus of works the programs are trained on—through compulsory licensing or other mechanisms—can have disastrous effects. For example, research has shown that narrow data sets are more likely to produce racial and gender bias in AI outputs. In our view, the “input” step, where the programs are trained on a large corpus of works, is a fair use of these texts. And the holdings in Google Books and HathiTrust indicate that it is consistent with fair use to build large corpora of works, including works that remain protected by copyright, for applications such as computational research and information discovery. Additionally, the Copyright Office has recognized this principle in the context of research and scholarship, as demonstrated by its approval of Authors Alliance’s petition for an exemption from DMCA restrictions for text and data mining

The question of the copyright status of AI-generated works is an important one. Most if not all of the stakeholders participating in this discussion agreed with the Copyright Office’s recent guidance regarding registration in AI-generated works: under ordinary copyright principles, the lack of human authorship means these texts are not protected by copyright. This being said, we also recognize that there may be challenges in reconciling existing copyright principles with these new types of works and the questions about authorship, creativity, and market competition that they might pose. 

But importantly, while this technology is still in its early stages, it serves the core purposes of copyright—furthering the progress of science and the useful arts by incentivizing new creation—to allow these systems to develop and confront new legal challenges as they emerge. Copyright is not only about protecting the exclusive rights of copyright holders (a concern that underlies many arguments against generative AI as a fair use), but incentivizing creativity for the public benefit. The new forms of creation made possible through generative AI can incentivize people who would not otherwise create expressive works to do so, bringing more people into creative industries and adding new creative expression to the world to the benefit of the public.

The listening sessions were recorded, and will be available on the Copyright Office website in the coming weeks. And these listening sessions are only the beginning of the Office’s investigation of copyright in AI generated works. Other listening sessions on visual works, music, and audiovisual works will be held in the coming weeks, and the Office has indicated that there will be an opportunity for written public comments in order for stakeholders to weigh in further. We are committed to remaining involved in these cutting edge issues, through written comments and otherwise, and we will keep our readers informed as policy around generative AI continues to evolve. 

Authors Alliance Submits Comment to Copyright Office Regarding Ex Parte Communications

Posted April 4, 2023
Photo by erica steeves on Unsplash

Yesterday, Authors Alliance submitted a comment to the U.S. Copyright Office in response to a notice of proposed rulemaking asking for feedback from the public on new rules to govern ex parte communications. “Ex parte communications” refer to communications outside the normal, permitted channels of communication—in this case, to communications between organizations or members of the public and Copyright Office staff outside of hearings or other formal proceedings. Ex parte communications with the Copyright Office are important, because they allow stakeholders and the office to work out open questions in rulemakings or other proceedings outside of the formal channels. Authors Alliance relied on our ability to make ex parte communications during the last Section1201 rulemaking cycle (where we obtained our text data mining exemption) in order to clarify certain issues. Now, the Office is proposing establishing formal rules for how these communications can be made, as well as establishing transparency around them. We support this proposal, and shared our thoughts in a comment. You can read our full comment here.

Judge Rules Against Internet Archive on Controlled Digital Lending

Posted March 28, 2023
Photo by Wesley Tingey on Unsplash

On Friday, Southern District of New York Judge John Koeltl issued a much-anticipated decision in Hachette Books v. Internet Archive. Unfortunately, as many of our members and allies are aware, the judge ruled against the Internet Archive, finding that its CDL program was not protected by the doctrine of fair use and granting the publishers’ motion for summary judgment. You can read the 47-page decision for yourself here

In his fair use analysis, Judge Koeltl found that each of the four fair use factors weighed in favor of the publishers, emphasizing above all else his view that IA’s controlled digital lending program was not transformative, an important consideration under the first fair use factor, which considers the purpose and character of the use. This inquiry also involves asking whether the use in question was commercial. To the surprise of many, the decision stated that IA’s use of the publishers’ works was commercial, because the Open Library is part of the IA’s website, which it uses “to attract new members, solicit donations, and bolster its standing in the library community.” The judge found this to be the case in spite of the fact that IA “does not make a monetary profit” from CDL. In other words, the judge held that the indirect, attenuated benefits the Internet Archive (which is, after all, a nonprofit) reaps from operating the Open Library makes its CDL program commercial. 

Judge Koeltl gave less attention to the fourth factor in the fair use analysis, “the effect of the use on the potential market for the work,” which is often held up to be of significant importance. One consideration under this factor is whether the use creates a competing substitute with the original work. Unfortunately, on this point too, the court—in our view—missed the mark. This is because the decision does not draw a distinction between CDL scans and ebooks, going so far as to call CDL scans “ebooks” throughout. As we explained in our summary of the proceedings last week, many features of both CDL and ebooks make them both functionally and aesthetically distinct from one another. By glossing over these differences, the judge reached the conclusion that CDL scans are direct substitutes for licensed ebooks.

Authors Alliance is deeply concerned about the ramifications of this decision, which was exceedingly broad in scope, striking a tremendous blow to the CDL model, rather than only IA’s implementation of it. Local libraries across the country practice CDL, and library patrons and authors alike depend on it to read, research, and participate in academic discourse. 

As it stands, this decision only applies to Internet Archive and is only about the 127 books on which the publishers based their lawsuit. It does not set a binding precedent for any other library, but if left in place (or worse, if affirmed on appeal), it could cause libraries to avoid digitizing and lending books under a CDL model, which in our view would not serve the interests of many authors. This decision makes it harder for those authors to reach wide audiences: CDL enables many authors to reach more readers than they could otherwise, and authors like our members who write to be read would not be served if fewer readers could access their books. 

The decision also hampers efforts to preserve books—aside from IA’s scanning program, there are few if any centralized efforts to preserve books in digital format once their commercial life is over. Without CDL, those books could quite literally disappear, and the knowledge they advance could be lost. IA’s scanning operations do preserve such books, which is one reason we have strongly supported them in this lawsuit. By the same token, if this decision stands, it will also limit authors’ ability to conduct efficient research online. The CDL survey we launched last year revealed that CDL is an effective research tool for authors who need to consult other books as part of their writing process, and in many cases it enables them to access far more works than they could at their local library alone. Authors who rely on CDL in this way would be harmed by this decision, as they could well be forced to undergo a more time-consuming research process, detracting from time that could be spent writing. 

The Internet Archive has already indicated that it will be appealing Judge Koeltl’s ruling, and we look forward to supporting those efforts. We will continue to keep our readers and members apprised of updates as this case moves forward.

Judge Hears Oral Arguments in Hachette Book Group v. Internet Archive

Posted March 20, 2023
Photo by Timothy L Brock on Unsplash

Earlier today, Judge John Koeltl of the Southern District of New York heard oral arguments in Hachette Book Group v. Internet Archive—a case Authors Alliance has been following since the lawsuit was first filed back in 2020. The case is about—among other things—whether Internet Archive’s controlled digital lending program qualifies as a fair use. Authors Alliance submitted an amicus brief in support of the Internet Archive back in July, arguing that CDL serves the interests of authors who write to be read. IA’s attorney cited to our brief during oral argument, and we are pleased that we were able to magnify the voices of authors who write to be read through its submission. You can learn more about the case and read our brief here.

In the hearing, the judge considered each party’s motion for summary judgment. The parties hotly contested a number of key issues in the case, including whether each side’s experts had properly demonstrated market harm (or lackthereof), what the appropriate market to consider was for purposes of fair use analysis, the commerciality of IA’s use, and what legal cases supported both arguments in favor of and against fair use. Judge Koeltl asked the Internet Archive’s attorney a number of probing questions on these points, grappling with the difficult questions in this case. The judge further implied that there may be open issues of fact in this case, which could indicate the need for additional briefings or hearings. 

CDL and Commerciality

The parties disagreed on the commerciality of IA’s use when it produces and makes CDL scans available. The publishers attorney argued that IA’s CDL operations are “intertwined” with its other functions, such as its ownership of the book vendor Better World Books, and further emphasizing its argument that CDL loans result in lost revenue for the publisher—in other words, that the supposed commercial harm to the publishers that results from CDL lending makes the CDL lending itself commercial. The Internet Archive’s attorney answered that IA is a nonprofit organization that does not profit at all from its CDL program. He pointed to the fact that traditional library lending is not commercial in nature and does not provide libraries like IA with commercial benefits. 

CDL and Market Effects

The plaintiffs’ attorney began by setting forth plaintiffs’ views on the issue of market harm—the fourth factor in fair use analysis, often cited as one of the most important factors in the inquiry. Plaintiffs discussed what they see as massive financial harm stemming from IA’s CDL program, which they estimated to amount to “millions of dollars in licensing revenues.” Plaintiffs also emphasized that, were CDL “given the green light,” or upheld as a fair use, the plaintiffs would suffer even greater losses. Throughout her argument, plaintiffs’ attorney emphasized the “basic economic principle and common sense is that you cannot compete with free.” In other words, the publishers argue that the ebook library licensing market could collapse altogether if CDL were allowed to continue. Yet this misses the point that CDL is a longstanding and established practice, which has seen adoption and growth in libraries across the country while the ebook licensing market has continued to thrive. 

Judge Koeltl, however, pressed the publishers on whether they had shown evidence of actual market harm, i.e. proof that IA’s CDL program had directly harmed their bottom line. In response, plaintiffs criticized the expert evidence offered by IA’s experts to show that no such harm had occurred. This is a difficult question because the party asserting a fair use defense typically has the burden of showing that the use has not harmed the market, but it exceedingly difficult to prove a negative. 

The judge also questioned whether CDL actually could represent such a loss: the publishers’ argument rests on the premise that libraries loan out CDL scans in lieu of paying to license ebooks, and were CDL not permitted under the law, IA and other libraries would instead choose to pay licensing fees to lend out ebooks. The judge pointed out that the result might in fact be that libraries would choose not to lend digital copies of works out at all, or would instead lend out physical books, undercutting the lost licensing revenue argument. 

IA’s attorney argued that the publishers had not offered empirical evidence of market harm in this case, focusing on the fact that when a library lends out a CDL scan, it does so in lieu of a physical book, “simulating the limitations of physical books.” This is due to CDL’s “owned to loaned” ratio requirement: a library can only loan out the number of CDL scans as it has physical books in its collection, and can only loan these scans out to one patron at a time. When a library lends out a CDL scan, it does so in lieu of loaning the physical book, for which it has already paid. And while the plaintiffs mentioned harm to authors (who are, after all, the people that copyright law is intended to protect) several times during their argument, they did this in a way that linked authors with publishers as parties that are financially invested in a works’ sale—author interests and the finer details of the economics of author income and library lending were absent from the discussion. 

The parties also disagreed about which market was the appropriate one to look to when discussing market harm in the context of fair use analysis. The publishers argued, and the judge seemed to assume, that the proper market is the library ebook licensing market. The judge opined that libraries could, instead of using CDL to lend out their books, simply purchase an ebook license. He seemed to view CDL scans and licensed ebooks as one and the same, despite the fact that there are several key differences between these types of loans, both in form and function, as explained in other amicus briefs in the case. Moreover, missing from the argument was the fact that, in many cases, libraries loan out CDL scans because no ebook is available to them: particularly for older books in a publisher’s backlist, or for books that are no longer available commercially, there is in many cases no ebook available, or no ebook available to libraries. Library patrons with print or mobility disabilities in need of digital copies of these kinds of works in order to read them would be greatly harmed if CDL were no longer permitted. 

CDL and Transformativeness

The publishers’ attorney started from the premise that CDL as a use was not transformative, explaining that a licensed ebook and a CDL scan served precisely the same function. In response, IA’s attorney in response argued that CDL is a transformative use because it “utilizes technology to achieve the transformative purpose of improving efficiency of delivering content without unreasonably encroaching on the rights of the rightsholder.” He further explained that fair uses are favored when they serve the key purpose of copyright: incentivizing new creation for the public benefit without harming the interests of rightsholders. To illustrate these benefits, he cited to Authors Alliance’s amicus brief, in which we explained the myriad ways that CDL benefits authors and can even incentivize the creation of new works. 

Adding to its transformativeness argument, IA explained that, when it comes to speculative or actual market harm, such an effect must be balanced against the public benefit that results from the use. And when it comes to CDL, this public benefit is tremendous: numerous amici, as well as Authors Alliance, explained that CDL serves the interests of library patrons, authors, and the public writ large. 

What’s Next?

Now that the judge has heard both sides’ arguments, he will issue a decision in the case. While there is no way of knowing exactly when this will happen, Judge Koeltl is known for issuing decisions fairly quickly, so we may have a decision as soon as later this week. As always, we will keep our members and readers apprised of any developments in this pivotal case as it moves forward.

Copyright Office Issues Opinion Letter on Copyright in AI-Generated Images

Posted March 8, 2023
Photo by Michael Dziedzic on Unsplash

In late February, the Copyright Office issued a letter revoking a copyright registration it had previously granted artist Kristina Kashtanova for a comic that used images generated using Midjourney, a generative AI program that creates images in response to user prompts. While this may seem minor, or simply another data point in the ongoing fight about copyright protection for AI-generated works, the determination is quite significant: it comes at a moment when AI-generated art has captured public attention, and moreover shows the Copyright Office’s thoughts on the important question of whether an artist who relies on a program like Midjourney can obtain copyright protection for an original compilation of AI-generated works. In today’s post, we explain the Copyright Office letter, contextualize it within the growing debate over AI and copyright, and share our thoughts on what all of this might mean for authors who write to be read. 

Copyright and Human Authorship

As technology has advanced to allow the creation of works without the direct involvement of a human, courts have grappled with whether these creations are entitled to copyright protection. In the late 19th century, the Supreme Court established that copyright was intended to protect the products of human labors and creativity, creating the “human authorship” requirement. In an early case on the topic, the Court held that a photograph was copyrightable despite the fact that a camera literally created the image, since photographs were “representatives of original intellectual conceptions of the author.” It cautioned, however, that when it came to creations resulting from processes that were “merely mechanical,” lacking “novelty, invention, or originality” by a human author, such hypothetical works might be beyond the scope of copyright protection.

This principle was tested in the 2010s: in 2011, an Indonesian crested macaque monkey named Naruto seized a photographer’s camera and took hundreds of images of himself. The photographer, David Slater, shared some of these images online, which promptly went viral. Several websites posted these images as well, prompting Slater to assert that he owned the copyright in the images and request their removal. The Wikimedia Foundation, which had uploaded the image to Wikimedia Commons, a repository of public domain and free license content, argued that the image was a part of the public domain due to the lack of a human creator. Several years later, Slater published a book of nature photographs which included Naruto’s selfie. Then, in 2015, the People for the Ethical Treatment of Animals (PETA) filed a lawsuit in the Northern District of California on Naruto’s behalf, asserting that the macaque owned the copyright in the image and requesting damages. The district court judge held that Naruto could not own the copyright in the image due to copyright’s human authorship requirement. However, the judge did indicate that Congress might be free to do away with the human authorship requirement and permit copyright ownership by animals, suggesting that the requirement was not a constitutional one, but indicating that it was beyond the power of the judiciary to decide. The Ninth Circuit Court of Appeals later affirmed the district court’s ruling.

Currently, the Copyright Office is defending a lawsuit in the D.C. district court brought by AI system developer, Dr. Stephen Thaylor, regarding the constitutionality of copyright law’s human authorship requirement. Thaylor argues that the Copyright Act does not forbid treating AI systems as “authors” for the purpose of copyright law, and contends that the human authorship principle is unsupported by contemporary case law. While it seems unlikely that Thaylor will prevail on this argument, the case will at the very least generate new attention about the human authorship requirement and how it fits into creation in the digital age. 

The Creativity Requirement and Zarya of the Dawn

Kashtanova’s assertion of copyright ownership in her comic, Zarya of the Dawn, is in many ways similar to the photographer David Slater’s claim that he owned the copyright in Naruto’s selfie. In each case, the Copyright Office indicated that when a work is not the product of human authorship, a human may not claim copyright in that work (the latest compendium of Copyright Office practices lists “a photograph taken by a monkey” as an example of work that is not entitled to copyright protection since it does not meet the human authorship requirement). 

Kashtanova’s attorney had argued that Midjourney served “merely as an assistive tool,” and that Kashtanova should be considered the work’s author. But the Office likened Midjourney to a “merely mechanical process” lacking “novelty, invention, or originality” by a human creator, quoting the Supreme Court’s warning about the limits of copyright protection in the 19th century case discussed earlier in this post. And it was not only the human authorship requirement that made Zarya of the Dawn beyond the scope of copyright protection, but also copyright’s creativity requirement: for a work to be copyrightable, it must possess at least a “modicum” of creativity, a very low bar that rarely forecloses copyright protection for works of human authorship. 

The Office explained that Midjourney generates images in response to user prompts, “text commands entered in one of Midjourney’s channels.” But these are not “specific instructions” for generating an image, rather input data that Midjourney compares to its training data before generating an image. The Office also argued that these images lack human authorship because the process is “unpredictable” and “not controlled by the user.” In other words, the “creativity” in these images comes not from the human entering prompts, but from the interaction between the prompt and Midjourney’s training data. This makes it different from a tool like a camera over which a user exercises total control—there is little to no unpredictability when we use digital cameras to photograph the world around us, rather all creative choices come from the human using the device. 

The Office also noted that this opinion was not necessarily the final world on AI-generated images, as “other [generative] AI offerings” might operate differently, such that the creativity and human authorship requirements could be met. Kashtanova argued that minor edits she had made to the images were sufficiently creative to give her copyright ownership in the work as a whole. While the Office disagreed in this specific case (the before and after images demonstrating the editing were nearly identical), it did leave this possibility intact for future cases. Moreover, the Office granted Kashtanova ownership in the comic’s text, which she alone had written, as well as copyright ownership in the compilation of Midjourney-generated images. Compilations of uncopyrightable subject matter can sometimes be protected by copyright, because both the human authorship and creativity requirements are met when a human selects and arranges the material. The copyright owner does not own a copyright in the material itself, but in the original compilation they have created.

What Does this Mean for Authors?

The Copyright Office’s denial of registration in the Midjourney-generated images has important implications for the public domain and authors’ abilities to use new forms of technology as assistive tools in the creation of their works. But the Office’s action also leaves some open questions about the copyright status of images generated by Midjourney and similar systems. One possibility is—as was asserted by Wikimedia in the case of Naruto’s selfie—these images are a part of the public domain. Were that to be the case, it could be a boon for artists and creators. Recall that once a work is in the public domain, it becomes free for all to use without fear of copyright infringement. The case of the monkey selfie is further instructive here, as the owner of the camera in that case did not prevail on claiming his own copyright in Naruto’s selfie. By the same token, it is unlikely that the creators of Midjourney could claim a copyright in images like those used by Kashtova, despite their role in creating and making available the “assistive tool.” 

If AI systems could be used to generate infinite public domain content—whether through text-based systems like ChatGPT or image-generating systems like Midjourney—this would greatly expand public domain content. The public domain can be a boon for creators, as they are free to do anything they wish with this material. On the other hand, some have expressed fear that, should all AI-produced works be considered a part of the public domain, these public domain works could compete with works produced by human authors. It is also important to remember the practical economic realities of systems like Midjourney. Whether or not the Copyright Office and other policymakers determine that AI-generated content is a part of the public domain, the creators of those systems could employ other means to assert ownership or forbid onward uses of the content created by these systems. Contractual override, the employment of so-called “digital locks” like DRM, or other legal and technical mechanisms could conceivably limit authors’ ability to use AI-generated works the way they might use more traditional public domain materials. 

The First Copyright Small Claims Court Judgment

Posted March 6, 2023

Authors Alliance members will recall the posts we’ve made over the years about the enactment and implementation of a new copyright small claims court, the “Copyright Claims Board,”  housed within the U.S. Copyright Office. 

Late last week, the CCB issued its very first judgment. It came in a case brought by photographer David Oppenheimer against an California attorney, David Prutton, who had used an unlicensed copy of one of Oppenheimer’s photos (a picture of the federal courthouse in Oakland) on his solo-practitioner website (h/t to Plagiarism Today, where we first saw reporting about the case, here). 

Screenshot of Prutton’s website, showing use of Oppenheimer’s photo of the Federal Courthouse in Oakland (twin buildings on the right).

The case had a head start because it was originally filed in federal district court, where the parties voluntarily agreed to dismiss the federal case and have the case referred to the CCB. You can read the entire history, including all the filings, here. The CCB ruled in favor of Oppenheimer, and awarded the photographer an award of statutory damages of $1,000, significantly less than the $30,000 (the maximum amount available to claimants in CCB proceedings) that Oppenheimer originally sought. 

In many ways, this was a pretty easy case for the CCB. Prutton readily admitted that he had used Oppenheimer’s unlicensed photo, in whole, on his website. Though Prutton raised a fair use defense, he didn’t bother to argue any except one of the four fair use factors. Prutton’s sole contention was that the impact on the market was so minimal—and that Oppenheimer had shown no evidence of harm—that Prutton should win on the fourth fair use factor. 

The CCB, noting that the fair use factors need to be balanced and weighed together, did its own analysis of all the fair use factors but concluded—rightly, I think—that for the other three fair use factors: 

  • Prutton’s use was not particularly transformative or for a new purpose, weighing against the use;
  • Oppenheimer’s original photo was creative (certainly enough for copyright protection, though reasonable minds might disagree on the extent of the creativity and therefore how strong this factor should weigh in its favor), weighing against the use;
  • Prutton has used the whole work, not a small portion of it, weighing against the use.

For the fourth fair use factor, Prutton argued that because Oppenheimer showed essentially no history of licensing revenue from this photograph, along with a history of other litigation that tended to indicate that Oppenheimer’s business was primarily oriented toward generating revenue through litigation, there was no meaningful market harm. The CCB disagreed, essentially concluding that it was Prutton’s job to show a lack of market harm (which they said he did not do), and the burden did not rest on Oppenheimer to show evidence of a market.  However, because Oppenheimer didn’t show any actual evidence of financial harm, this also led the CCB when assessing damages to grant an award far below Oppenheimer’s request—his original demand of $30,000 in damages was reduced to just $1,000.

Where the case was a little more interesting was how the CCB addressed Prutton’s defense of “unclean hands,” in which he essentially asks the CCB to excuse his use because Oppenheimer had acted improperly. If you do a quick search for “David Oppenheimer” and “copyright” you will find that Oppenheimer is frequently in court over alleged infringement of rights in his photographs, with fact patterns very similar to the one in this case, including heavy-handed negotiation tactics and aggressive use of litigation. In several of those cases, such as this case in the Western District of North Carolina, courts refused to grant Oppenheimer easy wins—concluding that Oppenheimer’s litigation tactics could reasonably be viewed as so problematic as to block his assertion of rights by the defense of “copyright misuse.” 

The CCB dismissed Prutton’s “unclean hands” defense by highlighting how unusual and extreme a plaintiff’s conduct has to be to fall subject to that general defense. The CCB didn’t, however, really assess Prutton’s more substantial “copyright misuse” defense, perhaps because Prutton didn’t raise it as a separate defense. In my view, copyright misuse may well have been a valid defense in this case. 

As the Western District of North Carolina explained in a previous case brought by Oppenheimer,  “misuse of copyright is a valid affirmative defense where the use of a copyright is contrary to the public policy upon which copyrights are granted. . . . Typically, the defense applies when seeking to avoid anti-competitive behavior, but it can also apply to other scenarios where a copyright owner attempts to extend the copyrights beyond their intended reach. . . . The underlying policy principles behind copyrights extend from the United States Constitution, with the relevant policy here being to promote the ‘useful arts.’” The court in that case concluded that if Oppenheimer’s “purpose in copyrighting the Copyrighted Work was to license it for use when individuals or companies need [his photo] then Plaintiff is likely not misusing his copyrights. Yet, a reasonable jury could find Plaintiff is using copyrights to derive an income from infringement suits and this issue is one of fact that the Court should not decide.” 

Lessons Learned

As this is the very first decision of the CCB, I don’t think we should draw sweeping conclusions from it about how the CCB will do its work. But it is interesting to see that this first case wasn’t exactly a suit between legal amateurs—Oppenheimer is a seasoned litigant who has brought many copyright cases, and Prutton is an attorney (albeit not one who specializes in copyright). Both made significant missteps in the presentation of their cases. And so, one observation I think we can make is that while the copyright small claims system is meant to have low barriers to participation, and the CCB seems inclined to go to extra lengths to help parties understand the process and present cogent filings, the CCB is not going to excuse incomplete argumentation. At least in this case, the CCB refused to assume facts or arguments not presented by the parties. That was true both for the plaintiff and defendant: plaintiffs who make damage assertions are going to need to show evidence of actual harm in order to get awards close to their requested amounts. And defendants who raise defenses will need to fully argue them; glossing over three of the four fair use factors is not a winning strategy. Nor does it seem passing references to defenses such as “unclean hands” and “copyright misuse” will work without adequate support. 

Jack Daniels v. VIP Products and the Freedom to Parody and Comment in the United States

Posted March 2, 2023

This post was written for the Kluwer Copyright Blog, and is based in part on an amicus brief filed last week by the Harvard Cyberlaw Clinic on behalf of Authors Alliance and ComicMix before the United States Supreme Court in Jack Daniels v. VIP Products.

Ordinarily, authors who write parodies look to copyright limitations and exceptions to protect their rights. In the United States, the doctrine of fair use has been held to permit parody in uses ranging from rap music to children’s books. These fair use rights, the courts have said, have their roots in the U.S. Constitution’s First Amendment protections for freedom of speech.

In a recent case before the U.S. Supreme Court, Jack Daniels v. VIP Products, those parody rights are at risk. In a twist, however, it is not copyright law, but rather an expansive view of trademark law, that poses this threat.

The facts of this case are straightforward: Jack Daniels, creator of the famous Tennessee Whiskey,  brought the trademark suit to stop VIP Products for production of a dog toy, which it titled “Bad Spaniels,” in the shape of Jack Daniels’ iconic whiskey bottle and label.  Jack Daniels asserts that the Bad Spaniels toy infringes on its trademark and dilutes its brand. VIP Products counters that the toy is meant to parody Jack Daniels’ bottle and is protected speech under the U.S. Constitution’s First Amendment.

Jack Daniel’s Whiskey Bottle (left) and VIP Products’ “Bad Spaniels” dog toy (right). From Jack Daniels Properties, Inc. s v. VIP Products, LLC, Case No. 22-148, U.S. Supreme Court, Brief for Petitioner (11 January 2023), page 3, available here.

Although dog toys and whiskey bottles seem relatively inconsequential to literature, parody, and creative work, this case could have a dramatic impact on how authors write about, and parody, famous brands.

Trademarks are a cornerstone of our shared cultural vernacular. Popular brands are woven into the fabric of our national identity, recognizable by and meaningful to those from many different backgrounds. Authors often draw on these shared associations in their literary works, sending beloved fictional characters to real colleges, serving them familiar cereals, and outfitting them in well-known clothing labels. Whether to evoke nostalgia or to immerse their readers, authors use trademarks both to simulate reality and to critique it.

While trademark law aims to protect consumers and prevent confusion as to the source of goods or services, it must be enforced in a manner consistent with the speech protections guaranteed by the First Amendment of the U.S. Constitution. The freedom of authors to use trademarks in their works could be stifled by the threat of litigation. Overenforcement of trademark law runs contrary to both the purpose of intellectual property law and the U.S. constitutional legacy of protecting free expression. Protections for parody in other areas of the law, such as copyright’s fair use doctrine, will be undermined by a trademark ruling that allows for expansive enforcement.

If heightened First Amendment protections are not put in place, the threat of costly legal proceedings may cause creators to avoid the use of trademarks in their artistic works. While trademark law does have other mechanisms to protect authors of parody and commentary, such as a showing that an author’s use does not pose a likelihood of confusion, the process for successfully defending a trademark infringement case is remarkably expensive. In 2020, the American Intellectual Property Law Association reported that the median cost of trademark litigation in the U.S. before even going to trial ranged from $150,000 to $588,000. In the American system, litigants ordinarily bear their own costs, and so even an author who successfully defends such a suit would be on the hook for a large amount in legal fees. While litigation is commonplace for large corporations with significant legal resources, even a single lawsuit could be career-ending for an author without the resources to handle it.

If the threat of legal sanction hangs over the heads of writers, their literary characters may no longer use iPhones, eat at McDonald’s, or visit Disneyland. These uses offer meaningful expressive value to authors. Brands are often intentionally selected as cultural signifiers, chosen for the implicit associations they convey to readers. Cory Doctorow’s Down and Out in the Magic Kingdom (a Disney theme park) would have a different meaning if it were instead titled Down and Out in an Amusement Park. Nor is The Devil Wears Luxury Clothing as evocative as The Devil Wears Prada.

Even when trademarks are evoked in literary circumstances that their owners find distasteful, these uses are still expressive and noncommercial, thus worthy of the highest First Amendment protection. Prioritizing the pecuniary interests of trademark owners over the First Amendment rights of creative artists could lead to a catastrophic chilling effect on authors’ speech based on the perceived risk of litigation, whether or not such risk is actualized. This result is both untenable and entirely unnecessary. It is possible to ensure that trademark owners still have access to a wide variety of robust and reasonable remedies in cases of true infringement without creating unnecessary panic in many other circumstances.

The Supreme Court has a clear doctrinal path to avoiding a speech-suppressive environment. In Rogers v. Grimaldi, 875 F.2d 994 (2d Cir. 1989), the Second Circuit Court of Appeals struck a balance between the interests of trademark owners and First Amendment speech by crafting a clear and efficient test for infringement with appropriate protections for speech. The Rogers court recognized the mark owner’s interest in preventing confusion while ensuring adequate protection for the vital free speech principles at play, and provided a rule to determine at the outset of litigation–before incurring substantial costs–when expressive works infringe trademark rights. Rogers, in short, provided that in cases of artistic or creative works, trademark infringement should only be considered “where the public interest in avoiding consumer confusion outweighs the public interest in free expression.” Ordinarily, the court explained, this rule “will normally not support [the] application of [trademark law] unless the title has no artistic relevance to the underlying work whatsoever, or, if it has some artistic relevance, unless the title explicitly misleads as to the source or the content of the work.”

A ruling that substantially adopts a test like that in Rogers would continue to protect the rights of trademark owners, while also ensuring that authors who reference popular brands are protected by providing a clear, consistent and efficient rule to protect authors. A ruling in favor of Jack Daniels, however, could strike fear into the hearts of risk-averse creators, chilling their speech by discouraging them from using certain trademarks in their works altogether. It would undermine the otherwise strong protections that U.S. courts have identified for parodists and other authors in U.S. copyright law, under the doctrine of fair use.

You can read more about our views on the interaction between trademark law and authors’ free expression rights in our amicus brief filed in Jack Daniels v. VIP Products, available here.

Fair Use Week 2023: Looking Back at Google Books Eight Years Later

Posted February 24, 2023
Photo by Patrick Tomasso on Unsplash

This post is authored by Authors Alliance Senior Staff Attorney, Rachel Brooke. 

More recent members and readers may not be aware that Authors Alliance was founded in the wake of Authors Guild v. Google,  a class action fair use case in the Second Circuit that was litigated for nearly a decade, and finally resolved in favor of Google in 2015. The case concerned the Google Books project—an initiative launched by Google whereby the company partnered with university libraries to scan books in their collections. These scans would ultimately be made available as a full-text searchable database for the public to search through for particular terms, with short “snippets” displayed accompanying the search results. Users could not, however, view or read the scanned books in their entirety. The Authors Guild, along with several authors, filed a lawsuit against Google alleging that scanning the books and displaying these snippets constituted copyright infringement.

In addition to Authors Guild representing its members in the litigation, its associated plaintiffs brought the case as a class action, claiming to bring the case on behalf of a broad group of authors:  “[a]ll persons residing in the United States who hold a United States copyright interest in one or more Books reproduced by Google as part of its Library Project” who were either authors or the authors’ heirs.

But many of these authors did not agree with the Authors Guild’s stance in the case, and felt that the Google Books project served their interests in sharing knowledge, seeing their creations be preserved, and reaching readers interested in their work. A group of authors and scholars came together to share their views with the district court, many of whom would soon become founding members of Authors Alliance. Many of those same authors signed on to amicus briefs before both the district court and Second Circuit explaining why they opposed the litigation and supported Google’s fair use defense. Then, in 2014, Authors Alliance submitted its first amicus brief to the Second Circuit, supporting Google’s ultimately successful fair use defense. The plaintiffs later appealed the Second Circuit’s ruling, asking the Supreme Court to weigh in, but the Court ultimately declined to hear the case, leaving the Second Circuit’s ruling intact. 

Nearly a decade later, the effects of Google Books can still be seen in fair use decisions and copyright policy developments involving the challenges of adapting copyright to the digital world. In today’s post, I’ll reflect on how Google Books can be contextualized within today’s fair use landscape and share my thoughts on what the case can tell us about copyright in the digital world. 

Google Books and Transformativeness

A major question in Authors Guild v. Google was whether Google’s use of the copyrighted works was “transformative,” a key component of the fair use inquiry. When a use is found to be transformative, this in practice weighs heavily in favor of a finding of fair use. In the case, the court found that Google’s scanning, as well as the search and snippet display functions, were transformative because the service “augments public knowledge by making available information about [the] books without providing the public with a substantial substitute for . . . the original works.” This was because Google Books provided information about the books—such as the author and publisher information—without creating substitutes of the original works. In other words, readers could learn about the books they searched through, but could not read the books in full—to do this, those readers would have to purchase or borrow copies through the normal channels. 

Since the doctrine of transformativeness was established in the 1994 landmark Supreme Court case, Campbell v. Acuff-Rose Music, there have been myriad questions about the precise contours of what it means for a use to be transformative. Campbell established that a use is transformative when it endows the secondary work with a “new meaning or message,” but it can be difficult to apply this test in practice, particularly in the context of new or nascent technologies. Google Books tells us that scanning works in order to create a full-text searchable database with limited snippet displays is a transformative use based on its new and different purpose from the purpose of the works themselves. Furthermore, it reinforces the notion that a use is particularly likely to be considered transformative when it serves the underlying purpose of copyright law: incentivizing new creation for the benefit of the public and “enriching public knowledge.” By highlighting that Google contributed to public knowledge about books through its scanning activities and the Google Books search function, the court helped bring fair use for scholarship and research—two key prototypical uses established in the 1976 Copyright Act—into the digital age, setting an important precedent for later cases. 

Google Books and Derivative Works

One of the plaintiffs’ arguments in Google Books was that Google’s full-text searchable database constituted a derivative work. One of a copyright holder’s exclusive rights is the right to prepare derivative works—such as adaptations, abridgements, or translations of the original work—and the plaintiffs alleged that this right had been infringed. The court disagreed, finding that Google’s use had a transformative purpose, whereas derivative works tend to involve a transformation in form, such as the adaptation of a novel into a movie or an audiobook. Furthermore, the court explained that derivative works are “those that re-present the protected aspects of the original work, i.e., its expressive content, converted into an altered form[.]” In contrast, the Google Books project provided information about the books and offered a limited “snippet” view, but did not re-present the expressive content: the full text of the books themselves.

The distinction the court drew between transformative fair uses and derivative works in Google Books is an important one, as it can often be a close question whether a work involves a transformative purpose or merely represents the same work in a new form, without enough added to tip the scales towards fair use. And it is a question that continues to arise in fair use cases today: just last year, the Supreme Court agreed to hear Warhol Foundation v. Goldsmith, a case about whether Andy Warhol’s creation of a series of screenprints of the late musical artist Prince which drew from a photograph taken by photographer Lynn Goldsmith qualified as a fair use. We’ve covered this case extensively on our blog over the past few years, and submitted an amicus brief in the case. Our brief argues (among other things) that Warhol’s screen prints involve much more than a transformation in form: they are stylistically and visually distinct from Goldsmith’s photograph, and endow the photograph with a new meaning or message, making the use highly transformative. 

As in Google Books, the parties and amici in Goldsmith grapple with the line between transformative uses and the creation of derivative works, an often complicated and fact-sensitive determination. In this context, Google Books serves as a reminder that fair use is not a one-size-fits-all determination. Yet it also provides support for arguments advanced by Authors Alliance and others that simply because a transformation in form exists—in the Google Books case, the transformation from a print book to a scanned copy, and in Goldsmith, the transformation of a black and white photo to a series of colorful screenprints—does not mean that a secondary use cannot be a fair one. Warhol’s use did not merely “re-present the protected aspects of the original work[‘s] . . . expressive content,” but was transformative in the different “purpose, character, expression, meaning, and message” it conveyed.

Google Books and Controlled Digital Lending

The practice of controlled digital lending (“CDL”)—and the arguments in favor of it constituting a fair use—can be traced back in part to the fair use principles established and reinforced in Google Books. As I argue in our amicus brief in Hachette Books v. Internet Archive, a case about—among other things—whether CDL constitutes a fair use, Google Books shows that copying the entirety of a work in the process of making a transformative use of it can be fully consistent with fair use. 

Another important suggestion in the Google Books case, made at the district court level, was that the Google Books search function could actually drive book sales: the search results were accompanied by links to purchase the book, and research suggested that this could enhance sales of those books. This is analogous to the effects of library lending: library readers often purchase books by authors they first discovered at the library, an effect which can apply with equal force when the library patron borrows a CDL scan. Indeed, several other amici in Hachette Books argue that the finding that the Google Books search was a fair use lent substantial support for the argument that CDL is a fair use, based on both the factual similarities between the two initiatives and their shared objective of “enriching public knowledge.” 

As in Google Books, CDL also helps authors reach readers who could not otherwise access their books, and achieves this through scanning books on library shelves. And also like Google Books, CDL helps solve the problem of 20th century works “disappearing”: the commercial life of a book tends to be much shorter than the term of copyright, so when books under copyright go out of print, they can disappear into obscurity. Scanning these books to preserve them ensures that the knowledge they advance will not be lost. 

Google Books and Text Data Mining

Text data mining—the process of using automated techniques aimed at quantitatively analyzing text and other data—is also widely considered to be a fair use, and this determination is similarly built in part on the building blocks established in Google Books. As was the case in Google Books, the results of text data mining research provide information about the works being studied, and cannot in any way serve as substitutes for the content of the works. In fact, one important aspect of the new exemption to DMCA liability for text data mining, which Authors Alliance successfully petitioned for in 2021, is that researchers are not able to use the works in the text data mining corpus for consumptive purposes. And also like Google Books, researchers are able to view the content in a limited manner to verify their findings, analogous to Google Books’s snippet view. The new TDM exemption was a huge win for Authors Alliance members, and something to celebrate for all scholars engaged in this important research. Importantly, the precedent established by Google Books strongly supported its adoption and the Register of Copyright’s suggestion that text data mining was likely to be a fair use

Looking Forward: Google Books and Artificial Intelligence

In recent years, scholars and researchers have grappled with the implications of copyright protection on AI-generated content and AI models more generally. The holding in Google Books provides some support for companies’ and researchers’ ability to engage in these activities: one important factor in the case was that Google Books did not harm the market for the books at issue in the case, since the books in the database could not serve as substitutes for the books themselves. Similarly, when copyrighted works are used to train AI, the output cannot serve as a substitute for the copyrighted works, and the market for those works is not harmed, even if—like the plaintiffs in Google Books—the copyright holders might prefer that their works not be used in this way. Google Books establishes that simply because copyrighted works are used as “input” in a given model, this does not mean that the outputs constitute infringement. It is also worth noting that the court found Google’s use to be fair despite the fact that it was a use by a commercial, profit-seeking entity. While a commercial use can sometimes tip the scales in favor of finding a use to not be fair, this can be overcome by a socially beneficial, transformative purpose. This could arguably apply with equal force to AI models trained on copyrighted works which contribute to our understanding of the world, despite the fact that commercial entities are often the ones deploying these technologies. 

Eight years after it was decided, the legacy of Google Books endures in policy debates and copyright lawsuits that capture the public’s attention. Policymakers and judges would be wise to heed the lessons it teaches about the value of advancing public knowledge through digitization and the use of copyrighted works for new and socially beneficial purposes. As we await policy developments regarding text data mining and wait for decisions in Goldsmith and Hachette Books, it is my hope that this legacy will live on, reminding us all of the vast capabilities of information technology to enrich our understanding of the world and advance the progress of knowledge, which, after all, is what copyright law is all about.