Disney's copyright lawyers cannot touch me. (Maybe.)

published June 8, 2022

Archives have been navigating the treacherous waters of copyright since their inception. For most archival items, determining copyright status is a fairly straightforward (if not cumbersome) process: there are charts that can be consulted, donor agreements that can be reviewed, and human-led entities that can be contacted. Even in situations where a work – such as a film – does not have one “author” or “creator,” copyright is usually assigned to a company or producer, and the rights are (usually) not difficult to suss out. As it has in many fields, however, the advent of artificial intelligence (“A.I.”) has upended traditional notions of creation and ownership. There is a sizable gap in the existing literature about this subject, particularly when approached from the angle of archival studies. Archives will need to contend with this change in approach on a theoretical level in order to be prepared for the upcoming practical realities it will bring. This piece explores the ways in which A.I. creates, the copyright issues A.I.-created works raise, and the implications for archives and other memory institutions, using art creation program DALL·E 2 as a case study.

DALL·E 2 (referred to from this point forward as “DALL·E”) is an A.I. program designed to create images and art based on a user-provided textual statement. One example given on the home page for the DALL·E project is “a painting of a fox sitting in a field at sunrise in the style of Claude Monet.” DALL·E is trained (through the analysis of a mass volume of images) to recognize the shape of a fox, or the texture of a Monet painting, and then produce an image that fits the above description. Interestingly, and perhaps somewhat chillingly, OpenAI “does not allow users to generate violent, adult, or political [emphasis added] content, among other categories.” The specific process by which DALL·E is able to discern items, styles, and combinations is beyond the scope of this paper, but the key takeaway is that it is coded to create new images by learning from patterns in other images.

In order to learn, however, this art generator must draw (in both senses of the word) from pre-existing resources. A.I. programs like DALL·E can “create” only by being trained to recognize shapes, textures, and colors from the images that it is fed alongside text cues. The final creation is more like a remix or a reimagining, as if a hip-hop artist made a “new” track only by sampling the work of others. Crucially, these resources are all created by humans, and many are still under copyright. DALL·E’s documentation reveals that the program can even create images featuring characters and logos that are explicitly trademarked: “the model can generate known entities including trademarked logos and copyrighted characters.” The documentation goes on to suggest that “OpenAI [the company behind the project] will evaluate different approaches to handle potential copyright and trademark issues, which may include allowing such generations as part of "fair use" or similar concepts, filtering specific types of content, and working directly with copyright/trademark owners on these issues.” As of now, however, OpenAI does not work with copyright holders regarding either the works used to train the system or the works it creates.

This confluence of “work” from several different angles provokes questions about the nature of ownership, particularly in the copyright and provenance of art. Historically, copyright has been extended only to works created by humans. As Hirtle, Hudson, and Kenyon make clear in their discussion of copyright issues for cultural institutions: “the common element of protected works is that they have been created through human effort. There are no copyright issues when digitizing items from natural collections, such as fossils, plant or animal specimens, and geological formations” (p. 32). The big question, then, is who “owns” A.I.-generated art? Several potential answers present themselves: the creators of the A.I., the people whose art was used to train the program, the users who entered a prompt, and the A.I. itself.

The last option is, at least in the United States, an impossibility under current law. (In the United Kingdom and European Union, this possibility is more complicated due to the 2017 recognition of “electronic persons.”) As settled in the Ninth Circuit court case Naruto v. Slater, a non-human entity (in this case, a selfie-taking monkey) does not have standing in court and thus cannot claim authorship or copyright over a work. While this case was not specifically about A.I., Matthew Hooker explains that “under a sort of ‘transitive property’ principle, it could be inferred that unless Congress explicitly grants standing to AI entities, then the works created by those entities cannot be ‘authored’ by the AI. The Copyright Office’s Compendium’s standard would also preclude AI entities from being authors” (p. 28). However, Hooker goes on to suggest that “if a human created the AI software, then it is possible that the human creator might hold the copyright, even if the AI cannot” (p.31). It is also possible that no person owns the copyright. There is even an argument to be made that while single works of art created by DALL·E could not be copyrighted, the collective work it creates could be considered a piece of conceptual art, owned and copyrighted by the creator – but only as a whole unit. That these are only possibilities and not matters of settled case law makes any of them a potentially dangerous conclusion to draw in a vacuum.

Consider the following hypothetical: an A.I. is programmed specifically to make minor tweaks to characters and logos in existing copyright, trained on hundreds of thousands of frames of Mickey Mouse and Bart Simpson. A user “creates” a work by typing in a prompt regarding a character, which the A.I. produces. Klaris and Bedat argue that “the person in control of the bot is the author worthy of Constitutional protection," but consider a painter who creates a portrait of Mickey Mouse and attempts to sell it only to be shut down by Disney. Why would programming and training an A.I. to do the same thing in a digital realm rather than a physical one provide the programmer with protections that they would not otherwise be afforded? Such a scenario may seem needlessly fantastical, but it is not entirely unlike what DALL·E is programmed to do. DALL·E’s specific mission is to make pieces of art that are recognizable – although perhaps not easily attributable to a single artist or creator. The works it creates are of shapes and styles captured or created by humans who are capable of exercising their copyright claim in a court of law.

For memory institutions, the implications of the above explanations are staggering. If the author of an A.I. like DALL·E donates a number of works created by that new program to an archive, and it is later decided in court that copyright is owned by a different entity, the archives would potentially face legal repercussions for even acquiring the pieces, much less making them publicly accessible or placing them online. An A.I.-generated piece sold in 2018 at Christie’s auction house for $432,500 – archives and museums have no choice but to consider this A.I.-led movement alongside human art styles. Simply not accepting or seeking out these works until the law is settled – a process that could take years – would risk missing perhaps the only opportunity that will be available to preserve them. Digital information, and by extension digital art, vanishes quickly; archives must be proactive in its preservation. Art generated by an A.I. is nonetheless work that belongs in memory institutions, and art created in an emerging field involving new technology has an even greater claim to importance.

This set of hurdles further complicates the preservation of born-digital items, a field already wrestling with a number of pressing concerns. Copyright issues plague software preservation circles – a glance at the “Law & Policy” page of the Software Preservation Network (SPN) will provide a sense of how complicated determining the ownership of even the most basic software can be. An A.I., which may be built on open source code, trained with the work of human artists, and operated by a creative end user, is undoubtedly an even more knotted copyright mess to attempt untangling. To preserve the art created by an A.I. is one thing; to preserve the A.I. itself (arguably equally important for contextual purposes) poses another set of problems.

Thus far, however, the field has not begun to discuss any of these upcoming challenges. The vast majority of archival studies literature centered around A.I. has focused on its potential as a preservation or accessibility aid. Few scholars have discussed the potential hazards and difficulties of preserving A.I.-generated art – a search of several journals and databases returned no salient results, and scanning the Society of American Archivists section pages (Web Archiving, Preservation, etc.) offered no further direction for research. This gap in the literature is immense. If preserving A.I. work is of importance to archivists – and it is the position of this paper that it is – then the profession needs to be at the forefront of the theory underpinning an approach to the subject. It is of pressing importance that scholars begin considering the full implications of A.I.-created art.

Artificial intelligence poses many obstacles in a number of areas of study, but archives are uniquely poised to encounter issues that will resonate beyond their walls – issues of legality, ownership, and preservation, among others discussed above. It is imperative that the field undertake the work necessary to advocate for a clearer understanding of what copyright protections exist for A.I.-generated materials. Should it fail to, the repercussions, though artificial at the moment, could become very real.