When Meta’s services went down this past October, users were unable to access all of Meta’s applications, including Instagram, Messenger, and WhatsApp. This digital outage had physical consequences, as some Meta employees got locked out of their offices. The effects rippled outside of Meta’s own ecosystem, as some consumers soon discovered they were unable to log in to shop on select e-commerce websites, while others quickly found out that they could no longer access the accounts used to control their smart TVs or smart thermostats. Drawn by the ease of using Facebook accounts to log into websites, users had come to allow their Facebook account to act as a kind of digital identity. The outage, along with revelations from a fortuitously timed whistleblower, reminded users just how much individuals and governments depend on the “critical infrastructure” Facebook provides. Lawmakers in the U.S. have struggled with the question of how Meta should be regulated, or how its power should be reined in. One step towards mitigating Meta’s power would be to develop alternative digital Identity Management (“IdM”) systems.
Technology has been used to verify identity for hundreds of years. Back in the third century B.C.E., fingerprints, recorded in wax, were used to authenticate written documents. For centuries, identification technology has allowed strangers to bridge a “trust gap” by authenticating and authorizing.
In the present day, IdM systems have become a critical piece of technology for governments, allowing for the orderly provision of a range of services, like healthcare, voting, and education. IdM systems are also critical for the individual, because they allow a person to “prove one’s status as a person who can exercise rights and demand protection under the law.” The UN went so far as to describe an individual’s ability to prove a legal identity as a “fundamental and universal human right.”
Currently, there are over one billion people who live in the “identity gap” and cannot prove their legal identity. Put another way, one billion people lack a fundamental, universal human right. What makes this issue more pernicious is that the majority of individuals in the identity gap are women, children, stateless individuals and refugees. The lack or loss of legal identity credentials is correlated with increased risk for displacement, underage marriage, and child trafficking. Individuals living in the “identity gap” face significant barriers to receiving “basic social opportunities.”
The legal and social issues created by the “identity gap” are now evolving. In addition to the individuals who can’t prove their legal identity at all, there are over 3.4 billion people who have a legally recognized identification, but cannot use that identification in the digital world.
A 2017 European Commission Report found that an individual’s ability to have a digital identity “verg[es] on a human right.” The report then argued that one of the deep flaws of the internet is that there is no reliable, secure method to identify people online. The New York Times called this “one of biggest failures of the… internet.” Still, proving digital identity isn’t just a human rights issue; it’s also critical for economic development. A McKinsey report posited that a comprehensive digital IdM system would “unlock economic value equivalent to 3 to 13 percent of GDP in 2030.”
Digital IdM systems, however, are not without risk. These systems are often developed in conjunction with biometric databases, creating systems that are “ripe for exploitation and abuse.”
The most common IdM scheme is a “centralized” system; in a centralized IdM scheme, a single entity is responsible for issuing and maintaining the identification and corresponding information. In centralized IdM schemes, identity is often linked to a certain benefit or right. One popular example in the United States is the Social Security Number (“SSN”); SSNs are issued by the Social Security Administration, who then use that number to maintain information about what social security benefits an individual is eligible to receive. Having an SSN is linked to the right to participate in the social security system.
The centralized IdM schemes typically verify identity in one of two ways: via a physical and anti-forgery mechanism or a registry. These systems have proved remarkably resilient for a few reasons. They are easily stored for long periods of times and can be easily presented for many different kinds of purposes. Still, both ways have shortcomings, including function creep and lack of security.
Identity systems that rely on anti-forgery mechanisms, like signatures, watermarks, or special designs, can also have security flaws. First, these documents require the checking party to validate every anti-forgery mechanism; this might require high levels of skill, time, or expertise. Additionally, once a physical identification is issued, the issuing party is generally unable to revoke or control the information. Finally, anti-forgery measures constantly need to be updated because parties have great incentives to create fake documents.
Another security shortcoming of centralized IdM systems is that they rely on registries to contain all their data. Registries are problematic because they have a single point of failure. If one registry is compromised, an entire verification system can be undone. For instance, if SSNs became public, the SSN would become worthless; the value is in the secrecy.
Equally significant is the possibility of function creep, which can happen when a user loses control of their identification. SSNs, for example, were designed for a single purpose: the provisioning of social security benefits. Now, SSNs serve as a ubiquitous government identifier that is “now used far beyond its original purpose.” This is problematic because SSNs contain “no authenticating information” and can easily be forged. It’s not just governments, however, that allow function creep in centralized IdM systems. This happens for privately managed identity systems as well, as the Facebook hack showed.
The Alternatives: Individualistic and Federated IdM Models
Another type of IdM system is an individualistic or “user-centric” system. The goal of these systems is to allow the user to have “full control of all the transactions involving [their] identity” by requiring a user’s explicit approval of how their identity data is released and shared. Unlike those in “centralized” schemes, these types of identification do not grant any inherent rights. Instead, they give individuals the ability to define, manage, and prove their own identity.
To date, technical hurdles have prevented the widespread adoption of these “user-centric” systems. Governments and private companies alike have proposed using blockchain to create IdM systems that allow individuals to access their own data “without the need of constant recourse to a third-party intermediary to validate such data or identity.” There is hope that blockchain can provide the technical support to create an “individualist” IdM system that is both secure and privacy-friendly. Still, these efforts are in their infancy.
The last major type of IdM system is a federated model. Federated IdM systems require a high degree of cooperation between identity providers and service providers; the benefit is single sign on (SSO) capabilities whereby a user can use their credentials from one site to access other sites. This is similar to the Facebook model of “identity.” The lynchpin of any such system, however, is who the “trusted external party,” who acts as the verifier, is. The risk is that these systems lack transparency, meaning users might not know how their data is used.
Using Facebook to verify identity online is quick and easy. Yet this system is inadequate. An individual’s ability to state, verify, and prove their digital identity will be “the key to survival,” particularly given how difficult it is to create trust in the digital space. Proving identity is a technical problem, but this technical problem is closely linked with an individual’s ability to act as a citizen, in person or online. Governments and corporations alike have recognized the importance of improved digital identity systems and have begun advocating for more standardized identity systems. Detractors of digital identification systems argue that an individual’s identity should not depend on the conferral of documents by a third party, and that relying on these types of documents is contrary to the idea that humans have inherent rights. They’ll then quickly point to examples of authoritarian governments who use identity tracking for evil purposes. These criticisms ignore the reality that proving identification is already an essential part of life and that many rights are only conferred when you have the proper identification. Further, these criticisms fail to recognize that superior identification systems will provide benefits that will accrue to society as a whole. They could be used to record vaccination status, fight identity fraud, or even to create taxation systems based on consumption.
Identification and identity are closely linked. As we transition towards even more digital services, taking steps to ensure that we have control over our digital identity will be more than a technology or privacy problem. Our ability to have and control our identity will continue to be a key driver of social and economic mobility.
 In this context, authentication is the ability to prove that a user is who they say they are, and an authorization function shows that the user has the rights to do what they’re asking to do.
 Function creep is when a piece of information or technology is used for more purposes than it was originally intended.
Henry Rittenberg is a 2nd year student in Northwestern’s JD-MBA program.
When a musician desires to record a cover version of a song (i.e., their own version of a song written or made famous by someone else), the process for obtaining the rights to do so is quite simple: they obtain a mechanical license—a compulsory license that can be obtained by paying the appropriate fee to the copyright holder or their representative—but only after the copyright holder has exercised their right of first publishing. And that’s all there is to it. The artist records their version of the song, releases it into the world, pays royalties to the copyright holder, and, so long as they have abided by all the aforementioned steps, they likely do not run into any issues relating to this process. This process is simple, but in a time where musicians liberally borrow material from others and the resulting songwriting credits read like novellas, it creates issues in a discrete subset of cases.
17 USC § 115(a)(2) provides that
A compulsory license includes the privilege of making a musical arrangement of the work to the extent necessary to conform it to the style or manner of interpretation of the performance involved, but the arrangement shall not change the basic melody or fundamental character of the work, and shall not be subject to protection as a derivative work under this title, except with the express consent of the copyright owner.
Under this provision, arrangements of musical works prepared under mechanical licenses do not receive copyright protection unless expressly granted by the copyright holder. For most musicians preparing recordings under this type of license, this is not a very concerning issue—they still retain a copyright in their sound recording, and they are likely to be the only users of their arrangement. But consider the following situation:
What happens here? The arrangement is not provided copyright protection, but the new artist has to credit someone for the ideas that they repurposed in their new song. So, the songwriting credit for the portion borrowed goes to the original songwriters, even though they were not involved in writing the particular musical ideas.
Consider the following real-life example in order to understand the strange results that occur in these situations. In 1979, “And the Beat Goes On” was released by The Whispers. In 2002, “Reggae Beat Goes On,” a reggae cover of “And the Beat Goes On,” was released by Family Choice. Then in 2019, “How Long?”, a song containing elements of “Reggae Beat Goes On,” was released by Vampire Weekend.
“Reggae Beat Goes On” is a cover of “And the Beat Goes On,” and uses a very original arrangement in order to shift the genre from disco to reggae. The melody and lyrics remain the same and the chord structure is nearly the same as well, but the rest of the musical setting is different. “How Long?” takes elements from the musical setting of “Reggae Beat Goes On” and interpolates them in the new song. Namely, the guitar part from “Reggae Beat Goes On” is played on bass guitar in “How Long?” There are string parts brought over to “How Long?” from “Reggae Beat Goes On,” and the chords in both songs match as well.
Who are the credited writers for “How Long?”? There are five: Ezra Koenig, member of Vampire Weekend; Ariel Rechtshaid, producer of the record; and William Shelby, Stephen Shockley, and Leon F. Sylvers III, all writers of “And the Beat Goes On.” Bill Campbell, arranger of “Reggae Beat Goes On,” receives no songwriting credit for “How Long?” despite having written the external elements that are present in “How Long?”, while Shelby, Shockley, and Silvers receive credit despite not having written the material that was borrowed.
The result here seems absurd—the people receiving credit (and thus, also receiving royalties) did not actually write the borrowed material. However, the result is somehow in line with the purposes of and justifications for US copyright law. US copyright law is explicitly founded upon a utilitarian theory. Under a utilitarian theory, lawmakers have to decide which types of works to prioritize for copyright protection. And in reading 17 USC § 115(a)(2), it is clear that Congress chose not to give priority to arrangements of songs prepared under mechanical licenses. If US copyright law was founded upon a natural rights theory and all new works were automatically granted protection, this problem would likely not exist, but that is not the case—Congress must weigh the costs and benefits of expanding protection. Here, the absurd results justify Congress intervening.
One way to remedy this problem is to revise 17 USC § 115(a)(2) to automatically extend protection to arrangements prepared under mechanical licenses, even if the copyright owner has not expressly granted consent for the arrangement to receive copyright protection. If drafted carefully, the consequences would be minimal—the law would have to delineate which elements of the new arrangement are not eligible for protection, and in order to do that, the arranger would only have to look back to see what elements of the original song are afforded protection. Those protected elements would then be excluded from the protection afforded to the arrangement, while the rest of the elements, as well as the arrangement as a whole, would receive copyright protection. A revised version of 17 USC § 115(a)(2) could read as follows
A compulsory license includes the privilege of making a musical arrangement of the work to the extent necessary to conform it to the style or manner of interpretation of the performance involved, but the arrangement shall not change the basic melody or fundamental character of the work. All elements of the arrangement that are not subject to copyright protection under the original copyright shall be subject to protection as a derivative work under this title.
Congress made an explicit choice to not automatically extend copyright protections to arrangements prepared under mechanical licenses. In order to prevent absurd results that end with authors receiving credit for material they did not write and thus, receiving royalties for this work that they did not create, it would be prudent of Congress to revise 17 USC § 115(a)(2) to automatically provide copyright protection to arrangements of musical works prepared under mechanical licenses.
Michael Pranger is a 2nd year JD student at Northwestern Pritzker School of Law.
What’s The Issue?
It seems logical that the creator of a work would own the rights to that work. This general idea imports easily into some industries but creates problems in the music industry. The reality is that the main rights holder of a creative musical work is often not the musicians but collective management organizations (CMOs). After pouring countless hours, days, months, and years into perfecting a single music work or album, the musician often ends up not having total control over his or her work. The music industry is driven by smoke and mirrors where the distributors and records labels often do not disclose who owns the rights to which musical work. George Howard, co-founder of a digital music distributor called TuneCore and professor at Berklee College of Music, describes the music industry as one that lacks transparency. He explains that the music industry is built on asymmetry where the “under-educated, underrepresented, or under-experienced” musicians are deprived of their rights because they are often kept in the dark about their rights as creators.
As a result of the industry having only a few power players, profit is meek for musicians. Back in the day, musicians and their labels were able to get a somewhat steady source of income through physical album sales. However, with the prominence of online streaming, their main source of income has changed. The source of this issue seems to stem from how creators’ rights are tracked and managed.
A piece of music has two copyrights, one for the composition and one for the sound recording, and it is often difficult to keep track of both because the ownership of these rights are split amongst several songwriters and performers. The music industry does not have a way to keep track of these copyrights, and this is an issue especially when there are several individuals involved in creating a single musical work. With the development of digital ledger technology and its influence in various industries, it could be time that this development makes its way into the music industry and provide a solution to compensate musicians for their lost profits.
Blockchains: the solution?
Lately, blockchain technology has been at the forefront of conversations. For example, the variation in Bitcoin’s pricing has been a hot topic. Blockchain technology seems like a mouthful, but it is simply a “database maintained by a distributed network of computers.” Blockchains allow information to be recorded, distributed across decentralized ledgers, and stored in a network that is secure against outside tampering.
With the advancement of online music streaming, and entertainment going digital, blockchain seems like the perfect tool to be used in this industry. Since the issue of weakened profits seems to stem from disorganized tracking and monitoring of creators, blockchain technology could be utilized to improve the systems used for licensing and royalty payments. A blockchain ledger would allow a third party to track the process of a creative work and be an accessible way of managing intellectual property rights of these creative works. By tracking and monitoring their works, musicians could potentially gain back their profits, or at least recuperate some of their losses.
In 1998, there were several companies that came together to create a centralized database to organize copyrights for copyright owners so that royalty payments would be made in an orderly fashion. This effort was called the Secure Digital Music Initiative (SDMIT) and its purpose was to “create an open framework for sharing encrypting music by not only respecting copyrights, but also allowing the use of them in unprotected formats.” Unfortunately, this initiative failed to provide a universal standard for encrypting music.
The latest venture was the Global Repertoire Database (GRD) which aimed to “create a singular, compiled, and authoritative ledger of ownership and control of musical works around the world.” This was a very ambitious move and required two rounds of financing which consisted of the initial startup funds and the funds to cover the budgeting for the year. Although there were significant contributions to this mission, some collection societies, such as the American Society of Composers, Authors and Publishers (ASCAP), started to pull out of the fund due to GRD’s failure and debt that it accumulated.
Even though this venture failed to provide a centralized database that could resolve royalty and licensing issues, there is now a growing consensus in the music industry for a global, digital database that properly, and efficiently, manages copyright ownership information. The next venture could utilize blockchain technology because of the advantages for storage, tracking, and security that it offers. In addition, not only could blockchain provide a centralized database so that music content information is accurately organized, it could provide a way to close the gap between creators and consumers and dispose of intermediaries. This would allow for a more seamless experience and transparency for the consumer and allow the creators to have more control over their works. Further, this ledger would allow these creators to upload all of their musical work elements, such as the composition, lyrics, cover art, video performances and licensing information, to a single, uniform database. This information would be available globally in an easily verified peer-to-peer system.
On the other hand, since blockchains are tamper-resistant, the data could not be “changed or deleted without affecting the entire system” even with a central authority. This means that if someone decides to delete a file from the system, such a deletion will disrupt the whole chain. There could also be issues with implementing such a large network of systems, or computers, due to the sheer amount of music that is globally available. Additionally, to identify each registered work, the right holders have to upload digital copies of their works which would require an extensive amount of storage and computational power to save entire songs.
Nevertheless, blockchain could provide the base for implementing a centralized database using a network of systems, or computers, in order to organize royalty payments for these musicians. Proponents contend that, with the help of Congress, this could be made possible. Congress recently introduced Bill HR 3350, Transparency in Music Licensing and Ownership Act. This act, if passed, will require musicians to register their songs in a federal database or else forfeit the ability to enforce their copyright, which would prevent them from collecting their royalties for those works. Although this might seem like an ultimatum, this proposed Act would provide the best way of changing how the music industry stores its information to provide an efficient way to distribute royalties and licensing payments to these artists.
People are split in their opinions about blockchain technology in the music industry. There are some who see this as a more accurate way of managing “consumer content ownership in the digital domain.” Others do not see this as a viable plan due to its lack of scalability to compensate for the vast amount of musical works. Even with the development of the music industry into the digital field, the goal is always to protect the artists’ works. Plan [B]lockchain ledger may not completely solve the royalty problem in the music industry, but it can provide a starting point in creating a more robust metadata database and, in combination with legislative change, the musical works could remain in the hands of their respectable owners.
Jenny Kim is a second-year law student at Northwestern Pritzker School of Law.
Throughout the past two years, AI-powered stem-splitting services have emerged online, allowing users to upload any audio file and access extracted, downloadable audio stems. A “stem” is an audio file that contains a mixture of a song’s similarly situated musical components. For example, if one records a mix of twenty harmonized vocal tracks, that recording constitutes a vocal stem. Stems’ primary purpose is to ease integrating or transferring their contents into either a larger project or a different work. Traditionally, only producers or engineers created and accessed stems. Even when stem sharing became commonplace, it was only for other industry insiders or those with licenses. But AI stem-splitting technology has transformed stem access. For the first time, anyone with internet access can obtain a stem through stem extraction software, which will likely push music production’s creative envelope into new realms. One inevitable consequence, however, is the question of copyright protection over stems extracted from copyrighted works.
Section 102 of Title 17 extends copyright protection not only over the stems’ original copyrighted audio source but also over that source’s components, such as the stems. Any modification of that work, such as extracting a stem and using it elsewhere, likely qualifies as a “derivative work” under Section 103. Importantly, Section 106 allows only copyright owners to authorize making derivative works. In light of this regime, what flexibility, if any, do artists have in using AI-extracted, copyrighted stems? Three considerations shed light on an answer: fair use, de minimis use, and the use of content recognition software coupled with licensing.
Codified under Section 107, the fair use defense provides a possible safeguard for would-be infringers. To establish this affirmative defense, a court would need to find the statute’s four factors sufficiently weigh toward “fair use.” Unfortunately, courts reach incongruous interpretations of what permissible fair use includes, rendering the defense a muddled construct for many artists. Squaring the four factors with stem usage, however, may offer guidance.
In the seminal music fair use case, Campbell v. Acuff-Rose Music, Inc., the Supreme Court emphasized that this first factor will likely weigh toward fair use when the work is “transformative.” The Court went so far as to note “the more transformative the new work, the less will be the significance of other factors . . . .” Thus, an artist fearing infringement should strive toward transforming the copyrighted material into something distinct, used for noncommercial purposes. In the absence of a bright-line rule from the Court, however, artists will still need to use reasonable judgment about what types of stem usage is “distinct.” For example, suppose Artist A extracts a strings stem from a copyrighted work and only uses two seconds of it within another work that comprises numerous other instruments and melodies. Meanwhile, Artist B extracts the same strings stem; however, Artist B uses the entire strings melody within their work and only adds percussion and minor counter-melodies. Artist A would likely be in a more favorable legal position than Artist B given A’s efforts to materially transform the copyrighted audio.
The commercial nature of copyrighted stem usage is also unclear. An artist may choose to work with stems for solely experimental purposes. For example, an artist who shares their work via Soundcloud or YouTube does not expect another person to use those platforms to directly purchase the work. With stems’ increasing public accessibility, many will simply want to experiment with a music tool that, until recently, has largely remained a foreign concept. If this issue reaches a court, the court would need to conduct an analysis set against the landscape of such heightened accessibility. An increase in this noncommercial, creative use may offer hope to artists in the future, but it is too soon to tell.
The second factor favors artists borrowing from copyrighted works with lower creative value. Unfortunately, music is typically found to be one of the most creative forms of copyrighted work. For example, a district court in UMG Recordings, Inc. v. MP3.Com, Inc. analyzing this second factor noted that the disputed material—copyrighted musical works—was “close to the core of intended copyright protection” and “far removed from the more factual or descriptive work more amenable to ‘fair use.’”
Though courts’ future inquiries into stem usage may differ from previous analyses of sample usage, the inquiry will likely change very little for this factor. Although a stem could potentially represent only a minute portion of the song, this factor’s inquiry focuses on the source of the stem, rather than the stem itself. Consequently, rarely will this factor work to a potentially infringing artist’s benefit, even if their stem usage is quite minor.
The third factor, however, may offer hope for such minor stem usage. Courts will undoubtedly reach differing interpretations about how minimal the copyrighted portion’s “amount” and “substantiality” must be for this factor to weigh toward fair use. A court will need to weigh numerous variables and how they intersect. For example, is an artist using a thirty-second loop of a vocal stem or a five-second loop? Does that vocal stem include the chorus of a song? What about any distinct lyrics? Just minor humming? These considerations are not entirely novel. Artists purporting to use copyrighted samples have long been able to argue—with little success—that their samples’ amount and substance pass muster under this factor. Yet, stems are not samples; in fact, they typically represent a considerably smaller portion of a work. Given just how recent and novel their public accessibility is, it remains unclear whether a court would treat stem use any differently under this factor than it has treated instances of sample use. Carefully using a minor portion of a vestigial stem to avoid a work’s core substance, therefore, could potentially facilitate a favorable outcome.
The fourth and final factor of fair use is “[u]ndoubtedly the single most important element.” This factor examines both the infringement’s effect on the potential market and “whether unrestricted and widespread conduct of the sort engaged in by the defendant . . . would result in a substantially adverse impact on the potential market for the original.”
In the sampling realm, this factor has tipped the scales before. For example, in Estate of Smith v. Cash Money Records, Inc., a district court found fair use when the defendants inserted a thirty-five-second “spoken-word criticism of non-jazz music” into a hip-hop track. In its analysis of this fourth factor, the court emphasized that “there [was] no evidence” that pointed to overlapping markets between the spoken jazz track and the hip-hop track. The court, noting this factor’s high probative value, then weighed this factor in the defendants’ favor.
In the stem realm, the novel nature of widespread public use means courts will need to determine both whether this factor should remain highly probative and how much deference to give stem users in analyzing market overlap. After all, an artist who incorporates a stem into a work intended for a twenty-five-person YouTube following likely affects the original work’s market differently than an artist who disseminates that work to millions of followers. This factor’s outcome will also rely on the stem’s source. Similar to sampling, if an artist uses a stem in a drastically different arena than the one for which the stem was created, this factor will weigh more toward fair use.
For example, suppose Artist A locates an insurance advertisement jingle. Artist A then extracts a stem from that advertisement audio and uses the stem in a new hip-hop track. The advertisement’s potential market is likely different from the hip-hop track’s potential market. Artist A’s work would likely have little impact, if any, on the advertisement’s market. Artist B, meanwhile, creates a hip-hop track but uses a stem from another hip-hop song produced twenty-five years ago. Though Artist B may believe the stem from the older hip-hop track no longer caters to the same hip-hop market to which Artist B is targeting, a court may be more inclined to find a material impact on the older track’s market: it would provide another way in which music listeners, particularly hip-hop listeners, could hear that older track. Nonetheless, it would remain up for a court to decide.
De Minimis Use
Artists might also be able to use extracted, copyrighted stems if such use is de minimis. The Ninth Circuit in Newton v. Diamond held de minimis use—“when the average audience would not recognize the appropriation”—is permissible. Yet, following the Newton decision, the Sixth Circuit in Bridgeport Music, Inc. v. Dimension Films foreclosed the possibility of de minimis copying. Similar to fair use analysis, it is difficult for an artist to determine whether the use of a stem in their work is de minimis under this standard.
Thus, an artist who loops only a small, relatively generic-sounding portion of a stem may find additional legal protection. But they might not. If they are in a jurisdiction that does not recognize de minimis use, or they use a stem in a way that extends beyond what a court considers de minimis in a de minimis jurisdiction, this avenue will be unavailable.
Content Recognition Software
Beyond legal defenses, a newer scheme of licensing deals coupled with content recognition software may offer protection for stem usage. For example, if a user on the content platform TikTok uploads content with copyrighted audio, TikTok’s content recognition software recognizes the audio, then pays the appropriate royalties to the audio’s copyright holder through preexisting licensing deals. Yet, because schemes like TikTok’s and stem usage are both relatively new, it remains unclear whether artists could find the same protection through individual stem use. Indeed, if an artist uses only a small part of a single stem, it may very well be impossible to detect the stem’s source; however, emerging technology may change this soon. Further, these licensing deals restrict such artists to sharing work only on particular platforms—notably, neither Soundcloud nor YouTube. Ultimately, this protection carries promising potential for expanded, authorized stem use. But perhaps not quite yet.
Matthew Danaher is a second-year law student at Northwestern Pritzker School of Law.
What are Copyright Bots?
Digital media handles like Instagram, YouTube, and Tik Tok are now the platforms of choice for creators, new and old, to showcase their work. The accessibility of these platforms opens an untapped market of creativity, allowing for anyone with a smartphone to disseminate their work. With great freedom, though, the potential for copyright infringement skyrockets.
This is where copyright bots come in. Also known as content recognition software, copyright bots are automated systems programmed into digital media platforms that compare uploaded content against an archive of copyrighted content to recognize similar works. The copyright owners, or the platforms utilizing copyright bots, can then examine the similar works to determine whether the work is an actual copy, whether the copy is licensed, and whether legal action to remove the work is merited. Some bots can even handle the enforcement of the copyright by sending out notices and handling appeals.
The Digital Millennium Copyright Act (DMCA) of 1998 is one explanation for the popular use of copyright bots. Under the Act, online service providers can be held liable for direct infringements of copyrighted work made by its users that they are aware of, regardless of whether they are responsible or receive any kind of financial benefit from the infringement. Fortunately, the safe harbors provision of the DMCA provides immunity from secondary infringement liability as long as the online provider swiftly removes and disables access to the infringing material. This incentivizes providers to use automated systems capable of acting expeditiously.
Another explanation for the use of copyright bots is necessity. There is simply too much new content being uploaded for humans to manually search on their own. To put it in perspective, 720,000 hours of video are uploaded to YouTube every day. If human reviewers are equivalent to a guard dog, then copyright bots are an entire army, able to identify, analyze, and compare uploaded content at a rate that is not viable for humans.
The Benefits and Risks of Copyright Bots
Copyright bots provide substantial benefits to copyright owners and digital media platforms. Without bots, it would take a lifetime for copyright owners to police a fraction of YouTube’s content alone. Humans’ limitation in identifying similar works results in thousands, even millions, of works neglected under the radar.
Copyright bots, on the other hand, provide a copyright owner with an all-inclusive overview of similar works because they can effortlessly sift through content at a relatively instant speed. They can also enforce a copyright owners rights without any oversight. Consequently, this automated method identifies all matches, allowing copyright owners to comprehensively enforce their rights against all, instead of just a fraction, of unauthorized uses.
Digital media platforms also benefit significantly from the use of copyright bots. According to YouTube, over 98% of copyright issues on the platform are handled by their content recognition software. With the eminent risk of liability under the DMCA, it’s no wonder why YouTube and other platforms use overzealous copyright bots.
Nevertheless, copyright bots pose several risks. First, copyright bots are far from perfect and tend to produce false positives. The automated systems often flag content that uses a de minimis amount of copyrighted material, or material that is in the public domain. In particular, bots do not have an ear for classical music, which exists as a frequently revisited collection of public-domain works that are distinguishable through slight variations in performance.
Copyright bots’ tendency to produce false positives disproportionately favors the interests of digital media platforms and copyright owners. Both groups have an incentive to remove as many close matches as possible, even when they are false positives that don’t constitute infringement. This is especially true for digital media platforms that are motivated to rigorously comply with the DMCA’s notice-and-takedown process to shield themselves from liability. Overall, there is little to no deterrent for digital media platforms and copyright owners to eliminate any and all content identified by the bots.
On the other hand, the odds are against content creators when they appeal a takedown notice. If they lose on the appeal, they risk having their digital media accounts blocked and being forced to pay a high licensing fee or settlement payment. Content creators fear receiving a takedown notice every time they post new content. Doug Walker, a popular YouTube host, said that his fear of takedowns when administered by a bot has made him never feel safe posting a video, even though “the law states that [he] should be safe” posting his videos. As a result, many content creators, like Walker, choose to censor themselves instead of gambling on an appeal.
Second, copyright law is intended to allow for flexibility and discretion in the analysis of whether a new work infringes a protected work. Bots, however, are unable to decipher the difference between fair uses of a copyrighted work—works that are a parody, intended for educational purposes, or have been transformed—from works that are infringing.
For example, Richard Prince’s series of paintings called “Canal Zone,” in which Prince added an animated guitar and eyes to an existing photographic work of a Rastafarian man, likely would be signaled and shot down by copyright bots. But the court in Cariou v. Prince found that “all but five of Prince’s works do make fair use of Cariou’s copyrighted photographs.”
Bots’ limited ability to recognize the limitations on copyright law risks granting monopolies to the original owners, which is something copyright law is intended to prevent.
Copyright Bots’ Impact on the Policy Goals of Copyright Law
In addition to the short-term effect on content creators, copyright bots’ deficiencies ultimately harm copyright law’s policy goals. Copyright law stems from Article 1 Section 8 clause 8 of the Constitution: “Congress shall have power to… promote the progress of science and useful arts, by securing for limited times to authors and inventors the exclusive right to their respective writings and discoveries…” Under the United States’ utilitarian perspective, the exclusive rights of reproduction, adaptation, publication, performance, and display afforded to copyright owners is intended to foster creativity and innovation to benefit the public. What copyright law does not grant, however, is a monopoly over original works with a modicum of creativity.
Copyright bots do a good job of protecting owners’ rights across a broader range, and they help keep digital media platforms within the safe harbors of the DMCA. However, overreliance on copyright bots risks granting monopolies to copyright owners and turns a blind eye to the intended limitations on copyright law. This is especially true when the bots themselves are also in charge of enforcement.
Moreover, the bots’ rigorous administration of owners’ rights fails to acknowledge the lawfulness of works that supersede the objects of original creation or add something new with further purpose or different character. Transforming a copyrighted work or using it for educational purposes benefits society and achieves copyright law’s policy goals.
If digital media platforms continue to take a back seat while bots drive enforcement, copyright law in the online world will discourage creativity and innovation. Ultimately, this will halt the advancement of science and the useful arts. Even worse, creators may start focusing on ways to avoid copyright bot detection instead of spending time adding value, information, or new aesthetics to copyrighted material, thereby failing to further progress and protect owners’ rights.
Copyright law’s design leaves room for discretion to balance the tension between copyright owners’ rights and creators’ liberty. Digital media platforms should embrace the permitted use of copyrighted material to maintain the peace between copyright owners and creators instead of unleashing their bot armies on the slightest hint of a potential infringement. As human oversight dwindles, and the use of bots rises, creators will be forced to learn the bots’ algorithm to avoid attacks instead of enjoying and capitalizing on their freedom under the fair use doctrine.
The easiest solution would be to eliminate copyright bots entirely. However, bots are necessary considering the sheer volume of new content uploaded every minute. Nonetheless, they need revision to be more skeptical of works that are similar to or contain copied material. There also needs to be human oversight, especially during the enforcement stage, until copyright bots can differentiate between infringing works and fair uses. It’s time for copyright bots to receive a much-needed tune up before they rewire content creators and deprive the online world of their artistry.
On July 1, 2021, the original source code for the World Wide Web sold for $5.4 million at a Sotheby’s auction. This was not the actual source code, which is open source and freely available in the public domain, but rather a non-fungible token (NFT) version of the source code. NFTs, a new block chain based asset, have been making waves as everyone — from artists, sports leagues, to exiled whistleblowers and institutional investors —looks to participate in the craze by creating and selling digital assets for millions of dollars. So, what are NFTs, and what property rights and liabilities do buyers take on with their purchases?
What are Non-Fungible Tokens?
Non-fungible tokens, or NFTs, are unique digital assets stored on blockchain and used to create authenticated digital ownership of a scarce asset. The concept was first introduced by Vitalik Buterin, creator of the fungible crypto currency Ethereum, in December 2012 with “Colored Coins.”
An NFT uses “smart contracts,” which are open-sourced blockchain protocols outlining specific terms and conditions for the transfer of digital ownership. The smart contract is then permanently “minted,” or stored onto a blockchain token (most commonly Ethereum) creating an immutable record of the token’s history, from its creation all the way to its most recent sale. The secure digital storage of ownership records is one of the chief benefits of NFTs as a vehicle for storing wealth, especially compared with similar markets, like art, which are often plagued with authentication problems.
In 2017, the potential for digital assets as a means to store and create wealth was first initiated when CryptoPunks launched the world’s first marketplace for NFT digital art on the Ethereum blockchain. The project’s creators distributed 10,000 different claimable cartoon characters to entice market participants, and the characters were quickly claimed and subsequently traded in a freely formed secondary market.
State of the Market Today
Today, the NFT market is exploding. For example, on August 28, 2021, just four years after launch, CryptoPunks crossed $1 billion in all time sales. Much of this growth has occurred over the past twelve months. During 2020, NFT sales are estimated to have totaled a mere $94.7 million; however, for 2021 NFT sales have sky rocketed all the way to $24.9 billion. Much of this growth occurred in the second half of the year with sales volume growing from a combined $2.1 billion during Q1 and Q2 to a combined $22.3 billion during Q3 and Q4. Despite current volatile equity markets, NFTs continue to remain popular during 2022, with OpenSea, the largest NFT marketplace, announcing a record $5 Billion in transaction volume for the month of January.
Much of the NFT growth has been attributed to increased demand and acceptance for cryptocurrencies, coupled with an increased participation in public consumer trading during the COVID-19 pandemic. The explosive growth has drawn increased media attention, which has further fueled its virtuous cycle. Today, blue-chip companies like Coca-Cola are issuing their own NFTs, while institutional investors and fund advisors like Andressen Horowitz are investing heavily in the NFT marketplace, demonstrating confidence in its continued growth. Accordingly, this formerly fringe product has come under increased legal and regulatory scrutiny.
The scope of transferred property rights with NFTs
The rights acquired by an NFT purchaser vary widely based on how and where the NFT was acquired. In most cases an NFT holder, similar to a purchaser of a physical collectible, is purchasing a non-exclusive license to the underlying intellectual property rights of the asset for a non-commercial purpose. However, unlike a physical collectible, a digital NFT is easily and cheaply created, reproduced, or downloaded. A crucial distinction is that an NFT holder is merely acquiring the rights to the blockchain token imprinted with the intellectual property, and not the underlying intellectual property itself. Furthermore, for many NFTs the blockchain is unable to store the actual underlying digital asset, so what is actually being bought is an imprinted link to the asset. As a result, the underlying copyright only transfers if the copyright’s owner assents to such transfer in writing alongside the transfer of the digital asset.
The totality of the rights associated with the token are defined by the smart contract. Thus, there is a huge importance on the drafting of the smart contract coded into the token during an NFT transaction, which is an element not evinced during a physical collectible sale.
Smart contracts are the coded instructions of the NFT defining the scope and limitations of use. Depending on where the NFT is acquired the license agreement can vary immensely. The NBA Top Shop platform is an example of a proprietary marketplace, meaning the NFTs are created by a single market operator who does not allow third party transactions. For their platform, Top Shot has promulgated an internal standard NFT license agreement. These kinds of agreements stipulate that the buyer receives, “(i) a personal license to use and display the art associated with the NFT, as well as (ii) a commercial license to make merchandise that displays that art associated with the NFT, a license subject to a $100,000 gross revenue per year limit.” Given the market operator is minting all listed products, this concentrates even more control in the market operator and typically transfers the fewest rights to the subsequent owner, as evidenced by Top Shop limiting the ability of future owners to commercialize their purchase.
On the other end of the spectrum is OpenSea, an example of an “open” marketplace that allows anyone to mint and sell NFTs and enables customizable licenses. For example, an NFT of a tungsten cube sold for $200,000 and included the right to touch a physical one-ton tungsten cube stored at Midwest Tungsten Service headquarters once per year. This market offers financial freedom, but also features high risk of fraud due to ease of access, lack of market oversight, and difficulty in tracing actual wallet ownership.
In the middle are curated marketplaces, which require artists to apply to have their items listed by the market operator; however, unlike proprietary marketplaces, any creator can apply to have their NFT listed so long as the NFT conforms with the market operator’s regulations. Approval generally requires the NFT seller to abide by the market’s standard license agreement, similar to the proprietary market.
Regardless of who is dictating the terms, NFT license agreements cover a variety of issues tied to which property rights are being transferred. Typically, this includes many common terms and conditions such as indemnification clauses, buyer and seller rights, and definitions for various duties and obligations amongst others. However, the intrinsic qualities of an NFT also necessitate some atypical clauses such as ownership of platform IP, digital wallet verification, and outlining the commercialization or transfer rights. The most important and heavily scrutinized term of an agreement though is how the agreement handles the ownership of the copyright connected to the token. Thus, understanding the terms contained in these agreements is essential to evaluating and participating in this fast growing and evolving digital marketplace.
The potential for copyright infringement and liability for NFTs
While blockchain can authenticate the transactions, blockchain is unable to inform the buyer if the seller has rights to underlying copyrighted work or is merely selling someone else’s copyrighted work. This is important because section 504 of the Copyright Act holds even an innocent actor who unknowingly violated somebody else’s copyright automatically liable for both actual and statutory infringement damages. Recently, an NFT of Whales produced by a 12 year old programmer was sold for thousands of dollars, only for the images to later be discovered to have been copied directly from another project. Assuming the use constituted infringement (the issue has not yet been litigated), the buyer at a minimum is liable to return the NFT to its rightful owner and handover any profits. Furthermore, if the buyer knew there was a potential infringement and still sold the NFT, the statutory liability could be over a million dollars.
NFT case law is still being developed, with the first cases just now working their way through the legal system. As the market continues to grow, more regulatory agencies, including the SEC, are taking a closer look at security status and potential acts of fraud. However, until the regulatory status is cleared NFTs exist in a caveat emptor market. The onus is on the purchaser to verify the terms of the license agreement they are acquiring, the status of the seller, and the underlying intellectual property. While blockchain makes the verification and authentication process easier, it still has gaps creating additional liability, especially for copyright infringement.
TRIPS Waiver and COVID-19 Policy
As COVID continues to impact nations across the world, policymakers are left trying to facilitate ways to better deal with a global event of this magnitude. Public health concerns forced leaders to re-think current laws and agreements. Intellectual property law is not an outlier. To mitigate some of the effects of the pandemic, some countries have tried to waive certain intellectual property rights for knowledge sharing. In October of 2020, for instance, India and South Africa proposed a TRIPS waiver related to the COVID-19 pandemic, which sparked a global debate around balancing intellectual property rights and managing a global health crisis.
Countries that are members of the WTO have all signed agreements related to trade and intellectual property protection–this agreement is the TRIPS agreement. Governments that have agreed to this TRIPS agreement can bring waivers, such as the one proposed now, in times when they believe the public good may be better served without certain IP protections. The idea is that waiving certain obligations pertaining to either patents related to COVID-19 related inventions or discoveries would allow greater access to COVID-19 vaccines and drugs. India and South Africa’s proposal was amended in May of 2021, with the support of 60 low-income countries. The main amendment was the addition of a clause that limited the waiver to cover a period of three years. The waiver would cover “health products and technologies, including diagnostics, therapeutics, vaccines, medical devices, personal protective equipment, their materials or components, and their methods and means of manufacture for the prevention, treatment, or containment of COVID-19.”
Support for this waiver has expanded to more countries, including China, but notably it did not receive initial backing from the U.S. The U.S. has historically been opposed to such intellectual property waivers. However, on May 5, 2021, the Biden Administration released a statement in support for a TRIPS waiver for COVID-19 vaccines, specifically. It stated, “The Administration believes strongly in intellectual property protections, but in service of ending this pandemic, supports the waiver of those protections for COVID-19 vaccines.”
More than a year after India and South Africa’s initial TRIPS proposal, discussions continue among WTO nations, but no agreement has been reached. Countries like Germany, Switzerland, Norway, and the UK are standing firm in their objections. Though as of January 2022, roughly 10 billion doses of the COVID-19 vaccine have been administered, Our World in Data reports that only 10% of people in low-income countries have received at least one dose.
Would Climate Change Fit Under a TRIPS Waiver?
If a TRIPS waiver cannot be approved for something as global and life-threatening as the COVID-19 pandemic, can a TRIPS waiver ever be approved for something else? In the face of impending climate change, for example, could a TRIPS waiver be effective in the sharing and access of climate technologies, much as a COVID-19 waiver would be effective in sharing technologies related to the pandemic?
Simply put, yes. Climate change could be similarly categorized as a type of disaster that could be noted in a TRIPS waiver proposal to temporarily waive intellectual property rights, specifically patent protections. However, at this point, a TRIPS waiver for climate change has not been formally proposed.
Possibility of Extended TRIPS Waiver to Combat Climate Change
Just as with the COVID-19 pandemic, low-income countries are likely to suffer the most from climate change. Like with COVID-19 relief, knowledge sharing of climate change technologies may be needed and desired. A TRIPS waiver could promote “technology transfer” related to climate change, which in turn could promote access to climate technologies, especially in low-income countries. This sharing of knowledge would ideally lead to developments in climate technology and further access around the world.
However, opponents to TRIPS waivers state that without intellectual property protections, inventors are not sufficiently incentivized to continue to create new technologies. It is argued that stripping away financial incentives would hinder the development of climate technology and that the TRIPS agreement is in place primarily to guard those property rights of patent, copyright, and trademark owners. To secure a TRIPS waiver for climate change technologies, approval would be needed from many different countries. Low-income countries would potentially be more willing to sign on to this type of waiver, as evidenced by their support of the COVID-19 waiver. As has been the case with other proposed waivers, it will potentially be difficult to influence higher-income countries like the United States. Yet, since the Biden administration has supported a limited, temporary COVID-19 waiver, perhaps this administration or future administrations would support similar proposals relating to climate change.
Other high-income countries would need to sign on for the waiver to be effective and lead to knowledge sharing. As James Bacchus emphasizes in his report for a WTO climate waiver, being able to secure such a waiver will fall on the political persuasion of countries who hold the most power. It will take the persuasion of these countries in convincing the WTO that this type of waiver is necessary to battle the potential destruction of climate change. Drafting a climate change waiver and waiving certain intellectual property rights temporarily to create knowledge sharing of climate technologies would be the easy part of this venture. Getting the necessary support from countries who have the most power would be much harder.
Alternate Routes of Knowledge Sharing
A TRIPS waiver could allow access to intellectual property, specifically patents, within a timeframe normally unachievable under the current intellectual property regime. But is there another way to increase knowledge sharing, without relying on a TRIPS waiver being passed?
One alternative may be patent pools. The creation of patent pools has improved access to public health when intellectual property laws may have previously limited knowledge sharing. Patent pools are agreements between patent owners and/or third parties, to license a patent for others to use, create, or make. They work by allowing patent holders to license their patents to a shared pool. Then, manufacturers, developers, and other inventors that are part of this pool can use the rights to make the patented invention at lower costs. The patent holder gets royalties from what is sold, therefore allowing them to still benefit from their work. The Medicines Patent Pool, for example, operates to increase access to HIV treatments. It continues to create lower-cost drugs for those living with HIV/AIDS.
Applying the same idea, a patent pool could be created for climate change related technologies. The benefit of a patent pool is that it would not need broad approval from countries in the WTO, like a TRIPS waiver would need. A patent pool for climate change technologies would instead rely on the licensing of technology from patent holders themselves. Once the patent pool has patents that can be used, manufacturers can start to make and sell these inventions at lower costs, which would hopefully provide more access. The downside to a patent pool solution is that, though licensing patent holders would collect royalties, to make the pool work, all participants would need to agree on the licensing details. This would require that patent holders independently make the decision to share knowledge on their own and then agree with the other patent holders in the group. Without buy in from governments, the patent pool system relies on the goodwill and good faith of individual patent owners.
Ultimately, it would likely be difficult to set in place a TRIPS waiver for climate change technologies, even though such a waiver can be highly effective in promoting knowledge sharing. As a TRIPS waiver has not yet been available for COVID-19 related relief, trying to obtain buy in for a potential crisis that may be slower-paced than COVID-19, like climate change, would be challenging. However, without waiting for sign on from every country, effective knowledge sharing can still take place by adopting alternative plans like patent pools. Because patent owners would still receive royalties and be able to opt-in on their own accord, patent pools may be a particularly effective way to promote the sharing of climate change technologies.
Madeline Thompson is a second year law student at Northwestern Pritzker School of Law.
Gambling, in one way or another, has been part of American life for centuries—early colonists participated in activities such as lotteries, betting on cock fights, and other games of chance. Throughout our history, gambling has remained a source of moral debate. On one hand, it is argued that Americans should be free to use their money how they see fit, and gambling typically consists of ‘harmless’ games. On the other, gambling can become a crippling addiction that leads many to economic hardship and is often tied to corruption and crime. This tension has resulted in a complex regulatory relationship between the government and the gambling industry. Nevertheless, gambling has only grown in popularity over the years and is now a massive industry that generated over 40 billion dollars in revenue in 2019 alone. The United States is now facing even more regulatory complexity as the rise of the internet and cryptocurrencies has rendered the existing regulations largely ineffective.
Throughout American history, regulating gambling has largely been the responsibility of each state. The federal government has taken a back seat role and typically gets involved only to supplement and support state law in the face of interstate gambling.
As with many aspects of modern life, the rise of the internet has quickly and drastically changed the gambling landscape. Online gambling sites began cropping up in the 90s, and people who lived in states where gambling was illegal were suddenly able to sign up online and make bets from anywhere. Even people in states where gambling was legal took advantage of the convenience of online gambling.
Many of these gambling sites were hosted abroad, resulting in the shift of a large portion of potential gambling revenue to offshore operations. Further, these online gambling hubs lacked any regulation and were much more likely to rig odds in their own favor or participate in money laundering schemes.
Finally, in 2006, the federal government deemed it necessary to take a more active role in regulating online gambling and passed The Unlawful Internet Gambling Enforcement Act (UIGEA). The UIGEA did not go so far as to outlaw online gambling. In fact, it didn’t impose any liability on individual online gamblers and included a clarification in its purpose section that emphasized that the Act should not be construed as “altering, limiting, or extending any Federal or State law . . . prohibiting, permitting, or regulating gambling within the United States.” Instead, the Act prohibited businesses from accepting payments from people engaging in any form of illegal online gambling. The main provision reads: “No person engaged in the business of betting or wagering may knowingly accept [any form of payment], in connection with the participation of another person in unlawful Internet gambling.”
In effect, the Act put online gambling purveyors that accept payments from players in the United States in violation of the law unless they are directly authorized by a state and acting in accordance with that state’s laws and all bets or wagers are initiated and received within that singular state. It also prohibited financial institutions from processing any gambling payments.
Naturally, many online gambling sites stopped allowing players in the United States from gambling on their websites to comply with the law. But of course, some online gambling sites remained in operation. Online poker sites, for example, hoped that the vague language of the UIGEA did not cover poker, since poker can be characterized as a game of skill rather than of chance.
In 2011, though, the Department of Justice exercised its first major enforcement of the UIGEA and shut down these online poker sites, while at the same time freezing millions of players’ accounts, many of which held enormous sums of money. Mass chaos ensued. Avid online poker players refer to the date of the shut down as “Black Friday,” and as a result, ‘illegal’ online gambling largely tapered off for several years.
Currently, online gambling is legal in six states: New Jersey, Connecticut, Delaware, Michigan, Pennsylvania, and West Virginia. Gamblers in these states can access and legally gamble on a small selection of state-authorized casino websites. These states have seen massive benefits from the legalization of online gambling. New Jersey, for instance, has generated over $600 million dollars in tax revenue since it first authorized online casinos in 2013.
However, those located in the remaining 44 states where online gambling is illegal were largely out of luck. Recently though, thanks to the rise of individual VPN use and cryptocurrencies, online gambling has had a huge resurgence. But of course, with this resurgence comes new legal implications and gray areas.
A Virtual Private Network (VPN) is a service that allows you to encrypt your data and mask your computer’s IP address when you surf the web. It does so by connecting you to the internet via a secured private server, rather than through your own internet service provider. When using the internet through a VPN, websites you visit do not know your IP address, which normally connects your location and identity to your online activity. Instead, sites only know the IP address of the VPN. VPNs themselves are legal and a valuable way to use the internet while also protecting your personal data. But they also give people the freedom to do things they wouldn’t, or couldn’t, do if their activity were tied to their personal IP address.
American online gamblers have taken to using VPNs located in more gambling friendly countries in order to circumvent online casinos’ bans on American users. However, most of these sites ultimately require users to provide personal information in order to validate users’ identities prior to allowing them to deposit and withdraw funds. This validation step effectively filters out most American online gambling attempts. That is, until the rise of cryptocurrencies.
On a basic level, cryptocurrency is a type of digital money that is encrypted and decentralized. Cryptocurrencies allow users to anonymously participate in transactions without the help of a traditional bank or financial institution. Over the past few years, as cryptocurrencies have gained major traction, online casinos have kept up with the trend and many now accept cryptocurrencies. Some casinos have even cropped up that run exclusively on cryptocurrencies. The anonymous nature of using a cryptocurrency has made it possible for virtually any American, especially if they are masking their IP address with a foreign-based VPN, to gamble online at offshore online casinos that accept cryptocurrency. And not only are they able to do so, but it is virtually risk free.
Since the UIGEA didn’t criminalize the act of online gambling for players, there is currently no federal legal risk for those who choose to skirt location restrictions and gamble on sites that technically are not allowed to operate in the United States. And while some states have laws that make it a misdemeanor to participate in unauthorized gambling, these laws are seldomly enforced and rarely specify the rules surrounding online gambling in particular.
The online “crypto casinos” themselves also face little risk. First, it is not yet established whether cryptocurrencies fall under the definition of ‘payments’ set forth in the UIGEA. And second, cryptocurrencies make it much harder, if not impossible, to trace the physical location origins of payments. Therefore, it is much harder to pin any misconduct on crypto casinos.
Online gambling has recently become even more widespread and has reached younger audiences it otherwise may not have via live streaming on sites such as Twitch. Popular online personalities and influencers have developed a new niche in which they live stream themselves online gambling, most often playing slots, and often for hours on end. Thousands of people tune in to watch them win, or squander away, incredibly large amounts of money. These influencers also often share incentives or bonuses for their viewers to sign up with these online casinos, which opens another can of legal worms given it is unclear whether the streamers are gambling with their own money or have the odds rigged in their favor to draw in more customers. This is on top of the fact that they are influencing people to partake in a legally ambiguous activity (and often they are influencing children to gamble, which is certainly illegal).
There are no signs of the online gambling fad slowing, which leaves the United States in murky waters with respect to the legal and policy implications it raises. Some have suggested that criminalizing the act of online gambling is the only viable solution. Others argue that the UIGEA should be repealed and that online gambling should be legalized and further regulated to make it safer and more fair for American residents, as well as to generate tax revenue. The government will likely need to determine a course of action sooner rather than later to combat the undoubted consequences of unchecked online gambling.
Mari Earhart-Price is a second-year law student at Northwestern Pritzker School of Law.
The Development of Artificial Intelligence
Today, Artificial Intelligence (AI) has developed into deep learning. Deep learning is the ability of an AI system not only to learn but also to independently make decisions without human intervention. With the development of deep learning, services that automatically compose music or draw a picture are appearing. For example, Google’s experimental “Auto Draw” tool uses deep learning algorithms to suggest complete drawings as users roughly sketch out their ideas. With the development of these types of AI services, there are copyright issues relating to both the inputs and outputs of such systems.
These services are still in the early stages, but as these services develop, they have the potential to produce results of commercial value. Therefore, their development may produce copyright issues. This post will (i) explain some copyright issues that may arise in relation to such deep learning services in two main categories and (ii) introduce the current situations in the U.S., South Korea, and Japan.
Deep Learning and Copyright
There are two major copyright issues related to services using deep learning. The first issue concerns the use of third-party data that is necessary for the learning process of a deep learning system. In general, copyright law requires permission from the rights holders for all such data. However, in many cases, it is practically impossible to obtain consent for all of the data that deep learning services use.
The second issue relates to the rights of AI-generated works. If contracts or legal documents created by AI have commercial value, then who owns the copyright of those works? In such situations, there is the possibility of a dispute between the AI service provider and the user using that service over the rights to any profit generated.
The current state of the law in the U.S., South Korea and Japan
A. United States
The first issue, permissive use, is whether using training data constitute unauthorized reproduction, thereby giving rise to copyright infringement liability. Circuit courts are divided on this issue.[i] However, even if infringement occurs during machine learning, training AI with copyrighted works would likely be excused by the ‘fair use’ doctrine.[ii] For example, in Authors Guild v. Google, Inc.[iii], Google had scanned digital copies of books and established a publicly available search function. The plaintiffs alleged that this constituted infringement of copyrights. The Second Circuit held that Google’s works were non-infringing fair uses because the purpose of the copying was highly transformative, the public display of text was limited, and the revelations did not provide a significant market substitute for the protected aspects of the originals. The court also said that a profit motivation in and of itself did not justify denial of fair use.
On the ownership issue, it is not clear whether the U.S. Copyright Act itself explicitly requires the author of a creative work to be human. However, the U.S. Copyright Office, by publishing “The Compendium II of Copyright Practices,” went beyond the statutory text in requiring that an author be human in order for the work to be eligible for copyright protection.[iv] And in Naruto v. Slater[v], Naruto, a crested macaque monkey, took several self-portrait photographs with photographer’s unattended camera. The Ninth Circuit dismissed the copyright claims brought by Naruto’s representative PETA.
B. South Korea
Currently, regarding permissive use, there are no regulations relating to AI learning data in the Copyright Act of South Korea. Therefore, when copyright-protected materials are used as AI learning data under the current law, that use may conceivably be judged as copyright infringement. However, Article 35-3(1) of South Korean law states that “… where a person does not unreasonably prejudice an author’s legitimate interest without conflicting with the normal exploitation of works, he/she may use such works.” In other words, the fair use doctrine is also possible under South Korean law, and there is a possibility that this provision will apply to AI learning.
Thus, with no legal provisions in place and with no relevant precedent cases, this legal uncertainty acts as an obstacle to the use of data in AI learning in South Korea. However, the South Korean Ministry of Culture, Sports and Tourism is working on a revision of the Copyright Act to include a clause that does not require consent of the copyright holder to the extent that the material is used for AI learning and big data analyses. The revised Act was proposed in Korean National Assembly on January 15th 2021, and the procedure is currently underway.
Regarding the ownership issue, the Copyright Act of South Korea defines the term “work” to mean “a creative production that expresses human thoughts and emotions” in Article 2(1), and “author” to mean “a person who creates a ‘work’” in Article 2(2). As such, unless the current law is amended, only humans can be the authors of creative endeavors. Therefore, AI-generated works are not protected by the current law.
However, South Korea may also see progress on this front. The Presidential Council on Intellectual Property formed the AI-Intellectual Property Special Expert Committee in June of 2020 to establish a pan-government AI policy. This committee will discuss a variety of policy issues, such as 1) whether AI should be recognized as an author, 2) whether the works created by AI should be protected to the same level as those by humans, and 3) who owns the work created by AI.
Japan has already solved the problem of permissive use through legislation. Japan revised related regulations through the revision of the Copyright Act on May 25, 2018. According to Article 30-4 of the new Copyright Act, it is permissible to exploit a work as necessary if it is used in data analysis. As a result, there are no restrictions on the subject, purpose, and method of data analysis, and there is no obligation to compensate the copyright holder. It is now also permitted to provide learning data in cooperation with multiple corporations.
Regarding ownership, under Japan’s Copyright Act Article 2(1), a copyright-protected work is defined as a creation expressing “human thoughts and emotions.” Thus, it appears difficult for AI to become the author of its own creations under the current law. To address this, the “Intellectual Property Strategy Headquarters” of the Prime Minister’s Office has suggested specific policies for AI copyright policy in its “Intellectual Property Promotion Plan 2016.” Specifically, this plan says that in order to promote AI creation, incentives to those involved in AI creation must be guaranteed. Thus, it is necessary to recognize the copyright of AI creations as well. However, the policy also stated that granting IP protection to all AI-created works may be subject to excessive protection. Thus, it is necessary to limit the content and scope of recognition of rights in consideration of the need for such protection.
Conclusions and Final Thoughts
Regarding the permissive use issue, unlike the U.S. and Japan, South Korea still has legal uncertainty. In South Korea, more specific legal provisions are required for using copyright-protected works as learning data for AI as a fair use. If the Copyright Act is amended, this problem can be resolved. Thus, it is necessary to watch for future progress.
In terms of the ownership issue, each of these countries currently has a problem in that the author of copyright must be human. Therefore, it is necessary to amend the copyright law for attribution of the rights of AI-generated works. However, a more detailed and careful discussion on who precisely will hold that copyright, is still needed.
Seung Hoon Park is a third-year law student at Northwestern Pritzker School of Law.
[i] Jessica L. Gillotte, COPYRIGHT INFRINGEMENT IN AI-GENERATED ARTWORKS, 53 U.C. Davis L. Rev. 2655, 2674-76 (2020).
[ii] Id. at 2659.
[iii] Authors Guild v. Google, Inc., 804 F.3d 202 (2d Cir. 2015).
[iv] Shlomit Yanisky-Ravid, Generating Rembrandt: Artificial Intelligence, Copyright, and Accountability in the 3A Era — The Human-Like Authors Are Already Here — A New Model, 2017 MICH. ST. L. REV. 659, 718-19 (2017).
[v] Naruto v. Slater, 888 F.3d 418 (9th Cir. 2018).
To clients, legal billing can seem “like a black box.” Many clients worry about lawyers overstating their billable hours (“bill padding”) or charging exorbitant hourly rates. By using data analytics to optimize legal billing (“data-driven billing”), law firms can stand out from their competitors and win more business.
1 – Problems that Exist in Legal Billing
There is cause for concern that legal costs are inflated. Bill padding is partially caused by high billable hours targets. As of 2016, the average associate was required to log 1,892 billable hours per year. These targets “virtually assure that some clients will be overbilled.” The economic model that governs law firm billing incentivizes firms to assess lawyers based entirely “upon the ability to generate revenue through the billable hour.” This type of assessment pressures lawyers to bill as many hours as possible. “[P]erceived billing expectations” have led at least some lawyers to inflate their logged billable hours. This sort of fraudulent billing makes law firms look bad and frustrates their clients.
The Cost of a Billable Hour
Despite concerns that billing by the hour may incentivize lawyers to be inefficient, law firms remain “wedded to the billable hour” because of the difficulty in estimating how much time legal work will take. Some firms set prices using cost-based, competition-based, or value-based pricing. Law firms also consider the firm’s expertise in the subject, market rates in the jurisdiction, the type of matter, and the type of client when setting prices. In general, the pricing strategy for many law firms is not data-driven. This lack of optimization suggests that current market prices are likely inflated. The opaqueness of legal billing practices leads to clients dreading the results.
2 – How Data-Driven Billing Can Help
Data-driven billing can help detect bill padding and intelligently set the price of billable hours. Billing software can identify bill padding through “scrutiniz[ing] bills to see irregularities and billing guideline violations” by comparing billable hour submissions against budgets and industry-wide data. Law firms can use pricing analytics to set pricing that will maximize the chance of getting not just a new client but also a profitable one. Using data analytics to minimize bill padding and intelligently set pricing allows law firms to differentiate themselves from competitors and maximize their profitability. “Pricing analytics is a huge untapped opportunity” for law firms. Although clients do not base their decisions purely on price, using data analytics to be an industry leader in pricing can help a law firm stand out in a competitive legal market.
The use of data analytics to analyze legal billing has grown rapidly. In 2014, auditing legal fees was identified as a growth industry. Since then, legal departments have expanded their use of data-driven billing and “have grown increasingly comfortable asking for and analyzing billing-related data.” There are now a large number of private companies that offer data-driven billing software, such as:
3 – How Data-Driven Billing Works
Most data-driven billing software expands upon the Uniform Task-Based Management System (“UTBMS”) legal billing codes used to log billable hours. UTBMS codes were introduced by the American Bar Association in the mid-1990’s to standardize billing practices. While UTBMS codes “brought some clarity” to billing, the codes suffer “significant limitations” because of how broad they are. Legal billing data analytics software can use natural language processing, a form of artificial intelligence, to analyze the text descriptions accompanying time logs to determine exactly what a lawyer was doing and classify it more precisely than the default UTBMS codes can. The analytics software can then compare those classifications against historical data from the law firm and legal industry as a whole to flag potential bill padding. It will also help law firms efficiently price their services by providing them with historical billing data to use when determining pricing. This sort of data-driven pricing could become more important going forward if more law firms begin providing fixed-price quotes to clients instead of using billable hours to determine fees.
To learn more about how data-driven billing works, I spoke with Joe Tiano, the founder and chief executive officer of Legal Decoder. He explained that Legal Decoder uses proprietary natural language processing to analyze billing records and determine what a lawyer did at a more precise level. Legal Decoder has developed a set of proprietary billing categories that are more precise than the standard UTBMS codes. For example, a UTBMS code may indicate that a lawyer was working on a discovery matter, but it will not say whether it was a discovery motion, a deposition, or a motion to compel.
In the process of classifying each billable hour, Legal Decoder’s Compliance Engine looks for three types of problems with legal billing. First, it examines staffing efficiency, which asks whether the most competent and lowest cost lawyer was assigned to a task. For example, the software can detect if a firm is having a partner work on an task that an associate could handle. Second, it examines workflow efficiency, which asks whether a lawyer’s work is redundant or inefficient. Third, it examines billing hygiene, which ensures that the billable hour entries accurately record the time spent on a task.
After Legal Decoder’s proprietary software drills down to precisely classify each billable hour, it then analyzes how long each assignment took. Using historical data from industry-wide benchmarks, Legal Decoder’s Pricing Engine estimates how long each task should take and analyzes whether each matter was handled efficiently. Joe explained that although there can be variance in how long each task will take, the software can, for the most part, effectively estimate how long a task should take. Legal Decoder then presents the results from its Pricing and Compliance Engines in intuitive Tableau dashboards for its clients to analyze.
The following screenshots of the Legal Decoder dashboard were provided courtesy of Joe Tiano. All data displayed in the screenshots are from bankruptcy data and are in the public domain. © 2019 Legal Decoder, Inc. All rights reserved.
4 – How Increased Implementation of Data-Driven Billing Will Impact the Legal Industry
While describing how Legal Decoder’s software works, Joe explained that law firms are sitting on treasure troves of data that they are currently not leveraging. The success of companies like Legal Decoder demonstrates how valuable data-driven billing can be. The continued expansion of applying data analytics to legal billing will likely lead to several changes in the legal industry.
Data-Driven Billing Makes Law Firms More Attractive and Increases Predictability
It may seem like data-driven billing benefits clients and harms law firms by giving clients leverage to negotiate better pricing. Joe pushed back on that idea by explaining that through data-driven billing, law firms can increase their realization rate, which measures the difference between the amount of billable hours logged and what percentage of that time is ultimately paid for by the client. According to Joe’s previous research, as of 2016 roughly $60 billion in billable hours are lost due to the 83% net realization rate across the legal industry. Cost-conscious clients have increasingly begun to push back against what they perceive as inflated bills. In 2015, 68% of law departments received discounted fees by negotiating billing with outside counsel. Inside counsel for clients view receiving a discount as a way to flaunt their efficiency to their chief financial officer. Joe has previously written that by using data analytics to analyze legal billing, law firms can “operate more efficiently (and more profitably) with greater client attraction, retention, and satisfaction.” Put another way, by using data-driven billing, law firms can differentiate themselves from competitors and thus win more business.
Data Analytics Can Affect Partnership Decisions
Data-driven billing gives law firms another way to assess potential partners. Joe explained some of his clients use Legal Decoder’s software to analyze the work of potential partners. Without Legal Decoder’s software, law firms would evaluate an associate based on their total billable hours and overall feedback. Legal Decoder’s software lets law firms drill down into an associate’s billable hours and look for potential issues that would otherwise go unnoticed.
Pricing Will Become a More Important Differentiator in a Post-Covid Market
Most lawyers want to continue working remotely, even when it is safe to return to offices again. Working remotely could allow lawyers to live in lower cost of living areas, allowing firms to pay lower salaries for these lawyers. Law firms are reconsidering their expensive office leases and may look to downsize their square footage in the future. Paying for less office space could lead to law firms lowering their fixed costs. Clients are “smarter than ever before” and have exerted “a continual downward pressure on fees.” All of these trends indicate that pricing will be a more important differentiator between law firms than ever before. Data-driven billing can help a firm stand out by intelligently pricing its services.
Kyle Stenseth is a second-year law student at Northwestern Pritzker School of Law.
Underlying the U.S. and global patent systems is the belief that granting a limited monopoly will incentivize innovation. Although climate change comes to mind as a particularly controversial topic, according to PEW Research, six in ten Americans and majorities in other surveyed countries see climate change as a major threat. Patent filings seemed to reflect that concern as climate change mitigation technology patents more than doubled between 2005 and 2012. However, beginning in 2012, patent filings for climate change mitigation technologies plummeted— down 44% for carbon capture and storage and 29% for clean energy patents. Why, in a world of increased awareness and acceptance of climate change, did the U.S. and global patent systems fail to deliver on the promise that patents were enough to incentivize innovation?
There are several potential explanations for the green tech patent drop-off. From a technological perspective, there is some evidence that green tech matured quickly and capped, leaving room only for improvement patents. From a policy perspective, many have argued that continued fossil fuel and carbon subsidies, along with the lack of a carbon pricing system, have disincentivized green energy and made it more difficult to compete. From a global market perspective, did the U.S. and China trade war for independence and dominance over the $300B semiconductor market detract from China, which was the largest patentor of green tech, filing patents in the biotech, chemical, and green tech sectors? What is the solution to reversing the green tech patent drop off? From a legal and patent perspective, I argue that the U.S. and global patent systems need to provide fast-tracking for green tech patent applications and reduced standards.
Patent Filings for Climate Change Mitigation Technology Plummeted
As the United States and the rest of the world moved toward embracing technology to remove and reduce carbon emissions, the innovation theorists appeared correct. Worldwide patent filings for climate change mitigation technologies more than doubled between 2005 and 2012. During that period, the growth in the green tech sector was increasing at a faster rate than other technologies. But while technologies in the health, engineering, and information and communication fields continued on their normal trajectories, in 2012 green tech did what few could have anticipated: it defied the innovation push and plummeted. For conservationists and technologists alike, this unexpected nosedive came in the form of a reduction by 44% for carbon capture and storage and 29% for clean energy patents. Only a few related fields avoided this trend: patents that enable power system integration of climate change mitigation and patents for regulated maritime and air vessels.
Possible Causes for the Green Tech Patent Drop-Off
Old technologies are constantly being replaced by new technologies, but in a field that was already heavily digitized (40%) and therefore not needing to be retrofitted, green technology certainly did not appear to be on the verge of a swift exit. There are, however, particular reasons for why green technology patents dropped off.
Technology Perspective: Green Tech Matured and Capped Quickly
First, some energy and economic reports noted that some green technology was uniquely susceptible to maturing and capping earlier in the innovation phase. However, the International Energy Agency surveyed 400 technologies to model commercial readiness and reported the opposite: by 2070, still less than 25% of “key technologies the energy sector needs to reach net-zero emissions” will reach maturity, 41% will be in the early adoption stage, 17% in the demonstration stage, and 17% in the prototype stage. In particular, electricity infrastructure and electrification of heavy industry remain the furthest from zero-carbon maturity.
Policy Perspective: Green Tech Struggling to Compete with Well-Funded Oil and Gas Industries
A second possible cause for the green technology patent drop-off is that less subsidized renewable technologies struggled to compete against heavily funded fossil fuel industries. A nascent and unsubsidized industry that is not yet commercially viable has a much greater likelihood of extinction compared to subsidized industries that are well-established and commercialized. The U.S. Congressional Research Service reported that between 2009 and 2018, renewables received 19% of research and development funding while fossil energy received 21%. However, both are dwarfed by the $100 billion in subsidies or 29% of the R&D funding that nuclear energy received in that same time period. Nonetheless, the relatively similar percentage of R&D funding of renewables and fossil energy may be misleading, at least according to the International Energy Agency, which notes that despite increased urgency, low-carbon energy R&D is actually “below the levels in the 1980s[.]”
Market Perspective: China, the leader in Green Tech Patent Filings, Shifting R&D to Win Semiconductor Trade War with U.S.
A third possible cause for the green technology patent drop-off is less the result of internal U.S. policies, and more the result of external China-U.S. foreign relations. Between 2000-2011, China was leading the global growth in environmentally-related patents with a more than 1,040% increase in applications according to the OECD. Thus, any shifts away from green R&D and patenting would likely be significant. When both the United States and China began to engage in a trade war for greater control over the semiconductor industry, which is the most intensive R&D industry, China doubled down on its plan to invest $118 billion over five years into semiconductors. This US-China trade war may partly explain why China shifted political energy and funding away from green technology and into semiconductors.
Regardless of the initial cause of the shift away from green technology, unlike the U.S., China appears to be on the rebound. Specifically, in 2018-2019, UK commercial law firm EMW reported that China filed 81% of the world’s renewable energy patents, a 28% increase from the year before, compared to the United States, which filed 8% of the world’s renewable energy patents. Additionally, there are some in the semiconductor industry who believe semiconductors can actually play a constructive role in fighting climate change.
How Fast-Tracking Patent Applications for Climate Change Mitigation Technology is the Fastest Way to Reverse This Trend
In order for the U.S. to reverse the trend away from needed and important patent applications for climate change mitigation technology, the U.S. should begin by restarting the Green Technology Pilot Program that it once championed to fast-track these technologies. Before the program ended on March 30, 2012, the USPTO accorded special status to 3,500 applications related to environmental quality, energy conservation, renewable energy development, and greenhouse gas emission reductions. These accelerated examination programs allowed patentees to receive a final disposition within about 12 months. Despite the seemingly premature ending, there remain significant and promising technological inventions that have yet to be widely patented or enabled, including patents in relation to grids, batteries, and carbon capture technology. Because patents are an essential tool to combat climate change, the USPTO and the federal government should actively consider expanding and improving the fast-track process.
The Consequences of Not Fast-Tracking Patent Applications for Climate Change Mitigation Technology Are Dire
According to 98% of climate scientists, the warnings have not been heeded. Much has been discussed about the rapidly deteriorating state of icebergs at the polar caps, and of shifting weather patterns that would result in increased drought, starvation, human migration, and conflict. More innovation and thus more innovators are needed to respond to this growing threat. Innovation needs to be re-injected into green technology: the world does not yet have a fully zero-emission fleet of vehicles; homes are not being constructed with materials that are resilient to increased extreme weather events; it will take years to commercialize fungi to break down plastic; it will take a decade for the alternative meat industry to capture 10% of the market; and it will take until 2040 for the Ocean Cleanup’s proprietary system to clean the Great Pacific Garbage Patch. If the U.S. and the world continue to ignore extreme shifts in climate, green innovators risk losing what traction they have. That is a loss we will all share.
Climate change is a critical problem that requires a solution, but the traditional solution–the patent system–stopped delivering on its essential promise to drive innovation. Beginning in 2012, while climate change awareness and acceptance grew, research and development sputtered. Fewer patents followed, down 44% for carbon capture and storage and 29% for clean energy patents. Various causes may underlie this problem, including some technological limitations, policy prioritization of fossil fuel and nuclear energy, and competing R&D concerns like the semiconductor trade-war. Nonetheless and regardless of the cause, the green technology patent drop-off has gone unnoticed and uncorrected for too long. Here in the U.S., the USPTO and the federal government have an important role to play in finding ways to get the patent system back on track. Fast-tracking green technology patent applications is the best way to accomplish this because time is what the green sector does not have enough of.
Melissa Hurtado is a second-year law student at Northwestern Pritzker School of Law.
A Primer on Training Sets and Machine Learning.
Merging computer processing and the practice of transactional law is a concept that has been around longer than you might think. Technologies for automating contract management and drafting tasks existed as early as the 1970’s, and consumer-facing software for automating tasks like incorporation and estate-planning have been a fixture in the legal service market for well over a decade. Both the number and the utility of legal technology innovations used in contracting are growing exponentially, and much of that growth is being driven by machine learning technology.
Machine learning is a form of artificial intelligence in which computer algorithms are utilized in a way such that the software “learns” and improves performance on its own. Training sets are the initial samples presented to the software, and for the purposes of AI contracting, training sets take the form of documents – contracts, forms, filings, etc., provided to the software. The quantity of data included in a training set can affect both the software’s ability to produce new outputs and the accuracy of those outputs. In other words, the more contracts fed to machine learning software, the better it can learn to discover patterns in order to manage and classify contracts, analyze qualities such as risk, suggest changes, or even predict operative results from contract drafts. The more complex and variable the contract-related output to be produced by the software is, the larger the training set required in order for the software to function accurately.
The quality of data included in the training set also affects the software’s ability to perform its purpose. The term “Garbage in, garbage out” describes the concept that the overall quality of the training set used to develop a software’s learning capability will affect both its ability to accurately analyze problems and the overall quality of its outputs. Even setting aside judgments on the quality of the inputs, form and content of outputs will often mirror the inputs, especially when dealing with textual data. In short, the value of contracting software will depend on the quality (including context and similarity) and quantity of training set data.
Issues with Contract Language.
The momentum for another fascinating shift in contract drafting is building alongside the proliferation of artificial intelligence – a push for simplifying contract language. Legal tech companies are investing in adapting Natural Language Processing (NLP) techniques to more easily code legalese and convoluted sentence structures commonly found in legal contracts. LawGeex, for instance, developed algorithms that can comprehend unfamiliar legalese, and the company’s product can perform contract review tasks more accurately than human lawyers. The need for this interpretation creates barriers for companies that are not as well-funded or do not have the ability to implement the multi-year lead time required to train such systems.
While technical contract language alone presents issues for training set availability, interpreting complex legal language is not just an issue for software and machines. The digitalization of consumer-facing contracts has fueled a demand for simpler, more natural language used in the day-to-day lives of customers. There is growing support for reducing the complexity of overall drafting language and contracting organizations like World Commerce and Contracting have already issued guidance for implementing this shift. Ironically, this trend presents potential future issues for software built on current training sets that include convoluted language. As Professors Daniel W. Linna Jr. and Helena Haapio put it, “[W]e have a disconnect between people developing AI for contracting and people working to improve contracting through simplification and redesign.” This disconnect impacts the quality of training sets as we might soon face a situation where the expectations of legal language that impact how companies train software do not match society’s expectations of contract language.
Issues with Availability and Quality of Contracts.
The availability of contracts on which to train software is important for any developers of new AI contracting software because machine learning processes can learn more from larger training sets. The issue of availability is a simple one: the vast majority of contracts are not public documents. Additionally, public databases that do exist, such as the SEC’s EDGAR database, relate mostly to specific practices and types of contracting parties. Still, LexPredict, Bloomberg and Contract Standards are among the many legal tech providers who utilize public databases to account for at least part of their training sets. Public filings are included in databases for reasons of disclosure rather thanfor their intrinsic value or paradigmatic drafting, and their use for training set data presents issues of quality.
Also problematic is the fact that public filings often represent only an end product. Especially for emerging predictive applications of AI contracting, AI needs to, first, learn which contract provisions are standard in order to create baseline precedent documents that can be customized and, second, gather information about specific situations and conditions in which specific, non-standard customizations are to be applied. The emphases on both disclosure over process and the end product create a paradox where outcomes are transparent but process is opaque, creating yet another potential difficulty in utilizing public documents for training purposes.
Issues for Specific Practices and Contract Types.
There are both quality and quantity (availability) issues that affect the supply of satisfactory training sets for particular types of contracts. As previously stated, legal tech companies pre-train their software with large, often public, data sets. Software often will not properly recognize unfamiliar terms and contracts that are not part of the existing set, and so the software will not be useful for more niche or particularly complex contracts. Additionally, when developing newer predictive contracting technology, it will be difficult to train models to accurately predict outputs for circumstances which occur infrequently, such as “bet the company” situations. Finally, legal market conditions that arise within particular practice areas might present unique roadblocks to innovation, leading to less generation of training sets. In his paper on the inefficiency of precedent selection in the M&A field, Professor Robert Anderson IV notes that pressure to standardize agreements often comes from the client side, and firms are less likely to invest in standardizing deal documentation via technology in matters like bankruptcy or acquisitions where clients are unlikely to be repeat players. He also notes that a particularly small number of firms dominate the M&A field, and reputational barriers may exclude new and innovative firms from entering the marketplace and leveraging technology to challenge the status quo.
Conclusion and Beyond.
This post briefly surveys a few contract-specific issues that might limit the quality and availability of training sets for AI contract drafting, but there are undoubtedly many more. In addition to technological issues, other issues relate to the way law firms, other legal service providers, and clients behave. In a future post, I hope to survey more of these, along with a couple of proposed solutions for issues with training sets for AI contracting. Suggested solutions include affording greater IP protection to contracts and sharing contract management resources among firms. These solutions, if implemented, may in turn pose new issues, such as ones involving cost-sharing and privacy concerns. AI contracting, following the path of machine learning in a more general sense, is a rapidly advancing technology, and new issues with its implementation will likely continue to arise.
Zach Frankel is a second-year law student at Northwestern Pritzker School of Law.