For copyright determination purposes, how can AI-generated content (eg.lyrics) be proven as such rather than human generated?

Question

Most answers I have seen to the question of whether AI-generated content is copyrightable cite law in the US and other countries limiting copyright to human-created content. However all the answers I have seen fail to address the question of how a government copyright office and then potentially later a rights holder, copyright enforcement company, or court would be able to prove it was AI-generated.

AI platforms are often vague about retention and/or use of user input and AI output, and it is not clear to me how accessible this data would be to government copyright offices, courts, and copyright enforcement companies for purposes of determining whether a particular lyric were AI generated.

In some cases, already copyrighted content may appear verbatim in AI output, and to the extent this can be easy recognized as an existing work this would make the AI authorship question moot. However most of the time this will not be the case, especially where AI only enhances a user's original work rather than composing something from scratch or with minimal prompts.

As an example of information that would partly answer this question, it would be possible, if it meets standards of evidence, to prove AI authorship if an AI tool's conversations were accessible to the court. Legal professionals will know if courts have or would likely accept this as evidence and whether they have access to it (and similar evidence). As would experience on how much attention is paid to stylistic tells in AI generated content. As would information about other specific kinds of proof that would be asked for by a court in such cases.

score 9 · Answer 1 · edited Jun 24 '24 at 20:15

Possible kinds of evidence

As explained in the comments by Nate Eldredge:

There only really is a general answer. You present evidence and see if a jury believes you. There's no specific set of evidence that they're required to accept or reject; it's always case-by-case.

There isn't much that is inherently legal about how you prove once case v. another. Proof of facts depends upon what information you can get and basic logic and emotional impact. The rules of evidence rule out some kinds of evidence but they don't tell you how to affirmatively prove particular things.

Testimony

The most obvious way to do so is to get a statement from the person who allegedly created the work under oath, or from some other person with personal knowledge of the facts, or from someone who heard someone else with personal knowledge of the facts say so. Cross-examination of this testimony might also be required to clarify points that are vague in that testimony.

There is a common misconception that testimony from a party to a dispute isn't evidence shown by statements like "What proof do you have?" But, in fact, it is the most common kind of evidence, is generally admissible, and is routinely used to prove facts in legal disputes.

Propensity evidence

Propensity evidence might be admissible to show an established modus operandi of this particular person generating content with AI and then claiming they made it (or of not doing so), or to impeach the credibility of a sworn statement from someone providing evidence about the source of the work.

Someone who has been caught previously submitting AI-generated works to contexts and in copyright applications is going to have a harder time proving that they made the work themselves than someone who has a long history of painstaking making original works without AI assistance.

Forensic review of computer files and browsing histories

One could subpoena the Internet browsing histories of the suspected AI-generator and the metadata of the file being advanced as human created, to show opportunity to use AI, and a timeline that favors an AI or non-AI source.

For example, if a large complicated work was created in five minutes, shortly after the browsing history shows access to an AI-generator website, this would be strong evidence that it was created with AI, while the absence of access to an AI-generator website and scores of versions of the same work over a period of months from an application capable of being used to make a work like that which was present on the person's computer, would be strong evidence that the work was created by a person.

Review of public output from AI-generating websites

Some AI-generating websites publish to the general public works created using it, so a search of those websites for that kind of republication generated with user input might be possible.

If the work submitted shows up in the AI-generating website's public output, it is almost surely AI-generated.

Expert testimony

Another way to prove this would be to get expert witness testimony from experts in distinguishing AI-produced works from non-AI produced works (a growing cottage industry). Short of that, there is software that is designed for the purposes of distinguishing the two, although even if this is admissible as evidence in court, it would probably be given much less weight than expert witness testimony.

The details of how this is done by expert witnesses is beyond the scope of Law.SE and calls for information technology and artificial intelligence expertise, rather than legal knowledge. But the answer by Barmar explores some of them in a cursory way to illustrate the kind of testimony that expert witnesses might provide.

What counts as AI-generated for copyright purposes?

None of this engages with the legally hard question of where AI-generation that is not entitled to copyright ends, and where the use of computer generated graphics or editing software in the course of preparing a human generated work begins. This is another reason that expert testimony may be required.

Does Photoshopping a picture cross the line? Is telling a computer to color in everything between certain lines a certain shade of blue AI-generation? Is hand drawing six images in a twenty-four frame-per-second animation and asking a computer to interpolate between them for the other 18 frames AI-generation? Is using a 3D-model of a common part of the set as a starting point for drawing something in an animation or manga AI-generation? Is having a computer do shading from the direction of a light source identified by the creator AI-generation? What about using an AI-generator to generate a trailer or storyboard to used to inspire a longer or more complete work that is created the old fashioned way?

Is following AI advice in a grammar editor to break up run on sentences and correct grammatical and spelling mistakes and avoid using words considered derogatory AI-generation? Is using an AI-search to find references and quotes to consider using in research paper AI-generated if the text that utilizes those references and quotes is human written?

Many of the gray area questions remain unresolved legally.

score 8 · Answer 2 · answered Jun 24 '24 at 02:32

You prove you wrote it the same way you prove you wrote it now

By providing sufficient evidence to convince the court on the balance of probabilities. If the person you are suing asserts that the material is AI generated and not your work, they will need evidence to support that that your evidence must overcome.

score 4 · Answer 3 · answered Jun 24 '24 at 19:04

While AI is likely to improve in time, at the moment AI has some quirks that make it relatively easy to distinguish from human-generated texts. For instance, when people post AI-generated responses in Stack Exchange sites, someone will almost always recognize it -- it rarely reads like something a human would have written. And experts have statistics-based tools that can be used to help them distinguish them; some of these are probably being used now by schools to detect when students submit AI-generated essays (similar to the way they use tools to discover plagiarism).

In the case of images, there are some mistakes that AIs make that would never happen from humans, like getting the number of fingers wrong in a person.

So you could get an expert witness to examine the work in question. Using their expert knowledge and tools, they will likely be able to tell whether something was a human- or AI-generated copy.

Jen · Answer 4 · 2024-07-01T15:17:25.390

As you've been told (at the proposed duplicate, in DaleM's answer, and in ohwilleke's answer), all evidence that, if believed, would increase or decrease the likelihood that a work was AI-generated is considered relevant.

This can include:

testimony;
affidavits;
admissions;
digital forensics;
expert opinion evidence;
general factual evidence that is more consistent with one version of events than another version of events;
etc.

To the extent that you're asking more specifically what that evidence would be, those are technical questions for subject matter experts, not legal questions. E.g.: if you are asking "what would an expert say?", that is a technical question, not a legal question. If you are asking "what would the digital forensics show?", that is a technical question, not a legal question. Still, ohwilleke and Barmar give some suggestions for what this evidence might entail.

If you really want to know how AI generation leaves its traces in the world, that is better suited for a technical stackexchange. And you have received what appear to be two helpful answers in that regard here: https://ai.stackexchange.com/q/46057

For copyright determination purposes, how can AI-generated content (eg.lyrics) be proven as such rather than human generated?

4 Answers4

Possible kinds of evidence

What counts as AI-generated for copyright purposes?

You prove you wrote it the same way you prove you wrote it now

Linked