1

Github copilot is an AI application developed by Microsoft, trained on code available online including that within the github code repository. It outputs code that may solve problems presented as a function header.

There is at least one instance of Github copilot reproducing code that is a verbatim copy of GPL'ed code, see the image below, this twitter thread and the original source. There has been a question about the ownership of the generated code but what about the copilot application itself? From the point of view of a informatician, it would seem that to be able to replicate this code the copilot program must have the information within it, and therefore be a derivative work of the code replicated, but I do not know how that relates to the legal definition of a derivative work.

The code represented here is licensed under the GPL v2, which I think would not require Microsoft to release the source of a derivative work if they are not releasing the binary. If an owner of the copyright of code that did not allow this (for example Affero GPL) was to identify their work being reproduced by Github copilot are they likely to be able to successfully argue that copilot is a derivative work, and so should be made available under the Affero GPL?

Github copilot generating Quake III code

User65535
  • 10,342
  • 5
  • 40
  • 88

1 Answers1

1

Something used to create a derivative work is not itself a derivative work by virtue of that fact, any more than a copying machine is a derivative work of a book it is used to copy.

Derivative means "based upon". If the software used to make a derivative work itself was not based upon the work that it was used to make a derivative work of, then it isn't a derivative work.

Furthermore, since software programs lack legal personality, the software program itself can't infringe on the copyright of another. Only a person utilizing the software program can infringe on someone else's copyright.

ohwilleke
  • 257,510
  • 16
  • 506
  • 896