0

Manually migrating this from here because the answers were unsatisfactory: https://datascience.stackexchange.com/questions/26079/is-it-legal-to-scrape-youtube-videos-for-training-data

Can a machine learning researcher scrape youtube or use youtube videos/snippets of youtube videos to train a model? Would this violate 1) the creator's copyrights, 2) google's distribution/use rights?

This question does not concern datasets Google has freely released, but rather any other publicly available videos.

belkarx
  • 244
  • 1
  • 9

1 Answers1

2

The download/scrape would be a prima facie copyright infringement if the content is protected by copyright (most is) and not licenced for such use.

This would be excused only if a fair use defence were established, which is a case-by-case assessment as described here: In the US, when is fair use a defense to copyright infringement?

The fact that the intended use would be to train a machine-learning model would be just one of the factors in the fair use analysis (the purpose and character of use), but it is not determinative, other factors can pull the analysis in the opposite direction, and we are not better placed than you to make a prediction about the outcome of such a defence.

Jen
  • 87,647
  • 5
  • 181
  • 381