One man's look at copyright law

This article by Dan Polansky looks at copyright law, especially the United States law.

Protected works edit

As per Wikisource: United States Code/Title 17/Chapter 1/Sections 102 and 103: "Works of authorship include the following categories:

(1) literary works;
(2) musical works, including any accompanying words;
(3) dramatic works, including any accompanying music;
(4) pantomimes and choreographic works;
(5) pictorial, graphic, and sculptural works;
(6) motion pictures and other audiovisual works;
(7) sound recordings; and
(8) architectural works."

We pay attention specifically to computer software in the following.

Computer software edit

The U.S. copyright law codification itself does not list computer programs among protected work categories. However, they are mentioned in "House Report No. 94-1476 (Extract)" in Wikisource: United States Code/Title 17/Chapter 1/Sections 102 and 103:

'The term "literary works" does not connote any criterion of literary merit or qualitative value: it includes catalogs, directories, and similar factual, reference, or instructional works and compilations of data. It also includes computer data bases, and computer programs to the extent that they incorporate authorship in the programmer's expression of original ideas, as distinguished from the ideas themselves.'

As per Wikisource: United States Code/Title 17/Chapter 1/Section 101:

"‘‘Literary works’’ are works, other than audiovisual works, expressed in words, numbers, or other verbal or numerical symbols or indicia, regardless of the nature of the material objects, such as books, periodicals, manuscripts, phonorecords, film, tapes, disks, or cards, in which they are embodied."

Arguably, this appears to be terminological stretch since computer programs are not literary works by a naive terminology. Also technical writing does not quite match "literary", unless it means "by means of letters". But then, what if it is by means of Chinese characters, which are not letters? Be it as it may, if "literary" means "by means of letters", a computer program in 7-bit ASCII is a literary work. Since the term "literary work" is defined as quoted above, this paragraph has no material impact.

Is the binary executable of a computer program a literary work, and if so, by what standard? Sure enough, we can "disassemble" (translate) the binary into the assembly language, which uses mnemonics, and then, we come closer to something like works "expressed in words, numbers, or other verbal or numerical symbols or indicia". In any case, using naive terminology of "literary work", stating that binary executable is a literary work is a stretch.

As an aside, the above language of works "expresses in [...] numbers" opens the door to all digital objects being "literary works"; since, e.g. a PNG raster image is a work expressed in numbers. That was probably not intended. Indeed, even an 7-bit ASCII file containing English text would be a work expressed in numbers (via its digital storage), whereas an intuitive understanding would be that it is a work expressed in words.

Notably, a computer game is a package of different kinds of elements: the computer code embodying algorithms, screen/room layouts, graphical elements, background music, sound, etc. Thus, a computer game is something like a rich compound, and its different parts come under different work categories listed by the copyright law.

Further reading:

Fixing in tangible form edit

As per Wikisource: United States Code/Title 17/Chapter 1/Sections 102 and 103:

  • "Copyright protection subsists, in accordance with this title, in original works of authorship fixed in any tangible medium of expression, now known or later developed, from which they can be perceived, reproduced, or otherwise communicated, either directly or with the aid of a machine or device."

When someone gives a lecture--by means of speech, no slides--and a student makes lecture notes, is it the student that is the copyright holder and not the teacher? One might think so: the teacher did not fix his lecture in any tangible medium of expression. Indeed, lecture notes taken by students are sometimes being published online. Thus, sound waves (of changes in air pressure) are not a tangible medium. However, whether this is a standard/accepted legal interpretation needs to be clarified.

If someone makes a sound recording of an originally improvised musical performance by someone else, is it the recorder who holds the copyright?

One may ponder whether computer files are really a tangible medium, given one cannot touch the files, unlike a sheet of paper, a book, a painting, a photograph or a physical photographic film. For the purpose of copyright law, almost certainly, but it is not clear how or whether this is codified.

A corollary seems to be that when a journalist interviews someone by means of speech (not e.g. email), it is the journalist that is the copyright holder of the whole interview, and the interviewee has no copyright to what he or she said. This should be better sourced since it may be a non-standard analysis.

Further reading:

Originality edit

As per Wikisource: United States Code/Title 17/Chapter 1/Sections 102 and 103:

  • "Copyright protection subsists, in accordance with this title, in original works of authorship fixed in any tangible medium of expression, now known or later developed, from which they can be perceived, reproduced, or otherwise communicated, either directly or with the aid of a machine or device."

What does original mean? For one thing, it means "not copied". But does it means something else?

Let us produce a list of pseudo-random numbers seeded from clock (say, Mersenne Twister used by Python). Let this list be only be found in our publication, nowhere else. Is it original work protected by copyright? It is a work of an algorithm, not a human. Moreover, the statement of the form "X is the Y-th number generated by Mersenne Twister generator seeded from seed S" is a statement of fact. Can the algorithm claimed to be an author and engage in authorship? But then, under physicalism, human brains are something like embodiments of huge complexes of algorithms, and then, counter-intuitive as it may seem, human authorship is also a result of algorithm, just that no one knows what that algorithm is exactly. This requires clarification.

Let us produce a list of random numbers by a method that can claim to produce genuinely random numbers rather than pseudo-random ones. Then, the result is not produced by a deterministic algorithm. Is the result protected by copyright?

Further reading:

Idea-expression dichotomy edit

As per Wikisource: United States Code/Title 17/Chapter 1/Sections 102 and 103:

"(b) In no case does copyright protection for an original work of authorship extend to any idea, procedure, process, system, method of operation, concept, principle, or discovery, regardless of the form in which it is described, explained, illustrated, or embodied in such work."

Moreover, "[...] the fundamental axiom of copyright law that no one may copyright facts or ideas" as per Wikisource: Feist Publications v. Rural Telephone Service.

Further reading:

Feist v. Rural edit

In Feist v. Rural (1991), the U.S. Supreme Court ruled that a telephone directory is a mere listing of facts with no original selection or arrangement and that it is therefore not copyright protected.

Court actions and courts involved:

  • The District Court granted summary judgment to Rural, agreeing with Rural that telephone directories are protected by copyright.
  • The Court of Appeals affirmed.
  • The Supreme Court reversed the judgment of the Court of Appeals.

Further reading:

Merger doctrine edit

From Murray 2006: "The merger doctrine in copyright states that if an idea and the expression of the idea are so tied together that the idea and its expression are one - there is only one conceivable way or a drastically limited number of ways to express and embody the idea in a work - then the expression of the idea is uncopyrightable because ideas may not be copyrighted."[1]

Further reading:

Sweat of the brow edit

The "sweat of the brow" doctrine is the idea that effort alone (of collecting information) is worth protection regardless of originality. In the U.S., the doctrine was rejected in Feist v. Rural in 1991.

Further reading:

Fair use edit

As per copyright.gov: "Under the fair use doctrine of the U.S. copyright statute, it is permissible to use limited portions of a work including quotes, for purposes such as commentary, criticism, news reporting, and scholarly reports."

Wikimedia Commons does not allow fair use.[2] The rationale is that the project intends to serve wikis in different languages and different countries support or interpret fair use differently.

Further reading:

De minimis edit

Sources indicate there exists de minimis defense in the U.S. copyright law, an abbreviation of the phrase de minimis non curat lex. It seems to be distinct from and not part of fair use. It remains to be clarified what it is exactly; different sources seem to use the phrase differently.

Further reading:

Compilation edit

As per copyright.gov, 'Compilations of data or compilations of preexisting works (also known as “collective works”) may also be copyrightable if the materials are selected, coordinated, or arranged in such a way that the resulting work as a whole constitutes a new work. When the collecting of the preexisting material that makes up the compilation is a purely mechanical task with no element of original selection, coordination, or arrangement, such as a white-pages telephone directory, copyright protection for the compilation is not available.'

Further reading:

Government works edit

The U.S. government works are not protected by copyright, per Wikisource:United States Code/Title 17/Chapter 1/Sections 105 and 106.

Further reading:

Copyright term edit

Term (duration) in the U.S.:

  • Generally the life of the author plus 70 years, but it depends on when the work was created and on other things.

Further reading:

Example rulings edit

Example rulings:

U.S. constitution edit

U.S. constitution about copyright per Wikisource: Constitution of the United States of America:

"The Congress shall have Power [...] To promote the Progress of Science and useful Arts, by securing for limited Times to Authors and Inventors the exclusive Right to their respective Writings and Discoveries;

Above, the language is of "writing"; thus, e.g. paintings would not be protected.

Above, the language is of "useful" rather than "beautiful" or "pleasant"; narrowly construed, novels would not be protected, but technical and scientific writing would be protected.

Copyright notice edit

As per Circular 1, "Notice was required for works published in the United States before March 1, 1989. Works published without notice before that date may have entered the public domain in this country."

As per Circular 3, "Copyright notice is optional for works published on or after March 1, 1989, unpublished works, and foreign works; however, there are legal benefits for including notice on your work."

Further reading:

Berne Convention edit

Berne Convention is an international copyright treaty. Its main goal is the mutual protection of copyright between signatory countries.

As per Wikisource: Convention for the Protection of Literary and Artistic Works, the general minimum term of protection is the death of the author + 50 years, but there are different conditions for certain classes of works.

Further reading:

Paraphrasing edit

According to Wikipedia:Close paraphrasing, paraphrasing the source does not necessarily avoid copyright violation. This seems strange since it is the expression that is protected, not the fact or idea. This requires more deliberation and research.

Meta: Wikilegal/Close Paraphrasing indicates the answer to "Is close paraphrasing of a copyrighted work a copyright infringement?" is yes. But then, there is something like close paraphrasing and something like non-close paraphrasing. The page does not provide any explanation for what "close" means, nor any examples. The Wikilegal page takes an exception to the answer it has given, admitting that if one closely paraphrases a copyrighted work published under CC-BY-SA in a work under CC-BY-SA and traces the sentence to that work, there is no copyright violation. The proper proofreading and vetting of that page is unclear; the page was created in 2012 by an anonymous IP address.

Further reading:

Plagiarism edit

The concept of plagiarism relates to copyright violation, but is not exactly the same thing.

According to Britannica 1911, plagiarism is "an appropriation or copying from the work of another, in literature or art, and the passing off of the same as original or without acknowledgment of the real authorship or source."

One point of contrast to copyright violation: If a publisher publishes a copyrighted work without author's (or the rightful publisher's) permissions but correctly states the author of the work, it would be a copyright violation but no plagiarism since the transgressing publisher does not misrepresent or hide the authorship.

Even passing someone's ideas as one's own is plagiarism, according to multiple sources. That is a point of contrast to copyright, which does not protect ideas. How and to what extent giving credit for ideas to other people and sources is practicable and practiced is unclear. Since, surely authors would all too often read something somewhere, forget where, and later use the ideas with forgotten provenance in their writing or other production.

Plagiarism is not illegal in the U.S., unlike copyright violation; plagiarism is an ethical concern.

Further reading:

Translation pairs edit

Translation pairs, e.g. "cat" --> "Katze", are one application of copyright law. Arguably, they often constitute facts expressed in an obvious canonical form (a pair), and thereby would not be protected by copyright.

However, one can argue that in so far as the chosen translation pairs differ between different dictionaries, there is something original to an extent to them. Requires clarification.

Protection against copyright infringement I am using:

  • Use multiple sources for each translation pair rather than blindly copying a single source.
  • Double check and question rather than blindly accepting what the sources indicate.
  • Check definitions of the items in the pair for match.

Further reading:

Dictionary definitions edit

Above, we touched on #Translation pairs, but there are other dictionary artifacts potentially subject to copyright, especially definitions.

Arguably, dictionary definitions are copyrighted, given the variation in their formulation across dictionaries. On the other hand, one could argue that they are not protected given the merger doctrine: there is only a handful of ways how to accurately define a word, one might think. The Richards v. Merriam Webster, Inc. case suggests definitions are protected.

Further reading:

Dictionary quotations edit

Some dictionaries contain short quotations of word use from literature (a sentence or two), stating the work title, author and publication date. Since the authorship is properly credited, it is not plagiarism but it could still be a copyright violation in principle. This practice is probably allowed via fair use. Finding a good source on the subject would be worthwhile.

Slogans and short phrases edit

As per copyright.gov, "Copyright does not protect names, titles, slogans, or short phrases. In some cases, these things may be protected as trademarks."

Further reading:

Photographs edit

One could think that, unlike paintings, photographs are a straightforward capture of facts: how the world looks at a point in time at a certain place from certain angle, etc. However, since photographs are copyright protected, there must be a rationale for doing so. Finding out the rationale requires more research.

Contrast can be drawn between 1) merely making a photograph and 2) arranging things to be photographed and then making a photograph. Thus, the lawgiver could consider "plain" photographs to be free from copyright protection.

Further reading:

Charts and graphs edit

Charts and graphs would seem to be free from copyright in so far as they are a straightforward presentation of data and data and facts are not copyrightable. However, one might reckon that color schemes and similar somewhat arbitrary choices could make a chart copyrightable.

Wikimedia Commons have Template:PD-chart that labels charts that are considered uncopyrightable; one may investigate the particular charts to get an idea. Moreover, Commons:Threshold of originality#Charts links to multiple deletion discussions.

Further reading:

Computer-made art edit

It is unclear to what extent and what computer-made art is subject to copyright. One can think e.g. of iterated function system fractals or Mandelbrot set images, which are drawn by the computer. Considering Mandelbrot set, the human can set the location, the zoom level and color palette; it is not clear to what extent this limited choice is creative. Arguably, Mandelbrot set images are not created but rather computationally discovered.

A rather different category is pseudo-AI-generated art, which we cover in a dedicated section below.

According to Compendium of U.S. Copyright Office Practices, human authorship is required for copyright: 'The copyright law only protects “the fruits of intellectual labor” that “are founded in the creative powers of the mind.”'

Further reading:

AI-generated art edit

In pseudo-AI-generated art, the human can supply a brief textual prompt and the pseudo-AI produces an impressive-looking image. It is not clear to whom the copyright (if any) of the output image belongs.

Further reading:

AI-generated text edit

The copyright questions about AI-generated text seem similar to those about AI-generated visual art.

New York Times sues OpenAI over ChatGPT alleged copyright violation.[3][4]

Further reading:

Screenshots of computer software edit

Screenshots of computer software are copyrighted. There may be a fair use defense for their use; this needs to be clarified.

The English Wikipedia has Template:Non-free video game screenshot to label non-free screenshots and provide fair use as rationale.

Further reading:

Videogame long play videos edit

YouTube contains many videogame long play videos, showing someone play a video game from the start to the end. One would think that since these videos show graphical elements and music, they are copyright violations. On the other hand, one could argue that the videos do not provide as much entertainment as an actual play and do not effectively reduce merchantability of the game, so ethically (as contrasted to legally) it is tolerable. On the other hand, even a single screenshot from a computer game is copyrighted, and an argument protecting a whole long play video from a lawsuit would equally well seem to protect a single screenshot. The long play publishing practice possibly rests on the game publishers not launching any lawsuits or requests for video removals given the videos do not reduce their revenue and probably cause them no other harm.

Further reading:

Game mechanics edit

Multiple experts in Quora indicate that mechanics of computer games are not copyrightable but are patentable. However, visual artwork and music are very likely copyright protected.

As for the mechanics, one has to become clear what is meant by that. A general idea of a game may be not copyrightable, but specific room layouts could possibly be. This requires more research.

Further reading:

Internet and the web edit

The web presents some new questions concerning copyright. For instance, one can ask whether a browser's making RAM copies and even temporary files to view content sent by the web server is a copyright violation in that it is an unauthorized making of copies. Arguably, it is the intent of the web server provider that user software makes such copies and therefore, there is an implied license to make such copies, solely for the purpose of activity approved by the web server, that of viewing the content.

These fine considerations can play a role in an analysis of the legality of training large language models (LLM, a pseudo-AI) on content obtained from web pages. Since, arguably, as long as the web page provider did not grant anyone an express license to make copies on their servers for the purpose of the LLM training, making these copies is a copyright violation.

Database right edit

Database right is distinct from copyright. It is not available in the U.S., but it is available in the EU.

Database right had to be codified as distinct from copyright since in copyright, information is not protected, merely its expression, and a database is a collection of standardized structured records, arguably showing no originality in expression. (But, arguably, one might claim some originality in the data model.)

Further reading:

Wikipedia edit

As per Wikipedia:Copyrights#Governing copyright law:

"The Wikimedia Foundation is based in the United States and accordingly governed by United States copyright law. Regardless, according to Jimbo Wales, the co-founder of Wikipedia, Wikipedia contributors should respect the copyright law of other nations, even if these do not have official copyright relations with the United States."

The above applies to the English Wikipedia. It probably applies to non-English Wikipedias as well since the organization and the servers are located in the U.S.

Further reading:

Wikisource edit

If one assumes that the decisive factor for copyright jurisdiction is that the servers and the organization are located in the U.S., one might think that all Wikisource domains (de.wikisource.org, fr.wikisource.org, etc.) would be under the same U.S. jurisdiction. Nonetheless, some non-English works are hosted at wikisource.org rather than on the subdomain matching the language. For instance, there is https://wikisource.org/wiki/Słownik_geograficzny_Królestwa_Polskiego. This has template "Template:PD-US-1923-abroad/PL". There are other templates: Template:PD-US-1923-abroad/CS, Template:PD-US-1923-abroad/DE, etc. Therefore, the Foundation seems to be playing it safe and separates non-English works whose inclusion is based on PD-US rationale on a dedicated domain. However, how this separation can possibly be material from the standpoint of copyright jurisdiction is unclear: it is hard to understand how a mere switch of the domain from e.g. pl.wikisource.org from wikisource.org (keeping the language of the work the same) magically impacts the applicable copyright law jurisdiction. wikisource.org still seems to serve the pages to viewers in various jurisdictions across the world; by contrast, some U.S.-based websites responded to the EU-imposed GDPR by refusing to serve pages to viewers located in the EU. Thus, hosting the pages on a separate domain seems to be some kind of game more than anything else, but I (Dan Polansky) am not a lawyer.

References edit

  1. Copyright, Originality, and the End of the Scenes a Faire and Merger Doctrines for Visual Works by Michael D. Murray, 2006
  2. Commons: Commons:Fair use
  3. The New York Times sues OpenAI and Microsoft for copyright infringement, 27 Dec 2023, theverge.com
  4. pdf attachment: the NYT complaint, nytimes.com

Further reading edit

Wikipedia and other wikis:

Non-Wikipedia: