Who owns AI-generated content?

Ownership of AI-generated content remains unsettled globally. The US Copyright Office has ruled that purely AI-generated works without human authorship cannot be copyrighted. The UK provides limited protection for computer-generated works. The EU is developing guidance on AI and IP rights.

Can AI be trained on copyrighted material?

This is actively debated and litigated. The EU AI Act requires transparency about training data. The US has ongoing lawsuits (e.g., NY Times v. OpenAI). Japan has a relatively permissive text-and-data mining exception, while the EU allows it with opt-out rights for copyright holders.

AI Copyright & IP Law | AI Governance Reference

1. Overview

Generative AI has created the most significant challenge to intellectual property law since the internet. Two fundamental questions dominate the global debate:

Input: Is training AI models on copyrighted works — without permission or payment — legal?
Output: Can AI-generated content be protected by copyright, and if so, who owns it?

These questions are being answered differently across jurisdictions, creating a patchwork of approaches that affects every AI developer, content creator, publisher, and user worldwide.

Stakes: The economic implications are enormous. The creative industries generate trillions in annual revenue. AI companies have trained models on vast corpora of copyrighted text, images, music, and code. Litigation seeking over 0 billion in damages is pending. The outcome will determine whether AI training is treated as transformative fair use, requiring licensing agreements, or outright infringement.

2. Training Data & Copyright

2.1 The Technical Process

AI model training involves ingesting and processing copyrighted works at massive scale:

Data collection: Web scraping, licensed datasets, public domain corpora, user-uploaded content
Processing: Tokenization, embedding, and statistical pattern extraction
Storage: Models do not store verbatim copies (usually) but encode statistical patterns that can sometimes reproduce or closely approximate training data
Key datasets: Common Crawl, The Pile, LAION-5B, Books3 (controversial), C4

2.2 The Core Legal Question

Position	Argument	Proponents
Training is fair use / permitted	Training is transformative (creates new capability, not copies); no market substitution (model is not a copy of any work); analogous to human learning	AI companies (OpenAI, Google, Meta, Stability AI); some legal scholars
Training requires licensing	Mass copying of copyrighted works requires permission; market harm (AI outputs compete with originals); scale distinguishes from human learning	Publishers (NYT, Getty); authors (Authors Guild); music industry (RIAA/UMG); visual artists
Training is infringement	Unauthorized reproduction at scale; derivative work creation; no existing exception covers this use	Some rights holders; some jurisdictions without broad fair use

3. US Copyright Law & AI

3.1 Fair Use Doctrine (17 U.S.C. § 107)

The US fair use defense is the central legal battleground for AI training. Courts consider four factors:

Factor	Application to AI Training	Likely Outcome
1. Purpose and character of use	Is AI training "transformative"? It creates a new tool rather than substituting for the original works. Commercial purpose weighs against fair use.	Contested — transformativeness is the key question; Google v. Oracle and Andy Warhol Foundation v. Goldsmith set conflicting precedents
2. Nature of the copyrighted work	Training uses both factual and highly creative works	Mixed — creative works get stronger protection
3. Amount and substantiality	AI training typically copies entire works	Weighs against fair use, but Google Books held that copying entire works can be fair use when the use is transformative
4. Effect on the market	Do AI outputs substitute for the original works? Do they create a licensing market that AI companies are bypassing?	Most contested factor — depends on whether AI outputs compete with training data sources

3.2 US Copyright Office Guidance

Registration guidance (February 2023): AI-generated content is not copyrightable without significant human authorship. Works must have human creative control, not just AI prompting.
Zarya of the Dawn (2023): Registered a graphic novel using Midjourney images — USCO ruled the text and arrangement were copyrightable but individual AI-generated images were not
Notice of Inquiry (2023): USCO requested public comments on AI and copyright — received 10,000+ submissions covering training, outputs, and policy
Report on AI (2025): USCO published comprehensive report analyzing training data copyright, output copyrightability, and recommended legislative approaches

3.3 Proposed US Legislation

Bill	Sponsor(s)	Key Provisions	Status
AI SHIELD Act	Rep. Eshoo	Require AI companies to disclose training data; create licensing framework	Introduced (pending)
COPIED Act	Bipartisan	Require consent and compensation for use of copyrighted works in AI training	Introduced (pending)
NO FAKES Act	Bipartisan Senate	Protect individuals from AI-generated replicas of their voice or likeness	Introduced (pending)
Generative AI Copyright Disclosure Act	Rep. Schiff	Require AI developers to disclose copyrighted works used in training	Introduced (pending)

4. EU Copyright & AI

4.1 Text and Data Mining Exception

The EU has the most developed legal framework for AI training data, through the Digital Single Market Directive (DSMD, 2019/790):

Article	Scope	Conditions	AI Training Impact
Article 3	TDM for scientific research	Research organizations and cultural heritage institutions with lawful access	Academic AI research can mine copyrighted works freely
Article 4	General TDM exception	Anyone with lawful access; subject to rights holder opt-out (machine-readable reservation of rights)	Commercial AI training is permitted UNLESS the rights holder has opted out

4.2 The Opt-Out Mechanism

robots.txt and the Opt-Out: Article 4 of the DSMD allows rights holders to reserve their rights against TDM (opt out). For online content, this must be done in a machine-readable format. The question of whether robots.txt constitutes a valid opt-out is being debated. Major publishers and media organizations are implementing AI-specific opt-out signals. The EU AI Act (Article 53) requires providers of general-purpose AI models to respect these opt-outs and to publish summaries of training data.

4.3 EU AI Act Transparency Requirements

Article 53(1)(c): Providers of GPAI models must put in place a policy to comply with EU copyright law, in particular the opt-out under Article 4 DSMD
Article 53(1)(d): Must draw up and make publicly available a sufficiently detailed summary of the content used for training the model
Template: The AI Office is developing a template for training data summaries

5. UK Copyright & AI

5.1 Current Law

Section 29A CDPA (1988): Existing TDM exception limited to non-commercial research. Does NOT cover commercial AI training.
Computer-generated works (s.9(3)): Uniquely, UK law provides copyright protection for computer-generated works with no human author — authorship is attributed to the person who made the arrangements for its creation. This 1988 provision (predating modern AI) could apply to AI outputs.

5.2 Failed Reform Attempt

The UK government proposed a broad TDM exception for AI training in 2022 but withdrew it after fierce opposition from creative industries. The proposed reform would have:

Allowed TDM for any purpose (including commercial AI training) with lawful access
Similar to EU Article 4 but without the opt-out mechanism
Creative sector argued it would devastate their industries
Government withdrew the proposal and opted for a voluntary code of practice approach

5.3 Current Approach

Code of Practice: UK IPO convening stakeholders to develop voluntary licensing frameworks between AI companies and rights holders
Transparency: AI companies expected to disclose training data sources
Legislative uncertainty: Without reform, commercial AI training in the UK likely requires licensing (no fair use doctrine in UK law)

6. Other Jurisdictions

Jurisdiction	AI Training Data Law	AI Output Copyright	Key Developments
Japan	Most permissive: 2018 Copyright Act amendment (Art. 30-4) allows reproduction for computational analysis regardless of purpose. Training on copyrighted works broadly permitted.	No AI authorship; human involvement required	Government considering whether to narrow the exception after creator backlash; cultural industry concerns
China	No specific TDM exception; fair use is narrow. Beijing Internet Court ruled (2024) that unauthorized AI training may infringe copyright.	Beijing court ruled (2023) AI-generated images can be copyrighted if human has sufficient creative control	Rapidly evolving; courts taking case-by-case approach; generative AI regulations require training data compliance
Canada	Fair dealing (narrower than US fair use); no specific TDM exception. Government consulting on AI and copyright reform.	No AI authorship under current law	Copyright Board studying AI issues; legislative reform expected
South Korea	Limited fair use; no specific TDM exception. Copyright Act reform under discussion.	No AI authorship	Korean Copyright Commission studying AI training; industry consultations ongoing
Brazil	No specific TDM exception. AI Bill includes provisions on training data.	Under discussion in AI Bill	AI Bill (PL 2338/2023) includes training data transparency requirements
India	Fair dealing; narrow exceptions. No specific TDM provision.	Copyright Board has not addressed AI authorship	Delhi High Court considering AI and copyright cases; reform discussions ongoing
Australia	Fair dealing (narrow). No TDM exception.	No AI authorship (Copyright Act requires human author)	Australian Law Reform Commission recommended TDM exception; government considering

7. Major Litigation

Case	Jurisdiction	Parties	Claims	Status / Significance
NYT v. OpenAI/Microsoft	US (S.D.N.Y.)	New York Times vs. OpenAI & Microsoft	Copyright infringement; unfair competition; training on NYT articles; ChatGPT reproducing NYT content	Pending; most high-profile AI copyright case; NYT demonstrated verbatim output reproduction; seeking billions in damages
Authors Guild v. OpenAI	US (S.D.N.Y.)	Authors Guild + individual authors vs. OpenAI	Training on copyrighted books (Books3 dataset) without permission	Pending; class action; represents thousands of authors
Getty v. Stability AI	US (D. Del.) & UK	Getty Images vs. Stability AI	Training Stable Diffusion on 12M+ Getty images; outputs reproduce watermarks	Pending in both jurisdictions; visual arts case
Andersen v. Stability AI et al.	US (N.D. Cal.)	Artists vs. Stability AI, Midjourney, DeviantArt	Training image generators on copyrighted art without consent	Partially dismissed; amended complaint proceeding; first artist class action
Thomson Reuters v. ROSS Intelligence	US (D. Del.)	Thomson Reuters vs. ROSS Intelligence	Training legal AI on Westlaw content	Jury found copying occurred (2024); significant for legal AI
UMG/RIAA v. AI Music Companies	US	Major record labels vs. Suno, Udio	Training music AI on copyrighted recordings	Filed 2024; music industry copyright claims; seeking $150K per infringed work
Concord Music v. Anthropic	US (M.D. Tenn.)	Music publishers vs. Anthropic	Claude reproducing copyrighted song lyrics	Pending; addresses output infringement specifically

8. AI-Generated Content Ownership

8.1 The Authorship Question

Jurisdiction	Can AI Be an Author?	Can AI-Assisted Works Get Copyright?	Key Authority
United States	No — copyright requires human authorship	Yes, if human contribution is sufficient (not just prompting)	USCO guidance (2023); Thaler v. Perlmutter (D.D.C. 2023)
European Union	No — CJEU requires author’s own intellectual creation	Yes, if human creative choices are reflected	Infopaq (CJEU); Painer (CJEU)
United Kingdom	No human author required for computer-generated works (s.9(3) CDPA)	Yes — uniquely, UK law protects computer-generated works	CDPA 1988 s.9(3); authorship goes to person making arrangements
China	Evolving — court ruled human creative control sufficient	Yes, if human demonstrates creative involvement	Beijing Internet Court (2023) — AI image copyright case
Japan	No — requires human thought or emotion	Yes, if human creativity is substantial	Copyright Act requires human author
Canada	No — requires human author under Copyright Act	Likely yes if human contribution is significant	No AI-specific case law yet

9. AI & Patents

9.1 Can AI Be a Patent Inventor?

The question of whether AI can be named as an inventor on a patent has been tested globally through the DABUS cases (Device for the Autonomous Bootstrapping of Unified Sentience):

Jurisdiction	Ruling	Court/Authority
United States	No — inventor must be a natural person (Thaler v. Vidal, Fed. Cir. 2022)	Federal Circuit; affirmed USPTO position
United Kingdom	No — inventor must be a person (Thaler v. Comptroller-General, UK Supreme Court 2023)	UK Supreme Court; unanimous
European Patent Office	No — inventor must be a natural person	EPO Boards of Appeal
Australia	Reversed — Initially yes (Federal Court, 2021); reversed on appeal (Full Federal Court, 2022) — inventor must be human	Full Federal Court of Australia
South Africa	Yes — granted patent with AI inventor (no substantive examination; formality-based system)	CIPC (2021) — not precedent-setting due to process

9.2 AI-Assisted Inventions

The Practical Question: While AI cannot be named as an inventor, AI-assisted inventions (where a human uses AI as a tool) are patentable. The USPTO issued guidance (February 2024) confirming that AI-assisted inventions are not automatically unpatentable but that a natural person must have made a "significant contribution" to the invention. This creates a spectrum from unpatentable (purely AI-generated) to patentable (AI-assisted with significant human contribution).

10. Comparative Analysis

Dimension	USA	EU	UK	Japan
Training Data	Fair use (case-by-case; pending litigation)	TDM exception with opt-out (Art. 4 DSMD)	No commercial TDM exception; licensing expected	Broad TDM exception (Art. 30-4)
AI Output Copyright	No — requires human authorship	No — requires human intellectual creation	Yes — computer-generated works provision (s.9(3))	No — requires human author
AI as Patent Inventor	No (Thaler v. Vidal)	No (EPO)	No (UK Supreme Court)	No
Transparency Req.	Proposed bills (pending)	AI Act Art. 53 (training data summaries)	Voluntary code of practice	Under discussion
Approach	Litigation-driven (courts deciding)	Legislative (DSMD + AI Act)	Voluntary / market-based	Legislative (permissive exception)

11. Trends & Future Outlook

Litigation Will Set US Law

The NYT v. OpenAI and Authors Guild v. OpenAI cases will likely reach appellate courts by 2026-2027. Their outcomes will establish whether AI training constitutes fair use under US law — a precedent with global implications given the dominance of US AI companies.

Licensing Frameworks Emerging

Major licensing deals are already being struck: OpenAI with AP, Axel Springer, Le Monde; Google with Reddit; various AI companies with stock photo agencies. A licensing market for AI training data is forming, though terms and economics remain highly contested.

Regulatory Divergence

Japan’s permissive approach, the EU’s opt-out regime, and the UK’s uncertainty create regulatory arbitrage opportunities and compliance challenges for global AI companies. Pressure for international harmonization is growing but no convergence is imminent.

Deepfakes & Right of Publicity

AI-generated replicas of real people (voice clones, digital likenesses) are driving new legislation (NO FAKES Act in US, various state laws) and raising questions at the intersection of copyright, right of publicity, and personality rights.

12. References & Resources

Official Sources

US Copyright Office — Artificial Intelligence — USCO AI initiative page with guidance, NOI, and reports
USCO — Copyright and Artificial Intelligence Report (2025) — Comprehensive policy analysis
EUR-Lex — Digital Single Market Directive (2019/790) — Full text including TDM exceptions
UK IPO — AI and Intellectual Property — Consultation documents and government response
USPTO — Artificial Intelligence — Patent and trademark guidance for AI

Key Court Decisions

Thaler v. Perlmutter (D.D.C. 2023) — AI-generated art not copyrightable without human authorship
Thaler v. Comptroller-General (UK Supreme Court 2023) — AI cannot be patent inventor
Thaler v. Vidal (Fed. Cir. 2022) — AI cannot be patent inventor (US)

Pending Litigation

New York Times v. Microsoft/OpenAI — Complaint (2023) — Full text of NYT complaint
Authors Guild v. OpenAI — Authors Guild case page

Academic & Research

WIPO — Artificial Intelligence and Intellectual Property — World IP Organization AI policy portal
SSRN — Fair Learning (Sobel) — Influential paper on AI training and fair use

AI, Copyright & Intellectual Property

Table of Contents