AI, Copyright, and Chaos: What Every Company Needs to Know Now

AI, Copyright, and Chaos: What Every Company Needs to Know Now

You know that AI raises copyright issues. You have heard some buzzing about it, but you haven’t had time to pull together the pieces.

No worries. This short primer breaks down the legal issues, the court battles, and the things you need to know right now about copyright and AI. Two legal questions are at the heart of it:

  1. Does using copyrighted content to train an AI model qualify as fair use under U.S. copyright law—or is it infringement?
  2. Who owns the copyrightable output of AI?

Let’s unpack the key issues, legal frameworks, and where this rapidly evolving controversy may be headed.


Fair Use or Infringement?

In the United States, copyright law protects original works of authorship—from novels and news articles to source code and digital art. But there are exceptions to copyright protection.

One of the most important exceptions is the fair use doctrine, which allows limited use of copyrighted material without permission, typically for purposes such as criticism, commentary, education, news reporting, or research.

To determine whether a use qualifies as fair use, courts apply a four-factor test:

  1. The purpose and character of the use, including whether it is commercial or educational, and whether it is “transformative” (i.e., adds new expression, meaning, or message).
  2. The nature of the copyrighted work—factual works are more likely to support fair use than purely creative ones.
  3. The amount and substantiality of the portion used, both in quantity and quality.
  4. The effect of the use on the potential market for the original work.

AI companies argue that training models on large datasets that include copyrighted works qualifies as fair use because the use is transformative. In other words, they claim that the AI isn’t republishing the original content; instead, it is learning patterns from the content to generate new text or images.

On the other hand, copyright holders argue that scraping and using their protected works without permission or compensation—especially for commercial products—is not transformative and undermines the creators’ ability to monetize their work.


Key Legal Battles to Watch

Several high-profile lawsuits are testing these arguments in court. These cases will likely shape the future legal landscape for AI and copyright:

  • The New York Times v. OpenAI and Microsoft (2023): The NYT filed a lawsuit accusing OpenAI and its partner Microsoft of using millions of its articles without permission to train language models like ChatGPT. The complaint claims that the AI models can reproduce near-verbatim excerpts of NYT articles and that this undermines the newspaper’s business model. The NYT argues this use is not fair use and seeks damages and removal of NYT content from the AI models.
  • Sarah Silverman v. Meta and OpenAI (2023): Comedian and author Sarah Silverman joined other authors in a class-action lawsuit, alleging that their books were illegally used to train AI models. These cases challenge the idea that large-scale scraping of copyrighted books (often from sources like shadow libraries) is permissible.
  • Andersen v. Stability AI (2023): Visual artists sued Stability AI, the company behind the image generator Stable Diffusion, claiming that their copyrighted artwork was used to train the model and that AI-generated images can closely mimic their styles.

So far, courts have dismissed some parts of these cases. But other parts—especially around the legality of training data and market harm—are proceeding. There is no final answer yet, and each case could set important precedent.


Ownership of AI Outputs

Beyond training data, another hot topic is who owns the copyrightable outputs generated by AI. U.S. copyright law only protects works of human authorship. If AI generates a poem, an image, or a piece of code with no meaningful human input, it likely isn’t protected by copyright at all. This creates two concerns for companies using generative AI:

  1. Lack of protection: If a company generates content using AI and there is no human authorship, the company may not be able to stop others from copying or reusing it.
  2. Infringement risk: If a company uses AI-generated output that mimics a copyrighted work to closely, the company could be liable for infringement - even if it was not aware of the overlap.

The U.S. Copyright Office reinforced its stance on this issue in a January 2025 report on copyrightability:

  • Works generated solely by AI lack copyright protection unless substantial human creative input is involved.
  • Simple prompts alone do not establish copyright ownership over AI outputs.
  • If a human makes creative arrangements or modifications to an AI-generated output that reflect originality, those portions may qualify for protection


Practical Takeaways for Legal and Compliance Teams

While the courts work through these complex issues, in-house legal and compliance professionals need to think proactively about managing AI copyright risks. Here are five practical steps to consider:

  1. Understand the training data: If your company builds or licenses AI tools, ask vendors for transparency around training data sources. Some companies now offer “clean” models trained only on licensed or public domain content.
  2. Document human input: For outputs your company creates using AI—like marketing content, software code, or product designs—keep records of human involvement. This can help establish copyright ownership and defend against infringement claims.
  3. Monitor outputs for similarities: Use plagiarism checkers, reverse image searches, and other tools to review AI-generated content for unintentional copying of protected works.
  4. Review contracts and terms: Ensure contracts with AI vendors address IP indemnity, data use rights, and output ownership. This is especially critical if AI outputs form part of your product or service offerings.
  5. Follow the litigation: Stay informed about key lawsuits and emerging guidance from courts and regulators. Legal interpretations are evolving, and decisions in the next year could redefine the landscape.


Where is this headed?

There is a real possibility that U.S. courts will eventually split on these issues, creating legal uncertainty for companies operating across jurisdictions. Some industry groups are calling for Congress to step in and clarify the rules, particularly around fair use in AI training.

Meanwhile, a new market for licensed training data is emerging. Companies like Shutterstock, Getty Images, and certain book publishers are striking deals to license their content for AI training, providing a middle ground between unrestricted scraping and total exclusion.

Long term, we may see a dual track emerge: one for general-purpose models trained on open or licensed data, and another for industry-specific models trained on fully vetted datasets. Either way, companies that proactively address copyright risks in their AI strategy will be better positioned for the regulatory shifts to come.

Lisa Rangel

Executive & Board Resume Writer endorsed by Recruiters | Ex-Executive Search | 200+ monthly LinkedIn Recos over 10 yrs | FreeExecJobSearchTraining.com |Recruiting AI Agent Company Board Member | Exec Job Landing Experts

5mo

So much that still needs to be developed and to keep up!!

Like
Reply
Volodymyr Boretskyi

AI & Technology Lawyer | EU AI Act | Legal Compliance | Data & Privacy | 19+ yrs in Litigation & Business Law

5mo

One of the most complex areas at the intersection of AI and copyright. Companies need to have a clear understanding of how third-party data is used to train models, maintain transparent documentation, and be prepared to respond to inquiries about data sources. This aligns fully with the AI Act’s requirements on traceability and accountability. It is essential - and urgent - to start transforming risk management approaches in legal and compliance departments now (actually, yesterday 😄).

Like
Reply
Andrea Davis

Vice President and State Counsel

5mo

Thank you for this - It is going to be an interesting few years while copyright law catches up. I always try to identify anything AI generated so folks know what is human generated (by me).

Like
Reply
Jessie Brown, JD, PCC

Career Coach for Lawyers | Executive Coach | Former Big Law | Retreat & Workshop Facilitator | Forest Bathing Guide | Meditation Teacher

5mo

This is so interesting! Many of us use AI to improve or lightly revise our writing, almost as if AI is our co-author. I had never thought about copyright issues around co-created content like that, but it’s a super relevant issue.

Jenn Deal

Trademark Lawyer | Lawyer Well-being Advocate

5mo

There’s so much interesting stuff going on st the intersection of IP and AI.

To view or add a comment, sign in

Others also viewed

Explore content categories