Health & Science

'New York Times' considers legal action against OpenAI as copyright tensions swirl

NPR | By Bobby Allyn

Published August 16, 2023 at 4:53 PM CDT

OpenAI, maker of ChatGPT, has in recent weeks been hit with lawsuits from comedian Sarah Silverman, U.S. novelists, and others alleging copyright infringement. Now, the New York Times is weighing possible legal action.

The New York Times and OpenAI could end up in court.

Lawyers for the newspaper are exploring whether to sue OpenAI to protect the intellectual property rights associated with its reporting, according to two people with direct knowledge of the discussions.

For weeks, the Times and the maker of ChatGPT have been locked in tense negotiations over reaching a licensing deal in which OpenAI would pay the Times for incorporating its stories in the tech company's AI tools, but the discussions have become so contentious that the paper is now considering legal action.

The individuals who confirmed the potential lawsuit requested anonymity because they were not authorized to speak publicly about the matter.

A lawsuit from the Times against OpenAI would set up what could be the most high-profile legal tussle yet over copyright protection in the age of generative AI.

A top concern for the Times is that ChatGPT is, in a sense, becoming a direct competitor with the paper by creating text that answers questions based on the original reporting and writing of the paper's staff.

It's a fear heightened by tech companies using generative AI tools in search engines. Microsoft, which has invested billions into OpenAI, is now powering its Bing search engine with ChatGPT.

If, when someone searches online, they are served a paragraph-long answer from an AI tool that refashions reporting from the Times, the need to visit the publisher's website is greatly diminished, said one person involved in the talks.

So-called large language models like ChatGPT have scraped vast parts of the internet to assemble data that inform how the chatbot responds to various inquiries. The data-mining is conducted without permission. Whether hoovering up this massive repository is legal remains an open question.

If OpenAI is found to have violated any copyrights in this process, federal law allows for the infringing articles to be destroyed at the end of the case.

In other words, if a federal judge finds that OpenAI illegally copied the Times' articles to train its AI model, the court could order the company to destroy ChatGPT's dataset, forcing the company to recreate it using only work that it is authorized to use.

Federal copyright law also carries stiff financial penalties, with violators facing fines up to $150,000 for each infringement "committed willfully."

"If you're copying millions of works, you can see how that becomes a number that becomes potentially fatal for a company," said Daniel Gervais, the co-director of the intellectual property program at Vanderbilt University who studies generative AI. "Copyright law is a sword that's going to hang over the heads of AI companies for several years unless they figure out how to negotiate a solution."

The Times' talks with OpenAI follow reports that the paper will not join other media organizations in attempting to negotiate with tech companies over use of content in AI models. A person at the Times said not participating is unrelated to any potential litigation against OpenAI, which declined to comment through a spokesperson.

While a spokesman for the Times would not comment, the paper's executives have publicly nodded at the tension.

In June, Times CEO Meredith Kopit Levien said at the Cannes Lions Festival that it is time for tech companies to pay their fair share for tapping the paper's vast archives.

"There must be fair value exchange for the content that's already been used, and the content that will continue to be used to train models," she said.

The same month, Alex Hardiman, the paper's chief product officer, and Sam Dolnick, a deputy managing editor, described in a memo to staff a new internal initiative designed to capture the potential benefits of artificial intelligence.

They cited "protecting our rights" among their chief fears: "How do we ensure that companies that use generative AI respect our intellectual property, brands, reader relationships and investments?"

A Times suit would join other copyright holders taking aim at AI companies

Any potential suit the Times files would join other similar legal actions leveled against OpenAI in recent weeks.

Comedian Sarah Silverman joined a class-action suit against the company, alleging that she never gave ChatGPT permission to ingest a digital version of her 2010 memoir The Bedwetter, which she says the company swallowed up from an illegal online "shadow library"

Other generative AI companies, like Stability AI, which distributes the image generator Stable Diffusion, have also been hit with copyright lawsuits.

Getty Images is suing Stability AI for allegedly training an AI model on more than 12 million Getty Images photos without authorization.

"Copyright holders see these instances are reckless, and AI companies see it as gutsy," Vanderbilt's Gervais said. "As always, the final answer will be determined by who ends up winning these lawsuits."

Legal experts say AI companies are likely to invoke a defense citing what is known as "fair use doctrine," which allows for the use of a work without permission in certain instances, including teaching, criticism, research and news reporting.

Key question for AI suits: Will 'fair use' apply?

There are two legal precedents that will likely play a part in the pending AI copyright disputes.

The first is a 2015 federal appeals court ruling that found that Google's digitally scanning of millions of books for its Google Books library was a legally permissible use of "fair use," and not copyright infringement.

The court found that Google's digital library of books did not create a "significant market substitute" for the books, meaning it did not compete with the original works.

Legal experts say proving that in the AI cases will be a major hurdle to overcome for OpenAI.

The second case expected to be relevant to the AI copyright suits is the Andy Warhol Foundation case the Supreme Court decided in May.

In it, the high court found that Andy Warhol was not protected by fair use doctrine when he altered a photograph of Prince taken by Lynn Goldsmith. Importantly, the court found that Warhol and Goldsmith were selling the images to magazines.

Therefore, the court wrote, the original and the copied work shared "the same or highly similar purposes, or where wide dissemination of a secondary work would otherwise run the risk of substitution for the original or licensed derivatives of it."

Lawyers for the Times believe OpenAI's use of the paper's articles to spit out descriptions of news events should not be protected by fair use, arguing that it risks becoming something of a replacement for the paper's coverage.

NPR's David Folkenflik contributed to this report.