NPR News

'New York Times' sues ChatGPT creator OpenAI, Microsoft, for copyright infringement

NPR | By Bobby Allyn

Published December 27, 2023 at 1:47 PM EST

The New York Times filed a federal lawsuit against OpenAI and Microsoft seeking to end the practice of using its published material to train chatbots.

The New York Times sued OpenAI and its biggest backer, Microsoft, over copyright infringement on Wednesday, alleging the creator of ChatGPT used the newspaper's material without permission to train the massively popular chatbot.

In August, NPR reported that lawyers for OpenAI and the Times were engaged in tense licensing negotiations that had turned acrimonious, with the Times threatening to take legal action to protect the unauthorized use of its stories, which were being used to generate ChatGPT answers in response to user questions.

And now, the newspaper has done just that.

OpenAI has said using news articles is "fair use"

In the suit, attorneys for the Times claimed it sought "fair value" in its talks with OpenAI over the use of its content, but both sides could not reach an agreement.

OpenAI leaders have insisted that its mass scraping of large swaths of the internet, including articles from the Times, is protected under a legal doctrine known as "fair use."

It allows for material to be reused without permission in certain instances, including for research and teaching.

Courts have said fair use of a copyrighted work must generate something new that is "transformative," or comments on or refers back to an original work — something the Times argues does not apply to how OpenAI reproduces the paper's original reporting.

"There is nothing 'transformative' about using The Times's content without payment to create products that substitute for The Times and steal audiences away from it," Times lawyers wrote in the suit on Wednesday.

Suit seeks damages over alleged unlawful copying

The suit seeks to hold OpenAI and Microsoft responsible for the "billions of dollars in statutory and actual damages that they owe for the unlawful copying and use" of the Times' articles. In addition, the Times' legal team is asking a court to order the destruction of all large language model datasets, including ChatGPT, that rely on the publication's copyrighted works.

OpenAI and Microsoft did not return a request for comment.

Some news publishers have been leery about partnering with tech companies after becoming reliant on online traffic ushered in through search and social media, only to see Big Tech pivot away from distributing news in recent years. At the same time, the tech industry continued to pocket large sums of online advertising dollars, as the news industry struggled.

Media executives do not want to repeat the same pattern with AI, and the Times' legal battle with OpenAI could result in sweeping ramifications for the entire digital publishing industry.

The Times is the first major media organization to drag OpenAI to court over the thorny and still-unresolved question of whether artificial intelligence companies broke intellectual property law by training AI models with copyrighted material.

Over the past several months, OpenAI has tried to contain the conflict by striking licensing deals with publishers, including with the Associated Press and German media conglomerate Axel Springer, which publishes Business Insider and Politico.

The Times' suit joins a growing number of legal actions filed against OpenAI over copyright infringement. Writers, comedians, artists and others have filed complaints against the tech company, saying OpenAI's models illegally used their material without permission.

Another issue highlighted in the Times' suit is ChatGPT's tendency to "hallucinate," or produce information that sounds believable but is in fact completely fabricated.

Lawyers for the Times say that ChatGPT sometimes miscites the newspaper, claiming it reported things that were never reported, causing the paper "commercial and competitive injury."

These so-called "hallucinations" can be amplified to millions when tech companies incorporate chatbot answers in search engine results, as Microsoft is already doing with its Bing search engine.

Lawyers for the paper wrote in the suit: "Users who ask a search engine what The Times has written on a subject should be provided with neither an unauthorized copy nor an inaccurate forgery of a Times article."