Wednesday, August 16, 2023

TorrentFreak's Latest News

 

Anti-Piracy Group Takes Prominent AI Training Dataset "Books3′ Offline
Ernesto Van der Sar, 16 Aug 11:05 AM

The eyeGenerative AI models such as ChatGPT have captured the imaginations of millions of people, offering a glimpse of what an AI-assisted future might look like.

There is little doubt that generative AI will lead to new breakthroughs, some with the potential to revolutionize many aspects of day-to-day life. At the same time, AI is causing grave concerns within the copyright industries.

The copyright angle is the topic of many debates and has already made its way to court in a few cases. It's high on the agendas of governments around the world, which are poised to accommodate generative AI within copyright legislation.

While lawyers and lawmakers are working hard to explore this novel area, anti-piracy agencies are taking concrete action. A few weeks ago we reported that the RIAA had taken down datasets used to create voice models, for example.

Books3 AI Training Database

This week, Rights Alliance entered the arena with one of the most high-profile takedowns thus far. The Danish anti-piracy outfit sent a DMCA takedown notice to The Eye, targeting the "Books3" training dataset.

Books3 doesn't sound as exciting as 'The Lord of the Rings' or 'A Song of Ice and Fire' but these titles are likely covered in the plaintext collection of 196,640 books, which is nearly 37GB in size.

The dataset, which contains all books from the pirate site Bibliotik, was first published on The Eye in late 2020 and since then has been used to train several AI models, including Meta's.

Initial 'release' in 2020

presser

The notion that AI models are trained on pirated books isn't new. According to a recent lawsuit, which also mentions Books3, OpenAI also used books datasets that rightsholders believe were sourced from shadow libraries such as LibGen, Z-Library and Sci-Hub.

Anti-Piracy Group Targets Books3

In recent years, The Eye managed to keep the Books3 database online but recently removed the archive following Rights Alliance's takedown notice.

The anti-piracy group acted on behalf of Danish book publishers whose works were featured in the database. They see this as an important step to limit access to unauthorized AI training materials, which can be exploited by commercial AI initiatives.

"It is absolutely crucial that we can prevent AI from being trained on illegal content," Rights Alliance Director Maria Fredenslund says, commenting on the takedown.

"We have a big task ahead of us in detecting and taking down illegal training datasets like Books3, but also in dealing with AI that has already been trained on illegal content and is now spreading on the internet."

Rights Alliance stresses that it should be up to rightsholders to control how their works are used so the crackdown on unauthorized datasets will continue.

Books3 is Down, But not Everywhere

While the original and most widely circulated Books3 download link is offline now, the dataset hasn't completely disappeared from the web. The file is still backed up by the Internet Archive's Wayback Machine and alternative download links are also being shared.

Shawn Presser, who first shared the Books3 dataset on X years ago, points out that it is still available elsewhere. For example, Books3 is part of 'The Pile', an AI training dataset compiled by EleutherAI. A torrent for this dataset is still hosted on The Eye at the time of writing.

August 2023 Update…

books3

In addition, the Books3 dataset is also available from direct download sources. In this sense, it's not much different from traditional pirated books and movies, which are hard to take down permanently.

This shows that AI doesn't just promise new technological breakthroughs, it also adds a new task to the roster of anti-piracy groups.

From: TF, for the latest news on copyright battles, piracy and more.

Court Orders SportsBay to Pay Almost Half a Billion Dollars For Violating DMCA
Andy Maxwell, 15 Aug 08:47 PM

lockIn July 2021, U.S. broadcaster DISH Network and subsidiary Sling TV filed a copyright lawsuit in a Texas district court against the unknown operators of four websites – SportsBay.org, SportsBay.tv, Live-NBA.stream, and Freefeds.com.

The complaint alleged that the unknown defendants circumvented (and provided technologies and services that circumvented) security measures employed by Sling and thereby provided "DISH's television programming" to users of their websites.

According to the complaint, the defendants circumvented technological measures contrary to 17 U.S.C. § 1201(a)(1)(A), and trafficked in circumvention technology and services contrary to 17 U.S.C. § 1201(a)(2) through their operation of the websites. The plaintiffs requested a permanent injunction, control of the defendants' domains, and damages of up to $2,500 for each violation of the DMCA's anti-circumvention provisions.

Early September 2021, District Judge Charles Eskridge granted DISH's request to start serving subpoenas on third-party service providers including Namecheap and WhoisGuard, Tucows, Cloudflare, Digital Ocean, Google, Facebook and Twitter, with the aim of identifying the still-unknown operators of the sites. In the same month, the sites listed in the complaint disappeared.

DISH Names Defendants in Argentina

According to DISH's first amended complaint filed in January 2022, information obtained from the third-party service providers enabled the company to identify two men responsible for operating the SportsBay sites.

Juan Barcan, an individual residing in Buenos Aires, Argentina, used his PayPal account to make payments to Namecheap and GitHub. Juan Nahuel Pereyra, also of Buenos Aires, used his PayPal account to make payments to Namecheap.

On January 20, 2022, DISH sent a request to the Argentine Central Authority to serve Barcan and Pereyra under the Hague Convention.

On October 31, 2022, the Central Authority informed DISH that Pereyra was served in Buenos Aires on September 14, 2022. Barcan was not served so after obtaining permission from the court, DISH served Barcan via a Gmail address used to make payments to Namecheap for the Sportsbay.org, Live-nba.stream, and Freefeds.com domain names.

When the defendants failed to appear, DISH sought default judgment. As part of that process, the plaintiffs provided the following information to describe the functions of the four websites listed in the complaint.

When Defendants and users selected or clicked on a channel on Sportsbay.org or Sportsbay.tv, the websites connected to Defendants' Freefeds.com website by embedding in an iframe content originating from a Freefeds.com Uniform Resource Locator. The Freefeds.com iframe then accessed the encrypted Sling programing originating from Sling's computer server and delivered it to the embedded iframe on the Sportsbay.org and Sportsbay.tv websites.

The Freefeds.com iframe then connected to Defendants' Live-nba.stream server in order to obtain the DRM keys necessary to decrypt the Sling programming so that it was displayed to Defendants and users on the Sportsbay.org and Sportsbay.tv websites

DISH informed the court that each time a user accessed Sling programming from the links on the websites, "a connection was made with Live-nba.stream to obtain encryption keys to decrypt Sling's transmission."

By framing each visit to the Live-nba.stream website as a circumvention violation under section 1201(a)(2) of the DMCA, and nominating a six-month period where that domain reportedly received 2,469,250 visits from users in the United States, DISH arrived at a "reasonable and conservative claim" based on minimum statutory damages of just $200 for each violation.

"The Court should grant Plaintiffs' Motion, award statutory damages in the amount of $493,850,000 for Defendants' 2,469,250 violations of section 1201(a)(2) of the DMCA, and enter a permanent injunction barring Defendants from further violations," DISH informed the court.

DISH Awarded Nearly Half a Billion in Damages

In his order handed down yesterday, District Judge Charles Eskridge entered a default judgment against Juan Barcan and Juan Nahuel Pereyra for violations of the DMCA's anti-circumvention provisions.

The defendants and anyone acting in concert with them are permanently enjoined from circumventing any technological protection measure that controls access to Sling or DISH programming, including through the use of websites or any similar internet streaming service. Then comes the award for damages.

"Plaintiffs are awarded $493,850,000 in statutory damages against Defendants, jointly and severally, for Defendants' 2,469,250 violations of section 1201(a)(2) of the DMCA," the order reads.

DISH-half a billion DMCA

The order can be found here (pdf)

From: TF, for the latest news on copyright battles, piracy and more.

270x90-blue

Are you looking for a VPN service? TorrentFreak sponsor NordVPN has some excellent offers.

 
 
Powered by Mad Mimi®A GoDaddy® company

No comments: