Saturday, July 1, 2023

TorrentFreak's Latest News

 

Over 900 RARBG Magnet Link Repos Anonymously Nuked From GitHub
Andy Maxwell, 01 Jul 10:10 AM

rarbgThe most dependable things in life tend to be the things most easily taken for granted. In the piracy ecosystem, that certainly applied to torrent site RARBG.

RARBG was never likely to win any prizes for being the best-looking site with bleeding-edge features. Nor would its operators hope to win any. What the site did was what any indexer of any content should strive for; plenty of well-organized and readily searchable content, all of it supported by ancillary sources of complementary data, with very little downtime and zero drama.

Until the site threw in the towel in May, RARBG met all of these requirements and made it look easy. The decision to shut down obviously came as a shock, but the complete lack of notice took everyone by surprise. There would be no closing down event, and no last few days to grab whatever people had taken for granted would always be available.

Backups Are Boring

As computing tasks go, backups are indeed pretty boring. The same can't be said about not having a backup when you absolutely need one. A split-second decision to backup The Pirate Bay in 2006 saved the site and thanks to the silent work of archivists over several years, RARBG's massive magnet link database didn't die along with the site.

Following the RARBG shutdown, magnet link databases appeared on forums, file-hosting sites, even packaged up as torrents themselves. One early upload of more than 270,000 links appeared on GitHub and then took on a life of its own as contributors added to the database and created their own forks.

A subsequent readme file suggests that the archive later contained over six million magnet links, possibly one of the largest collections ever seen.

rarbg-repo-down

But just as RARBG gave no notice of its demise, these backups of backups also abruptly disappeared this week, removed by GitHub in response to a copyright complaint.

Entire Repositories Declared Infringing

Given the sheer volume of magnet links in the original repository, the chances of being hit with a DMCA takedown notice from one or more rightsholders from a pool of thousands were always relatively high. A mainstream rightsholder in the movie or music industry would've been a relatively safe prediction but would've been wide of the mark in this case.

"This letter is a Notice of Infringement as authorized in 512(c) of the U.S. Copyright Law," the DMCA notice reads.

"I am the copyright owner of the works and the following is true and accurate. I own full rights to these videos. I also have 2257's, ID's and model releases for them. I also have signed documents from the original producer(s) certifying I own them."

Unusually, the DMCA notice published by GitHub lists no original works whatsoever. Presumably, they were present in the original notice and for some reason GitHub made a decision to redact. What we can deduce from the above is that the mention of '2257' is a reference to 18 U.S.C §§ 2257 and the legal requirement to keep name and age verification records relating to performers in adult movies.

rar-dmca-1

The DMCA notice initially lists four repositories (1,2,3,4) along with a request to remove them in their entirety.

"These entire repositories are infringing as they shares multiple links to content for which I own copyrights to. They also shares and provide ways to facilitate piracy of said content [sic]," the notice adds.

GitHub removed them but the end result was much more comprehensive.

It's Over 900!

In addition to the first four named repos, the complaint demanded the removal of forks. Lots and lots of forks.

A note from GitHub states that because the reported network containing the allegedly infringing content was larger than 100 repos, and the submitter alleged that the forks were "infringing to the same extent" as the parent repository, the notice was actioned against the entire network.

What began as a takedown of a network of 45 repositories, ended up as a comprehensive takedown of 900 repositories, parent repository included.

Identity of DMCA Notice Sender: Unknown

In the interests of privacy, GitHub quite rightly redacts personal information from DMCA notices, but only rarely does it completely redact all information that would enable the identification of the sender. In this case, all information has indeed been redacted.

We can only speculate why GitHub made this decision, but one option is that the sender represented in a personal capacity (rather than under a corporate entity). It's possible that they have a personal interest in the content beyond simply owning it, but those are the kind of details the redaction is designed to cloak.

The final question relates to the magnet links and the allegedly infringing content claimed to be connected to them. RARBG has been offline for more than a month, plenty long enough for there to be zero seeds or peers left in any number of swarms.

So, on one hand the magnet links referenced in the notice could be considered as facilitating infringement, if they still work. On the other, if the swarm has already died, those magnets are just strings of text and of use to no one. If anyone has a backup of the backup of the backup, over six million magnets need to be checked, just to be sure.

The DMCA takedown notice can be found here

From: TF, for the latest news on copyright battles, piracy and more.

Authors Accuse OpenAI of Using Pirate Sites to Train ChatGPT
Ernesto Van der Sar, 30 Jun 06:21 PM

openaiGenerative AI models such as ChatGPT have captured the imagination of millions of people, offering a glimpse of what an AI-assisted future might look like.

The new technology also brings up novel copyright questions. Several rightsholders are worried that their work is being used to train AI without any form of compensation, for example.

How these and other copyright questions will be dealt with is not entirely clear. Governments around the world are taking different approaches, with U.S. Congress recently stating that it doesn't plan to overreact. Meanwhile, rightsholders don't intend to stand idly by.

Authors Sue OpenAI for Copyright Infringement

This week, authors Paul Tremblay and Mona Awad filed a class action lawsuit against OpenAI, accusing ChatGPT's parent company of copyright infringement and violating the DMCA, among other things. According to the authors, ChatGPT was partly trained on their copyrighted works, without permission.

The proof for this claim is seemingly simple. The authors never gave OpenAI permission to use their works, yet ChatGPT can provide accurate summaries of their writings. This information must have come from somewhere.

"Indeed, when ChatGPT is prompted, ChatGPT generates summaries of Plaintiffs' copyrighted works—something only possible if ChatGPT was trained on Plaintiffs' copyrighted works," the complaint reads.

Pirate Training

While these types of claims are not new, this week's lawsuit alleges that OpenAI used pirate websites as training input. This potentially includes Z-Library, a shadow library of millions of pirated books that's at the center of a criminal prosecution by the U.S. Department of Justice.

OpenAI hasn't disclosed the datasets that ChatGPT is trained on, but in an older paper two databases are referenced; "Books1" and "Books2". The first one contains roughly 63,000 titles and the latter around 294,000 titles.

These numbers are meaningless in isolation. However, the authors note that OpenAI must have used pirated resources, as legitimate databases with that many books don't exist.

"The only 'internet-based books corpora' that have ever offered that much material are notorious 'shadow library' websites like Library Genesis (aka LibGen), Z-Library (aka Bok), Sci-Hub, and Bibliotik. The books aggregated by these websites have also been available in bulk via torrent systems."

chatgpt complaint

Based on these data points, the complaint concludes that OpenAI committed copyright infringement. As compensation, the plaintiffs demand statutory damages, which can reach $150,000 per work. Additional damages for the alleged removal of copyright management information, in violation of the DMCA, are also on the table.

AI, Piracy and Copyright

There is no direct evidence that OpenAI used pirate sites to train ChatGPT. That said, it is no secret that some AI projects have trained on pirated material in the past, as an excellent summary from Search Engine Journal highlights.

The mainstream media has picked up this issue too. The Washington Post previously reported that the "C4 data set," which Google and Facebook used to train their AI models, included Z-Library and various other pirate sites.

"At least 27 other sites identified by the U.S. government as markets for piracy and counterfeits were present in the data set," the article added.

The present lawsuit will be closely watched by AI enthusiasts and rightsholders. It may result in OpenAI having to disclose some of its training data, which would be interesting in its own right

Even if it transpires that ChatGPT was trained with pirated books, the court would still have to decide whether that amounted to copyright infringement. Some experts believe that this type of AI training can be considered fair use.

Fair use protects transformative uses of copyrighted works that don't compete with the original content. According to several experts, that defense could likely apply to AI training cases.

A copy of the complaint filed against OpenAI at the federal court for the Northern District of California is available here (pdf)

From: TF, for the latest news on copyright battles, piracy and more.

Egyptian Authorities Shut Down Movizland and Arrest Operator
Ernesto Van der Sar, 30 Jun 04:11 PM

movizlandThe Alliance for Creativity and Entertainment (ACE) is the most active anti-piracy coalition, assisting enforcement efforts around the world.

The group is backed by prominent rightsholders such as Apple, BBC, beIN, Canal+, Disney, Sky, Netflix, and Warner Bros, as it systematically hunts down key piracy players.

Through new partnerships and connections, ACE expanded its work in the MENA region last year. This includes Egypt, where the coalition teamed up with local law enforcement to tackle several large streaming portals over the past months.

Movizland

This week, another casualty was added to this growing list. ACE announced that it assisted the authorities in taking down Movizland, a popular pirate streaming site that has been in business for over a decade.

As part of the enforcement action, the alleged owner was arrested. The site originally launched in 2012 by an Egyptian national, who operated it out of the Egyptian capital Cairo.

Movizland provided access to a library of roughly 34,000 movies and series and had roughly 12 million monthly visitors, spread across several domain names.

movizland

The streaming site was popular in the Middle East, with Egypt being the top traffic source. In recent months, Movizland was among the top 100 most visited websites in the country.

Taking Out Major Players

ACE boss Jan van Voorn informs TorrentFreak that his organization brought this case to the attention of the authorities after it identified the operator. This person was located through an in-house investigation by ACE, where third-party subpoenas also proved useful.

The anti-piracy group regularly targets third-party intermediaries, including domain name registries and CDN provider Cloudflare, with subpoenas that request information on pirate sites. This occasionally leads to useful information.

The takedown of Movizland follows after earlier successes against various sports streaming sites in Egypt, as well as other popular pirate streaming portals such as MyCima and Shahed4U.

"We are thankful for the continuous hard work of the Egyptian authorities to address these criminal networks," Van Voorn tells TorrentFreak, adding that there is "more to come."

Persistent Pirates

These enforcement efforts undoubtedly have an effect, and not just on the operators. Based on the responses on social media, many people are disappointed to see their favorite pirate streaming site offline.

Whether they will do much to eliminate piracy in the region is another question. For now, we have seen that pirate brands such as Egybest, MyCima, Shahed4U, Yalla-Shoot and Yallakora live on, presumably under different operators.

These alternatives typically start as clones and copycats but can eventually become just as popular as the original.

ACE hopes, however, that by continuing to apply pressure, most pirate site operators will eventually give up. That may work, but there's plenty of work left to do; despite the takedowns, Egypt's top 100 most visited websites list still features dubious streaming sites.

From: TF, for the latest news on copyright battles, piracy and more.

270x90-blue

Are you looking for a VPN service? TorrentFreak sponsor NordVPN has some excellent offers.

 
 
Powered by Mad Mimi®A GoDaddy® company

No comments: