Wednesday, May 1, 2024

TorrentFreak's Latest News

 

Newspapers Sue OpenAI for Copyright Infringement and 'Fake News' Hallicunations
Ernesto Van der Sar, 01 May 12:26 PM

newsprintStarting last year, various rightsholders have filed lawsuits against companies that develop AI models.

The list of complainants includes record labels, book authors, visual artists, a chip maker, and news publications. These rightsholders all object to the presumed use of their work without proper compensation.

Keeping pace with the constant stream of legal paperwork is a challenge, but a complaint filed at a New York federal court yesterday deserves to be highlighted. In this case, eight major news publications are suing OpenAI and Microsoft for copyright infringement.

U.S. Newspapers Sue OpenAI and Microsoft

The New York Daily News, Chicago Tribune, Orlando Sentinel, Sun-Sentinel, Mercury News, Denver Post, Pioneer Press, and Orange County Register, claim that the AI companies used their publications to train and develop ChatGPT models without obtaining permission.

In addition, ChatGPT can recall large parts of their copyright-protected articles, which effectively bypasses their paywalls. This has a direct effect on the newspapers' revenues, they argue.

"Defendants are taking the Publishers' work with impunity and are using the Publishers' journalism to create GenAI products that undermine the Publishers' core businesses by retransmitting 'their content'—in some cases verbatim from the Publishers' paywalled websites—to their readers."

Training On and Reproducing Copyrighted Articles

The complaint alleges that the newspapers' articles are prominent parts of the training material for OpenAI's models. GPT-3, for example, has 175 billion parameters and includes the 'WebText2' and 'Common Crawl' databases that both contain material owned by the plaintiffs.

This alleged unauthorized use remains ongoing, the newspapers claim, and it will likely continue in the future.

"On information and belief, Microsoft and OpenAI are currently or will imminently commence making additional copies of the Publishers' Works to train and/or fine-tune the next generation GPT-5 LLM," the complaint adds.

The plaintiffs show that ChatGPT can reproduce content from copyrighted news articles when prompted. In addition, third-party services in the OpenAI store are specifically marketed to bypass their paywalls, they say.

These tools include a custom GPT called "Remove Paywall" and a tool such as "News Summarizer", which promises to "save on subscription costs" and "skip paywalls just using the link text or URL."

remove paywall

OpenAI and Microsoft have previously argued that the use of copyrighted works to train its models falls under fair use. In addition, they called out the lack of specific copyright infringements by third parties.

This lawsuit is likely to trigger similar defenses, but copyright infringement allegations are just part of the newspapers' complaint.

'Fake News Hallucinations'

The newspapers are not only concerned by the unauthorized use of their works; they also allege that the AI tools cause commercial and competitive injury by spreading false claims.

The plaintiffs cite various examples where ChatGPT allegedly links dubious news reporting to their newspapers.

"As if plagiarizing the Publishers' work were not enough, Defendants' products are often subject to 'hallucinations' where those products malign the Publishers' credibility by falsely attributing inaccurate reporting to the Publishers' newspapers.

"Beyond just profiting from the theft of the Publishers' content, Defendants are actively tarnishing the newspapers' reputations and spreading dangerous disinformation."

One example is the spurious claim that disinfectants can cure Covid. While many newspapers reported on these claims, they didn't endorse them.

fake news

These hallucinations dilute and injure the reputation of the newspapers, the complaint alleges. This claim comes on top of the various copyright infringement accusations for which they request compensation.

Ultimately, the newspapers are not against Artificial Intelligence, but they do want OpenAI and Microsoft to pay for the content they use and, ideally, ensure that their reputations are not harmed in the process.

"This lawsuit is about how Microsoft and OpenAI are not entitled to use copyrighted newspaper content to build their new trillion-dollar enterprises, without paying for that content.

"As this lawsuit will demonstrate, Defendants must both obtain the Publishers' consent to use their content and pay fair value for such use," the newspapers conclude.

A copy of the complaint, filed by the newspapers at the U.S. District Court for the Southern District of New York, is available here (pdf)

From: TF, for the latest news on copyright battles, piracy and more.

CJEU Gives File-Sharer Surveillance & Data Retention a Green Light
Andy Maxwell, 30 Apr 09:13 PM

SpyAs part of anti-piracy scheme featuring warning letters, fines, and ISP disconnections, France has monitored and stored data on millions of internet users since 2010.

Digital rights groups insist that as a general surveillance and data retention scheme, the 'Hadopi' program violates fundamental rights.

Any program that monitors citizens' internet activities, retains huge amounts of data, and then links identities to IP addresses, must comply with EU rules. Activists said that under EU law, only "serious crime" qualifies and since petty file-sharing fails to make the grade, the whole program represents a mass violation of EU citizens' fundamental rights.

Surveillance and Serious Crime

Seeking confirmation at the highest level, La Quadrature du Net, Federation of Associative Internet Service Providers, French Data Network, and Franciliens.net, began their challenge in France. The Council of State referred the matter to the Constitutional Council, which in turn referred questions to the Court of Justice of the European Union (CJEU) for interpretation under EU law.

EU member states may not pass national laws that allow for the general and indiscriminate retention of traffic and location data. Retention of traffic and location data is permitted on a targeted basis as a "preventative measure" but only when the purpose of retention is to fight "serious crime."

In his non-binding opinion, CJEU Advocate General Szpunar described Hadopi's access to personal data corresponding to an IP address as a "serious interference with fundamental rights," the clearest sign yet that the right to privacy had already taken a blow.

CJEU judgments have balanced citizens' rights and rightsholders' right to copy many times over the years but here, case law was deemed potentially problematic. In fact so much so, AG Szpunar proposed "readjustment of the case-law of the Court" to ensure that rightsholders would not be left in a position where it was impossible to enforce their rights on BitTorrent and similar networks.

EU Law Shouldn't Rule Surveillance Out

By last September, it was clear that a legal basis needed to be found to allow Hadopi and similar programs to continue. For example, the fluid nature of dynamic IP addresses was mentioned as an obstacle to comprehensive tracking.

Well-constructed arguments stated that balance could be found in securing the harvested data and, to protect fundamental rights, limitations on how much data could be used in the event an alleged file-sharer was prosecuted.

Ultimately, however, when infringement occurs exclusively online, an IP address may be the only means to track down an alleged infringer, leading to the conclusion that retention and access to civil identifying data is both "necessary" and "wholly proportionate."

Copyrights Trump Privacy Rights

In its decision handed down Tuesday, initially only in French, the CJEU leaves no stone unturned in delivering a win for rightsholders. Despite the problematic case law, the judgment builds a framework for how monitoring and data retention can be conducted within the requirements of EU law.

The judgment deals with three key questions, summarized as follows:

1. Is civil identity data corresponding to an IP address included among the traffic and location data which, in principle, requires prior review by a court or administrative entity?

2. If yes, is EU law to be interpreted as precluding national legislation that provides for the collection of such data, corresponding to users' IP addresses, without prior review by a court or administrative entity?

3. If yes, does EU law preclude the review from being performed in an adapted fashion, for example as an automated review?

In other words, are member states precluded from having a national law that authorizes a copyright authority to access stored IP addresses and civil identity data relating to users, collected by rightsholders monitoring their activities on the internet, for the purpose of taking further action, without a review by a court or administrative body?

Data collected includes date and time of alleged infringement, IP address, peer-to-peer protocol, user pseudonym, details of copyright works, filename, ISP name.

Ensuring Privacy and Data Security

The judgment notes that IP addresses can constitute both traffic data and personal data. However, IP addresses that are public and visible, as they are in file-sharing swarms, are not being used in connection with the provision of an 'electronic communication service'.

The judgment also states that, if Member States seek to impose "an obligation to retain IP addresses in a general and indiscriminate manner, in order to attain an objective linked to combating criminal offenses in general", they should lay down clear and precise rules in legislation relating to retention of data, meeting strict requirements.

IP and civil identity data must be separated from each other and all other data, in a secure and reliable computer system. When IP addresses and civil data need to be linked, a process that does not undermine the "watertight separation" should be used, and regularly inspected for effectiveness. When these rules are followed, even citizens' data gathered indiscriminately cannot result in "serious interference" to fundamental rights.

The judgment notes that EU law does not "preclude the Member State concerned from imposing an obligation to retain IP addresses, in a general and indiscriminate manner, for the purposes of combating criminal offenses in general."

Balancing Competing Rights

The CJEU says that while EU citizens using internet services "must have a guarantee that their privacy and freedom of expression" will be respected, those fundamental rights are not absolute. The prevention of crime or the protection of the rights and freedoms of others may see those rights deemed less important.

Then, with some fluidity, the CJEU pulls the rug on excuses and upgrades petty file-sharing to something, well, a bit more serious.

To prevent crime, it may be strictly necessary and proportional for IP addresses to be captured and retained for "combating criminal offenses such as offenses infringing copyright or related rights committed online."

Indeed, not allowing the above "would carry a real risk of systemic impunity not only for criminal offenses infringing copyright or related rights, but also for other types of criminal offenses committed online or the commission or preparation of which is facilitated by the specific characteristics of the internet."

Pirate Privacy? Not Here

The judgment adds that despite the strict security guarding private information, there's always a chance that a person might find themselves profiled. And that, the court suggests, may be of their own making.

[S]uch a risk to privacy may arise, inter alia, where a person engages in activities infringing copyright or related rights on peer-to-peer networks repeatedly, or on a large scale, in connection with protected works of particular types that can be grouped together on the basis of the words in their title, revealing potentially sensitive information about aspects of that person's private life.

Thus, in the present case, in the context of the graduated response administrative procedure, a holder of an IP address may be particularly exposed to such a risk to his or her privacy where that procedure reaches the stage at which Hadopi must decide whether or not to refer the matter to the public prosecution service with a view to the prosecution of that person for conduct liable to constitute the minor offense of gross negligence or the offense of counterfeiting.

Throughout the course of the next few paragraphs, the judgment mentions processing data for the "prevention, investigation, detection or prosecution of criminal offenses," and a quote from the French government stating that "the measures adopted by Hadopi in the context of the graduated response procedure 'are of a pre-criminal nature directly linked to the judicial proceedings'."

That leads to the predictable conclusion that EU law does not preclude national legislation that allows for the surveillance of internet users and the retention of their data, for the purpose of identifying users and taking legal action against them.

Member states just need to follow the rules to ensure that those who didn't have their privacy breached when their data was collected, don't have it breached or leaked as they wait for whatever punishment arrives in the mail.

La Quadrature du Net says it's disappointed with the judgment.

"[T]his decision from the CJEU has, above all, validated the end of online anonymity. While in 2020 it stated that there was a right to online anonymity enshrined in the ePrivacy Directive, it is now abandoning it.

Unfortunately, by giving the police broad access to the civil identity associated with an IP address and to the content of a communication, it puts a de facto end to online anonymity."

The judgment is available here

From: TF, for the latest news on copyright battles, piracy and more.

Rightsholders Want U.S. "Know Your Customer" Proposal to Include Domain Name Services
Ernesto Van der Sar, 30 Apr 02:12 PM

anonymous hackerMany people see optional anonymity as a key feature of the Internet but increasingly there are calls for stricter identity checks.

While a full-blown 'Internet passport' is not yet required anywhere just yet, 'know your customer' requirements are increasingly common as a means to deter fraud and abuse.

"Know Your Customer" Rules

In recent years, copyright holders and industry groups have argued to expand these identity verification rules to tackle the online piracy problem. These efforts have begun to pay off in Europe and, over in the United States, similar calls are heard.

In 2021, President Donald Trump signed an executive order aiming to stop foreign cybercriminals from using US-based Infrastructure as a Service (IaaS) services. This kicked off a proposed rulemaking process to require advanced services to implement "Know Your Customer" regimes.

Last year, this proposal was followed up by an executive order from President Biden, with an added focus on potential abuses of online services to train Artificial Intelligence models. If adopted, the rules would put an end to anonymity for users of online cloud services.

Before taking the matter forward, the Department of Commerce asked the public for input on its plans which resulted in some noteworthy responses.

Rightsholder Coalition Chimes In

The Coalition for Online Accountability filed a response yesterday. While not generally known to the wider public, the coalition's members are seven well-known copyright industry players; the RIAA, MPA, ESA, Broadcast Music, Disney, Warner Bros, and NBCUniversal.

Given the makeup of the coalition, it doesn't come as a surprise that its submission has a strong focus on piracy.

"There is no doubt that the motion picture, music, and video game industries have long suffered from widespread online piracy and other abuses," the coalition writes.

The proposed rule could help to tackle the piracy problem. After all, pirate sites use cloud hosting and other IaaS services. Making it easier to identify the owners would greatly help to hold them accountable. However, the coalition sees a major shortcoming too, as the proposal doesn't include domain name services.

The proposal is very clear about this exclusion. Since domain name registrars and registries don't host any content, they fall outside its scope.

"It does not, however, capture domain name registration services for which a consumer registers a specific domain name with a third party, as that third party does not provide any processing, storage, network, or other fundamental computing resource to the consumer."

'U.S. Domain Name Services Should be Included'

The Coalition urges the Department of Commerce to reconsider its position. According to the rightsholders, domain name registrars must be included in the IaaS category, as they are broadly abused by pirate sites and services, with little recourse.

"Currently, many domain name registrars turn a blind eye on the rampant domain name abuse practices. They provide the means and instrumentalities for impersonation making no effort to collect true and correct data about their clients," the coalition writes.

To address this, the Coalition for Online Accountability proposes two main changes to the proposed rules.

The first one states that U.S. domain name service providers, including Verisign and GoDaddy, should be classified as IaaS providers. In addition, domain registries must ensure that the identifying information they collect is accurate.

"[I]t is important that all U.S. domain name registries be required by the forthcoming regulations to maintain complete and accurate databases of the identity and contact information of all registrants for the domain names that such registries administer," the coalition writes.

first and second

Currently, U.S. domain registries such as Verisign and PIR already require customers to supply accurate information but pirate sites typically don't do so. In addition, it's not always easy for rightsholders to access shielded Whois information.

The coalition proposes to make domain name registration directories openly accessible, free of charge. Many domain name services shielded this data for privacy reasons when the EU adopted the GDPR, but the rightsholders would like to go back to the old system, where all information is public.

If registrars use proxy or anonymizer services, these too should be required to disclose a domain name's owner in response to good faith claims of abuse.

Suspend Pirate Domain Names

As the cherry on top, the coalition goes beyond the "Know Your Customer" framework by proposing that domain name services should also take enforcement actions if "Trusted Notifiers" flag a domain for abuse.

This means that U.S. domain registrars and registries should suspend or disable a domain that is reported by a 'trusted' rightsholder representative.

"The domain name registrar, registry, privacy service, proxy service, or other domain name registration authority must disable, disrupt, or suspend any domain names used for domain name abuse within 48 hours of receipt of a notice submitted in good faith from a Trusted Notifier."

suspend

This type of scheme is not unprecedented. Currently, the MPA and RIAA already have trusted notifier status at several online intermediaries. This includes the domain registries Identity Digital and Radix, which regularly take action against piracy-related domains.

Whether the Department of Commerce is open to broadening the scope of its proposal remains to be seen. Rightsholders previously argued for similar expansions during the earlier inquiry, which didn't lead to the inclusion of domain name services.

A copy of the comments submitted by the Coalition for Online Accountability is available here (pdf)

From: TF, for the latest news on copyright battles, piracy and more.

270x90-blue

Are you looking for a VPN service? TorrentFreak sponsor NordVPN has some excellent offers.

 
 
Powered by Mad Mimi®A GoDaddy® company

No comments: