1. News
  2. AI
  3. Google Can Use Publisher Content Despite AI Opt-Out

Google Can Use Publisher Content Despite AI Opt-Out

featured
Share

Share This Post

or copy the link

Recent revelations suggest that Google Search products may utilize content from publishers even if they have opted out of training artificial intelligence (AI) systems. This information emerged during a testimony by a Google DeepMind executive in an ongoing antitrust case involving the US Justice Department, where it was clarified that this content is not incorporated into DeepMind’s AI models. Google has indicated that search content is governed by a distinct protocol that adheres to the robots.txt web standard.

Google’s Distinct Approach to AI Models and Search Products

A report by Bloomberg indicates that Eli Collins, Vice President of Product at Google DeepMind, confirmed the differing approaches towards publishers’ preferences for AI training between the DeepMind models and Google’s Search products.

During the proceedings, Diana Aguilar, an attorney for the Department of Justice, presented a document revealing that approximately 80 billion out of 160 billion tokens used for training Google’s AI models originated from content from publishers that opted out of AI training. In response, Collins maintained that once a publisher opts out, DeepMind’s models do not utilize that content.

However, when Aguilar inquired whether the Gemini AI model could utilize this same content if integrated into Search, Collins affirmed that this is indeed permissible, provided the application remains within Search. This includes functionalities such as Google’s AI Overviews and the recently introduced AI Mode powered by Gemini models.

This situation indicates that conventional opt-out measures may be insufficient to prevent Google from leveraging publisher content. Following updates to its privacy policy in June 2023, Google specified its intent to utilize all publicly available Internet data for training its language models. The term “publicly available” pertains to any websites that do not restrict access through paywalls or mandatory sign-up processes.

A spokesperson for Google later clarified to Bloomberg that the rules governing Search-oriented AI tools differ, revealing that publishers can only refuse their data’s usage in Search AI if they also opt out of being indexed for search. This can be achieved by disabling the robots.txt standard, which allows Google’s crawler bots to access content for indexing purposes.

However, this decision would also prevent those web pages from appearing in Google’s search results, effectively leaving publishers little choice but to permit Google to train its AI models using their data.

The ongoing antitrust lawsuit aims to demonstrate that Google maintains a monopoly in both search and AI sectors. The Department of Justice is advocating for measures that would compel Google to divest Google Chrome and share the data used in search results. Notably, there have been no proposals concerning the company’s AI products.

Google Can Use Publisher Content Despite AI Opt-Out
Comment

Tamamen Ücretsiz Olarak Bültenimize Abone Olabilirsin

Yeni haberlerden haberdar olmak için fırsatı kaçırma ve ücretsiz e-posta aboneliğini hemen başlat.

Your email address will not be published. Required fields are marked *

Login

To enjoy Technology Newso privileges, log in or create an account now, and it's completely free!