Google hit with lawsuit over data used to train its AI products

share on

twitter /
facebook /
linkedin /
- email
- telegram
- whatsapp
- wechat
- pinterest
- line
- snapchat
- reddit

Tech giant Google has been hit with a lawsuit this week alleging that it used data from millions of users online without their consent and that it violated copyright laws so that it could train and develop its artificial intelligence systems, according to legal documents seen by MARKETING-INTERACTIVE.

"It has very recently come to light that Google has been secretly stealing everything ever created and shared on the internet by hundreds of millions of Americans. Google has taken all out personal and professional information, our creative and copywritten works, our photographs, and even our emails—virtually the entirety of our digital footprint —and is using it to build commercial artificial intelligence (AI) products like Bard, the chatbot Google recently released to compete with OpenAI’s ChatGPT," it read, adding:

For years, Google harvested this data in secret, without notice or consent from anyone.

Don't miss: Google debunks claims it misled advertisers: Is that enough to keep brands confident?

It alledged that as part of its theft of personal data, Google illegally accessed restricted, subscription-based websites to take the content of millions without permission and infringed at least 200 million materials explicitly protected by copyright, including previously stolen property from websites known for pirated collections of books and other creative works.

"This mass theft of personal information has stunned internet users around the world," it said. It added that Google is not the only "bad actor" in the new AI economy though. Rather, it noted that the entire tech industry is looking to "vacuum up as much data as they can find" in the AI race.

Personal data of every kind, especially conversational data between humans, is critical to the AI training process, it added. "This is how products like Bard develop human-like communication capabilities. Creative and expressive works are just as valuable because that is how AI products learn to 'create' art," the lawsuit said.

However, the lawsuit alleged that despite warnings by governing bodies to not break the law in their pursuit of machine learning, Google has decided to "quietly 'update' its online privacy policy" last week to double-down on its position that "everything on internet is fair game for the company to take for private gain and commercial use".

It said:

It was the company’s first public acknowledgement of what it had been doing in secret for years: scraping the entire internet to take anything it could.

Without this mass theft of private and copyrighted information belonging to real people, communicated to unique communities for specific purposes, and targeting specific audiences, many of Google’s AI products including Bard would not exit, it said.

Google’s "sudden" notice and admission regarding its scraping practices comes three days after OpenAI was sued for theft and commercial misappropriation of personal data on the internet as part of its own massive “scraping” operation, also done in secret, without notice or consent from anyone whose personal information was taken.

According to legal documents, Google has since responded to the backlash by inviting the world to engage in a "dialogue" about what data collection and protection efforts should look like in the new era of AI. However, the lawsuit classified this as "too little too late".

The lawsuit comes shortly after it was revealed that Google may have misled advertisers including Fortune 500 brands, the US federal government, and many small businesses with regards to its ad viewership, according to a new report by advertising research organisation Adalytics.

The misalignment likely came from Google’s proprietary TrueView skippable in-stream video ads which Adalytics estimates cost media buyers up to "billions of digital ad dollars", which were ultimately spent on "small, muted, out-stream, auto-playing or interstitial video ad units running on independent websites and mobile apps".

TrueView is Google's proprietary cost-per-view ad format that is displayed on YouTube, apps, and across the web. With TrueView, advertisers only pay when people actually view their ads rather than impressions. That said TrueView will ask users if they want to skip the video ad they are watching after five seconds.

Adalytics said that it found out that many advertisers who were paying for TrueView ads outside YouTube were not getting what they paid for and additionally, not getting consumer experience that meets Google's stated quality standards.

"For example, many TrueView in-stream ads were served muted and auto-playing as out-stream video or as obscured video players on independent sites. Often, there was little to no organic video media content between ads, the video units simply played ads only," said Adalytics.

share on