Google does not own the internet, says AI data lawsuit

Google, DeepMind and parent company, Alphabet, have been accused of “secretly stealing everything ever created and shared on the internet by hundreds of millions of Americans” to build their own AI chatbot, Bard.

Eight pseudonymous individuals – including two minors, aged 13 and 6 – are seeking to lead millions of netizens in a class action effort alleging 10 charges against Alphabet’s answer to OpenAI’s ChatGPT, among other items in Google’s “suite of AI products.”

The complaint was filed in a California federal court and alleges the company breached several state and US federal laws including the DMCA, California’s Unfair Competition law, and a state invasion of privacy rule. It also accuses Bard’s maker of larceny/receipt of stolen property.

The lawsuit [PDF] was keen to note Google’s recent update of its privacy policy confirming it scrapes public data from the internet to train its AI models and services – including both Bard and its cloud-hosted products. The suit claims the move was made in response to the FTC’s warning that: “Machine learning is no excuse to break the law… The data you use to improve your algorithms must be lawfully collected… companies would do well to heed this lesson.”

The claim from the unnamed plaintiffs is that the “update” of Google’s online privacy policy was an effective doubling-down on its position.

The news comes hot on the heels of similar lawsuits accusing Microsoft-backed OpenAI of privacy breaches and misusing scraped data, as well as of copyright infringement.

Like the Microsoft and OpenAI class action filed in June, yesterday’s lawsuit also namechecks Reg articles – twice – when delving into the technical side of matters. The first time is to explain how Google’s immense public C4 dataset ingests material to build its next-gen machine learning systems. The second mention is to further its argument that Google profits from the data, footnoting our April coverage of the internal Google presentation titled “AI-powered ads 2023” outlining Google’s plan to roll out generative AI tools to its advertising platform.

DeepMind, still run by co-founder Demis Hassabis after being acquired by Google in 2014, is cited in the suit due to its work on developing the Language Model for Dialogue Applications (LaMDA), considered instrumental in Bard’s development as well as in other Google AI products.

The suit claims that Google’s moves breach privacy rights and property rights, alleging:

The plaintiffs are looking for at least $5 billion, injunctive relief, and implementation of “effective cybersecurity safeguards” to protect the data subjects.

In a statement, Google general counsel Halimah DeLaine Prado said the company had been “clear for years that we use data from public sources – like information published to the open web and public datasets – to train the AI models behind services like Google Translate, responsibly and in line with our AI Principles.”

She added: “American law supports using public information to create new beneficial uses, and we look forward to refuting these baseless claims.” ®

Source link