Meta's New Image-Generating AI Is Trained on Your Instagram and Facebook Posts

Cashing In

Earlier this week, Meta announced a new AI image generator dubbed "Imagine with Meta AI."

And while it may seem like an otherwise conventional tool meant to compete with the likes of Google's DALL-E 3, Diffusion, and Midjourney, Meta's underlying "Emu" image-synthesis model has a dirty little secret.

What's that? Well, as Ars Technica points out, the social media company trained it using a whopping 1.1 billion Instagram and Facebook photos, per the company's official documentation — the latest example of Meta squeezing every last drop out of its user base and its ever-valuable data.

In many ways, it's a data privacy nightmare waiting to unfold. While Meta claims to only have used photos that were set to "public," it's likely only a matter of time until somebody finds a way to abuse the system. After all, Meta's track record is abysmal when it comes to ensuring its users' privacy, to say the least.

So Creative

Meta is selling its latest tool, which was made available exclusively in the US this week, as a "fun and creative" way to generate "content in chats."

"This standalone experience for creative hobbyists lets you create images with technology from Emu, our image foundation model," the company's announcement reads. "While our messaging experience is designed for more playful, back-and-forth interactions, you can now create free images on the web, too."

Meta's Emu model uses a process called "quality-tuning" to compare the "aesthetic alignment" of comparable images, setting it apart from the competition, as Ars notes.

Other than that, the tool is exactly what you'd expect. With a simple prompt, it can spit out four photorealistic images of skateboarding teddy bears or an elephant walking out of a fridge, which can then be shared on Instagram or Facebook — where, perhaps, they'll be scraped by the next AI.

Earlier this year, Meta's president for global affairs Nick Clegg told Reuters that the company has been crawling through Facebook and Instagram posts to train its Meta AI virtual assistant as well as its Emu image model.

At the time, Clegg claimed that Meta was excluding private messages and posts, avoiding public datasets with a "heavy preponderance of personal information."

Instead of immediately triggering a massive outcry and lawsuits over possible copyright infringement like Meta's competitors, the social media company can crawl its own datasets, which come courtesy of its users and its expansive terms of service.

But relying on Instagram selfies and Facebook family albums comes with its own inherent risks, which may well come back to haunt the Mark Zuckerberg-led social giant.

More on Meta: Facebook Has a Gigantic Pedophilia Problem

Share This Article