The Face Images Used to Train AI Are Taken Without Consent

IBM recently published a dataset for facial recognition AI made up of images that it scraped off of Flickr without asking for permission. — *Image: nikolapeskova via Pixabay/Tag Hartman-Simkins*

Move Fast and Break Things

The countless pictures of people that are used to train facial recognition artificial intelligence systems are often used without consent.

Most notably, IBM downloaded and used almost 1 million pictures from public Flickr accounts without notifying the pictures’ photographers or subjects, according to NBC News.

“This is the dirty little secret of AI training sets,” NYU law professor Jason Schultz told NBC. “Researchers often just grab whatever images are available in the wild.”

Positive Twist

When IBM announced that it had created its new dataset, in which each of the pictures was annotated with people’s measurements and descriptions, the company framed it as a way to make facial recognition systems fairer. That’s because before training AI on user-generated images became common, many algorithms only learned from limited datasets like celebrity images taken off of the web.

But the reality isn’t that rosy. Particularly concerning is how images of people’s faces are being used without their knowledge or consent to train algorithms that then may be used to surveil and monitor them. For example, The Intercept reported that IBM sold the NYPD surveillance systems that let cops scan CCTV footage and filter by people’s race.

“People gave their consent to sharing their photos in a different internet ecosystem,” Meredith Whittaker, co-director of the AI Now Institute, told NBC. “Now they are being unwillingly or unknowingly cast in the training of systems that could potentially be used in oppressive ways against their communities.”

Claiming Dibs

IBM claims it will accept people’s requests to take certain images down, but the company doesn’t make it easy. It only sends the dataset to academic and corporate research teams, so there’s no way that someone would even know their picture had been used, according to NBC.

NBC reports that IBM made one photographer jump through hoops to get his photos removed, and only deleted four of over 1,000 of his photos that were included in the set.

“This is the type of mass collection and use of biometric data that can be easily abused,” class-action lawyer Jay Edelson told NBC, “and appears to be taking place without the knowledge of those in the photos.”

More on facial recognition: Cops Are Using Amazon’s Facial Recognition Software Wrong