I remember checking like a year ago and they still had the word "gorilla" blackl...

_heimdall · on Feb 21, 2024

Gotta love such a high quality fix. When your upper high tech, state of the art algorithm learns racist patterns just blocklist the word and move on. Don't worry about why it learned such patterns in the first place.

nitwit005 · on Feb 21, 2024

Humans do look like gorillas. We're related. It's natural that an imperfect program that deals with images will will mistake the two.

Humans, unfortunately, are offended if you imply they look like gorillas.

What's a good fix? Human sensitivity is arbitrary, so the fix is going to tend to be arbitrary too.

_heimdall · on Feb 21, 2024

A good fix would, in my opinion, understanding how the algorithm is actually categorizing and why it miss-recognized gorillas and humans.

If the algorithm doesn't work well they have problems to solve.

DeusExMachina · on Feb 22, 2024

But this is not an algorithm. It's a trained neural network which is practically a black box. The best they can do is train it on different data sets, but that's impractical.

_heimdall · on Feb 22, 2024

That's exactly the problem I was trying to reference. The algorithms and data models are black boxes - we don't know wat they learned or why they learned it. That setup can't be intentionally fixed, and more importantly we wouldn't know if it was fixed because we can only validate input/output pairs.

CydeWeys · on Feb 22, 2024

It's too costly to potentially make that mistake again. So the solution guarantees it will never happen again.

Spivak · on Feb 21, 2024

You do understand that this has nothing to humans in general right? This isn't AI recognizing some evolutionary pattern and drawing comparisons to humans and primates -- it's racist content that specifically targets black people that is present in the training data.

_heimdall · on Feb 21, 2024

I don't know nearly enough about the inner workings of their algorithm to make that assumption.

The internet is surely full of racist photos that could teach the algorithm. The algorithm could also have bugs that miss-categorize the data.

The real problem is that those building and managing the algorithm don't fully know how it works or, more importantly, what it had learned. If they did the algorithm would be fixed without a term blocklist.

nitwit005 · on Feb 21, 2024

Nope. This is due to a past controversy about image search: https://www.nytimes.com/2023/05/22/technology/ai-photo-label...

callalex · on Feb 22, 2024

Where can I learn about this?

AlecSchueler · on Feb 21, 2024

[flagged]

_heimdall · on Feb 21, 2024

Do we have enough info for to say that decisively?

Ideally we would see the training data, though its probably reasonable to assume a random collection of internet content includes racist imagery. My understanding, though, is that the algorithm and the model of data learned is still a black box that people can't parse and understand.

How would we know for sure racist output is due to the racist input, rather than a side effect of some part of the training or querying algorithms?

alpaca128 · on Feb 21, 2024

It's not unavoidable, but it would cost more to produce high quality training data.

AlecSchueler · on Feb 21, 2024

Yes, somewhat unavoidable.