Intro
(Aditi) We have a bunch of problems, but we generally assume that we know the inputs accurately, but thatâs not always the case. âWhatâs the truth? Well thatâs a difficult thing; we are not gonna talk about that here.â When you google something, youâll get a subset of things, and that subset may be biased and distort human perception (which can be discouraging for a little girl who googles âscientistâ and sees a bunch of white males). Algorithms that differentiate based on race, colour, gender etc
The problem
In different scenarios, we ask questions. But the important thing is to not forget the context.
Subset selection
items, weighted non negative (utility) Pick a subset of size to maximise utility.
Assumption: Utilities correlate to âqualityâ, known exactly, and that things are additive in some sense⊠How realistic is that?
Other similar problems
Ranking
weighted bipartite matching putting object in thing gives utility , we want to maximise utility through matching (howâs this related to ranking?)
Supervised Learning
Inputs may not be accurate
What is your level of education on a scale of to ? This question doesnât make sense⊠One can add noise to get rid of biases like race etc, (which is what people do for privacy), but that can cause (fair) algorithms to fail.
We want to come up with algorithms that provide guarantees wrt the original data, not the noisy one we received. Because noise and bias (explicit/implicit) is happening, and we would like to undo the noise. (We are not looking at this from a coding theory perspective.)
Back to the problem
Our model is: people, some bias (multiplicative, say ) towards a certain group
Rooney Rule: Spend more time (interview) with at least one person from an underrepresentated group. Someone has showed that this can increase the latent (total) utility.
Generalisation: interview at least people from underrepresented
- Why can we not just get rid of the names, genders, ages before assessment? Where is the bias being introduced?
- Can we try to find an approximation for and unmultiply it?
- Why would the overall order be preserved after bias? If it is, canât we just pick highest few? Is it within a group? Ah okay!