There is a joke among pilots—of course, I never heard any pilot actually saying the joke, but I found it online in plenty of commentaries about automation—according to which the best aeroplane crew is composed by a computer, a pilot, and a dog. The computer flies the plane. The pilot feeds the dog. The dog’s task is to bite the pilot whenever they try to interfere with the computer’s work. In other words: let the machine do the job for you.
[picture from unsplash.com]
Algorithms today surely have a bad reputation. As you may have heard, machines control more and more our daily life. Algorithms are not any more a topic for philosophers and engineers: machines decide what we see online, what drugs we should take given our symptoms, and, if not now, then in a non-remote future, they will drive our cars and choose whether we will get a job or not. There are two main lines of criticism about how algorithms are employed today. According to the first, algorithms reproduce the biases of the programmers that created them, or of the data used to train them. According to the second, algorithms are opaque to users: in fact, because of their growing complexity and interconnectedness, are opaque to their creators too. Both criticisms are correct and, I think, both criticisms are not decisive.
Automatic job screening does look like a scary development. There are plenty of intuitive reasons that make unpleasant the idea that a machine can have the power to take life-changing decisions, especially if I am the subject of these decisions. Not surprisingly, accounts of ‘algorithmic biases’ in hiring are not missing. It has been reported that an algorithm used by Amazon was discriminating against women. The algorithm was fed with the previous decade of hiring decisions, and, since in the majority of cases males were hired, it learned to give a negative weight to CVs whenever it found mentions of women-only colleges or membership of women-only associations (‘women’s chess club’): in fact, to give negative weight to the bare presence of the term ‘woman’ or ‘women’.
Algorithms are stupid—I cannot resist mentioning a SMBC comic where ‘The Rise of the Machines’ fails as robots are fighting humans with spears and rocks, as their machine-learning algorithm was fed with data on historical battles, and ‘the vast majority of battle-winners used pre-modern weaponry’—but that is indeed the point. Biases will sure be present, but this is not a reason to throw the baby out with the bathwater: it is a reason to abandon an overly naïve view for which algorithms, as such, are spotless procedures giving always a perfect result (I am not sure who would support this view, but I found it often presented when criticising algorithms). The crux is not whether algorithms are biased (yes, they are!) but whether we can recognise and correct biases in algorithms better or worst than what happens with humans. This is, you will concede, an open question.
Let’s put it in another way. I am sure you can think about a committee composed by humans that decided to not hire you for reasons that were, in your opinion, debatable, and perhaps not fair. Would have an algorithm been better? I am a non-native English speaker and it is a reasonable assumption that this will disadvantage me in an interview for a job (for the rest, I am your ‘old white dude’ so this should be favoring me). I am not talking about biases against my demographic group (that is, interviewers being against non-native English speakers as such), but about the possibility that I may not be able to understand a subtlety in a question, that my answers will look less brilliant as, say, the breadth of my vocabulary is more limited than the one of a native speaker vocabulary, or that I will appear not too fond of informal verbal interactions as they may be more demanding for me that for a native speaker. Would an automatic process be preferable? Would I prefer to be interviewed by a computer in front of a camera, and having my answer analyzed by an algorithm (assuming that it would be able to understand what I am talking about)? Well, I’d say I am not against.
If we think that the hypothetical bias against non-native speakers is real, and it is a problem, we could design the algorithm to rescale up the score for non-native speakers, so that a non-native speaker getting a score slightly lower than a native one would have the same final evaluation from the algorithm. Is that a desirable outcome? Perhaps there are good reasons to think that a candidate that can understand slightly better subtleties and with a bigger vocabulary will do a better job. And what if a non-native speaker had a very good education in English language and they speak as native ones with minimum effort? Should this candidate be evaluated higher than a native English speaker that got the same score? This looks unfair. But perhaps the fact that they master in depth a second language says something interesting about the candidate? And, what about bilinguals?
In sum, subtle biases and their effects are very complicated to assess, both directly from humans and indirectly by programming algorithms, but what about obvious shortcoming, like the Amazon algorithm penalizing women? In a way, that is a positive story: the bias was recognized and—one hopes—corrected by Amazon’s engineers. Research is far from unanimous in showing shortcomings of algorithms and, when procedures based on algorithms are explicitly compared with human activity, machines tend to do better than the fleshy counterpart. Selections of board directors are one of the many possible cases of hiring decisions in which human biases can have an undesirable influence. Researchers compared how a machine learning algorithm would fare against real human decisions. They found that candidates that were chosen in real but that scored low according to the algorithm were later unpopular with shareholders, and that candidates not chosen in real but that scored high according to the algorithm were successful with shareholders in other companies. Why was that the case? Human decision tended to be biased by features like gender (males were preferred), financial background, and previous experience with other boards. The algorithm was less biased: ‘the algorithm is telling us exactly what institutional shareholders have been saying for a long time: that directors who are not old friends of management and come from different backgrounds both do a better job in monitoring management and are often overlooked.’
The second broad criticism is that algorithms are opaque to users, and more worryingly, to programmers themselves. Algorithms are opaque to users. Our social media feeds, our Google searches, our Amazon and Netflix suggestions (not to mention personalised advertisements in the majority of websites) do not provide us with a random, unfiltered, selection of information. This seems inevitable. Amazon has a catalogue of hundreds million of products: the probability of random suggestions to have any relevance for us is infinitesimal. But, should I know why I had that exact book advised to me this morning instead of another one?
Not necessarily. Cultural evolution works by making available to us tools and ideas that we did not have to invent by ourselves and, more to the point, we do not need in many cases to understand the nitty-gritty details. Imagine if we would start to be worried that we do not know how our laptop works under the hood and we would stop to use it. Perhaps this looks somehow attractive, but the same logic applies to almost all tools and ideas. How does a bike’s brake work? Why do we cook potatoes and eat lettuce raw? If we would really be concerned with the opacity of how things around us work, we will be paralyzed in our daily life. There are many reasons to ‘just do’ as we have been told or demonstrated. Opacity, in sum, is not a problem by itself, and we placidly live with opacity in many, if not all, domains of life.
We are—rightly—suspect of opacity when we fear that someone is misleading us, or when things do not work as they should. If, when pulling the brake, my bike keeps going as nothing happened, I have very good reasons to want to know more about the mechanism. Probably, however, I will refer to an expert in the domain and let them repair the brake, so that the mechanism will still remain opaque to me. A problem arises if experts themselves do not know any more how the brake works. In a word of increasing complexity and increasing interconnectedness, the functioning of many pieces of technology becomes opaque to everybody, experts included. It is perfectly plausible, and I could not see how it could be different, that there is not a single person in the world that knows how a laptop, an aeroplane, but also the Italian law system in its entirety function (I am not making fun of the Italian law system, I promise). Does anybody know how the algorithm that Google use to present search results to us work? This is an interesting question, but the answer may be more complicated than what it seems.
In a sense, no. As all products of cultural evolution, the Google algorithm stands on the shoulders of many giants and many dwarves, big and small innovations accumulated through time. Luckily, there is no need for programmers to know how the libraries (collections of pre-written functions that can be used as-it-is) they use work in the details nor how high-level languages, such as Python or C++, are translated in low-level, machine languages. Complicated algorithms are created by teams of programmers, and a programmer that worked on the graphical output will not know much, if anything, of the machine learning technique used to produce the results (and vice versa).
In addition, outputs like Google searches or Facebook newsfeed are likely the result of the optimizations of hundreds of parameters. Facebook may take into account your ‘likes’ in the last day, or week, or month (possibly all of them, weighted differently), the links you clicked, the ‘likes’ of your friends (how many of them? How long back in time?), the people you interacted with in Messenger, and who knows what else. The weight of the parameters can be fine-tuned according to their effects on users. Simplifying a bit, one can imagine that engineers at Facebook test modifications such as giving a 1% more influence to your last day ‘likes’ in respect to your last week ‘likes’ and, if we spend a few seconds more on the social media as a result (multiply this for billion of users), the modification is kept. However, some weight modifications will work better when accompanied by others. Perhaps giving more influence to recent ‘likes’ will have a positive effect—positive, it goes without saying, for Facebook—only when also the weight of your friends recent ‘likes’ is increased. In truth, it will depend on how all other parameters are set. Perhaps the success of the recent ‘likes’ modification depends in a not-obvious way on favouring link clicks of pictures, or reading less news, or anything else you can imagine and that can be tracked. Point is, the algorithm is probably optimizing hundreds if not thousands of parameters and why exactly a 1% increase in the importance of recent ‘likes’ (or anything else) is not obvious to the programmers themselves.
Let’s go back to our algorithm replacing the hiring committee. Through an optimization as the one just described, we find out that it gives some importance to, say, wearing multiple rings (I use this example both because I have two—a wedding one, and a wooden one that I bought ten years ago in a market in Phnom Penh—and because I can not think about any way in which this could be a useful information to hire a person). This is not a desirable outcome, right? Again, this should caution against the naïve view that we are in full control of algorithms. We are not: but, again, the key is to compare the opacity of the algorithm to the opacity of a human hiring committee. We will never find out whether the president of the committee like rings, or dark hair, or any other features we are pretty sure is not influent at all for the choice. Humans are more opaque than algorithms. Even if we do not know exactly why the algorithm produces this result, as its procedure is opaque, we do know that it depends on the weight on the ‘ring’ parameter. We can go back to the desk and work out what is going on.
What we should be worried about, in sum, is not that algorithms are biased and opaque. They are, and they will always be. We need, however, be fully aware that they are, and not let the ‘owners’ of the algorithms that decide which information we see online (the case I am mainly interested in) mislead us. Algorithms should be open, and they should probably offer (some) level of personalization, in a more transparent way than what happens now.