The Peruvian researcher Omar Flórez is preparing for a "very, very close" future in which the streets will be full of surveillance cameras capable of recognizing our faces and gathering information about us as we walk through the city.
Explain that they will do so without our permission, since they are public spaces and most of us do not usually cover our faces when we leave home.
Our face will become our password and, when we enter a store, we will recognize and analyze the data as if we were new or regular customers, or where we were before going through the door. Of all the information you collect, the treatment that this society gives us will depend on .
Flórez wants to avoid that aspects like our gender or skin color are part of the criteria that these companies evaluate when deciding whether we deserve a discount or special attention. Something that can happen without the companies themselves noticing it.
Artificial intelligence is not perfect: even if it is not programmed to do so, the software can learn itself to discriminate.
This engineer was born in Arequipa 34 years ago, received his doctorate in Computer Science at the State University of Utah (United States) and currently works as a researcher at the Capital One bank.
It is one of few Latin American Americans who study ethical aspects of the automatic learning or automatic learning , a process that defines "the ability to predict the future with data from the past using computers".
A technology based on algorithms used to develop the driverless car or to detect diseases like skin cancer, among others.
Flórez is working on an algorithm that allows computers recognize faces but can not decipher sex or ethnicity of the person His dream is that, when the future arrives, companies include their algorithm in their computer systems to avoid making racist or machist decisions without even knowing it.
We always say that we can not be objective precisely because we are human. He tried to trust machines as they are, but it seems they can not even …
Because they are programmed by a human being. In fact, we recently realized that the algorithm itself is an opinion. I can solve a problem with algorithms in different ways and each of them, in some way, incorporates my vision of the world. In fact, choosing which is the correct way to judge an algorithm is already an observation, an opinion on the algorithm itself.
Let's say that I want to predict the probability that someone commits a crime. This is why I collect photos of people who have committed crimes, where they live, what their race is, their age, etc. So I use this information to maximize the accuracy of the algorithm and to predict who can commit a crime later or even where the next crime might occur. This prediction could make the police focus more on areas where suddenly there are more people of African descent because there are more crimes in that area or they start to stop the Latinos because they are very likely not to have documents in order.
So for someone who has a legal residence or is an Afro-descendant and lives in that area but does not commit crimes, it will be twice as difficult to get rid of that stigma of the algorithm. Because for the algorithm you are part of a family or a distribution, so it is much more difficult for you, statistically, to leave that family or distribution. In some ways, you are negatively affected by the reality that surrounds you. Fundamentally, up to now, we have codified the stereotypes we have as human beings.
That subjective element is in the criteria you chose during the programming of the algorithm.
Exactly. There is a chain of processes to create an automatic learning algorithm: collect data, choose which features are important, choose the algorithm itself … Then take a test to see how it works and reduce errors and finally bring it to the public for use it. We realized that prejudices are in each of these processes.
A survey by ProPública discovered in 2016 that the judicial system of several US states used the software to determine which defendants were most likely to relapse. ProPública discovered that the algorithms favored whites and penalized blacks, although the form in which data was collected did not include questions about skin tone … In a way, the machine guessed it and used it as a criterion to be evaluated even if it was not designed to do it, right?
What happens is that there are data that already codify the breed and you do not even realize it. For example, in the United States we have the postal code. There are areas where only or mainly African-American people live. For example, in Southern California, mostly Latin American people live. So, if you use the postal code as a feature to feed an automatic learning algorithm, you are also coding the ethnic group without realizing it.
C & # 39; is it a way to avoid it?
Apparently, at the end of the day the responsibility lies with the human being who programs the algorithm and how it can be ethical. That is, if I know that my algorithm will work with 10% more errors and I will stop using something that could be sensitive to characterize the individual, then I simply take it and take responsibility for the consequences, perhaps, economic can have my company. So, there is certainly an ethical barrier between deciding what goes wrong and what is wrong with the algorithm and often falls back on the programmer.
It is assumed that the algorithms are only to process large volumes of information and save time . ¿ N or is there a way to make them unfailing?
Infallible No. Because I am always an approximation of reality, that is, it is good to have a certain degree of error. However, there are currently very interesting research work in which the presence of sensitive data is explicitly penalized. So, the human being basically chooses which data can be sensitive or not and the algorithm stops using it or does it in a way that shows no correlation. However, honestly, for the computer everything is a number: either it's a 0 or a 1 or a value in the middle, it does not make good sense. Although there are many interesting works that allow us to try to avoid prejudices, there is an ethical part that always falls on the human being.
Is there an area where you, as an expert, should not be left to artificial intelligence?
I think that right now we should be ready to use the computer to help rather than automate. The computer should tell you: these are the ones you should first elaborate in a judicial system. However, I should also be able to tell you why. This is called interpretation or transparency and the machines should be able to inform what is the reasoning that led them to make such a decision.
Computers must decide based on the models, but are not standard stereotypes? Are not they useful for the system to detect patterns?
If, for example, you want to minimize the error, it is a good idea to use prejudices numerically because it gives you a more accurate algorithm. However, the developer needs to realize that there is an ethical component to do this. There are regulations that prohibit the use of certain features for things like credit analysis or even the use of security videos, but they are very incipient. All of a sudden, what we need is this. Know that reality is unjust and that it is full of prejudices.
The interesting thing is that, despite this, some algorithms allow us to try to minimize this level of prejudice. That is, I can use the skin tone, but without it being more important or having the same relevance for all ethnic groups. So, answering your question, yes, you might think that, in reality, the use of this will have more accurate results and many times it is so. Once again, this ethical component is: I want to sacrifice a certain level of accuracy so as not to give the user a negative experience or use any kind of prejudice.
Amazon specialists realized that an IT tool that they had designed for the selection of personal discriminated curricula included the word "woman" and preferred terms that were more commonly used by men. This is quite surprising, because to avoid prejudices one would have to guess what terms men use more often than women in their curricula.
Even for the human being it is difficult to achieve.
But at the same time, now we try not to make gender differences and to say that words or clothes are not masculine or feminine, but that we can all use them. Self-learning seems to go the other way, since you have to admit the differences between men and women and study them.
The algorithms only collect what happens in reality and the reality is that yes, men use few words that women, perhaps, do not. And the reality is that sometimes people connect better with those words because they are also men to evaluate. So, to say the opposite is to go against the data. This problem is avoided by collecting the same number of resumes for men and women. There the algorithm will assign the same weight to either or to words that use both sexes. If you choose only the 100 resumes you have on the table, maybe only two are for women and 98 for men. So you create a prejudice because you are modeling only what happens in the men's universe for this work.
So, it is not a science for those who care to be politically correct because you have to deepen the differences …
You have touched a big point, which is empathy. The stereotype that one has of the engineer is that of someone very analytical and perhaps even a little social. It happens that we are starting to need things in the engineers we thought were not so relevant or that it seemed right for us: empathy, ethics … We need to develop these problems because we make so many decisions during the process of implementing an algorithm and many times C is an ethical component. If you're not even aware of it, you do not notice it.
Do you notice the differences between an algorithm designed by a person and one designed by 20?
In theory, prejudices should be reduced to an algorithm made by several people. The problem is that many times that group is made up of very similar people. Maybe they are all men or all are Asian. Maybe it's nice that women realize things that the group does not generally understand. That's why diversity is so important today.
Can we say that an algorithm reflects the prejudices of its author?
And that there are algorithms with biases precisely because of the lack of diversity that exists between those who create algorithms?
Not just for that, but it's an important part. I would say that it is also partly due to the data itself, which reflects reality. In the last 50 years we have tried to create algorithms that reflect reality. We have now realized that many times reflecting reality also reinforces stereotypes in people.
You think there is enough in the industry s science that algorithms can have prejudices or is something that is not given much importance?
On a practical level, it is not given the importance it should. At the research level, many companies are starting to seriously investigate this problem by creating groups called FAT: fairness, accountability and transparency (justice, accountability and transparency).
This article is part of the digital version of the Hay Festival Arequipa 2018, a meeting of writers and thinkers that takes place in that Peruvian city between the 8th and 11th November.
Now you can receive notifications from BBC World. Download the new version of our app and activate them so as not to lose our best content.