remo
Written on Oct 18, 2019
ANYONE WHO HAS ever visited Jones Beach on Long Island, New York, will have driven under a series of bridges on their way to the ocean. These bridges, primarily built to filter people on and off the highway, have an unusual feature. As they gently arc over the traffic, they hang extraordinarily low, sometimes leaving as little as 9 feet of clearance from the tarmac. There’s a reason for this strange design. In the 1920s, Robert Moses, a powerful New York urban planner, was keen to keep his newly finished, award- winning state park at Jones Beach the preserve of white and wealthy Americans. Knowing that his preferred clientele would travel to the beach in their private cars, while people from poor black neighbourhoods would get there by bus, he deliberately tried to limit access by building hundreds of low- lying bridges along the highway. Too low for the 12- foot buses to pass under. Racist bridges aren’t the only inanimate objects that have had a quiet, clandestine control over people.
Y la tesis central de la autora es que al igual que hay objetos que se construyeron con una clara intención, los algoritmos, algos, pueden tener ese mismo efecto si no los controlamos. Aunque la definición de algoritmo es bastante insulsa,
algorithm (noun): A step- by- step procedure for solving a problem or accomplishing some end especially by a computer., la autora la desarrolla un poco más en el contexto en el que la va a tratar en el libro:
Usually, algorithms refer to something a little more specific. They still boil down to a list of step- by- step instructions, but these algorithms are almost always mathematical objects. They take a sequence of mathematical operations– using equations, arithmetic, algebra, calculus, logic and probability– and translate them into computer code. They are fed with data from the real world, given an objective and set to work crunching through the calculations to achieve their aim.
Y nos da una breve pátina sobre los tipos de algoritmos que hay, clasificados por funcionalidad:
But broadly speaking, it can be useful to think of the real- world tasks they perform in four main categories:
*Prioritization: making an ordered list (Google Search)
*Classification: picking a category. As soon as I hit my late twenties, I was bombarded by adverts for diamond rings on Facebook.
*Association: finding links Association is all about finding and marking relationships between things. Dating algorithms such as OKCupid have association at their core, looking for connections between members and suggesting matches based on the findings.
*Filtering: isolating what’s important Algorithms often need to remove some information to focus on what’s important, to separate the signal from the noise. Sometimes they do this literally: speech recognition algorithms, like those running inside Siri, Alexa and Cortana, first need to filter out your voice from the background noise before they can get to work on deciphering what you’re saying.
The vast majority of algorithms will be built to perform a combination of the above. Take UberPool, for instance, which matches prospective passengers with others heading in the same direction. Given your start point and end point, it has to filter through the possible routes that could get you home, look for connections with other users headed in the same direction, and pick one group to assign you to– all while prioritizing routes with the fewest turns for the driver, to make the ride as efficient as possible.
Y por método de funcionamiento:
Rule- based algorithms The first type are rule- based. Their instructions are constructed by a human and are direct and unambiguous. You can imagine these algorithms as following the logic of a cake recipe. Step one: do this. Step two: if this, then that. That’s not to imply that these algorithms are simple– there’s plenty of room to build powerful programs within this paradigm. Machine- learning algorithms The second type are inspired by how living creatures learn. To give you an analogy, think about how you might teach a dog to give you a high five. You don’t need to produce a precise list of instructions and communicate them to the dog. As a trainer, all you need is a clear objective in your mind of what you want the dog to do and some way of rewarding her when she does the right thing. It’s simply about reinforcing good behaviour, ignoring bad, and giving her enough practice to work out what to do for herself. The algorithmic equivalent is known as a machine- learning algorithm, which comes under the broader umbrella of artificial intelligence or AI. You give the machine data, a goal and feedback when it’s on the right track– and leave it to work out the best way of achieving the end.
Y hace una advertencia de sentido común que me encanta:
Although AI has come on in leaps and bounds of late, it is still only ‘intelligent’ in the narrowest sense of the word. It would probably be more useful to think of what we’ve been through as a revolution in computational statistics than a revolution in intelligence. I know that makes it sound a lot less sexy (unless you’re really into statistics), but it’s a far more accurate description of how things currently stand. For the time being, worrying about evil Artificial Intelligence is a bit like worrying about overcrowding on Mars.
Tras la introducción, vienen los seis capítulos en que se centra al autora (podría haber elegido más): Datos, Justicia, Medicina, Transporte, Crimen y Arte.
DATOS
De éste hemos oído hablar muchos, empezando por la ya clásica historia de cómo Target, la cadena norteamericana, descubrió que cuando una mujer joven comenzaba a comprar desodorante sin perfume, normalmente a los pocos meses s se apuntaba a las ofertas de productos para recién nacido. Habían encontrado una señal en los datos. Operaron sobre esa señal y comenzaron a mandar publicidad de productos de recién nacido a las mujeres que cambiaban a desodorante sin fragancia. Y apareció un padre cabreadísimo porque a su hija de 15 años le estaban mandando cosas de bebés y le parecía una vergüenza , hasta que un par de meses después descubrió que su hija estaba embarazada.
Pero el uso de la información que dejamos por Internet, como hemos podido comprobar recientemente con el escándalo de Cambridge Analytica (que mostraba anuncios a medida a gente cuyas características conocía con la intención de cambiar el sentido de su voto) va mucho más allá:
[Palantir] was founded in 2003 by Peter Thiel (of PayPal fame), and at the last count was estimated to be worth a staggering $ 20 billion. That’s about the same market value as Twitter, although chances are you’ve never heard of it. And yet– trust me when I tell you– Palantir has most certainly heard of you.
[...] every time you sign up for a newsletter, or register on a website, or enquire about a new car, or fill out a warranty card, or buy a new home, or register to vote– every time you hand over any data at all– your information is being collected and sold to a data broker. Remember when you told an estate agent what kind of property you were looking for? Sold to a data broker. Or those details you once typed into an insurance comparison website? Sold to a data broker. In some cases, even your entire browser history can be bundled up and sold on.
Hay mil casos en este capítulo, algunos más inquietantes que otros. Como conclusión, una reflexión y una recomendación:
That was the deal that we made. Free technology in return for your data and the ability to use it to influence and profit from you. The best and worst of capitalism in one simple swap.
Whenever we use an algorithm– especially a free one– we need to ask ourselves about the hidden incentives. Why is this app giving me all this stuff for free? What is this algorithm really doing? Is this a trade I’m comfortable with? Would I be better off without it? That is a lesson that applies well beyond the virtual realm, because the reach of these kinds of calculations now extends into virtually every aspect of society. Data and algorithms don’t just have the power to predict our shopping habits. They also have the power to rob someone of their freedom.
JUSTICIA
Este capítulo me da mucho miedo. En EE.UU. están usando algos que predicen, dadas las características del criminal (edad, residencia, crimen cometido, etc... ) la probabilidad de reincidencia yo la probabilidad de saltarse la libertad bajo fianza. Y basándose en esas recomendaciones muchos jueces dictan la sentencia simplemente siguiendo lo que diga el algo. La autora repasa un montón de casos y nos habla de cómo es imposible reducir el número de falsos negativos (dejar libre a un culpable) sin aumentar el de falsos positivos (condenar a un inocente) y viceversa:
Let’s assume our murderer- detection algorithm has a prediction rate of 75 per cent. That is to say, three- quarters of the people the algorithm labels as high risk are indeed Darth Vaders. Eventually, after stopping enough strangers, you’ll have 100 people flagged by the algorithm as potential murderers. To match the perpetrator statistics, 96 of those 100 will necessarily be male. Four will be female. There’s a picture below to illustrate. The men are represented by dark circles, the women shown as light grey circles. Now, since the algorithm predicts correctly for both men and women at the same rate of 75 per cent, one- quarter of the females, and one- quarter of the males, will really be Luke Skywalkers: people who are incorrectly identified as high risk, when they don’t actually pose a danger. Once you run the numbers, as you can see from the second image here, more innocent men than innocent women will be incorrectly accused, just by virtue of the fact that men commit more murder than women. This has nothing to do with the crime itself, or with the algorithm: it’s just a mathematical certainty. The outcome is biased because reality is biased. More men commit homicides, so more men will be falsely accused of having the potential to murder.
Y de cómo, por supuesto, aunque la raza del reo no es un input del sistema, los algos tienden a condenarlos más, porque los negros tienden a vivir en zonas peores con más índice de criminalidad y salen más condenados a igualdad de delito.
Legally, race, gender and class should not influence a judge’s decision. (Justice is supposed to be blind, after all.) And yet, while the vast majority of judges want to be as unbiased as possible, the evidence has repeatedly shown that they do indeed discriminate. Studies within the US have shown that black defendants, on average, will go to prison for longer, are less likely to be awarded bail, are more likely to be given the death penalty, and once on death row are more likely to be executed. Other studies have shown that men are treated more severely than women for the same crime, and that defendants with low levels of income and education are given substantially longer sentences. Just as with the algorithm, it’s not necessarily explicit prejudices that are causing these biased outcomes, so much as history repeating itself.
Cuando se les piden cuentas a las empresas que diseñan estos algos sobre el sesgo racial, se niegan a dar información:
Any company that profits from analysing people’s data has a moral responsibility (if not yet a legal one) to come clean about its flaws and pitfalls. Instead, Equivant (formerly Northpointe), the company that makes COMPAS, continues to keep the insides of its algorithm a closely guarded secret, to protect the firm’s intellectual property.
Y al mismo tiempo, los psicólogos muestran que las condenas de jueces que no usan asistencia algorítmica sufren de sesgos.
Ejemplo de efecto ancla en otro contexto:
Like those signs in supermarkets that say ‘Limit of 12 cans of soup per customer’. They aren’t designed to ward off soup fiends from buying up all the stock, as you might think. They exist to subtly manipulate your perception of how many cans of soup you need. The brain anchors with the number 12 and adjusts downwards. One study back in the 1990s showed that precisely such a sign could increase the average sale per customer from 3.3 tins of soup to 7.52
MEDICINA
Los algos aquí se usan en un contexto más técnico: detección de células cancerígenas en muestras de anatomopatología, por ejemplo. Hubo un experimento MUY interesante hace tiempo:
They gave 16 testers a touch- screen monitor and tasked them with sorting through images of breast tissue. The pathology samples had been taken from real women, from whom breast tissue had been removed by a biopsy, sliced thinly and stained with chemicals to make the blood vessels and milk ducts stand out in reds, purples and blues. All the tester had to do was decide whether the patterns in the image hinted at cancer lurking among the cells. After a short period of training, the testers were set to work, with impressive results. Working independently, they correctly assessed 85 per cent of samples. But then the researchers realized something remarkable. If they started pooling answers– combining votes from the individual testers to give an overall assessment on an image– the accuracy rate shot up to 99 per cent. What was truly extraordinary about this study was not the skill of the testers. It was their identity. These plucky lifesavers were not oncologists. They were not pathologists. They were not nurses. They were not even medical students. They were pigeons. Pathologists’ jobs are safe for a while yet– I don’t think even the scientists who designed the study were suggesting that doctors should be replaced by plain old pigeons. But the experiment did demonstrate an important point: spotting patterns hiding among clusters of cells is not a uniquely human skill. So, if a pigeon can manage it, why not an algorithm?
¡Se me acaban los caracteres de la crítica!
La conclusión a la que llega la autora es que los algos son una maravillosa herramienta cuando se les considera una ayuda al humano que debe seguir al mando. Deben proporcionar datos que ayuden al humano a tomar la decisión, no tomarla ellos. La unión humano+máquina es mucho más poderosa que las partes por separado.
El libro es fantástico, muy recomendable. Proporciona tanto historias curiosas como cultura y elementos de juicio.