remo
Written on Oct 27, 2017
El origen del libro es fantástico:
[Nathan] Silver found that the single factor that best correlated with Donald Trump’s support in the Republican primaries was that measure I had discovered four years earlier. Areas that supported Trump in the largest numbers were those that made the most Google searches for “nigger.”
Lo cual le lleva al autor a hacer una declaración de intenciones:
I am now convinced that Google searches are the most important dataset ever collected on the human psyche. [...] In fact, at the risk of sounding grandiose, I have come to believe that the new data increasingly available in our digital age will radically expand our understanding of humankind.
El autor nos lleva de viaje por un montón de temas interesantes. Uno de los que más me ha gustado es la vieja, viejísima regunta: ¿Qué porcentaje de la población es gay? Un viejo estudio de los años 60-70 decía que una de cada 10 personas es gay. Pero el autor lo reduce a 1 de cada 20 y explica tan bien como ha llegado a esa conclusión que me ha convencido del todo, es una cifra que parecía imposible de saber con precisión y el tío va y lo hace.
Hay muchos datos, como nuestros likes y preferencias en Facebook, que también sirven para cosas, pero no para todo. NUestras búsquedas en Google son siempre sinceras. Nuestros likes de Facebook no. De dos revistas con la misma tirada, una de cotilleos y otra de literatura, en FB la de literatura tenía más del doble de likes que la de cotilleos. Lo mismo pasa con las encuestas:
Many people underreport embarrassing behaviors and thoughts on surveys. They want to look good, even though most surveys are anonymous. This is called social desirability bias.
Más datos interesantes:
After making their decision—either to reproduce (or adopt) or not—people sometimes confess to Google that they rue their choice. This may come as something of a shock but post-decision, the numbers are reversed. Adults with children are 3.6 times more likely to tell Google they regret their decision than are adults without children.
About 28 percent of girls are overweight, while 35 percent of boys are. Even though scales measure more overweight boys than girls, parents see—or worry about—overweight girls much more frequently than overweight boys.
On weekends with a popular violent movie, the economists found, crime dropped.
Students who were taught fractions via a game tested worse than those who learned fractions in a more standard way.
El autor usa el big data para la autoayuda, al hablar de que no debenmos fiarnos de todo lo que la gente pone en Instagram:
In fact, I think Big Data can give a twenty-first-century update to a famous self-help quote: “Never compare your insides to everyone else’s outsides.” A Big Data update may be: “Never compare your Google searches to everyone else’s social media posts.”
También suelta perlas de humor:
February 27, 2000, started as an ordinary day on Google’s Mountain View campus. The sun was shining, the bikers were pedaling, the masseuses were massaging, the employees were hydrating with cucumber water.
Y muestra signos de profundidad filosófica:
Milan Kundera, the Czech-born writer, has a pithy quote about this in his novel The Unbearable Lightness of Being: “Human life occurs only once, and the reason we cannot determine which of our decisions are good and which bad is that in a given situation we can make only one decision; we are not granted a second, third or fourth life in which to compare various decisions.”
Hay muchas muchas cosas más. ¿Qué diferencias en la vida podemos esperar entre el último admitido y el primer no admitido a una escuela de prestigio? ¿Qué palabras en una petición de un crédito son claro indicador de que la persona es menos proclive a devolverlo? ¿Es lícito usar este conocimiento para denegar créditos?
El libro sigue y sigue. Tiene un montón de notas al pie, puestas todas al final (casi un tercio del libro [!!]) y acaba con un alegato a favor del big data:
The days of academics devoting months to recruiting a small number of undergrads to perform a single test will come to an end. Instead, academics will utilize digital data to test a few hundred or a few thousand ideas in just a few seconds. We’ll be able to learn a lot more in a lot less time. [...] How do ideas spread? How do new words form? How do words disappear? How do jokes form?
Interesantísimo. Divertido. Instructivo. Fantástico. Imprescindible.