The Big Data expansion will injure democracy? - Pensiero Critico

Cerca nel sito
Vai ai contenuti

Menu principale:

The Big Data expansion will injure democracy?

Teorie > Concetti > Ecosistema Mediatico
Puoi condividere questa pagina:
Key Points

Big Data give access to a huge amount of data on the single human beings and on its relations with the environment. The subjects (commercial, social and political) that access to such data are enabled to correlate them and derive behaviour's predictions.


A developed culture is not only concerned with accumulating data, but it also takes care, analyzes them, puts them in context, interprets them. Today, however, the speed of information erases and rewrites in a continuous loop. Unfortunately, the two jobs today at greatest risk of extinction are librarians and journalists. The history is likely to be reduced to a perpetual here and now and we could end up imprisoned in a perpetual 'now'. (Luciano Floridi)


The analysis and evaluations of sensitive data are increasingly carried out by non institutional entities with business interests whose ethics is doubtful. The Big Data then raise a number of epistemological challenges regarding, for example, the fiduciary relationship between people and the ethics of academic research at the possibility that the data is exploited commercially. The ethical issues about the propensities of individuals that might emerge from correlations have not yet been the subject of policy analysis but already they interest the philosophers.

When your supermarket knows that you are pregnant before you have told it to someone
big data
The supermarkets that "give away" a card-loyalty to qualify us for discounts on products, access to an amount of data that allow them to make correlations of various type and, if we also give our address, they can send home flyers and custom-good discount. Then, if the supermarket also provides credit cards and phone cards they will use this data by making them converge on our identification (Guest ID). In the USA this profiling is already a reality, in the european countries we are not yet at this level. Charles Duhigg writes in the book "The power of habits" (p. 279):

With the identification code we have your name, address and payment method, we know you have a Target Visa, debit cards, and we can connect all your shopping at the supermarket, "said Pole [the statistical expert of Target] in front of an audience of experts in trade statistics at a conference in 2010. The company can reconnect to a person about half of all sales in the supermarket, almost all online sales and about a quarter of surfing the net. At that conference Pole showed a slide describing an example of data collected by Target. [see imagine] the problem with all this data, however, is that they have no meaning without the interpretation given by statisticians: at the eyes of a layman, two customers who buy both the same orange juice may look identical, and you need a mathematician to determine whether one is a thirty-four year woman who buys the juice for the children (and thus might like a coupon for a DVD of Thomas the steam engine) and the other is a bachelor of twenty-eight years drinking orange juice after jogging (and which therefore may be interested in rebates on running shoes). Pole and the other fifty members of the Guest Data and Analytical Service of Target, were able to locate the hidden habits in the data.
A society "digitized" by the Big Data

Big Data is a term for any collection of data sets so large and complex that it becomes difficult to process them using traditional data processing applications. The MIT computer scientist Sam Madden (see bibliography) has defined it as "Data that is too big, too fast, or too hard for existing tools to process".

Viktor Mayer-Schönberger and Kenneth Cukier, analyzed the advantages and disadvantages of Big Data, in the book "Big Data."

The benefits credited to Big Data are mainly three (p. 33):

  1. Quantity: ability to analyze huge amounts of data on a certain subject, instead of having to settle for tight sets (statistical sample)

  2. Confusion: willingness to accept the confusion of real data, rather than emphasizing any accuracy

  3. Correlations: ability to make correlations on the entire data set, rather than go looking for causality on partial samples

Regarding the first and the second point a known application example is "online translations". At the end of the 80's an innovative approach of IBM provided a machine translation on a statistical basis (the translation software used the calculus of probability to choose which word or phrase was the most suitable). IBM had applied the translation process to ten years of transcripts of parliamentary debates in French and English but, after a few years, IBM abandoned the field. Instead, in 2006, Google launched a completely different approach (based on Big Data), which involved the use of a much larger but confused data set phrases: the entire Internet Network. By taking advantage of the fact that it could access to documents on the web and translations into various languages ​​of all the official documents, Google trained its software on billions of pages of uneven quality translation. This approach proved successful in fact, while renouncing the accuracy of quality translations, it took advantage of the large number of phrases from the Internet to infer the best translation. In fact, the Google online translator that we use today is the best, despite the mistakes that sometimes it produces.

Regarding the third point, that is the importance of the correlations, an emblematic case is the supermarket Walmart (see bibliography) that through consumer purchases, monitored with  its cards-loyalty, they can predict changes in the consumer's family and "act" in advance with the "targeted" offer of discount coupons. A case that has made headlines is that of pregnant women, writes Kenneth Cukier (p. 83):

Whether a customer might be pregnant is important for retail chains, because pregnancy is a real watershed in the life of couples, as it marks an important change in their purchasing habits. The team reviewed the purchases made by the women who had entered in the register of gifts for young children. It had been noted  that those women were buying large quantities of lotion fragrance around the third month of pregnancy, and a few weeks later tended to buy food supplements magnesium, sodium and zinc. Eventually the team identified a dozen products, used as representative indicators, rthat allowed the company to calculate a "prediction of pregnancy" indicator for each customer who paid with a credit card, used a card-loyalty or good discount mailed. Correlations have enabled Target to estimate with reasonable reliability even the due date, so they can send discount coupons to customers tailored to each stage of pregnancy.

Among the disadvantages of big data cited by Viktor Mayer-Schönberger and Kenneth Cukier are three factors:

  1. Privacy: Internet has made it easier detection of personal data both to those who already have them carried out (governmental intelligence agencies), and to those who could expand their commercial action profiling their customers (private enterprises).

  2. Propensity: Big Data offer the possibility to use the forecasts on the single person to judge and punish her before acting; if this possibility will be used there will be serious risks for democracy. As the authors write (p.213): "The scene of the film " Minority Report " portrays a society in which the forecasts are so accurate as to allow the police to arrest individuals before they commit crimes. The people are not imprisoned for what they did but for what they are going to do, even if they have not yet committed any crime. "

  3. Algorithms evaluation: due to the enormous amount of data available on every aspect of social and individual reality, the problem shifts to the design of algorithms that extract data from forecasts, trends, tendencies, etc. The evaluation of the goals of the algorithms will become crucial to determine its legality. The authors wrote (p.205): "Despite his prowess in the collection and storage of information, there were many things that the Stasi (DDR) was not able to do. They could not keep track of all the movements and could not know, except through great efforts, with whom they talked. But today, most of this information is collected by mobile network operators. The government of the DDR was not able to know how many people would become dissidents, but the police are beginning to employ algorithmic models to determine where and when to carry out patrols, which gives us an idea of the coming developments. "
The Big Data that did not save the life of Steve Jobs
It's known that the founder of Apple, Steve Jobs, died of a pancreatic cancer but, by virtue of its wealth, he tried to exploit Big Data applied to his body (the entire DNA) to choose the most effective treatments.
Viktor Mayer-Schönberger and Kenneth Cukier write in the book "Big Data" (p.42):

In his personal fight against cancer, the legendary Apple CEO Steve Jobs has taken a completely different approach. He was one of the first in the world to have entirely sequenced his DNA and that of his tumor. For this purpose, he paid a six-figure sum - several hundred times the price requested by 23andMe. In exchange he has not received a sample, a series of markers, but a file with the genetic codes in their entirety. In the choice of treatment to be administered to a cancer patient, doctors must hope that his DNA is quite similar to that of patients who participated in the drug trials. But doctors who took care of Steve Jobs have been able to select therapies based effectiveness that they would have to the specific composition of its genetic code. Every time that a cure was losing effectiveness because the tumor was modified and managed to dodge the issue, they could use another drug - jumping, as he put it, "from a lily pad to the other." "I will be one of the first to defeat a tumor of this type, or one of the last to die", he liked to repeat. Unfortunately, his prediction did not come true ...
The Steve Jobs's DNA
Big Data
A taxonomy of Big Data

In defining the "Big Data" Bob Hayes cites a research done by the University of Berkeley who has asked to a large number of companies (operating in different fields) their definition of the term, receiving about 40 different definitions.

Bob Hayes has divided and organized these definitions into six categories (indicating the number of occurrences in brackets):

  1. Characteristics of the data: Big Data is about the traditional 3 Vs (Volume, Velocity, Variety) of data (N = 19) and the non-routine computing resources needed to process those data (N = 11).  Big Data is "data that contains enough observations to demand unusual handling because of its sheer size."

  2. Insights:  Big Data  is about the insights/results/value (N = 17) we get from data and the people necessary for extracting these insights (N = 3). Big Data “enchants us with the promise of new insights.”

  3. Analytics: Big Data is about analytics and modeling methods (N = 12) and their application in improving decision-making (N = 4). Big Data allows us the “opportunity to gain a more complex understanding of the relationships between different factors and to uncover previously undetected patterns in data.”

  4. Data Integration: Big Data is about the the integration of various disparate data sources and harnessing its combined power (N = 6). What’s big in Big Data is “the big number of data sources we have, as digital sensors and behavior trackers migrate across the world.”

  5. Visualisation and Storytelling: Big Data is about being able to tell a story (N = 1) through visualization (N = 2). Big data is “storytelling – whether it is through information graphics or other visual aids that explain it in a way that allows others to understand across sectors.”

  6. Ethics: Big Data is about being concerned how we use the vast quantities of data we have available today (N = 1). Big Data can provide us with “endless possibilities or cradle-to-grave shackles.”

The political use of Big Data

The 2012 US elections are credited (see Wadhwa bibliography) as the first where big data were used to predict the voting intentions of voters. As shown in the side slide, the intentions may be inferred from the trivial  personal preferences of the electorate (even from the favorite beverage). The micro-targeting techniques, a time impossible to implement, today allow profiling of voters according to voting intentions revealed by their personal consumption, and can send them customized messages online, as shown in the following slide of the entire process. The online and offline data of the individual voter converge in a central engine (CampaignGrid) that "decides" what kind of message to be sent to voters.

Big Data Elections

Computer scientists A.Oboler, C.Welsh and L.Cruz have done an analysis of Big Data in the light of social changes they entail, highlighting the risks of aggregations and correlations. The philosopher Luciano Floridi (see. Bibliography) warns for ethical subjects that govern the data and correlations on them, to paraphrase Orwell and his 1984: "Who controls the questions shapes the answers and who controls the answers gives form to reality. " In fact, the media suggested that the US election would be won by those who had been able to use Big Data and Social Media. Thet failed.

US elections 2012: Voting preferences in terms of what you drink (Coke or Pepsi?)
Big Data
They want to make us change our habits.
With you I see it hard.


Rationality asks for personal committment!
Subscribe to our newsletter:
Bibliography (who makes good reading is less manipulable)

If you feel that the thesis of the "key point" are not sufficiently supported by the arguments on this page you can express your opinion (motivating it).
Send an email with the Contact FORM. You will receive a response. Thanks for your cooperation.
Recommended books for those who want to understand how companies use Big Data
Mayer-Schonberger - Cukier
How is possible to become a "critical thinker"
It has been demonstrated, over the last 30-40 years by several psychologists including Amos Tversky, Daniel Kahneman, Gerd Gigerenzer and others, that humans believe to be rational, but they are not. When an individual is to make decisions under uncertainty more often use an "intuitive thinking" by making use of heuristics, ie mental shortcuts gained in the course of evolution. In most everyday situations heuristic decisions turn out right, but in more complex situations, appeared only with modernity, heuristics lead to distortions of judgment (bias) which result in wrong decisions.
According to Daniel Kahneman (pp.464-465 of the book "Thinking Fast and Slow") our intuitive thought is not easily educable and hinder the recognition of the environmental signals in some cases necessary to make the transition to a rational and critical thinking. An outside observer is less and less emotionally involved than one who takes decisions and performs actions. It is therefore committed to building a "critical society", in which there are "critical observers" who know how to warn us of the dangers inherent in certain decision-making situations. This is a primary task of the Institutions that need to invest in training programs to "critical thinking" of school educators. At the individual level, here are some actionable activities:

  1. Critical Attitude: trying to take a critical stance opposing the innate human tendency to jump to conclusions and make impulsive decisions. To learn more go to page: critical attitude.
  2. Reading: various studies confirm that the reading activity improves brain activity counteracting cognitive deficits and brain aging. To learn more go to page: Reading and Brain. In addition, the brain is enhanced by improving critical reading of texts (non-narrative).
  3. Language learning: recent studies have confirmed that learning languages ​​other than their own (even in old age) improves brain performance. To learn more go to page: Bilingualism and cognitive growth.


Page updating 24 november 2016

copyright 2012-2017 Licenza Creative Commons
I contenuti di sono distribuiti con Licenza Creative Commons Attribuzione - Non commerciale - Condividi allo stesso modo 3.0 Italia.
Protected by Copyscape
Copyright 2015. All rights reserved.
Torna ai contenuti | Torna al menu