Big Data give access to a huge amount of data on the single human beings and on its relations with the environment. The subjects (commercial, social and political) that access to such data are enabled to correlate them and derive behaviour's predictions.
A developed culture is not only concerned with accumulating data, but it also takes care, analyzes them, puts them in context, interprets them. Today, however, the speed of information erases and rewrites in a continuous loop. Unfortunately, the two jobs today at greatest risk of extinction are librarians and journalists. The history is likely to be reduced to a perpetual here and now and we could end up imprisoned in a perpetual 'now'. (Luciano Floridi)
The analysis and evaluations of sensitive data are increasingly carried out by non institutional entities with business interests whose ethics is doubtful. The Big Data then raise a number of epistemological challenges regarding, for example, the fiduciary relationship between people and the ethics of academic research at the possibility that the data is exploited commercially. The ethical issues about the propensities of individuals that might emerge from correlations have not yet been the subject of policy analysis but already they interest the philosophers.
With the identification code we have your name, address and payment method, we know you have a Target Visa, debit cards, and we can connect all your shopping at the supermarket, "said Pole [the statistical expert of Target] in front of an audience of experts in trade statistics at a conference in 2010. The company can reconnect to a person about half of all sales in the supermarket, almost all online sales and about a quarter of surfing the net. At that conference Pole showed a slide describing an example of data collected by Target. [see imagine] the problem with all this data, however, is that they have no meaning without the interpretation given by statisticians: at the eyes of a layman, two customers who buy both the same orange juice may look identical, and you need a mathematician to determine whether one is a thirty-four year woman who buys the juice for the children (and thus might like a coupon for a DVD of Thomas the steam engine) and the other is a bachelor of twenty-eight years drinking orange juice after jogging (and which therefore may be interested in rebates on running shoes). Pole and the other fifty members of the Guest Data and Analytical Service of Target, were able to locate the hidden habits in the data.
Big Data is a term for any collection of data sets so large and complex that it becomes difficult to process them using traditional data processing applications. The MIT computer scientist Sam Madden (see bibliography) has defined it as "Data that is too big, too fast, or too hard for existing tools to process".
Viktor Mayer-Schönberger and Kenneth Cukier, analyzed the advantages and disadvantages of Big Data, in the book "Big Data."
Whether a customer might be pregnant is important for retail chains, because pregnancy is a real watershed in the life of couples, as it marks an important change in their purchasing habits. The team reviewed the purchases made by the women who had entered in the register of gifts for young children. It had been noted that those women were buying large quantities of lotion fragrance around the third month of pregnancy, and a few weeks later tended to buy food supplements magnesium, sodium and zinc. Eventually the team identified a dozen products, used as representative indicators, rthat allowed the company to calculate a "prediction of pregnancy" indicator for each customer who paid with a credit card, used a card-loyalty or good discount mailed. Correlations have enabled Target to estimate with reasonable reliability even the due date, so they can send discount coupons to customers tailored to each stage of pregnancy.
In his personal fight against cancer, the legendary Apple CEO Steve Jobs has taken a completely different approach. He was one of the first in the world to have entirely sequenced his DNA and that of his tumor. For this purpose, he paid a six-figure sum - several hundred times the price requested by 23andMe. In exchange he has not received a sample, a series of markers, but a file with the genetic codes in their entirety. In the choice of treatment to be administered to a cancer patient, doctors must hope that his DNA is quite similar to that of patients who participated in the drug trials. But doctors who took care of Steve Jobs have been able to select therapies based effectiveness that they would have to the specific composition of its genetic code. Every time that a cure was losing effectiveness because the tumor was modified and managed to dodge the issue, they could use another drug - jumping, as he put it, "from a lily pad to the other." "I will be one of the first to defeat a tumor of this type, or one of the last to die", he liked to repeat. Unfortunately, his prediction did not come true ...
In defining the "Big Data" Bob Hayes cites a research done by the University of Berkeley who has asked to a large number of companies (operating in different fields) their definition of the term, receiving about 40 different definitions.
The 2012 US elections are credited (see Wadhwa bibliography) as the first where big data were used to predict the voting intentions of voters. As shown in the side slide, the intentions may be inferred from the trivial personal preferences of the electorate (even from the favorite beverage). The micro-targeting techniques, a time impossible to implement, today allow profiling of voters according to voting intentions revealed by their personal consumption, and can send them customized messages online, as shown in the following slide of the entire process. The online and offline data of the individual voter converge in a central engine (CampaignGrid) that "decides" what kind of message to be sent to voters.
Computer scientists A.Oboler, C.Welsh and L.Cruz have done an analysis of Big Data in the light of social changes they entail, highlighting the risks of aggregations and correlations. The philosopher Luciano Floridi (see. Bibliography) warns for ethical subjects that govern the data and correlations on them, to paraphrase Orwell and his 1984: "Who controls the questions shapes the answers and who controls the answers gives form to reality. " In fact, the media suggested that the US election would be won by those who had been able to use Big Data and Social Media. Thet failed.
Sam Madden (2012), From Databases to Big Data (PDF) [225 citations]
Viktor Mayer-Schönberger, Kenneth Cukier (2014), Big data - a revolution that will transform how we live work and think (PDF)
Bob Hayes (2014), Six Ways to Define Big Data
Page updating 24 november 2016