According to the new study published in Science 90% of individuals can be reidentified using information about time and place of their financial transactions. This finding reveals how fragile anonymity of bank users can be.
”From a policy perspective, our findings highlight the need to reform our data protection mechanisms beyond PII and anonymity and toward a more quantitative assessment of the likelihood of reidentification. Finding the right balance between privacy and utility is absolutely crucial to realizing the great potential of metadata,” the authors of the study say.
Big datasets are more and more frequently used by researchers at universities and private companies to tackle a wide set of questions. Availability of detailed and relatively cheaply available information is considered as one of the major innovations of our time which will help to achieve exciting scientific breakthroughs. This information is enabled by new technologies, which constantly generate traces of our behaviour.
On the one hand, public availability of this information is necessary for scientific progress. ”In science, it is essential for the data to be available and shareable. Sharing data allows scientists to build on previous work, replicate results, or propose alternative hypotheses and models,” the researchers say. On the other hand, public availability can violate anonimicy, if individuals can be identified from this data.
In order to evaluate, what proportion of credit card users can be recognized using scarce information, Yves-Alexandre de Montjoye and his colleagues at Massachusets Institute of Technology analyzed over one million credit card records. ”Let’s say that we are searching for Scott in a simply anonymized credit card dataset. We know two points about Scott: he went to the bakery on 23 September and to the restaurant on 24 September.
Searching through the data set reveals that there is one and only one person in the entire data set who went to these two places on these two days,” they explain. Statistical analysis showed that nine of ten users can be recognized knowing only time and place of their financial operation. In addition, it was revealed that the likelihood of reidentification is dramatically increased (by 22%) by one additional column: the cost of transaction.
Article: de Montjoye Y-A., Radaelli L., Singh V.K., Pentland A. S., 2015, Unique in the shopping mall: On the reidentifiability of credit card metadata, Science: Vol. 347 no. 6221 pp. 536-539 DOI:10.1126/science.1256297, source link.