I have read around 200 papers this year. A large fraction of them were very technical, some reviews and other very fashionable. But among them, I would like to highlight the ones that for me are the best. This is a personal selection and it is based not only on the technical aspects, but more on their impact. Specifically, these two papers change the way we look at privacy and how our actions reveal important information about us which was not obvious in the first place. These are the two papers I consider to be the best of 2013
- Private traits and attributes are predictable from digital records of human behavior
by Michal Kosinski, David Stillwell, and Thore Graepel, published in PNAS
- Unique in the crowd: the privacy bounds of human mobility by Yves-Alexandre de Montjoye, César A. Hidalgo, Michel Verleysen & Vincent D. Blondel, published in Scientific Reports
The first paper reveals how Facebook “likes” can reveal important information about people like where we live, our sexual orientation, ethnicity, religious or political views, intelligence, happiness. But more worrisome is the potential prediction of use of addictive substances, parental separation, personality traits, etc. This has important implications for online personalization and privacy. Not only commercial companies can access information that individuals may not have intended to share. But one can imagine situations in which such predictions could pose a threat to individual’s freedom. You can check what your likes and friends reveal about you in the webpage app that the authors build up as a demonstration http://youarewhatyoulike.com
The second paper addresses the important question in BigData applications, and also scientific research: how many data is needed to identify a particular individual? That is, how much level of anonymity is there in the data we leave behind in our everyday life? Most people (including me) think that a large volume of anonymous data might be needed to identify us. Thus our privacy is secured if we do not reveal a lot of information about us. But the researchers found that just 4 geolocalized phone calls can uniquely identify us. Just 4 calls! the reason behind is that our mobility is highly predictable and thus just 4 points in the dataset unveil that personal mobility pattern. Given the amount of geolocalized data that can be access from social networks, mobile phone data, etc. these results show that there is a growing concern that little information can be used to identify a targeted individual even in a completely anonymous dataset.