Social media fingerprints of unemployment

Alejandro Llorente, Manuel García-Herránz, Manuel Cebrián and Esteban Moro (2014), PLoS ONE 10(5): e0128692 [LINK]

SummaryspainflowPublicly available social media data can be used to quantify deviations from typical patterns of behavior and uncover how these deviations signal the socio-economical status of regions. Using data from geolocalized Twitter messages, we find that unemployment is correlated with technology adoption, daily activity, diversity in mobility patterns, and correctness in communication style. These behavioral metrics serve to build simple, interpretable, and cost-effective socio-economical predictors from these novel digital datasets. Our extensive investigation allows us not only to build accurate behavioral models of how unemployment impacts diverse geographical areas, but also to assessing the relevance and uniqueness of previously reported social media datasets to understand economical development.

Recent wide-spread adoption of electronic and pervasive technologies has enabled the study of human behavior at an unprecedented level, uncovering universal patterns underlying human activity, mobility, and inter-personal communication. In the present work, we investigate whether deviations from these universal patterns may reveal information about the socio-economical status of geographical regions. We quantify the extent to which deviations in diurnal rhythm, mobility patterns, and communication styles across regions relate to their unemployment incidence. For this we examine a country-scale publicly articulated social media dataset, where we quantify individual behavioral features from over 145 million geo-located messages distributed among more than 340 different Spanish economic regions, inferred by computing communities of cohesive mobility fluxes. We find that regions exhibiting more diverse mobility fluxes, earlier diurnal rhythms, and more correct grammatical styles display lower unemployment rates. As a result, we provide a simple model able to produce accurate, easily interpretable reconstruction of regional unemployment incidence from their social-media digital fingerprints alone. Our results show that cost-effective economical indicators can be built based on publicly-available social media datasets.


  • See the video of thousands of trips in Spain used to characterize the mobility between municipalities in Spain

Press coverage

You may also like...

3 Responses

  1. Do you think you would be able to predict unemployment rates changes before offiicial data is published tracing twitter activity? Thanks. A fascinating paper.

  2. admin says:

    Thanks for your comment, Gonzalo. Our results show that there is a lot of information in twitter activity regarding unemployment. It might be that we can construct a model, but most probably it will take us a long time (and journey) to be able to predict something. Having said so, we are talking this journey 😉

  3. Keep us posted! Thanks.

Leave a Reply

Your email address will not be published.