Detecting home countries of social media users with machine-learned ranking approach: a case study in Hong Kong

Abstract

Inferring individual’s home country from geotagged footprints is widely applied in human mobility research. Previous studies mainly used simple empirical methods that are based on intuitive hypothetical assumptions. Because the exact relationships between users' home countries and geotagged footprints haven’t be quantitatively revealed, empirical methods based on human intuitions and past experiences are used for rough approximation. In this study, we propose a machine-learning approach for the task of home country detection, by formulating the task as a query-ranking problem and using a machine-learned ranking model for problem solving. The used model is a Multiple Additive Regression Trees framework that aims to rank regions in specific orders and the region ranked first is designated as the home country. Our approach is data-driven and can adaptively learn the unknown function from input (geotagged footprints) to output (user’s home country), thus alleviating the bias introduced by previous empirical methods. We conduct experiments with real-world datasets, and results demonstrate that our approach achieves better performance than previous empirical methods. The model’s parameter sensitivity is also investigated, and results show that user’s origin may be a factor affecting the approach’s performance and that our approach achieves robust good performance with various parameter settings.

Publication
Applied Geography, 134 (2021): 102532
Zhewei Liu
Zhewei Liu

My research interests include spatial big data analytics, volunteered geographic information, human mobility.