Essays on social dynamics and application of machine learning
OA Version
Citation
Abstract
This dissertation consists of two essays on social interactions and one exploring the application of machine learning. The first chapter develops a game-theoretic model of favor exchange where one can request indirect favors through a chain of contacts in a network. I study the cooperative behavior fostered by potential collective sanctions, provide a full characterization of “renegotiation-proof” networks and propose a robustness refinement. When the maximum length of contact chains is larger than 3, only star-shaped (i.e. highly centralized) networks achieve highest robustness. I provide empirical evidence of higher centrality in social networks from exploring network data from Indian rural villages.
The second chapter studies whether there could be an evolutionary basis for discrimination against traits that are irrelevant to payoff. This study shows that discriminatory behaviors can be derived from non-discriminatory preferences. Each period the population is updated by a deterministic payoff-based update rule and incurs stochastic mutations. Without traits, in the long-run equilibrium agents coordinate on the risk dominant action. When traits are present and subject to change by population update and mutation, the long-run equilibria become the set of Pareto-efficient equilibria, among which a trait can be eliminated from the population and agents choose different actions based on opponent's trait. In an alternative setting where traits cannot mutate or be adjusted and another inferior location is introduced, there exists an update rule that will lead to the Pareto-inefficient outcome where both locations are populated.
The third chapter focuses on techniques to predict health outcomes at the census tract level and utilizes Machine Learning methods to reduce overfitting that traditional methods suffer from due to the large number of variables used. This study demonstrates how extensive data on social-demographic characteristics can be used to improve health outcome predictions relative to the previous literature that mostly focuses on using health-related variables. Using survey data from 2010 to 2015, I compare various regularization methods tuned by cross-validation by out-of-sample metrics and obtain high quality estimators of regional prevalence of 12 chronic diseases.