Edge Formation and its Influence on Machine Learning[go to overview]
Social networks are ubiquitous structures that we generate and enrich everyday while connecting with people through social media platforms, emails, and any other type of interaction. While these structures are intangible to us, they are very important carriers of information. For instance, the political leaning of our friends can be a proxy to identify our own political preferences. This explanatory power is being leveraged in public policy, business decision-making and scientific research because it helps machine learning techniques to make accurate predictions. However, these generalizations often benefit the majority of people who shape the general structure of the network and puts in disadvantage underrepresented groups who, as a consequence, get unfair treatments like segregation and discrimination. Therefore it becomes crucial to first understand how social networks form to then verify to what extent our connections help to reinforce social inequalities as a feedback loop mechanism in machine learning.
To this end, in the first part of this thesis, I propose HopRank and Janus two methods to characterize the mechanisms of edge formation of any given real-world network. HopRank is a biased random walker whose key concept is a model of information foraging on networks based on transition probabilities between k-hop neighborhoods. Janus is a Bayesian framework that allows us to identify and rank plausible hypotheses of edge formation in cases where nodes carry additional metadata. In the second part of this thesis, I investigate the implications of these mechanisms on machine learning. Specifically, I study the influence of homophily, preferential attachment, edge density, fraction of minorities, and the directionality of links on both the performance and fairness of collective classification, and on the visibility of minorities in top-k ranks. My findings demonstrate a strong correlation between network structure and machine learning outcomes. This suggests that algorithmic bias on networks can be: (i) anticipated by the type of network, and (ii) mitigated by connecting strategically with certain people.
In this talk I will focus on my most recent work where I study the influence of edge formation on ranking algorithms.
19.11.20 - 10:15
via Big Blue Button