Learning in Blotto Games and applications to modeling attention in social network
Résumé rédigé par
Directeur de thèse:
Unité de recherche UMR 7102 Laboratoire de recherche d'EURECOM
The Colonel Blotto game is a fundamental model of strategic resource allocation: two players allocate a fixed amount of resources to a fixed number of battlefields with given values, each battlefield is then won by the player who allocated more resources to it, and each player maximizes the aggregate value of battlefields he wins. It recently gained a very high interest in theoretical and applied research communities because of its potential to model many important problems of resource allocation in strategic settings ranging from international war to computer security. In particular, it provides a good model of competition for attention of users in social networks. There, battlefields correspond to users and their values correspond to the value of convincing the users (e.g.,in advertisement, it would be the value of the product bought by the user). Theoretical solutions of the Colonel Blotto game could therefore enable important progresses in designing strategies to allocate resources optimally to capture the attention of users in a social network, a topic of high importance in the online world with applications for instance to advertisement campaigns or information propagation.
Applications of the Colonel Blotto game, however, have remained limited so far mostly due to the lack of solutions of the game in realistic cases. Indeed, although it was originally proposed by Borel in 1921, the first Nash equilibrium solution of the game was given in 1950 only in a simple case (2 or 3 battlefields). In 2006, a Nash equilibrium solution was given for an arbitrary number of battlefields, but only if all battlefields have the same value, which is not realistic in applications.
Another barrier for applications is that the Nash equilibrium solution assumes complete information on the players payoffs, which is not always appropriate. The machine learning community has been very active in recent year to develop sequential learning methods in order to adjust the strategies while learning the unknown payoff parameters, in particular in the classical setting of the multi-armed bandit problem. These methods, however, are not adapted in a competitive environment such as the one modeled by the Colonel Blotto game and developing learning algorithms in fully game-theoretic settings is currently an open problem.
The overall goal of this thesis will be to develop solutions of the Blotto games in order to use it to model competition for attention of users in social networks. Specifically, we will look to address the two key barriers mentioned above, that is:
(i) First, we will look for a general Nash equilibrium solution. In particular, we will include arbitrary battlefield values, more than two players (in order to be able to model more than two competitors) and to take into account externalities on a graph (i.e., the fact that winning a battlefield has an effect on the value of neighboring battlefields in the social network). We will leverage preliminary ideas mentioned above and propose and analyze heuristics to compute approximate Nash equilibria in cases where the exact solution is not possible.
(ii) Second, we will develop sequential learning methods adapted to the game-theoretic setting where several competitors are concurrently performing a learning task whose outcome depends on each other; and apply those to design strategies for the competitors to dynamically adjust their resource allocation while learning the users value. To this end, we will combine ideas from the multi-armed bandit literature with game theoretic ideas from repeated games.