We consider online learning and its relationship to game theory. In an online decision-making problem, as in Singer's lecture, one typically makes a sequence of decisions and receives feedback immediately after making each decision. As far back as the 1950's, game theorists gave algorithms for these problems with strong regret guarantees. Without making statistical assumptions, these algorithms were guaranteed to perform nearly as well as the best single decision, where the best is chosen with the benefit of hindsight. We discuss applications of these algorithms to complex learning problems where one receives very little feedback. Examples include online routing, online portfolio selection, online advertizing, and online data structures. We also discuss applications to learning Nash equilibria in zero-sum games and learning correlated equilibria in general two-player games.
Attribution: The Open Education Consortium
http://www.ocwconsortium.org/courses/view/071dcfc7799e0b4afa6cc5c02db49627/
Course Home http://videolectures.net/mlss05us_kalai_olgt/