In a bandit problem agents who are initially unaware of the stochastic evolution of the environment (arms), aim to maximize a common objective based on the history of actions and observations. The classical difficulty in a bandit problem is the exploration-exploitation dilemma, which necessitates a careful algorithm design to balance information gathering and best use of available information to achieve optimal performance. The motivation to study bandit problems comes from its diverse applications including cognitive radio networks, opportunistic spectrum access, network routing, web advertising, and many others. In this talk we provide an agent-centric approach in designing online learning algorithms for bandit problems considering communication, computation and switching costs