Emerging multi-hop wireless networks provide a low-cost and flexible infrastructure that can be simultaneously utilized by multiple users for a variety of applications, including delay-sensitive multimedia applications. However, this wireless infrastructure is often unreliable and provides dynamically varying resources with only limited QoS support for multimedia applications. Due to the informationally-decentralized nature of the multi-user multimedia transmission over multi-hop networks, a centralized solution is not practical. Hence, we propose a distributed packet-based cross-layer approach that maximizes the decoded multimedia quality of multiple users engaged in simultaneous real-time streaming sessions over the same multi-hop wireless network. Our distributed approach explicitly models the interactions among various network agents (source and relay nodes) in the wireless network by considering packet-based distortion impact and delay constraints of the applications and optimizes the transmission strategies across the protocol layers as well as across the multi-hop network. The proposed solution is enabled by the scalable coding of the multimedia content, which allows priority-based adaptation to varying channel conditions and available resources. The cross-layer strategy for choosing a relay, the MAC retransmission strategies, the PHY modulation and coding schemes - is optimized per packet, at each node, in a fully distributed manner. The main component of the proposed solution is a low-complexity, distributed, and dynamic routing algorithm referred to as self-learning policy, which relies on prioritized queuing to select the path and time reservation for the various packets, while explicitly considering instantaneous channel conditions, queuing delays and the resulting interference. The results show that our distributed method significantly improves the agents' utilities under realistic dynamic network conditions and network topologies as compared to existing state-of-the-art solutions.