The increasing complexity of configuring cellular networks suggests that machine learning (ML) can effectively improve 5G technologies. Deep learning has proven successful in ML tasks such as speech processing and computational vision, with a performance that scales with the amount of available data. The lack of large datasets inhibits the flourish of deep learning applications in wireless communications and, in particular, 5G. This paper presents a methodology that combines a vehicle traffic simulator with a ray-tracing simulator, allowing to generate channel realizations representing 5G scenarios with mobility of both transceivers and blocking objects. The paper then describes a specific dataset for investigating beam-selection techniques on vehicle-to-infrastructure using millimeter waves. Experiments using deep learning in classification, regression and reinforcement learning problems illustrate how datasets generated with the proposed methodology can be used.