Provisioning resources in cloud-scale data centers is hard because there are no reliable sources for predicting the demands. Operators and users are often uncertain about future requirements and tend to over-provision grossly in order to avoid shortages, which leads to overall poor utilization. We find that, using automated user classification and ensemble forecasting, we can predict aggregate future usage more accurately than the traditional methods. If time permits, we will also discuss the issues of provisioning storage workloads in the face of the decreasing IOPS/capacity of disk drives, potential approaches via caching, and their limitations.