Maximizing utilization has been the approach for shared homogeneous clusters (hence join-shortest-queue, least-work-left, etc.) With computing co-located with data, as in Hadoop / MapReduce clusters, tasks are scheduled to maximize data locality. But is utilization, or data locality, the right metric in this case?