TetriSched: Global rescheduling with adaptive plan-ahead in dynamic heterogeneous clusters

TetriSched is a scheduler that works in tandem with a calendaring reservation system to continuously re-evaluate the immediate-term scheduling plan for all pending jobs (including those with reservations and best-effort jobs) on each scheduling cycle. TetriSched leverages information supplied by the reservation system about jobs' deadlines and estimated runtimes to plan ahead in deciding whether to wait for a busy preferred resource type (e.g., machine with a GPU) or fall back to less preferred placement options. Plan-ahead affords significant flexibility in handling mis-estimates in job runtimes specified at reservation time. Integrated with the main reservation system in Hadoop YARN, TetriSched is experimentally shown to achieve significantly higher SLO attainment and cluster utilization than the best-configured YARN reservation and CapacityScheduler stack deployed on a real 256 node cluster.

Files

Metadata

Work Title TetriSched: Global rescheduling with adaptive plan-ahead in dynamic heterogeneous clusters
Access
Open Access
Creators
  1. Alexey Tumanov
  2. Timothy Zhu
  3. Jun Woo Park
  4. Michael A. Kozuch
  5. Mor Harchol-Balter
  6. Gregory R. Ganger
License In Copyright (Rights Reserved)
Work Type Conference Proceeding
Publication Date April 18, 2016
Publisher Identifier (DOI)
  1. https://doi.org/10.1145/2901318.2901355
Source
  1. EuroSys '16: Proceedings of the Eleventh European Conference on Computer Systems
Deposited August 03, 2023

Versions

Analytics

Collections

This resource is currently not in any collection.

Work History

Version 1
published

  • Created
  • Added tetrisched-tumanov-eurosys16.pdf
  • Added Creator Alexey Tumanov
  • Added Creator Timothy Zhu
  • Added Creator Jun Woo Park
  • Added Creator Michael A. Kozuch
  • Added Creator Mor Harchol-Balter
  • Added Creator Gregory R. Ganger
  • Published
  • Updated Source Show Changes
    Source
    • EuroSys '16: Proceedings of the Eleventh European Conference on Computer Systems
  • Updated