Resources to Download

  1. Install CodeGreen CLI Tool
  2. Download Greenlight Dataset
  3. Codegreen Tool Repository
  4. Paper preprint

Abstract

Deep learning (DL) models are being widely deployed in real-world applications, but their usage remains computationally intensive
and energy-hungry. While prior work has examined model-level energy usage, the energy footprint of the dl frameworks, such as TensorFlow and PyTorch, used to train and build these models, has not been thoroughly studied. We present Greenlight, a large-scale dataset containing fine-grained energy profiling information of 1284 TensorFlow api calls. We developed a command line tool called CodeGreen to curate such a dataset. CodeGreen is based on our previously proposed framework FECoM, which employs static analysis and code instrumentation to isolate invocations of Tensor-Flow operations and measure their energy consumption precisely. By executing api calls on representative workloads and measuring the consumed energy, we construct detailed energy profiles for the apis. Several factors, such as input data size and the type of operation, significantly impact energy footprints. Greenlight provides a ground-truth dataset capturing energy consumption along with relevant factors such as input parameter size to take the first step towards optimization of energy-intensive TensorFlow code. The Greenlight dataset opens up new research directions such as predicting api energy consumption, automated optimization, modeling efficiency trade-offs, and empirical studies into energy-aware DL system design.