Carbon-Aware Model Training: Scheduling GPU Workloads Around Electricity Carbon Intensity

wpnews.pro

Training ML models has an environmental cost that most practitioners do not measure. A model trained during peak grid hours, when coal and gas plants are meeting high demand - can emit significantly more CO2 than the same model trained during off-peak hours when renewables dominate the grid. The carbon intensity of electricity varies by a factor of 2–5x throughout the day, but most training pipelines ignore this entirely.

Carbon-Aware Model Training Pipeline is a PyTorch-based training pipeline that monitors real-time electricity carbon intensity, delays training until a low-carbon window is available, reduces GPU memory footprint through gradient accumulation, and tracks CO2 emissions throughout the training process using CodeCarbon - with a comparison report that quantifies the carbon savings against a baseline run.

Carbon-Aware Scheduling - real-time carbon intensity monitoring with smart training delays until low-carbon windows are detected.

Gradient Accumulation - reduces GPU memory footprint while maintaining effective batch size.

Emissions Tracking - real-time CO2 monitoring via CodeCarbon with comprehensive JSON reports.

Modular Design - YAML-based configuration with separate scheduler, tracker, and trainer modules.

GPU Optimized - automatic CUDA detection with mixed precision training (FP16).

Comparative Analysis - automated reporting quantifying carbon savings against a baseline run.

The pipeline runs in four stages:

Stage 1 - Carbon-Aware Scheduling

Real-time monitoring checks electricity carbon intensity via APIs. Smart delays wait for low-carbon windows before starting training. Fallback mechanisms use realistic mock data when APIs are unavailable - with diurnal patterns simulating peak intensity at 18:00 and trough at 03:00. Configurable thresholds allow customization for different regions.

Stage 2 - Gradient Accumulation

Memory optimization processes smaller micro-batches. Effective batch size is maintained with reduced memory. Configurable steps (2, 4, 8, 16) adapt to hardware constraints. Convergence preservation ensures model quality is not compromised.

Stage 3 - Emissions Tracking

CodeCarbon integration monitors CO2 emissions in real-time. Energy metrics track power consumption in Watts and energy in kWh. Comprehensive reports generate JSON summaries with all metrics. Comparative analysis quantifies carbon savings versus the baseline.

Stage 4 - GPU Optimization

Mixed precision training (FP16) reduces memory and increases speed. Automatic CUDA detection uses GPU when available. Pin memory optimization enables faster data transfers. Graceful CPU fallback when GPU is unavailable.

┌─────────────────────────────────────────────────────────────────┐
│                     Training Configuration                      │
│                       (YAML Config File)                        │
└─────────────────────────┬───────────────────────────────────────┘
                          │
                          ▼
         ┌────────────────────────────────────┐
         │   Carbon Intensity Scheduler       │
         │   - API/Mock data fetch            │
         │   - Threshold comparison           │
         │   - Wait for low-carbon window     │
         └────────────────┬───────────────────┘
                          │
                          ▼
              ┌───────────────────────┐
              │   Start Training?     │
              │   Intensity < 300?    │
              └─────┬─────────────┬───┘
                    │ NO          │ YES
                    ▼             ▼
            ┌───────────┐   ┌──────────────┐
            │   Wait    │   │ Start Tracker│
            │ & Recheck │   │ (CodeCarbon) │
            └───────────┘   └──────┬───────┘
                                   │
                                   ▼
                  ┌────────────────────────────────┐
                  │   PyTorch Training Loop        │
                  │   - Gradient Accumulation      │
                  │   - Mixed Precision (FP16)     │
                  │   - Checkpointing              │
                  └────────────────┬───────────────┘
                                   │
                                   ▼
                  ┌────────────────────────────────┐
                  │   Emissions Tracking           │
                  │   - CO2 (kg)                   │
                  │   - Energy (kWh)               │
                  │   - Power (Watts)              │
                  └────────────────┬───────────────┘
                                   │
                                   ▼
              ┌───────────────────────────────────┐
              │   Save Results                    │
              │   - Model checkpoint              │
              │   - Training summary (JSON)       │
              │   - Emissions log (CSV)           │
              └───────────────────────────────────┘

Prerequisites

git clone https://github.com/dakshjain-1616/CarbonAwareModelTraining---by-NEO.git
cd CarbonAwareModelTraining---by-NEO

python3 -m venv venv
source venv/bin/activate  # On Windows: venv\Scripts\activate

pip install -r requirements.txt

Required packages: torch>=2.0.0

, torchvision>=0.15.0

, codecarbon>=2.3.0

, pyyaml>=6.0

, numpy

.

source venv/bin/activate

export PYTHONPATH="$PWD/src:$PYTHONPATH"
python src/train.py configs/baseline.yaml

python src/train.py configs/optimized.yaml

python generate_comparison.py

This runs three steps: baseline training without carbon awareness, optimized training with carbon-aware scheduling and gradient accumulation, and a comparison report that quantifies carbon savings and performance metrics.

Configure carbon-aware training in configs/optimized.yaml

:

scheduler:
  enabled: true
  carbon_threshold: 300           # gCO2/kWh
  wait_for_low_carbon: true

training:
  batch_size: 16
  gradient_accumulation_steps: 4  # Effective batch = 64
  epochs: 3

Run optimized training:

python src/train.py configs/optimized.yaml

Output:

Carbon Intensity Check: Current Intensity: 420.5 gCO2/kWh Threshold: 300 gCO2/kWh Status: ⏳ Waiting for low-carbon window...

[10 minutes later] Current Intensity: 285.3 gCO2/kWh Status: ✅ Starting training now!

Training Progress: Epoch 1/3 - Loss: 0.324 - Accuracy: 91.2% CO2 Emissions: 0.042 kg Energy Consumed: 0.15 kWh

CO2 Reduction: 32.5% (0.024 kg saved) GPU Memory Reduction: 45.8% Accuracy: 93.1% (baseline: 93.4%)


**Carbon-Aware Scheduling Only**

Disable gradient accumulation, enable scheduling:

scheduler: enabled: true carbon_threshold: 250

training: gradient_accumulation_steps: 1 # No accumulation


**Gradient Accumulation Only**

Disable scheduling, enable memory optimization:

scheduler: enabled: false

training: batch_size: 8 gradient_accumulation_steps: 8 # Effective batch = 64


**Real Carbon Intensity API**

Configure for production with a real API:

scheduler: enabled: true use_mock_data: false api_endpoint: "https://api.carbonintensity.org.uk/intensity" region: "GB"


**Custom Model Integration**

Replace `SimpleCNN`

in `src/train.py`

:

``` python
from my_models import MyCustomModel

def prepare_model(config):
    device = torch.device('cuda' if torch.cuda.is_available() else 'cpu')
    model = MyCustomModel(
        input_channels=config['training']['input_channels'],
        num_classes=config['training']['num_classes']
    ).to(device)
    return model, device

Output Format

Training summary JSON saved to output/summary_optimized.json

:

{
  "run_name": "optimized",
  "training_metrics": {
    "final_accuracy": 93.1,
    "final_loss": 0.124,
    "epochs": 3,
    "total_time_seconds": 245
  },
  "carbon_metrics": {
    "total_emissions_kg": 0.042,
    "energy_consumed_kwh": 0.15,
    "avg_power_watts": 145.2
  },
  "scheduler_metrics": {
    "wait_time_seconds": 600,
    "initial_intensity": 420.5,
    "training_intensity": 285.3
  },
  "gpu_metrics": {
    "peak_memory_mb": 2048,
    "gradient_accumulation_steps": 4,
    "effective_batch_size": 64
  }
}

Comparison report saved to output/comparison_report.json

:

{
  "carbon_savings": {
    "baseline_emissions_kg": 0.074,
    "optimized_emissions_kg": 0.042,
    "reduction_kg": 0.032,
    "reduction_percentage": 43.2
  },
  "accuracy_impact": {
    "baseline_accuracy": 93.4,
    "optimized_accuracy": 93.1,
    "degradation_percentage": 0.3
  },
  "memory_savings": {
    "baseline_memory_mb": 4096,
    "optimized_memory_mb": 2048,
    "reduction_percentage": 50.0
  }
}

Evaluated on MNIST training - 3 epochs, RTX 3090 GPU:

Carbon Intensity Patterns (Mock Data):

Peak hours 18:00–22:00: ~450 gCO2/kWh

Off-peak hours 02:00–06:00: ~200 gCO2/kWh

Average reduction: 35–45% CO2 by scheduling during low-carbon windows

GPU Memory Savings:

Gradient accumulation 2x: ~30% memory reduction

Gradient accumulation 4x: ~50% memory reduction

Gradient accumulation 8x: ~60% memory reduction

Convergence Validation:

Accuracy degradation under 1% across all tested configurations

Loss convergence matches baseline within 2% tolerance

No divergence observed

CarbonAwareModelTraining---by-NEO/
├── src/
│   ├── scheduler.py                # Carbon intensity API & scheduling
│   ├── tracker.py                  # CodeCarbon emissions tracking
│   ├── train.py                    # Main training pipeline
│   └── utils.py                    # Config  & logging
├── configs/
│   ├── baseline.yaml               # Baseline training config
│   └── optimized.yaml              # Carbon-aware optimized config
├── output/
│   ├── summary_baseline.json       # Baseline training summary
│   ├── summary_optimized.json      # Optimized training summary
│   ├── comparison_report.json      # Comparative analysis
│   ├── emissions.csv               # CodeCarbon emissions log
│   └── training_*.log              # Detailed training logs
├── models/
│   ├── model_baseline.pt           # Baseline model checkpoint
│   └── model_optimized.pt          # Optimized model checkpoint
├── data/                            # MNIST dataset (auto-downloaded)
├── requirements.txt                 # Python dependencies
├── generate_comparison.py          # Comparison report generator
└── README.md

Why Carbon-Aware Scheduling?

Carbon intensity varies 2–5x throughout the day. Scheduling training during low-carbon windows reduces emissions without affecting model quality. Low-carbon periods also often correlate with cheaper electricity.

Why Gradient Accumulation?

Gradient accumulation enables training larger models on limited hardware by processing smaller micro-batches and updating weights less frequently. Used in BERT, GPT, and other large-scale models for the same reason.

Why CodeCarbon?

CodeCarbon uses lifecycle assessment methodologies, supports CPU, GPU, and multi-device setups, and produces transparent, community-validated calculations. It tracks energy, power, and emissions in a single library.

Why YAML Configuration?

YAML configs are version-controlled, human-readable, and separate code from experiment parameters - enabling reproducible A/B comparisons between baseline and optimized runs.

Validate installation:

python -c "import torch; print(f'PyTorch: {torch.__version__}')"
python -c "import codecarbon; print('CodeCarbon: OK')"
python -c "import yaml; print('PyYAML: OK')"
python -c "import torch; print(f'CUDA Available: {torch.cuda.is_available()}')"

Run a quick 5-minute test:

python src/train.py configs/test.yaml

Validate carbon savings:

python src/train.py configs/baseline.yaml
python src/train.py configs/optimized.yaml
python generate_comparison.py
cat output/comparison_report.json

CUDA Out of Memory

Reduce batch_size

and increase gradient_accumulation_steps

in the config.

Carbon Intensity API Timeout

No action needed - the pipeline automatically falls back to mock data and training proceeds.

Module Import Errors

export PYTHONPATH="$PWD/src:$PYTHONPATH"

CodeCarbon Tracking Fails

pip install --upgrade codecarbon

Training continues without emissions tracking if CodeCarbon fails.

Scheduler Waits Too Long

Increase max_wait_seconds

, raise carbon_threshold

, or set wait_for_low_carbon: false

in the config.

This project was built using NEO. NEO is a fully autonomous AI engineering agent that can write code and build solutions for AI/ML tasks including AI model evals, prompt optimization and end to end AI pipeline development.

The requirement was a PyTorch training pipeline that schedules GPU workloads based on real-time carbon intensity, reduces memory footprint through gradient accumulation, and tracks emissions with CodeCarbon - producing a side-by-side comparison report. NEO built the full implementation: the carbon intensity scheduler in scheduler.py

with API integration and mock fallback, the CodeCarbon emissions tracker in tracker.py

, the main training pipeline in train.py

with gradient accumulation and mixed precision FP16, the config and logging utilities in utils.py

, the YAML configs for baseline and optimized runs, the comparison report generator in generate_comparison.py

, and the full output structure covering JSON summaries, emissions CSV, and model checkpoints.

Use it to measure the carbon cost of your existing training runs.

Run python src/train.py configs/baseline.yaml

on your own model and data by replacing SimpleCNN

in src/train.py

with your model. The CodeCarbon tracker produces a JSON summary with CO2 in kg, energy in kWh, and average power in Watts, a baseline measurement before any optimization.

Use the comparison report to justify scheduling infrastructure.

Run both the baseline and optimized configs on the same dataset. The comparison_report.json

gives you a concrete before and after - percentage reduction in emissions, energy, and memory, alongside accuracy degradation, that makes the case for carbon-aware scheduling with real numbers from your own hardware.

Use mock data for development and real API for production.

Set use_mock_data: true

during development so training always proceeds without waiting. Switch to use_mock_data: false

with a real api_endpoint

for production runs where actual carbon savings matter.

Extend the scheduler with additional carbon intensity sources.

The scheduler in scheduler.py

fetches from a configurable api_endpoint

. Adding support for additional regional carbon intensity APIs - Electricity Maps, WattTime, or a custom internal source, means updating the fetch logic in scheduler.py

without touching the training loop, tracker, or reporting pipeline.

Carbon intensity varies throughout the day, and most training pipelines ignore it. A 43% reduction in CO2 emissions with less than 1% accuracy degradation, achieved by scheduling when the grid is cleaner and accumulating gradients to reduce memory - shows that sustainable ML is a practical engineering choice, not just an aspiration.

The code is at https://github.com/dakshjain-1616/CarbonAwareModelTraining

You can also build with NEO in your IDE using the VS Code extension or Cursor.

You can use NEO MCP with Claude Code: https://heyneo.com/claude-code

source & further reading

dev.to — original article Scrape any company's job postings — Greenhouse, Lever & Ashby, with one API call The OpenAI/Hugging Face Incident is a Wake-Up Call for Model Eval Security MCP vs. Agent Skills: A Decision Framework for Context Engineering

Carbon-Aware Model Training: Scheduling GPU Workloads Around Electricity Carbon Intensity

Run your AI side-project on zahid.host