{"slug": "deploying-gemma-4-26b-on-proxmox-iac-setup-with-terraform-ansible-amd-igpu", "title": "Deploying Gemma 4 26B on Proxmox: IaC Setup with Terraform, Ansible & AMD iGPU", "summary": "A developer automated the deployment of Gemma 4 26B on Proxmox VE using Terraform and Ansible, enabling hardware acceleration via AMD iGPU passthrough. The setup includes Ollama and Open-WebUI, with a workaround for unsupported AMD GPUs using HSA_OVERRIDE_GFX_VERSION.", "body_md": "Originally published at[woitzik.dev]\n\nRunning large language models (LLMs) like Gemma 4 26B locally usually requires massive Nvidia clusters. But what if you want to run it in a home lab or a constrained edge environment using Infrastructure as Code (IaC)?\n\nIn this guide, I will show you how to automate a complete local AI stack on Proxmox VE using **Terraform** for the infrastructure and **Ansible** for provisioning. We will cover the quirks of the Proxmox Terraform provider, setting up Ollama, and deploying Open-WebUI as our frontend.\n\nAs a bonus, I will show you how to enable hardware acceleration by passing through an unsupported AMD iGPU to the LXC container.\n\n[View the complete Proxmox IaC source code on GitHub 🐙](https://github.com/dwoitzik/homelab-infrastructure)\n\nMy current environment for this deployment runs on a compact, highly efficient node. For testing and baseline deployments, the 8-core Ryzen handles CPU inference surprisingly well:\n\n`rpool`\n\n)We use Terraform (via the `bpg/proxmox`\n\nprovider) to spin up dedicated, unprivileged LXC containers. To keep the environment secure and segmented, the containers are split across different VLANs.\n\nHere is the configuration for the AI stack container. Note the `device_passthrough`\n\nblocks—these are strictly required if you want to hand the host's iGPU over to the container for rendering.\n\n```\nresource \"proxmox_virtual_environment_container\" \"ct_srv_ai_01\" {\n  vm_id        = 201\n  node_name    = \"pve-mgmt-01\"\n  started      = true\n  unprivileged = true\n\n  initialization {\n    hostname = \"ct-srv-ai-01\"\n  }\n\n  cpu {\n    cores = 8\n  }\n\n  memory {\n    dedicated = 32768\n    swap      = 8192\n  }\n\n  features {\n    nesting = true\n  }\n\n  disk {\n    datastore_id = \"local-zfs\"\n    size         = 80\n  }\n\n  network_interface {\n    name        = \"eth0\"\n    bridge      = \"vmbr0\"\n    mac_address = \"bc:24:11:55:aa:f5\"\n    vlan_id     = 20\n    firewall    = true\n  }\n\n  # Optional: iGPU Passthrough for Hardware Acceleration\n  device_passthrough {\n    path = \"/dev/dri/renderD128\"\n  }\n\n  device_passthrough {\n    path = \"/dev/dri/card0\"\n  }\n\n  operating_system {\n    template_file_id = \"usb-templates:vztmpl/debian-13-standard_13.1-2_amd64.tar.zst\"\n    type             = \"debian\"\n  }\n\n  lifecycle {\n    ignore_changes = [\n      description,\n      initialization[0].user_account,\n      operating_system[0].template_file_id,\n      network_interface[0].mac_address,\n      features,\n    ]\n  }\n}\n```\n\n`ignore_changes`\n\nWorkaround\nIf you manually enable features like `keyctl`\n\n, `fuse`\n\n, or `nesting`\n\nvia the Proxmox Web UI, Terraform will often attempt to overwrite them or throw state errors on the next `apply`\n\n. Adding `features`\n\nto the `ignore_changes`\n\nlifecycle block prevents Terraform from actively fighting the Web UI overrides, keeping your deployments stable.\n\nNext, we use Ansible to install Ollama and pull the Gemma model.\n\nIf you enabled the `device_passthrough`\n\nin Terraform to utilize the integrated AMD Radeon Vega GPU, you will hit a roadblock: ROCm (AMD's compute stack) is extremely picky about officially supported hardware. We can force Ollama to utilize the Vega iGPU by overriding the GFX version in the systemd service using `HSA_OVERRIDE_GFX_VERSION`\n\n.\n\n```\n---\n- name: Ensure required dependencies are installed (curl, zstd)\n  ansible.builtin.apt:\n    name: \n      - curl\n      - zstd\n    state: present\n    update_cache: true\n\n- name: Check if Ollama is already installed\n  ansible.builtin.stat:\n    path: /usr/local/bin/ollama\n  register: ollama_check_bin\n\n- name: Download and execute official Ollama install script\n  ansible.builtin.shell: |\n    set -o pipefail\n    curl -fsSL [https://ollama.com/install.sh](https://ollama.com/install.sh) | sh\n  args:\n    executable: /bin/bash\n  when: not ollama_check_bin.stat.exists\n  changed_when: true\n\n- name: Ensure Ollama user is in video and render groups\n  ansible.builtin.user:\n    name: ollama\n    groups: video, render\n    append: true\n\n- name: Ensure systemd override directory for Ollama exists\n  ansible.builtin.file:\n    path: /etc/systemd/system/ollama.service.d\n    state: directory\n    owner: root\n    group: root\n    mode: '0755'\n\n- name: Configure Ollama environment variables\n  ansible.builtin.copy:\n    dest: /etc/systemd/system/ollama.service.d/override.conf\n    owner: root\n    group: root\n    mode: '0644'\n    content: |\n      [Service]\n      Environment=\"OLLAMA_HOST=0.0.0.0\"\n      # Only needed if utilizing the AMD iGPU passthrough\n      Environment=\"HSA_OVERRIDE_GFX_VERSION=9.0.0\"\n  notify: Restart Ollama\n\n- name: Ensure Ollama service is enabled and started\n  ansible.builtin.systemd:\n    name: ollama\n    state: started\n    enabled: true\n\n- name: Pull the Gemma 4 26B-A4B model\n  ansible.builtin.command: ollama pull gemma4:26b\n  register: ollama_pull_result\n  changed_when: \"'downloading' in ollama_pull_result.stdout\"\n```\n\n*(Note: Downloading a massive 26B model takes time. Your Ansible playbook might look like it's hanging during the ollama pull task. Be patient, it's just processing gigabytes of data.)*\n\nTo interact with Gemma comfortably, we deploy Open-WebUI as a Docker container within our server stack.\n\n```\n---\n- name: Ensure Open-WebUI directory exists\n  ansible.builtin.file:\n    path: /opt/open-webui\n    state: directory\n    owner: root\n    group: root\n    mode: '0755'\n\n- name: Deploy Open-WebUI docker-compose configuration\n  ansible.builtin.copy:\n    dest: /opt/open-webui/docker-compose.yml\n    content: |\n      services:\n        open-webui:\n          image: ghcr.io/open-webui/open-webui:main\n          container_name: open-webui\n          restart: unless-stopped\n          ports:\n            - \"3005:8080\"\n          environment:\n            - OLLAMA_BASE_URL=http://10.0.20.251:11434\n            - WEBUI_AUTH=True\n          volumes:\n            - open-webui-data:/app/backend/data\n\n      volumes:\n        open-webui-data:\n\n- name: Ensure Open-WebUI stack is running\n  ansible.builtin.command: docker compose up -d\n  args:\n    chdir: /opt/open-webui\n  register: openwebui_start\n  changed_when: \"'Started' in openwebui_start.stdout or 'Created' in openwebui_start.stdout or 'Pulled' in openwebui_start.stdout\"\n```\n\nBy explicitly setting the `OLLAMA_BASE_URL`\n\nto point to the dedicated IP of our AI LXC container, the WebUI immediately connects to the Gemma model without requiring manual API configuration in the interface.\n\nBuilding a private AI environment doesn't require cloud instances. With Proxmox, Terraform, and Ansible, you can treat your edge node or home lab exactly like an enterprise data center. The entire stack is ephemeral, version-controlled, and reproducible in minutes.\n\nThe same IaC patterns — Terraform for provisioning, Ansible for configuration — apply directly to enterprise cloud environments. If you are building regulated Azure infrastructure, the [Enterprise Terraform Blueprints](https://dev.to/templates) cover the network isolation layer.", "url": "https://wpnews.pro/news/deploying-gemma-4-26b-on-proxmox-iac-setup-with-terraform-ansible-amd-igpu", "canonical_source": "https://dev.to/dwoitzik/deploying-gemma-4-26b-on-proxmox-iac-setup-with-terraform-ansible-amd-igpu-1i90", "published_at": "2026-06-14 12:01:54+00:00", "updated_at": "2026-06-14 12:10:58.691823+00:00", "lang": "en", "topics": ["large-language-models", "developer-tools", "mlops", "ai-infrastructure"], "entities": ["Proxmox VE", "Terraform", "Ansible", "Ollama", "Open-WebUI", "AMD", "Gemma 4", "GitHub"], "alternates": {"html": "https://wpnews.pro/news/deploying-gemma-4-26b-on-proxmox-iac-setup-with-terraform-ansible-amd-igpu", "markdown": "https://wpnews.pro/news/deploying-gemma-4-26b-on-proxmox-iac-setup-with-terraform-ansible-amd-igpu.md", "text": "https://wpnews.pro/news/deploying-gemma-4-26b-on-proxmox-iac-setup-with-terraform-ansible-amd-igpu.txt", "jsonld": "https://wpnews.pro/news/deploying-gemma-4-26b-on-proxmox-iac-setup-with-terraform-ansible-amd-igpu.jsonld"}}