Portrait Generation Benchmark Q1 2026: Flux.2 vs SDXL vs Proprietary

wpnews.pro

cd /news/machine-learning/portrait-generation-benchmark-q1-202… · home › topics › machine-learning › article

[ARTICLE · art-33249] src=dev.to ↗ pub=2026-06-18T20:55Z topic=machine-learning verified=true sentiment=↑ positive

Portrait Generation Benchmark Q1 2026: Flux.2 vs SDXL vs Proprietary

Runflow benchmarked 8 image generation models across 12,000 real inference jobs, scoring each on quality, cost, and latency. Flux.2 [dev] achieved a composite quality score of 95, matching or exceeding proprietary models and leading the portrait generation category for the first time as an open-source model.

read2 min views29 publishedJun 18, 2026

Every quarter, we benchmark every major image generation model against real production workloads from our platform. Not synthetic tests, actual jobs from customers generating AI headshots at scale.

This quarter, we tested 8 models across 12,000 inference jobs, scoring each on quality (FID, CLIP, human eval), cost per image, and p95 latency. Here’s the full breakdown.

Most model comparisons use academic datasets, ImageNet, LAION, curated prompt sets. That’s useful for research, but it tells you nothing about how a model performs on your workload.

At Runflow, we route tens of thousands of real inference jobs per day. We see exactly how models perform on corporate headshots, e-commerce product photos, and creative portraits, the actual use cases customers care about.

Our Sentinel evaluation engine scores every output automatically across three dimensions:

We tested the following models, all running on our multi-cloud orchestration layer to normalize for infrastructure differences:

|---|---|---|---|
| Flux.2 [dev] | v2.0.1 | Open Source | Self-hosted |
| Flux.2 [schnell] | v2.0.1 | Open Source | Self-hosted |

The composite quality score combines FID (40%), CLIP alignment (30%), and human evaluation (30%). All scores are normalized to a 0–100 scale.

The headline: Flux.2 [dev] scored 95, matching or exceeding proprietary models across all three evaluation dimensions. For the first time in our benchmarks, an open-source model leads the portrait generation category outright.

Cost calculations include GPU compute, orchestration overhead, and our platform fee. All models were run on equivalent hardware (A100 80GB) through our multi-cloud orchestration layer.

Latency was measured end-to-end from API request to image delivery, including model (cold start) and network transfer. All measurements are p95 across the full 12K job dataset.

All benchmark results are reproducible. We publish our evaluation pipeline, reference datasets, and scoring rubrics in our open benchmark repository. If you find discrepancies, we want to know—open an issue or reach out directly.

Models labeled “Proprietary A” and “Proprietary B” are anonymized per our testing agreements. We’ll name them explicitly once we have permission from the providers.

Q2 benchmarks will expand to include video generation models (Wan2.6, Kling 2.1, Seedance) and our new virtual try-on pipeline. We’re also adding latency-under-load testing to simulate real production traffic patterns.

Want to run these benchmarks on your own workload? Talk to our team — we’ll set up a custom evaluation against your production data.

Test

Originally published on Runflow.

source & further reading

dev.to — original article Unlocking Infinite Automation: Integrating Google Apps Script with Gemini Spark Build Repeatable AI Workflows with SharePoint Skills Building an AI Tool That Converts Text into Realistic Handwriting - Handify ai

~/api · this article 200

$curl api.wpnews.pro/v1/news/portrait-generation-benc…

Read original on dev.to → dev.to/ricardoghekiere/portrait-generation-bench…

mentioned entities

Runflow

Flux.2

SDXL

Proprietary A

Proprietary B

A100 80GB

metadata

slugportrait-generation-benchmark-q1-2026-flux-2-vs-sdxl-vs-proprietary

topic#machine-learning

secondary4 topics

sentimentpositive

canonicaldev.to

navigation

← prevBobby Murphy puts personal money…

next →Show HN: Veneerly – See your own…

── more in #machine-learning 4 stories · sorted by recency

dev.to · 3 Aug · #machine-learning

Microsoft confie 90 % de sa chasse aux failles à un mini-modèle

dev.to · 3 Aug · #machine-learning

Unlocking Infinite Automation: Integrating Google Apps Script with Gemini Spark

dev.to · 3 Aug · #machine-learning

LobeChat: The 60K-Star Open-Source ChatGPT Alternative Nobody Talks About (And Why It Matters)

github.com · 3 Aug · #machine-learning

Show HN: Changed how I use agent harnesses

── more on @runflow 3 stories trending now

wpnews · 2 Aug · #artificial-intelligence

I Ran 8 AI APIs Through the Same 50 Prompts — Here's the Real Cost Breakdown

wpnews · 2 Aug · #developer-tools

Agent-Browser – Browser Automation for AI

wpnews · 2 Aug · #artificial-intelligence

Payment Rail vs. Settlement Layer: What AEON's Coinbase x402 Partnership Actually Validates

sponsored brought to you by zahid.host 4,200+ EU-deployed projects

reading about agents? ship yours in a single git push.

Run your AI side-project on zahid.host

EU-based hosting, git-push deploys, automatic HTTPS, no cold starts. Free tier with a custom domain — perfect for shipping the agent you just read about.

$git push zahid main

→ Live at https://your-agent.zahid.host ✓

Get free account → Pricing

from €0/mo · no card required