Head to head: Bernini-R Edit Video vs Wan v2.6 Image to Video Bernini-R Edit Video defeated Wan v2.6 Image to Video 16.2 to 9.8 in a head-to-head test of video editing and image-to-video generation, with Bernini-R demonstrating superior prompt adherence and temporal consistency across two industrial scene tasks. Bernini-R Edit Video wins this matchup comfortably, 16.2 to 9.8, because it does the thing that matters most in edit and image-to-video work: it actually obeys the prompt. Wan v2.6 Image to Video can look slick in isolated frames, but in these tests that polish repeatedly came at the expense of content accuracy and temporal discipline. In Foundry skylight dim , Bernini-R is plainly closer to the assignment. It gives you a defunct foundry-style bay, rows of copper impeller-like forms, an overhead crane trolley, and a believable shift from warm amber illumination into cooler dimness where the lamps start to read more strongly. Wan’s output is attractive, but it drifts into the wrong scene grammar: a person appears in frame, the repeated half-finished impellers collapse into a single finished turbine rotor, and the specified tracking feel and lighting transition lose definition. The gap widens in Brine pump yard drift . Bernini-R nails the desalination-yard context, the low tracking perspective, the lime-green maintenance robot, and the layered ambient motion that makes the environment feel active rather than staged. It is not perfect—it oddly spawns a second machine and looks a bit rougher—but those are survivable flaws inside an otherwise prompt-faithful sequence. Wan starts with a cleaner-looking first frame, then promptly falls apart, veering into unrelated grass-only shots and showing much weaker temporal consistency. That pattern defines the verdict. Wan is the model you might mistake for stronger in a thumbnail; Bernini-R is the model that survives contact with an actual brief. When one system keeps the foundry a foundry and the brine yard a brine yard, while the other keeps inventing new scenes, the decision is easy. Final call: Bernini-R Edit Video is the clear winner. It is more reliable, more controllable, and far better at sustaining prompt intent across a sequence. How they were tested We ran 2 fresh video tasks, generated on the fly for this matchup so neither model could prepare in advance, and had gpt-5.4 score each one. Bernini-R Edit Video scored 16.2 to Wan v2.6 Image to Video's 9.8. 1. Foundry skylight dim Inside Bay 7 of the defunct Marrow Vale turbine foundry, a soot-streaked overhead crane trolley glides slowly along its rail above a row of half-finished copper impellers while the camera dollies sideways at catwalk height, keeping pace in a medium close tracking shot; midway through the clip a ragged storm cloud covers the late-afternoon sun pouring through broken clerestory windows, causing the hot amber shafts across the dust-filled air and steel floor to fade smoothly into a cooler dimness as small work lamps become more prominent, with a tense, hushed mood, 16:9 Winner: Bernini-R Edit Video — Model A matches the prompt better: it shows a defunct foundry-like bay with rows of copper impeller-like forms, an overhead crane trolley, and a convincing transition from warm amber light to cooler dimness with lamps becoming prominent. Model B is visually appealing but deviates notably with a person in frame, a single finished turbine rotor instead of rows of half-finished impellers, and weaker adherence to the specified tracking setup and lighting change. 2. Brine pump yard drift At the edge of the Pelican-3 desalination yard, a lime-green maintenance robot rolls carefully along a grated walkway beside three humming brine pumps while the camera creeps forward in an intimate low tracking shot parallel to it; behind and around the subject, continuous ambient motion fills the scene as tall saltgrass bends in gusts, pale steam unfurls from a relief vent, shallow runoff water ripples in oily bands, loose warning tape flutters against a chain-link fence, and layered dawn clouds slide steadily overhead in cold blue light, creating a calm but slightly eerie industrial mood, 16:9 Winner: Bernini-R Edit Video — Model A matches the desalination-yard setting, low tracking perspective, lime-green maintenance robot, and layered ambient motion much better, though it oddly introduces a second machine and is less polished. Model B has cleaner rendering in its first frame, but it quickly abandons the prompt with unrelated grass-only shots and weaker temporal consistency. See every prompt and the full side-by-side outputs in the interactive Head-to-Head /head-to-head/head-to-head-bernini-r-edit-video-vs-wan-v2-6-image-to-video .