Mission Control Case Study | Published May 9, 2026
The Architect's Stillness: a 10-model Spark video sweep.
We ran the same dense psychological-horror prompt through every Spark video model, archived the completed clips to the shared movie depot, captured GPU vitals, scored prompt fidelity, and wrote up the runtime failures that blocked the other lanes.
Five text-to-video lanes and five image-to-video lanes were submitted through the public video queue.
Wan 2.1, CogVideoX 2B, and CogVideoX 5B returned MP4 artifacts for review.
CogVideoX 5B had the strongest atmosphere match, but still missed the narrative beats.
292 vitals samples, 96% p95 utilization, 83C peak temperature, 93.66W peak power.
Executive read
What the sweep proved
- The queue and movie depot worked: all ten jobs reached terminal state, three MP4s were copied into
/srv/neonflux/shared/chat-assets/movie/architects-stillness-20260508T232959Z, and the final Spark process table was clean. - The completed models understood mood better than story. They produced dark architecture, cracked glass, and cold metallic texture, but none rendered the child, doll, cyborg brain, soldier control cutaway, or reflected-eye finale as requested.
- The image-to-video lane is not ready as configured. Four I2V failures were adapter or API mismatches, and one failed with memory pressure even with the supplied source keyframe.
- The source keyframe was a weak base for I2V. It captured wet scale and lone figures, but drifted toward an industrial hall instead of the obsidian cathedral brain chamber.
Completed outputs
Three reviewable clips
The strongest completed result was CogVideoX 5B. It gave the closest cathedral/dome atmosphere, but remained mostly an environmental shot rather than a full horror sequence.
Visual evidence
Source frame and contact sheets
Model table
Outcome by model
| Model | Task | Status | Latency | Prompt match | Finding |
|---|---|---|---|---|---|
| Wan 2.1 T2V 1.3B | T2V | completed | 343.94s | 0.5/10 | Valid MP4, but effectively black and missing all prompt elements. |
| CogVideoX 2B | T2V | completed | 209.27s | 2.5/10 | Gothic/industrial texture match, but static and missing narrative beats. |
| CogVideoX 5B | T2V | completed | 714.54s | 3.5/10 | Best mood match; suggests cracked dome/cathedral but not the requested sequence. |
| Mochi 1 Preview | T2V | failed | 20.73s | n/a | CUDA out of memory. |
| SkyReels V2 DF 14B 540P | T2V | failed | 7200.01s | n/a | Hit the configured two-hour runtime timeout at 960x544, 49 frames. |
| LTX Video 13B Distilled | I2V | failed | 48.88s | n/a | CUDA out of memory despite supplied source keyframe. |
| HunyuanVideo I2V | I2V | failed | 300.56s | n/a | Indexing/source-conditioning error after image handoff. |
| SkyReels V1 Hunyuan I2V | I2V | failed | 5.75s | n/a | Repository layout mismatch: missing model_index.json. |
| SkyReels V2 I2V 14B 540P | I2V | failed | 607.73s | n/a | Tensor channel mismatch in the image-conditioning path. |
| Stable Video Diffusion XT | I2V | failed | 7.71s | n/a | Pipeline rejected unsupported guidance_scale argument. |
Operator notes
What to fix before the next sweep
- Add low-memory presets for Mochi, LTX, and the 14B SkyReels lanes before treating them as public defaults.
- Strip unsupported kwargs per pipeline, especially Stable Video Diffusion's
guidance_scalemismatch. - Fix the Hunyuan/SkyReels image-conditioning adapters before scoring I2V creative quality.
- Make SSH timeouts terminate the remote process group, not just the local SSH child, so stale GPU jobs cannot overlap later queue entries.
- For prompts this dense, move to a storyboard/keyframe workflow instead of asking one short clip to bind every object and camera beat.