UlazAI developer docs
Video model matrix
Video model capability matrix
This matrix is sourced from the Video Studio model registry. Use it to validate model-specific constraints before sending generation requests.
| Model | Engine | Inputs | Aspect ratios | Durations | Quality modes | Credits estimate |
|---|---|---|---|---|---|---|
|
Veo 3.1 Lite
veo31_lite
Most cost-effective Veo 3.1 mode (40 credits) for text-to-video and image-to-video.
|
Veo 3.1 | text , image | 16:9, 9:16, Auto | 8s | - | base: 40 |
|
Veo 3.1 Fast
veo31_fast
Fast 8-second generation with text-to-video and image-to-video.
|
Veo 3.1 | text , image | 16:9, 9:16, Auto | 8s | - | base: 100 |
|
Veo 3.1 Quality
veo31_quality
Higher-fidelity Veo 3.1 output with the same 8-second duration.
|
Veo 3.1 | text , image | 16:9, 9:16, Auto | 8s | - | base: 220 |
|
Kling 3.0
kling_3_0
Supports text, image, and frame-based generation with optional elements.
|
Kling 3.0 | text , image | 16:9, 9:16, 1:1 | 3, 4, 5, 6, 7, 8, 9, 10, 11, 12, 13, 14, 15s | std, pro | std no audio per second: 20, std audio per second: 30, pro no audio per second: 27, pro audio per second: 40 |
|
Kling 3.0 Motion Control
kling_3_0_motion_control
Requires exactly one image URL plus one motion reference video URL.
|
Kling 3.0 Motion Control | image , video | 3, 4, 5, 6, 7, 8, 9, 10, 11, 12, 13, 14, 15, 16, 17, 18, 19, 20, 21, 22, 23, 24, 25, 26, 27, 28, 29, 30s | 720p, 1080p | 720p per second: 12, 1080p per second: 20 | |
|
Kling 2.6
kling_2_6
Stable 5s/10s generation with optional audio.
|
Kling 2.6 | text , image | 16:9, 9:16, 1:1 | 5, 10s | - | 5 no audio: 55, 10 no audio: 110, 5 audio: 110, 10 audio: 220 |
|
Seedance 2.0
seedance_2
Supports text, first-frame, first+last-frame, and multimodal image/video/audio references. For real-person footage, use pre-registered asset:// IDs.
|
Seedance 2.0 | text , image , video | 1:1, 4:3, 3:4, 16:9, 9:16, 21:9, adaptive | 4, 5, 6, 7, 8, 9, 10, 11, 12, 13, 14, 15s | 480p, 720p | 480p 4 with video input: 86, 480p 4 no video input: 76, 480p 5 with video input: 108, 480p 5 no video input: 95, +44 more |
|
Seedance 2.0 Fast
seedance_2_fast
Seedance 2 Fast with the same first/last frame and multimodal reference options. For real-person footage, use pre-registered asset:// IDs.
|
Seedance 2.0 Fast | text , image , video | 1:1, 4:3, 3:4, 16:9, 9:16, 21:9, adaptive | 4, 5, 6, 7, 8, 9, 10, 11, 12, 13, 14, 15s | 480p, 720p | 480p 4 with video input: 71, 480p 4 no video input: 62, 480p 5 with video input: 88, 480p 5 no video input: 78, +44 more |
|
Seedance 1.5 Pro
seedance_1_5_pro
Audio-video model with fixed lens, multi aspect ratios, and 4/8/12s.
|
Seedance 1.5 Pro | text , image | 1:1, 21:9, 4:3, 3:4, 16:9, 9:16 | 4, 8, 12s | 480p, 720p, 1080p | 480p 4 silent: 10, 480p 4 audio: 20, 480p 8 silent: 20, 480p 8 audio: 30, +14 more |
|
Wan 2.6
wan_2_6
Text/image to video with 720p/1080p quality modes.
|
Wan 2.6 | text , image | 16:9 | 5, 10, 15s | 720p, 1080p | 720p 5: 70, 720p 10: 140, 720p 15: 210, 1080p 5: 105, +2 more |
|
Wan 2.6 Video Remix
wan_2_6_v2v
Video remix flow that requires one source video input URL.
|
Wan 2.6 | image , video | 16:9 | 5, 10s | 720p, 1080p | 720p 5: 70, 720p 10: 140, 1080p 5: 105, 1080p 10: 210 |
|
Grok Imagine Video
grok_imagine_video
Mode + resolution quality selector (for example normal|720p).
|
Grok Imagine Video | text , image | 1:1, 2:3, 3:2, 9:16, 16:9 | 6, 7, 8, 9, 10, 11, 12, 13, 14, 15, 16, 17, 18, 19, 20, 21, 22, 23, 24, 25, 26, 27, 28, 29, 30s | normal|480p, normal|720p, fun|480p, fun|720p, spicy|480p, spicy|720p | 6 480p: 10, 7 480p: 13, 8 480p: 15, 9 480p: 18, +46 more |
|
Hailuo 2.3 Standard
hailuo_2_3_standard
Image-to-video model that requires one reference image URL.
|
Hailuo 2.3 | image | 16:9 | 6, 10s | 768P, 1080P | 768P 6: 30, 768P 10: 50, 1080P 6: 50 |
|
Hailuo 2.3 Pro
hailuo_2_3_pro
Higher-cost Hailuo tier with improved quality presets.
|
Hailuo 2.3 | image | 16:9 | 6, 10s | 768P, 1080P | 768P 6: 45, 768P 10: 90, 1080P 6: 80 |
|
Sora 2
sora_2
Sora 2 generation with 10s or 15s durations.
|
Sora 2 | text , image | landscape, portrait | 10, 15s | - | per second: 8 |
|
Sora 2 Pro
sora_2_pro
Sora 2 Pro with high and standard quality modes.
|
Sora 2 | text , image | landscape, portrait | 10, 15s | high, standard | per second: 8 |
|
Sora 2 Pro Storyboard
sora_2_pro_storyboard
Storyboard workflow with optional prompt and longer duration mode.
|
Sora 2 | text , image | landscape, portrait | 10, 15, 25s | - | 10 seconds: 150, 15 25 seconds: 270 |
|
Wan 2.7 Text to Video
wan_2_7_t2v
|
Video engine | text | 16:9, 9:16, 1:1, 4:3, 3:4 | 2, 3, 4, 5, 6, 7, 8, 9, 10, 11, 12, 13, 14, 15s | 720p, 1080p | 720p 2: 32, 720p 3: 48, 720p 4: 64, 720p 5: 80, +24 more |
|
Wan 2.7 Image to Video
wan_2_7_i2v
|
Video engine | text , image | 16:9, 9:16, 1:1, 4:3, 3:4 | 2, 3, 4, 5, 6, 7, 8, 9, 10, 11, 12, 13, 14, 15s | 720p, 1080p | 720p 2: 32, 720p 3: 48, 720p 4: 64, 720p 5: 80, +24 more |
|
Wan 2.7 Video Edit
wan_2_7_videoedit
|
Video engine | image , video | 16:9, 9:16, 1:1, 4:3, 3:4 | 0, 2, 3, 4, 5, 6, 7, 8, 9, 10s | 720p, 1080p | 720p 2: 32, 720p 3: 48, 720p 4: 64, 720p 5: 80, +16 more |
|
Wan 2.7 R2V
wan_2_7_r2v
|
Video engine | text , image | 16:9, 9:16, 1:1, 4:3, 3:4 | 2, 3, 4, 5, 6, 7, 8, 9, 10s | 720p, 1080p | 720p 2: 32, 720p 3: 48, 720p 4: 64, 720p 5: 80, +14 more |
Model selection guidance
- Use
veo31_litefor the lowest Veo 3.1 cost profile (40 credits) in text-to-video and image-to-video flows. - Use
veo31_fastwhen speed and predictable 8s output matter most. - Use
kling_3_0for flexible durations, quality modes, and frame controls. - Use
wan_2_6_v2vfor source-video remix workflows. - Use
hailuo_2_3_*only when you can provide a reference image. - Use
sora_2_pro_storyboardfor multi-shot storyboard planning flows.