Free support + 10% off

Support UlazAI for free in Google and unlock 10% off credits.

UlazAI docs

Image and video APIs

Video model matrix

Pricing Dashboard Image studio

Video model capability matrix

This matrix is sourced from the Video Studio model registry. Use it to validate model-specific constraints before sending generation requests.

Model	Engine	Inputs	Aspect ratios	Durations	Quality modes	Credits estimate
Veo 3.1 Lite veo31_lite Most cost-effective Veo 3.1 mode (40 credits) for text-to-video and image-to-video.	Veo 3.1	text , image	16:9, 9:16, Auto	8s	-	base: 40
Veo 3.1 Fast veo31_fast Fast 8-second generation with text-to-video and image-to-video.	Veo 3.1	text , image	16:9, 9:16, Auto	8s	-	base: 100
Veo 3.1 Quality veo31_quality Higher-fidelity Veo 3.1 output with the same 8-second duration.	Veo 3.1	text , image	16:9, 9:16, Auto	8s	-	base: 220
Kling 3.0 kling_3_0 Supports text, image, and frame-based generation with optional elements.	Kling 3.0	text , image	16:9, 9:16, 1:1	3, 4, 5, 6, 7, 8, 9, 10, 11, 12, 13, 14, 15s	std, pro, 4K	std no audio per second: 20, std audio per second: 30, pro no audio per second: 27, pro audio per second: 40, +1 more
Kling 3.0 Motion Control kling_3_0_motion_control Requires exactly one image URL plus one motion reference video URL.	Kling 3.0 Motion Control	image , video		3, 4, 5, 6, 7, 8, 9, 10, 11, 12, 13, 14, 15s	720p, 1080p	720p per second: 20, 1080p per second: 27
Kling 2.6 kling_2_6 Stable 5s/10s generation with optional audio.	Kling 2.6	text , image	16:9, 9:16, 1:1	5, 10s	-	5 no audio: 55, 10 no audio: 110, 5 audio: 110, 10 audio: 220
Seedance 2.0 seedance_2 Supports text, first-frame, first+last-frame, and multimodal image/video/audio references. For real-person footage, use pre-registered asset:// IDs.	Seedance 2.0	text , image , video	1:1, 4:3, 3:4, 16:9, 9:16, 21:9, adaptive	4, 5, 6, 7, 8, 9, 10, 11, 12, 13, 14, 15s	480p, 720p, 1080p	480p 4 with video input: 86, 480p 4 no video input: 76, 480p 5 with video input: 108, 480p 5 no video input: 95, +68 more
Seedance 2.0 Fast seedance_2_fast Seedance 2 Fast with the same first/last frame and multimodal reference options. For real-person footage, use pre-registered asset:// IDs.	Seedance 2.0 Fast	text , image , video	1:1, 4:3, 3:4, 16:9, 9:16, 21:9, adaptive	4, 5, 6, 7, 8, 9, 10, 11, 12, 13, 14, 15s	480p, 720p	480p 4 with video input: 71, 480p 4 no video input: 62, 480p 5 with video input: 88, 480p 5 no video input: 78, +44 more
Seedance 1.5 Pro seedance_1_5_pro Audio-video model with fixed lens, multi aspect ratios, and 4/8/12s.	Seedance 1.5 Pro	text , image	1:1, 21:9, 4:3, 3:4, 16:9, 9:16	4, 8, 12s	480p, 720p, 1080p	480p 4 silent: 10, 480p 4 audio: 20, 480p 8 silent: 20, 480p 8 audio: 30, +14 more
Wan 2.7 Video wan_2_7_t2v Text-to-video generation with 720p/1080p quality modes and 2-15s durations.	Wan 2.7	text	16:9, 9:16, 1:1, 4:3, 3:4	2, 3, 4, 5, 6, 7, 8, 9, 10, 11, 12, 13, 14, 15s	720p, 1080p	720p 2: 32, 720p 3: 48, 720p 4: 64, 720p 5: 80, +24 more
Wan 2.7 Image to Video wan_2_7_i2v Image-to-video generation with first-frame reference and 2-15s durations.	Wan 2.7	text , image	16:9, 9:16, 1:1, 4:3, 3:4	2, 3, 4, 5, 6, 7, 8, 9, 10, 11, 12, 13, 14, 15s	720p, 1080p	720p 2: 32, 720p 3: 48, 720p 4: 64, 720p 5: 80, +24 more
Wan 2.7 Video Edit wan_2_7_videoedit Prompt-based video editing that requires one source video URL.	Wan 2.7	image , video	16:9, 9:16, 1:1, 4:3, 3:4	0, 2, 3, 4, 5, 6, 7, 8, 9, 10s	720p, 1080p	720p 2: 32, 720p 3: 48, 720p 4: 64, 720p 5: 80, +16 more
Wan 2.7 R2V wan_2_7_r2v Reference-to-video flow for image/video references and 2-10s durations.	Wan 2.7	text , image	16:9, 9:16, 1:1, 4:3, 3:4	2, 3, 4, 5, 6, 7, 8, 9, 10s	720p, 1080p	720p 2: 32, 720p 3: 48, 720p 4: 64, 720p 5: 80, +14 more
HappyHorse Video happyhorse_t2v HappyHorse text-to-video with 3-15s durations and 720p/1080p quality modes.	HappyHorse	text	16:9, 9:16, 1:1, 4:3, 3:4	3, 4, 5, 6, 7, 8, 9, 10, 11, 12, 13, 14, 15s	720p, 1080p	720p 3: 93, 720p 4: 124, 720p 5: 155, 720p 6: 186, +22 more
HappyHorse Image to Video happyhorse_i2v HappyHorse image-to-video with up to two image references.	HappyHorse	text , image	16:9, 9:16, 1:1, 4:3, 3:4	3, 4, 5, 6, 7, 8, 9, 10, 11, 12, 13, 14, 15s	720p, 1080p	720p 3: 93, 720p 4: 124, 720p 5: 155, 720p 6: 186, +22 more
HappyHorse Reference to Video happyhorse_r2v HappyHorse reference-to-video with up to nine image references.	HappyHorse	text , image	16:9, 9:16, 1:1, 4:3, 3:4	3, 4, 5, 6, 7, 8, 9, 10, 11, 12, 13, 14, 15s	720p, 1080p	720p 3: 93, 720p 4: 124, 720p 5: 155, 720p 6: 186, +22 more
HappyHorse Video Edit happyhorse_videoedit HappyHorse prompt-based video editing that requires one source video URL.	HappyHorse	image , video	16:9, 9:16, 1:1, 4:3, 3:4	3, 4, 5, 6, 7, 8, 9, 10, 11, 12, 13, 14, 15s	720p, 1080p	720p 3: 93, 720p 4: 124, 720p 5: 155, 720p 6: 186, +22 more
Grok Imagine Video grok_imagine_video Mode + resolution quality selector (for example normal\|720p).	Grok Imagine Video	text , image	1:1, 2:3, 3:2, 9:16, 16:9	6, 7, 8, 9, 10, 11, 12, 13, 14, 15, 16, 17, 18, 19, 20, 21, 22, 23, 24, 25, 26, 27, 28, 29, 30s	normal\|480p, normal\|720p, fun\|480p, fun\|720p, spicy\|480p, spicy\|720p	6 480p: 10, 7 480p: 13, 8 480p: 15, 9 480p: 18, +46 more
Hailuo 2.3 Standard hailuo_2_3_standard Image-to-video model that requires one reference image URL.	Hailuo 2.3	image	16:9	6, 10s	768P, 1080P	768P 6: 30, 768P 10: 50, 1080P 6: 50
Hailuo 2.3 Pro hailuo_2_3_pro Higher-cost Hailuo tier with improved quality presets.	Hailuo 2.3	image	16:9	6, 10s	768P, 1080P	768P 6: 45, 768P 10: 90, 1080P 6: 80
Sora 2 sora_2 Sora 2 generation with 10s or 15s durations.	Sora 2	text , image	landscape, portrait	10, 15s	-	per second: 8
Sora 2 Pro sora_2_pro Sora 2 Pro with high and standard quality modes.	Sora 2	text , image	landscape, portrait	10, 15s	high, standard	per second: 8
Sora 2 Pro Storyboard sora_2_pro_storyboard Storyboard workflow with optional prompt and longer duration mode.	Sora 2	text , image	landscape, portrait	10, 15, 25s	-	10 seconds: 150, 15 25 seconds: 270
Gemini Omni Video gemini_omni_video	Video engine	image , video	16:9, 9:16, 1:1, 4:3, 3:4	4, 6, 8, 10s	720p, 1080p, 4k	720p 4 no video input: 90, 720p 6 no video input: 120, 720p 8 no video input: 150, 720p 10 no video input: 180, +11 more
Gemini Omni Audio gemini_omni_audio	Video engine	text		4s	-	base: 50
Gemini Omni Character gemini_omni_character	Video engine	text , image		4s	-	base: 50

Model selection guidance

Use veo31_lite for the lowest Veo 3.1 cost profile (40 credits) in text-to-video and image-to-video flows.
Use veo31_fast when speed and predictable 8s output matter most.
Use kling_3_0 for flexible durations, quality modes, and frame controls.
Use wan_2_6_v2v for source-video remix workflows.
Use hailuo_2_3_* only when you can provide a reference image.
Use sora_2_pro_storyboard for multi-shot storyboard planning flows.

UlazAI developer docs

UlazAI docs

Video model capability matrix

Model selection guidance