Prompting Guide
How to write prompts that turn into great videos. This guide covers the prompt field in POST /videos.
โน๏ธ How It Works
Your prompt is the creative brief. The AI expands it into a complete video with narration and hand-drawn visuals. You do not write the narration or draw the storyboard โ the AI handles that.
How It Becomes a Video
Your prompt is expanded into a complete video with narration and hand-drawn visuals. After you review and approve the script (POST /videos/{id}/approve), the system generates images, audio, and renders the final video. The whole pipeline is automatic โ you just write the brief.
Themes
The theme parameter controls how visuals are structured:
| Theme | Style | Best For |
|---|---|---|
| Explainer Video | Drawings appear one-at-a-time as the narrator speaks. Each scene builds up on a blank whiteboard. | Tutorials, educational content, step-by-step explanations |
| Storyboard | Each scene is one complete illustration drawn all at once. Discrete, separated elements. | Storytelling, narratives, visual summaries, concept overviews |
๐ก Choosing a Theme
Use Explainer Video when you want the "teacher drawing on a whiteboard" feel โ concepts build as they're explained. Use Storyboard when you want illustrated frames โ each scene is a complete picture that advances the story.
Writing Your Prompt
Your prompt is a natural language brief. It does not appear in the video. The AI uses it to decide what to say and what to draw.
What to Include
| Element | Why | Example |
|---|---|---|
| Topic | Defines the subject | How photosynthesis works |
| Audience | Adjusts language complexity | For a 10th grade biology class |
| Tone | Shapes narration style | Warm, encouraging, simple |
| Structure | Guides scene breakdown | Cover light absorption, water splitting, glucose production, oxygen release |
| Key points | Ensures specific content is covered | Compare chloroplasts to tiny kitchens |
| Call to action | Gives the video a conclusion | End with a motivational message |
Constraints
- Max prompt length: 2,000 characters
- Max scenes: 40 per video (~10 min at 15s/scene)
- Concurrent videos: 3 max in progress at once
Example Prompts
Educational (Explainer Video):
Explain how photosynthesis works for a 10th grade biology class. Use simple analogies โ compare chloroplasts to tiny kitchens. Cover light absorption, water splitting, glucose production, and oxygen release. Keep it warm and encouraging.
Product demo (Explainer Video):
Explain how our meal-planning app works in 6 scenes. Start with the chaos of deciding what to cook daily. Show the app asking for dietary preferences, AI generating a weekly plan, auto-building a shopping list, cooking with step guidance. End with a happy family eating together. Tone: friendly, relatable.
Storytelling (Storyboard):
Tell the story of a startup founder's first year. Scene 1: the spark of an idea in a coffee shop. Scene 2: building the first prototype in a garage. Scene 3: launching and hearing crickets. Scene 4: pivoting and finding product-market fit. Scene 5: celebrating 10,000 users. End with a motivational message about persistence.
Prompt Structure Tips
Do
- Describe the concept, not the drawing. Say "explain budgeting" not "draw a pie chart"
- Specify 5-8 key ideas โ each becomes a scene with multiple drawings
- Set the audience and tone โ this has the biggest impact on narration quality
- Give scene structure hints โ "start with a problem, then show 3 solutions"
- Mention characters by name when using Cast & Props
Don't
- Write the full narration yourself โ the AI generates natural speech suited for text-to-speech
- Describe specific drawing layouts ("draw a flowchart with 5 boxes") โ say "explain the 5 steps" and let each step become a standalone illustration
- Exceed 2,000 characters โ be concise and let the AI expand
- Skip audience/tone info โ it's the most impactful part of the prompt
Scene Structure
The AI decides scene count based on your prompt. Each scene contains:
- Speech โ natural narration text (written for spoken delivery, not read)
- Visual description โ what gets drawn on the canvas
- Start time โ when the scene begins in the video
Each scene has a minimum of 5 standalone drawings (Explainer Video) or 5 separated elements (Storyboard). The canvas is always a white background with drawings spread across the full surface โ nothing clusters in the center.
๐ก Better Prompts = Better Visuals
The more vivid and specific your prompt is about concepts, the better the visuals will be. Focus on what you want taught โ the system handles the artistic style.
Audio
Single Voice
Default mode. One narrator speaks throughout. Select a voice via the voice_id parameter or let the system auto-select based on content language.
Multi Voice
Activate by passing speaker_ids with 2+ character asset IDs. Each character gets their own voice. The AI writes dialogue that reflects each character's personality โ not forced Q&A ping-pong.
| Mode | When | Setup |
|---|---|---|
| Single (default) | 0 or 1 speaker selected | Pass `voice_id` or use auto-select |
| Multi voice | 2+ speakers selected | Pass `speaker_ids` with character asset IDs |
Cast & Props
Use the Assets API to create characters and objects, then reference them in your video:
- Characters โ can speak (toggle speaker mode), have a voice, optionally a reference image
- Objects โ props that appear in scenes but don't speak
Reference images guide the AI to draw characters consistently across scenes. The AI describes what characters are doing, their pose, and expression โ not their appearance (the reference handles that).
โน๏ธ Character Descriptions
Character descriptions (via the Assets API) affect both script tone and image generation. "A wise old owl with round glasses" tells the AI to write professorial dialogue and draw consistently.
Regeneration
After script generation (status: refining), you can regenerate with feedback:
curl -X POST https://api.sketchpen.app/api/v1/videos/{id}/regenerate \
-H "X-Api-Key: sk_live_..." \
-H "Content-Type: application/json" \
-d '{"feedback": "Make the tone more casual and add humor"}'The system feeds your previous script + feedback back into the AI. It builds on what worked and fixes what didn't. Rate limit: 5/hour, 3 per 10 minutes.
Useful Feedback Examples
- "Make the tone more casual and add humor"
- "Scene 3 is too complex, simplify it"
- "Add a scene about the history of this concept"
- "The ending is weak, make it more impactful"
- "Target a younger audience โ high school level"
Credit Costs
| Operation | Cost |
|---|---|
| Script generation | Free |
| Image per scene (High quality) | 1.5 credits |
| Image per scene (Medium quality) | 1 credit |
| Audio generation | 1 credit per 1,000 characters |
| Regenerate | Free (re-approval costs credits) |
๐ก Cost Estimate
A 5-scene video at High quality โ 7.5 credits for images + ~2-3 credits for audio โ ~10 credits total.
Quick Reference
| Parameter | Default | Options |
|---|---|---|
| theme | Explainer Video | Explainer Video, Storyboard |
| bg_mode | white | white, dynamic |
| scene_quality | high | high (1.5 cr/scene), medium (1 cr/scene) |
| aspect_ratio | 16:9 | 16:9, 9:16, 1:1 |
| audio_mode | single | none, single, multi_voice |
| fps | 24 | 1-60 |