Prompting Guide

How to write prompts that turn into great videos. This guide covers the prompt field in POST /videos.

โ„น๏ธ How It Works

Your prompt is the creative brief. The AI expands it into a complete video with narration and hand-drawn visuals. You do not write the narration or draw the storyboard โ€” the AI handles that.


How It Becomes a Video

Your prompt is expanded into a complete video with narration and hand-drawn visuals. After you review and approve the script (POST /videos/{id}/approve), the system generates images, audio, and renders the final video. The whole pipeline is automatic โ€” you just write the brief.


Themes

The theme parameter controls how visuals are structured:

ThemeStyleBest For
Explainer VideoDrawings appear one-at-a-time as the narrator speaks. Each scene builds up on a blank whiteboard.Tutorials, educational content, step-by-step explanations
StoryboardEach scene is one complete illustration drawn all at once. Discrete, separated elements.Storytelling, narratives, visual summaries, concept overviews

๐Ÿ’ก Choosing a Theme

Use Explainer Video when you want the "teacher drawing on a whiteboard" feel โ€” concepts build as they're explained. Use Storyboard when you want illustrated frames โ€” each scene is a complete picture that advances the story.


Writing Your Prompt

Your prompt is a natural language brief. It does not appear in the video. The AI uses it to decide what to say and what to draw.

What to Include

ElementWhyExample
TopicDefines the subjectHow photosynthesis works
AudienceAdjusts language complexityFor a 10th grade biology class
ToneShapes narration styleWarm, encouraging, simple
StructureGuides scene breakdownCover light absorption, water splitting, glucose production, oxygen release
Key pointsEnsures specific content is coveredCompare chloroplasts to tiny kitchens
Call to actionGives the video a conclusionEnd with a motivational message

Constraints

  • Max prompt length: 2,000 characters
  • Max scenes: 40 per video (~10 min at 15s/scene)
  • Concurrent videos: 3 max in progress at once

Example Prompts

Educational (Explainer Video):

Explain how photosynthesis works for a 10th grade biology class. Use simple analogies โ€” compare chloroplasts to tiny kitchens. Cover light absorption, water splitting, glucose production, and oxygen release. Keep it warm and encouraging.

Product demo (Explainer Video):

Explain how our meal-planning app works in 6 scenes. Start with the chaos of deciding what to cook daily. Show the app asking for dietary preferences, AI generating a weekly plan, auto-building a shopping list, cooking with step guidance. End with a happy family eating together. Tone: friendly, relatable.

Storytelling (Storyboard):

Tell the story of a startup founder's first year. Scene 1: the spark of an idea in a coffee shop. Scene 2: building the first prototype in a garage. Scene 3: launching and hearing crickets. Scene 4: pivoting and finding product-market fit. Scene 5: celebrating 10,000 users. End with a motivational message about persistence.


Prompt Structure Tips

Do

  • Describe the concept, not the drawing. Say "explain budgeting" not "draw a pie chart"
  • Specify 5-8 key ideas โ€” each becomes a scene with multiple drawings
  • Set the audience and tone โ€” this has the biggest impact on narration quality
  • Give scene structure hints โ€” "start with a problem, then show 3 solutions"
  • Mention characters by name when using Cast & Props

Don't

  • Write the full narration yourself โ€” the AI generates natural speech suited for text-to-speech
  • Describe specific drawing layouts ("draw a flowchart with 5 boxes") โ€” say "explain the 5 steps" and let each step become a standalone illustration
  • Exceed 2,000 characters โ€” be concise and let the AI expand
  • Skip audience/tone info โ€” it's the most impactful part of the prompt

Scene Structure

The AI decides scene count based on your prompt. Each scene contains:

  • Speech โ€” natural narration text (written for spoken delivery, not read)
  • Visual description โ€” what gets drawn on the canvas
  • Start time โ€” when the scene begins in the video

Each scene has a minimum of 5 standalone drawings (Explainer Video) or 5 separated elements (Storyboard). The canvas is always a white background with drawings spread across the full surface โ€” nothing clusters in the center.

๐Ÿ’ก Better Prompts = Better Visuals

The more vivid and specific your prompt is about concepts, the better the visuals will be. Focus on what you want taught โ€” the system handles the artistic style.


Audio

Single Voice

Default mode. One narrator speaks throughout. Select a voice via the voice_id parameter or let the system auto-select based on content language.

Multi Voice

Activate by passing speaker_ids with 2+ character asset IDs. Each character gets their own voice. The AI writes dialogue that reflects each character's personality โ€” not forced Q&A ping-pong.

ModeWhenSetup
Single (default)0 or 1 speaker selectedPass `voice_id` or use auto-select
Multi voice2+ speakers selectedPass `speaker_ids` with character asset IDs

Cast & Props

Use the Assets API to create characters and objects, then reference them in your video:

  • Characters โ€” can speak (toggle speaker mode), have a voice, optionally a reference image
  • Objects โ€” props that appear in scenes but don't speak

Reference images guide the AI to draw characters consistently across scenes. The AI describes what characters are doing, their pose, and expression โ€” not their appearance (the reference handles that).

โ„น๏ธ Character Descriptions

Character descriptions (via the Assets API) affect both script tone and image generation. "A wise old owl with round glasses" tells the AI to write professorial dialogue and draw consistently.


Regeneration

After script generation (status: refining), you can regenerate with feedback:

bash
curl -X POST https://api.sketchpen.app/api/v1/videos/{id}/regenerate \
  -H "X-Api-Key: sk_live_..." \
  -H "Content-Type: application/json" \
  -d '{"feedback": "Make the tone more casual and add humor"}'

The system feeds your previous script + feedback back into the AI. It builds on what worked and fixes what didn't. Rate limit: 5/hour, 3 per 10 minutes.

Useful Feedback Examples

  • "Make the tone more casual and add humor"
  • "Scene 3 is too complex, simplify it"
  • "Add a scene about the history of this concept"
  • "The ending is weak, make it more impactful"
  • "Target a younger audience โ€” high school level"

Credit Costs

OperationCost
Script generationFree
Image per scene (High quality)1.5 credits
Image per scene (Medium quality)1 credit
Audio generation1 credit per 1,000 characters
RegenerateFree (re-approval costs credits)

๐Ÿ’ก Cost Estimate

A 5-scene video at High quality โ‰ˆ 7.5 credits for images + ~2-3 credits for audio โ‰ˆ ~10 credits total.


Quick Reference

ParameterDefaultOptions
themeExplainer VideoExplainer Video, Storyboard
bg_modewhitewhite, dynamic
scene_qualityhighhigh (1.5 cr/scene), medium (1 cr/scene)
aspect_ratio16:916:9, 9:16, 1:1
audio_modesinglenone, single, multi_voice
fps241-60
Need the raw text for your AI agent? View MDX ยท Full spec at /llms.txt