talking-head-production
Talking head video production with AI avatars, lipsync, and voiceover. Covers portrait requirements, audio quality, OmniHuman, PixVerse lipsync, Dia TTS. Use for: spokesperson videos, course content, social media, presentations, demos. Triggers: talking head, avatar video, lipsync, lip sync, ai spokesperson, virtual presenter, ai presenter, omnihuman, talking avatar, video presenter, ai talking head, presenter video, ai face video
USE THIS SKILL
DOWNLOAD THE APP TO INSTALL AND USE /talking-head-production ON YOUR DEVICE
Scan to open on your device
Opens skill content in Expo Go
COMMAND
/talking-head-production
CATEGORY
Productivity
REPOSITORY
inf-sh/skills
COMMIT
—
SKILL PROMPT
---
name: talking-head-production
description: "Talking head video production with AI avatars, lipsync, and voiceover. Covers portrait requirements, audio quality, OmniHuman, PixVerse lipsync, Dia TTS. Use for: spokesperson videos, course content, social media, presentations, demos. Triggers: talking head, avatar video, lipsync, lip sync, ai spokesperson, virtual presenter, ai presenter, omnihuman, talking avatar, video presenter, ai talking head, presenter video, ai face video"
allowed-tools: Bash(infsh *)
---
# Talking Head Production
Create talking head videos with AI avatars and lipsync via [inference.sh](https://inference.sh) CLI.
## Quick Start
> Requires inference.sh CLI (`infsh`). Get installation instructions: `npx skills add inference-sh/skills@agent-tools`
```bash
infsh login
# Generate dialogue audio
infsh app run falai/dia-tts --input '{
"prompt": "[S1] Welcome to our product tour. Today I will show you three features that will save you hours every week."
}'
# Create talking head video with OmniHuman
infsh app run bytedance/omnihuman-1-5 --input '{
"image": "path/to/portrait.png",
"audio": "path/to/dialogue.mp3"
}'
```
## Portrait Requirements
The source portrait image is critical. Poor portraits = poor video output.
### Must Have
| Requirement | Why | Spec |
|------------|-----|------|
| **Center-framed** | Avatar needs face in predictable position | Face centered in frame |
| **Head and shoulders** | Body visible for natural gestures | Crop below chest |
| **Eyes to camera** | Creates connection with viewer | Direct frontal gaze |
| **Neutral expression** | Starting point for animation | Slight smile OK, not laughing/frowning |
| **Clear face** | Model needs to detect features | No sunglasses, heavy shadows, or obstructions |
| **High resolution** | Detail preservation | Min 512x512 face region, ideally 1024x1024+ |
### Background
| Type | When to Use |
|------|-------------|
| Solid color | Professional, clean, easy to composite |
[... prompt truncated for preview ...]