gyoridavid

Short Video Maker

Community gyoridavid
Updated

Creates short videos for TikTok, Instagram Reels, and YouTube Shorts using the Model Context Protocol (MCP) and a REST API.

Shorts Video Maker

An open source automated video creation tool for generating short-form video content. Short Video Maker combines text-to-speech, automatic captions, background videos, and music to create engaging short videos from simple text inputs.

This repository was open-sourced by the AI Agents A-Z Youtube Channel. We encourage you to check out the channel for more AI-related content and tutorials.

Example

{
  "scenes": [
    {
      "text": "Hello world! Enjoy using this tool to create awesome AI workflows",
      "searchTerms": ["rainbow"]
    }
  ],
  "config": {
    "paddingBack": 1500,
    "music": "happy"
  }
}

Features

  • Generate complete short videos from text prompts
  • Text-to-speech conversion with multiple voice options
  • Automatic caption generation and styling
  • Background video search and selection via Pexels
  • Background music with genre/mood selection
  • Serve as both REST API and Model Context Protocol (MCP) server

How It Works

Shorts Creator takes simple text inputs and search terms, then:

  1. Converts text to speech using Kokoro TTS
  2. Generates accurate captions via Whisper
  3. Finds relevant background videos from Pexels
  4. Composes all elements with Remotion
  5. Renders a professional-looking short video with perfectly timed captions

Dependencies

Dependency Version License Purpose
Remotion ^4.0.286 MIT Video composition and rendering
Whisper CPP v1.5.5 MIT Speech-to-text for captions
FFmpeg ^2.1.3 LGPL/GPL Audio/video manipulation
Kokoro.js ^1.2.0 MIT Text-to-speech generation
Pexels API N/A Pexels Terms Background video sourcing
Express ^5.1.0 MIT API server framework
MCP SDK ^1.9.0 MIT Model Context Protocol support
React ^19.1.0 MIT UI components for video composition
Zod ^3.24.2 MIT Type validation

Prerequisites

  • Node.js (v18 or higher)
  • FFmpeg installed on your system
  • Pexels API key
  • Docker (for containerized deployment)
  • NVIDIA GPU (optional, for improved performance)

Running the Project

Using NPX (recommended)

The easiest way to run the project with GPU support out of the box:

PEXELS_API_KEY=your_pexels_api_key npx @ai-agents-az/shorts-creator

Using Docker

# Standard run
docker run -it --rm --name short-video-maker -p 3123:3123 \
  -e PEXELS_API_KEY=your_pexels_api_key \
  gyoridavid/shorts-creator

# For NVIDIA GPUs, add --gpu=all
docker run -it --rm --name shorts-video-maker -p 3123:3123 \
  -e PEXELS_API_KEY=your_pexels_api_key --gpu=all \
  gyoridavid/shorts-creator

Local Development

See the CONTRIBUTING.md file for instructions on setting up a local development environment.

API Usage

REST API

The following REST endpoints are available:

  1. GET /api/video/:id - Get a video by ID
  2. POST /api/video - Create a new video
    {
      "scenes": [
        {
          "text": "This is the text to be spoken in the video",
          "searchTerms": ["nature sunset"]
        }
      ],
      "config": {
        "paddingBack": 3000,
        "music": "chill"
      }
    }
    
  3. DELETE /api/video/:id - Delete a video by ID
  4. GET /api/music-tags - Get available music tags

Model Context Protocol (MCP)

The service also implements the Model Context Protocol:

  1. GET /mcp/sse - Server-sent events for MCP
  2. POST /mcp/messages - Send messages to MCP server

Available MCP tools:

  • create-short-video - Create a video from a list of scenes
  • get-video-status - Check video creation status

License

This project is licensed under the ISC License.

Acknowledgments

MCP Server · Populars

MCP Server · New

    Matthew-Wise

    Umbraco MCP

    A model context protocol (MCP) server for Umbraco

    Community Matthew-Wise
    orneryd

    M.I.M.I.R - Multi-agent Intelligent Memory & Insight Repository

    Mimir - Fully open and customizable memory bank with semantic vector search capabilities for locally indexed files (Code Intelligence) and stored memories that are shared across sessions and chat contexts allowing worker agent to learn from errors in past runs. Includes Drag and Drop multi-agent orchestration

    Community orneryd
    BetterThanTomorrow

    Make CoPilot an Interactive Programmer

    VS Code AI Agent Interactive Programming. Tools for CoPIlot and other assistants. Can also be used as an MCP server.

    Community BetterThanTomorrow
    chenningling

    小红书自动搜索评论工具(MCP Server 2.0)

    这是一款基于 Playwright 开发的小红书自动搜索和评论工具,作为 MCP Server,可通过特定配置接入 MCP Client(如Claude for Desktop),帮助用户自动完成登录小红书、搜索关键词、获取笔记内容及发布AI生成评论等操作。

    Community chenningling
    Dianel555

    Paper Search MCP (Node.js)

    A Node.js implementation of the Model Context Protocol (MCP) server for searching and downloading academic papers from multiple sources, including **Web of Science**, arXiv, and more.

    Community Dianel555