u2n4

Gemini Media MCP

Community u2n4
Updated

All-in-one MCP toolkit for AI media generation — VEO 3.1 video + Gemini Imagen 3 images + prompting skills

Gemini Media MCP

All-in-one MCP toolkit for AI media generation -- VEO 3.1 video + NanoBanana Pro 2 images + prompting skills

What's Included • Quick Start • VEO Server • NanoBanana Server • Skills • Contributing

CLI Install check Version check
veo-mcp-server veo-mcp-server --help veo-mcp-server --version
nanobanana-imagen-mcp nanobanana-imagen-mcp --help nanobanana-imagen-mcp --version

What is This?

Gemini Media MCP is a comprehensive toolkit that brings Google's most powerful AI media generation models into any MCP-compatible AI assistant. Generate 4K videos with VEO 3.1, create stunning images with NanoBanana Pro 2, and craft professional prompts with built-in skills -- all from a single repository.

What's Included

MCP Servers

Server Description Tools
VEO 3.1 AI video generation (text-to-video, image-to-video, extend, interpolate) 9 tools
NanoBanana AI image generation with NanoBanana Pro 2 (Pro + Flash models) 4 tools

Claude Code Skills (Plugin Marketplace)

Skill Description
VEO Prompting 7-layer prompt engineering for cinematic VEO 3.1 videos
NanoBanana Prompting 7-layer prompt engineering for photorealistic NanoBanana Pro 2 images

Install skills via Claude Code:

/plugin marketplace add u2n4/gemini-media-mcp

Quick Start

This repo hosts two independent MCP servers. Install whichever you need — or both. Each server publishes to PyPI separately.

Option A — uvx (zero-install, recommended)

Add a block to your MCP client config (Claude Desktop, Claude Code, Cursor, VS Code, Windsurf) using the appropriate server block below (see VEO Server / NanoBanana Server sections).

Option B — clone + editable install

git clone https://github.com/u2n4/gemini-media-mcp.git
cd gemini-media-mcp

# Create a virtual environment (uv pip install requires one — or pass --system)
uv venv
source .venv/bin/activate     # macOS / Linux
# .venv\Scripts\activate     # Windows PowerShell

# Install one or both sub-packages
uv pip install -e servers/veo
uv pip install -e servers/nanobanana

VEO Server

uvx (zero-install):

{
  "mcpServers": {
    "veo": {
      "command": "uvx",
      "args": ["veo-mcp-server"],
      "env": {
        "GEMINI_API_KEY": "your_key",
        "VIDEO_OUTPUT_DIR": "./videos"
      }
    }
  }
}

Claude Code:

claude mcp add veo -s user -e GEMINI_API_KEY=your_key -- uvx veo-mcp-server

pip install:

pip install veo-mcp-server

NanoBanana Server

uvx (zero-install):

{
  "mcpServers": {
    "nanobanana": {
      "command": "uvx",
      "args": ["nanobanana-imagen-mcp"],
      "env": {
        "GEMINI_API_KEY": "your_key"
      }
    }
  }
}

Claude Code:

claude mcp add nanobanana -s user -e GEMINI_API_KEY=your_key -- uvx nanobanana-imagen-mcp

pip install:

pip install nanobanana-imagen-mcp

VEO Server

AI video generation powered by Google VEO 3.1. Uses an async job pattern where generation starts in the background and returns a job ID for polling -- no timeouts.

Tools

Tool Description
veo_generate_video Generate video from text prompt. Supports 720p/1080p/4K, 16:9 or 9:16, 4/6/8 second duration, negative prompts, reference images, seed control, and batch generation (1-4 videos).
veo_image_to_video Animate a reference image with a motion prompt.
veo_interpolate_video Create smooth transition between two frames (first frame + last frame).
veo_extend_video Extend an existing VEO video by ~7 seconds. 720p only, max 148 seconds total.
veo_check_job Check async job status. Call every 15-20 seconds until completed or failed.
veo_list_jobs List all generation jobs and their current status.
veo_api_status Check API key status -- keys configured, active key, keys remaining.
veo_pricing_info Show pricing per second for standard and fast models at all resolutions.
veo_show_output_stats Display generation statistics -- video count, total size, file details, job statuses.

VEO Configuration

Variable Description Default
GEMINI_API_KEY Primary API key (required) --
GEMINI_API_KEY_BACKUP Backup key for auto-rotation --
VIDEO_OUTPUT_DIR Output directory for videos ~/veo-videos

VEO Models

Tier Model Best For
Standard veo-3.1-generate-preview Higher quality output
Fast veo-3.1-fast-generate-preview Quicker generation

NanoBanana Server

AI image generation powered by NanoBanana Pro 2. Supports Pro (maximum quality) and Flash (fast) models with default 4K resolution.

Tools

Tool Description
generate_image Generate images using NanoBanana Pro 2 (Pro or Flash). Supports aspect ratio, resolution (up to 4K), negative prompts, thinking level, grounding, and reference images.
upload_file Upload reference image for editing or conditioning.
show_output_stats Display generation statistics -- image count, total size, file details.
maintenance Server maintenance and cleanup -- clear caches, remove temporary files.

Models

Model Engine Best For
Pro Gemini 3 Pro Image Maximum quality, complex scenes
Flash Gemini 3.1 Flash Image Fast generation, simple scenes

Prompting Skills

VEO Prompting Skill

7-layer prompt engineering system for VEO 3.1:

  1. Cinematography (camera, shot type, lens, angles)
  2. Subject (characters, objects, material cues)
  3. Action (force-based verbs, timestamp beats)
  4. Environment (time of day, weather, depth layers)
  5. Lighting & Mood (physical light sources, color temperature)
  6. Audio Design (dialogue, SFX, ambient, music)
  7. Technical Controls (negative prompts, style anchors, film stocks)

NanoBanana Prompting Skill

7-layer prompt engineering system for NanoBanana Pro 2:

  1. Style & Art Direction (visual DNA)
  2. Scene Description (environment, atmosphere)
  3. Main Subject (hero element with extreme specificity)
  4. Camera & Lens (real camera specs for realism)
  5. Lighting (natural, studio, color temperature)
  6. Texture, Material & Color (tactile detail)
  7. Negative Prompts (quality guards)

Architecture

gemini-media-mcp/
├── servers/
│   ├── veo/                       # VEO 3.1 MCP Server (PyPI: veo-mcp-server)
│   │   ├── pyproject.toml
│   │   ├── requirements.txt
│   │   └── src/
│   │       └── veo_mcp_server/
│   │           ├── __init__.py
│   │           ├── __main__.py
│   │           └── server.py
│   └── nanobanana/                # NanoBanana MCP Server (PyPI: nanobanana-imagen-mcp)
│       ├── pyproject.toml
│       ├── requirements.txt
│       └── nanobanana_mcp_server/ # Package
├── skills/
│   ├── veo-prompting/             # VEO prompting skill
│   │   └── SKILL.md
│   └── nanobanana-prompting/      # NanoBanana prompting skill
│       └── SKILL.md
├── plugins/                       # Claude Code Plugin Marketplace
│   ├── veo-prompting/
│   │   ├── .claude-plugin/
│   │   │   └── plugin.json
│   │   └── skills/
│   │       └── veo-prompting/
│   │           └── SKILL.md
│   └── nanobanana-prompting/
│       ├── .claude-plugin/
│       │   └── plugin.json
│       └── skills/
│           └── nanobanana-prompting/
│               └── SKILL.md
├── .claude-plugin/
│   └── marketplace.json
├── .env.example
├── .gitignore
├── CHANGELOG.md
├── CONTRIBUTING.md
├── LICENSE
├── README.md
├── llms.txt
└── llms-install.md

Contributing

See CONTRIBUTING.md.

License

MIT -- see LICENSE.

Credits

  • NanoBanana MCP Server: inspired by the nano-banana naming convention used across the MCP community. This is an independent implementation.
  • VEO 3.1 by Google DeepMind

Support

If you find this useful, please star this repository!

Made with ❤️ in the Eastern Province of Saudi Arabia.

MCP Server · Populars

MCP Server · New

    shiahonb777

    turn-mcp-web

    Achieve infinite conversation turns in a single API request via turn-mcp. Self-hosted MCP server with browser console for human-in-the-loop AI agents.

    Community shiahonb777
    mkellerman

    BMAD MCP Server

    MCP Server

    Community mkellerman
    ceorkm

    Kratos MCP

    🏛️ Memory System for AI Coding Tools - Never explain your codebase again. MCP server with perfect project isolation, 95.8% context accuracy, and the Four Pillars Framework.

    Community ceorkm
    artokun

    comfyui-mcp

    ComfyUI MCP server + Claude Code plugin — workflow execution, visualization, composition, model management, and skill generation

    Community artokun
    andreisirbu91-lab

    MCPSpend

    Real-time cost observability for Model Context Protocol (MCP) tool calls. Wraps any MCP server, attributes spend per tool/project/customer. Free tier 25K calls/mo. EU-hosted

    Community andreisirbu91-lab