jkawamoto

Florence-2 MCP Server

Community jkawamoto
Updated

An MCP server for processing images using Florence-2

Florence-2 MCP Server

Python ApplicationGitHub Licensepre-commitRuffsmithery badge

An MCP server for processing images using Florence-2.

You can process images or PDF files stored on a local or web server to extract text using OCR (Optical CharacterRecognition) or generate descriptive captions summarizing the content of the images.

Installation

For Claude Desktop

To configure this server for Claude Desktop, edit the claude_desktop_config.json file with the following entry undermcpServers:

{
  "mcpServers": {
    "florence-2": {
      "command": "uvx",
      "args": [
        "--from",
        "git+https://github.com/jkawamoto/mcp-florence2",
        "mcp-florence2"
      ]
    }
  }
}

After editing, restart the application.For more information,see: For Claude Desktop Users - Model Context Protocol.

For Goose CLI

To enable the Bear extension in Goose CLI,edit the configuration file ~/.config/goose/config.yaml to include the following entry:

extensions:
  bear:
    name: Florence-2
    cmd: uvx
    args: [ --from, git+https://github.com/jkawamoto/mcp-florence2, mcp-florence2 ]
    enabled: true
    type: stdio

For Goose Desktop

Add a new extension with the following settings:

  • Type: Standard IO
  • ID: florence-2
  • Name: Florence-2
  • Description: An MCP server for processing images using Florence-2
  • Command: uvx --from git+https://github.com/jkawamoto/mcp-florence2 mcp-florence2

For more details on configuring MCP servers in Goose Desktop,refer to the documentation:Using Extensions - MCP Servers.

Tools

ocr

Process an image file or URL using OCR to extract text.

Arguments:
  • src: A file path or URL to the image file that needs to be processed.

caption

Processes an image file and generates captions for the image.

Arguments:
  • src: A file path or URL to the image file that needs to be processed.

License

This application is licensed under the MIT License. See the LICENSE file for more details.

MCP Server ยท Populars

MCP Server ยท New

    YV17labs

    ghostdesk

    Give any AI agent a full desktop โ€” it sees the screen, clicks, types, and runs apps like a human. Automate anything with a UI: browsers, legacy software, internal tools. No API needed. One Docker command.

    Community YV17labs
    remotebrowser

    mcp

    Free your data

    Community remotebrowser
    Decodo

    Decodo MCP Server

    The Decodo MCP server which enables MCP clients to interface with services.

    Community Decodo
    kuberstar

    Qartez MCP

    Semantic code intelligence MCP server for Claude Code - project maps, symbol search, impact analysis, and more

    Community kuberstar
    aovestdipaperino

    tokensave

    Rust port of CodeGraph โ€” a local-first code intelligence system that builds semantic knowledge graphs from codebases. Ported from the original TypeScript implementation by @colbymchenry.

    Community aovestdipaperino