Cheap Research v1.0.0

A bounded evidence review engine. Give it a claim and a document corpus - it extracts relevant evidence, detects contradictions, and produces auditable evidence packets. No hallucinations, no global truth claims, no open-web research.

What it does

Ingest PDF, TXT, MD, DOCX, DOC, HTML, RTF, EPUB documents
Extract evidence spans relevant to your claim
Detect contradictions between claim and corpus
Render markdown evidence packets with caveats
Audit immutable hash-chained audit trail

Trust posture

Fails closed - enforcement engine blocks invalid authority transitions
Bounded corpus - cannot make claims outside provided documents
Explicit caveats - every packet states limitations
No global truth - "not established in corpus" not "false"
Human review required - claims need explicit reviewer sign-off

MCP Tools

Tool	Description
`research_start`	Ingest corpus, assess claim, produce evidence packet
`research_status`	Get run state and evidence summary
`research_getEvidence`	Get top relevant spans + contradictions
`research_getPacket`	Get full markdown assessment report
`research_audit`	Run compliance integrity audit
`corpus_topics`	List available paper topics for download
`corpus_list`	List all papers in curated library (filterable)
`corpus_load`	Download seminal ML papers to corpus directory

Quick Start: Download Papers

Use corpus_load to download a curated set of foundational ML papers:

// Download ALL papers to your corpus
corpus_load({ corpus_path: "/path/to/cheap-research/corpus" })

// Or download specific topics
corpus_load({
  corpus_path: "/path/to/corpus",
  topics: ["transformers", "optimization"]
})

// Or specific papers by ID
corpus_load({
  corpus_path: "/path/to/corpus",
  paper_ids: ["attention", "bert", "resnet"]
})

Available topics: transformers, foundations, optimization, reinforcement, generative, representations, efficiency

Papers included: Attention Is All You Need, BERT, GPT, GPT-3, ResNet, AlexNet, LSTM, Adam, Batch Norm, Dropout, GANs, VAE, Diffusion Models, Vision Transformer, Word2Vec, DQN, Knowledge Distillation, and more (22 papers total).

Install

git clone https://github.com/your-org/cheap-research.git
cd cheap-research
npm install
npm run build

Adding to Your AI Assistant

This server uses the Model Context Protocol (MCP). Here's how to connect it to your AI:

Claude (Claude Code CLI / Claude Desktop)

Open your MCP config file:
- Claude Code: ~/.claude.json
- Claude Desktop: ~/Library/Application Support/Claude/claude_desktop_config.json (macOS) or %APPDATA%\Claude\claude_desktop_config.json (Windows)
Add the server to mcpServers:

{
  "mcpServers": {
    "cheap-research": {
      "command": "node",
      "args": [
        "/ABSOLUTE/PATH/TO/cheap-research/node_modules/.bin/tsx",
        "--tsconfig",
        "/ABSOLUTE/PATH/TO/cheap-research/tsconfig.base.json",
        "/ABSOLUTE/PATH/TO/cheap-research/apps/mcp-server/src/index.ts"
      ]
    }
  }
}

Replace /ABSOLUTE/PATH/TO/ with the actual path where you cloned the repo.
Restart Claude Code or Claude Desktop.
Type /mcp in Claude Code to verify the server is connected.

Other MCP-compatible AI Tools

Any AI tool that supports MCP can use this server. Point it to run the npm run mcp command or use the config format above. Check your tool's documentation for MCP server configuration.

Run as MCP Server (manual)

npm run mcp

This starts the server in stdio mode, ready for MCP clients.

Example

// Start research on a claim
research_start({
  task_type: "bounded_doc_claim_assessment",
  target_claim: "Attention mechanisms enable Transformers to model long-range dependencies",
  entity: "Transformer architecture",
  feature: "attention mechanism",
  scope: "Assess based on provided neural network papers",
  corpus_path: "/path/to/corpus"
})

// Returns: { run_id, status: "completed", assessment: "qualified", ... }

Supported Formats

Format	Extension	Parser
PDF	`.pdf`	pdf-parse
Text	`.txt`	native
Markdown	`.md`	native
Word	`.docx`	mammoth
Word (legacy)	`.doc`	mammoth + fallback
HTML	`.html`, `.htm`	cheerio
RTF	`.rtf`	rtf-parser
EPUB	`.epub`	epub2

What's NOT included (intentional)

Open-web research
Autonomous recommendations
Confidence scores
Free-form LLM synthesis
Automatic operational acceptance
Vector embeddings / semantic search

Architecture

corpus/           → ingestCorpus() → MemoryStore
                                  ↓
claim + evidence  → findRelevantSpans() → scored spans
                                  ↓
                  → detectContradictions() → flagged spans
                                  ↓
                  → createClaimCandidate() → enforcement check
                                  ↓
                  → renderPacketMarkdown() → evidence packet
                                  ↓
                  → publishPacket() → audit log

State Machines

Claims, findings, packets all have strict state transitions enforced by the enforcement engine. Illegal transitions are blocked with explicit blockers.

Audit Trail

Hash-chained immutable audit log. Every action recorded with:

Actor context (who)
Object reference (what)
Event type (action)
Payload (details)
Chain hash (integrity)

License

MIT

Cheap Research v1.0.0