wpa-mcp

An MCP (Model Context Protocol) server that turns Windows WPR .etl traces into structured, LLM-friendly performance insights — using WPAExporter + xperf under the hood, and optionally emitting flamegraph-ready folded stacks.

wpa-mcp bridges two worlds:

Windows Performance Analyzer (WPA) — the gold standard for analyzing ETW / WPR traces, but GUI-heavy and hard to automate.
LLMs (Claude, Copilot, GPT, …) — great at reasoning across evidence, but blind to .etl files.

This server exposes a small set of MCP tools so an LLM can:

Validate a trace (does it actually contain the events needed for analysis?)
Export the right WPA tables to CSV via predefined profiles
Summarize the CSVs into a compact JSON (Top N processes, hot stacks, ready-thread latency, DPC/ISR offenders, UI jank)
Render a Brendan-Gregg-style folded stack file for flamegraphs — or for the LLM to read directly

Architecture
Prerequisites
Install
Capture a trace
MCP tools
Built-in WPA profiles
Analysis examples
- Example 1: Runaway CPU
- Example 2: UI hang / "not responding"
- Example 3: Audio/mouse glitch caused by a driver
- Example 4: Feeding folded stacks to the LLM
Client configuration
Release process
Troubleshooting
FAQ

Architecture

+------------------+       stdio (MCP)        +--------------------+
|  LLM / MCP host  |  <-------------------->  |   wpa-mcp server   |
| (Claude, VSCode) |                          |  (this repo)       |
+------------------+                          +----------+---------+
                                                         |
                                             subprocess  |
                                                         v
                                   +---------------------+---------------------+
                                   |  xperf.exe          |  wpaexporter.exe    |
                                   |  (validate / stats) |  (+ .wpaProfile)    |
                                   +---------------------+---------------------+
                                                         |
                                                         v
                                              CSV tables (per profile)
                                                         |
                                                         v
                                       summarizer -> JSON  /  flamegraph -> .folded

Everything that the LLM sees is structured JSON or compact folded-stack text — never raw gigabyte CSVs.

Prerequisites

Windows 10/11 (required; the analysis tools are Windows-only)
Windows Performance Toolkit (WPT) installed (ships with Windows ADK / Windows SDK)
- wpaexporter.exe
- xperf.exe
Python 3.10+

If WPT is installed to a non-default path, set:

setx WPAEXPORTER_PATH \"C:\Program Files (x86)\Windows Kits\10\Windows Performance Toolkit\wpaexporter.exe\"
setx XPERF_PATH       \"C:\Program Files (x86)\Windows Kits\10\Windows Performance Toolkit\xperf.exe\"

Install

Via pipx (recommended — works once published to PyPI)

pipx install wpa-mcp
wpa-mcp    # starts the MCP stdio server

From source

git clone https://github.com/Jialong-zhong/wpr-xperf-mcp-server.git
cd wpr-xperf-mcp-server
pip install -e .
wpa-mcp

Capture a trace

The server's analyses are only as good as the providers you captured. Recommended capture for the four problem classes this server targets:

# Run as Administrator
wpr -start CPU ^
    -start GeneralProfile ^
    -start DesktopComposition ^
    -start Registry ^
    -filemode

# ... reproduce the issue ...

wpr -stop C:\traces\case01.etl \"repro notes here\"

WPR profile	What it adds that wpa-mcp uses
`CPU`	Sampled CPU, CSwitch, ReadyThread, StackWalk
`GeneralProfile`	Processes, images, DPC/ISR
`DesktopComposition`	DWM frame timing, Window-in-focus (UI hang evidence)
`Registry`	Registry activity (optional; useful for startup/UI hangs)

If you skip CPU, the most valuable analyses (hot stacks, scheduling latency) won't work — validate_trace will tell you so.

MCP tools

Tool	Purpose	Typical caller
`validate_trace(etl_path)`	Run `xperf -a stats` and report which providers / stacks exist	LLM, always first
`export_tables(etl_path, profile)`	Run one WPA profile via `wpaexporter` and return CSV paths	Advanced / targeted
`analyze_etl(etl_path, focus)`	Validate → export (by focus) → summarize. Returns one structured JSON	LLM, default entry point
`render_flamegraph(out_dir)`	Aggregate `CPU Usage (Sampled)` stacks into Brendan-Gregg folded format	After `analyze_etl` with CPU focus

`analyze_etl` input schema

{
  \"etl_path\": \"C:\\traces\\case01.etl\",
  \"focus\": \"cpu | latency | ui | dpc_isr | all\",
  \"out_dir\": \"optional override\",
  \"top_n\": 20
}

`analyze_etl` output shape (abbreviated)

{
  \"etl\": \"C:\\traces\\case01.etl\",
  \"focus\": \"all\",
  \"validation\": {
    \"duration_sec\": 42.7,
    \"has_cpu_sampling\": true,
    \"has_cswitch\": true,
    \"has_readythread\": true,
    \"has_stacks\": true,
    \"has_dpc_isr\": true,
    \"has_dwm\": true,
    \"warnings\": []
  },
  \"exports\": [\"...\\cpu\\CPU Usage (Sampled)_...csv\", \"...\"],
  \"summary\": {
    \"cpu_top_processes\": [{\"process\": \"chrome.exe\", \"weight_ms\": 8421.3}],
    \"cpu_top_modules\":   [{\"module\":  \"ntdll.dll\",   \"weight_ms\": 2310.0}],
    \"cpu_hot_stacks\":    [{\"stack\":   \"ntdll!... ; app!hot_fn\", \"weight_ms\": 1240.0}],
    \"ready_latency_top\": [{\"process\": \"explorer.exe\", \"tid\": 1234, \"p95_ms\": 187.0}],
    \"dpc_isr_top\":       [{\"driver\":  \"ndis.sys\",     \"total_ms\": 95.2, \"max_us\": 820}],
    \"ui_focus_top\":      [{\"process\": \"myapp.exe\",    \"focus_ms\": 5400.0}],
    \"dwm_slow_frames\":   {\"count\": 38, \"p95_ms\": 41.7, \"max_ms\": 128.0}
  }
}

Built-in WPA profiles

Each profile is a .wpaProfile XML that tells wpaexporter which WPA tables + columns to dump.

Focus key	File	Tables exported
`cpu`	`wpa/profiles/cpu_hotpath.wpaProfile`	CPU Usage (Sampled)
`latency`	`wpa/profiles/scheduling_latency.wpaProfile`	CPU Usage (Precise), Ready Thread
`ui`	`wpa/profiles/ui_hang.wpaProfile`	Window In Focus, DWM Frame Details
`dpc_isr`	`wpa/profiles/dpc_isr.wpaProfile`	DPC/ISR Duration

Column sets are deliberately minimal to keep CSVs small and summarizer-friendly.

Analysis examples

These are end-to-end, copy-pasteable walkthroughs. Each shows the user prompt, the tool calls the LLM should make, the JSON shape you can expect, and the conclusions a well-prompted LLM should draw.

Example 1: Runaway CPU

User: "C:\traces\cpu_spike.etl — some process is pinning my CPU at 100%. Find it and tell me which function."

LLM tool calls:

// 1) validate
validate_trace({ \"etl_path\": \"C:\\traces\\cpu_spike.etl\" })

// 2) analyze focused on CPU
analyze_etl({ \"etl_path\": \"C:\\traces\\cpu_spike.etl\", \"focus\": \"cpu\", \"top_n\": 10 })

// 3) (optional) generate folded stacks for the hottest paths
render_flamegraph({
  \"out_dir\": \"C:\\traces\\cpu_spike_wpa_out\\cpu\",
  \"top_n\": 200,
  \"min_weight_ms\": 5
})

Typical summary returned:

{
  \"cpu_top_processes\": [
    { \"process\": \"myapp.exe\",  \"weight_ms\": 38210.0 },
    { \"process\": \"chrome.exe\", \"weight_ms\":  2100.0 }
  ],
  \"cpu_top_modules\": [
    { \"module\": \"myapp.exe\",  \"weight_ms\": 31020.0 },
    { \"module\": \"ucrtbase.dll\", \"weight_ms\": 4100.0 }
  ],
  \"cpu_hot_stacks\": [
    { \"stack\": \"ntdll!RtlUserThreadStart;kernel32!BaseThreadInitThunk;myapp!worker_loop;myapp!compress_chunk\",
      \"weight_ms\": 27800.0 }
  ]
}

What the LLM should conclude:

Trace duration 42 s, myapp.exe consumed ~38 s of CPU ≈ ~90% of wall time on a single core equivalent.
Nearly all weight is in myapp!compress_chunk via worker_loop → this is a hot loop in the app itself, not the runtime or OS.
Recommend: inspect compress_chunk, consider SIMD / chunking, and capture a second trace with -start CPU -start FileIO to see if the loop is also I/O-bound.

Example 2: UI hang / "not responding"

User: "When the user clicks Save, the window freezes for ~2 seconds. Trace at C:\traces\save_hang.etl."

LLM tool calls:

validate_trace({ \"etl_path\": \"C:\\traces\\save_hang.etl\" })
analyze_etl({ \"etl_path\": \"C:\\traces\\save_hang.etl\", \"focus\": \"ui\" })
analyze_etl({ \"etl_path\": \"C:\\traces\\save_hang.etl\", \"focus\": \"latency\" })

Expected signals:

{
  \"ui_focus_top\": [{ \"process\": \"myapp.exe\", \"focus_ms\": 5400 }],
  \"dwm_slow_frames\": { \"count\": 42, \"p95_ms\": 58.0, \"max_ms\": 1980.0 },
  \"ready_latency_top\": [
    { \"process\": \"myapp.exe\", \"tid\": 7321, \"p95_ms\": 14.0 }
  ]
}

Reasoning rubric the LLM should apply:

dwm_slow_frames.max_ms ≈ 1980 while the app is in focus → the hang is real and UI-thread-bound.
ready_latency_top for the UI thread is low (≈14 ms p95) → the thread wasn't starved for CPU; it was busy on its own work (otherwise we'd see hundreds of ms of ready-wait).
Therefore the UI thread is doing synchronous work during Save. Likely culprits: synchronous disk/network I/O, lock contention, or a heavy compute path on the UI thread.
Next step: re-capture with -start CPU -start FileIO -start DiskIO and focus on the UI thread's stacks with render_flamegraph to see exactly what it's doing during those 2 seconds.

Contrast: if ready_latency_top had been, say, p95_ms: 900 — then the UI thread was ready but not scheduled. That's a scheduling/contention story, not a "heavy work on UI thread" story.

Example 3: Audio/mouse glitch caused by a driver

User: "Audio crackles and the mouse skips for a split second every few minutes. C:\traces\glitch.etl."

LLM tool calls:

validate_trace({ \"etl_path\": \"C:\\traces\\glitch.etl\" })
analyze_etl({ \"etl_path\": \"C:\\traces\\glitch.etl\", \"focus\": \"dpc_isr\" })

Expected signal:

{
  \"dpc_isr_top\": [
    { \"driver\": \"Netwtw10.sys\", \"total_ms\": 312.4, \"max_us\": 4120, \"count\": 1820 },
    { \"driver\": \"ndis.sys\",      \"total_ms\":  95.1, \"max_us\":  820, \"count\": 4300 },
    { \"driver\": \"nvlddmkm.sys\",  \"total_ms\":  60.0, \"max_us\":  410, \"count\": 2100 }
  ]
}

What the LLM should conclude:

Netwtw10.sys (Intel Wi-Fi driver) has a single DPC over 4 ms — that's well above the ~1 ms "don't cause audio glitches" rule of thumb.
Correlation with symptom: Wi-Fi DPC storms typically line up with mouse/audio skips because DPCs run at elevated IRQL and block the audio/HID stack.
Recommend: update the Wi-Fi driver; if the problem persists, disable power-saving for the Wi-Fi adapter and re-capture.

Quality rules wpa-mcp's prompting guide bakes in: any driver with max_us > 1000 is suspicious, >= 500 worth mentioning.

Example 4: Feeding folded stacks to the LLM

After analyze_etl with focus="cpu", you can ask the LLM to drill deeper:

render_flamegraph({
  \"out_dir\": \"C:\\traces\\cpu_spike_wpa_out\\cpu\",
  \"output_path\": \"C:\\traces\\cpu_spike.folded\",
  \"top_n\": 300,
  \"min_weight_ms\": 2
})

Returns:

{
  \"folded_file\": \"C:\\traces\\cpu_spike.folded\",
  \"source_csv\": \"C:\\traces\\cpu_spike_wpa_out\\cpu\\CPU Usage (Sampled)_....csv\",
  \"line_count\": 287,
  \"total_weight_ms\": 39120.0,
  \"preview\": \"ntdll!RtlUserThreadStart;kernel32!BaseThreadInitThunk;myapp!worker_loop;myapp!compress_chunk 27800\\nntdll!... ; myapp!parse_header 410\\n...\"
}

You can now either:

Render an SVG flamegraph (requires Perl + Brendan Gregg's script):

flamegraph.pl C:\traces\cpu_spike.folded > C:\traces\cpu_spike.svg

Or just let the LLM read the preview — the folded format is already much easier for an LLM than raw CSV.

Client configuration

Claude Desktop — `%APPDATA%\\Claude\\claude_desktop_config.json`

{
  \"mcpServers\": {
    \"wpa\": {
      \"command\": \"wpa-mcp\",
      \"env\": {
        \"WPAEXPORTER_PATH\": \"C:/Program Files (x86)/Windows Kits/10/Windows Performance Toolkit/wpaexporter.exe\",
        \"XPERF_PATH\":       \"C:/Program Files (x86)/Windows Kits/10/Windows Performance Toolkit/xperf.exe\"
      }
    }
  }
}

VS Code (GitHub Copilot Chat / MCP) — `.vscode/mcp.json`

Already included in this repo. It points at server.py in the workspace.

Custom MCP host

Any MCP client that speaks stdio works. Launch wpa-mcp (or python server.py) as a child process and send tools/list + tools/call over stdio.

Release process

This repo publishes to PyPI via GitHub Actions + PyPI trusted publishing (OIDC) — no secrets required.

One-time PyPI setup:

Claim the wpa-mcp project on PyPI.
Add a Trusted Publisher:
- Owner: Jialong-zhong
- Repository: wpr-xperf-mcp-server
- Workflow: publish.yml
- Environment: pypi

Then, to ship a new version:

# bump version in pyproject.toml, commit, then:
git tag v0.2.0
git push origin v0.2.0

The Publish to PyPI workflow (on tag v*) will build the sdist + wheel and publish automatically.

Troubleshooting

Symptom	Likely cause	Fix
`wpaexporter not found`	WPT not installed or path wrong	Install Windows Performance Toolkit; set `WPAEXPORTER_PATH`
`xperf stats failed`	ETL corrupted or not a WPR trace	Re-capture; ensure you ran `wpr -stop <file>` successfully
`columns missing` in summarizer	Your WPA version renamed columns	Open the corresponding `.wpaProfile` and adjust `<Column Name=...>` to match your WPA
`has_stacks: false` in validation	`-start CPU` not used during capture, or no admin	Re-capture with `-start CPU` as Administrator
Empty `dwm_slow_frames`	`DesktopComposition` profile wasn't enabled	Re-capture with `-start DesktopComposition`
`ready_latency_top` all near zero during a hang	The thread isn't ready-waiting → it's doing work	Run `render_flamegraph` on CPU exports to see what work

FAQ

Q: Does this need WPA GUI installed?No. Only wpaexporter.exe and xperf.exe (both from the Windows Performance Toolkit) are called. WPA GUI never launches.

Q: Can I use this on Linux/macOS?The MCP server itself is pure Python. But wpaexporter / xperf only exist on Windows, so analysis must run on Windows. A common setup is: capture on Windows, copy ETL to a Windows analysis box, run wpa-mcp there.

Q: Why not parse ETL directly in Python?ETL parsing is deep. Microsoft already ships an excellent, correct parser (wpaexporter) that understands every kernel + provider schema. Reusing it is cheaper and more accurate than reimplementing.

Q: Can I add my own WPA profile?Yes. Drop a .wpaProfile into wpa/profiles/, add a key to PROFILE_MAP in server.py, and (optionally) a summarizer in wpa/summarizer.py.

Q: Does the LLM see the full CSV?No — by design. The LLM sees compact summary JSON plus (optionally) folded-stack text. Raw CSVs stay on disk and are referenced by path.

License

MIT. See LICENSE.

wpa-mcp

wpa-mcp

Table of contents

Architecture

Prerequisites

Install

Via pipx (recommended — works once published to PyPI)

From source

Capture a trace

MCP tools

`analyze_etl` input schema

`analyze_etl` output shape (abbreviated)

Built-in WPA profiles

Analysis examples

Example 1: Runaway CPU

Example 2: UI hang / "not responding"

Example 3: Audio/mouse glitch caused by a driver

Example 4: Feeding folded stacks to the LLM

Client configuration

Claude Desktop — `%APPDATA%\\Claude\\claude_desktop_config.json`

VS Code (GitHub Copilot Chat / MCP) — `.vscode/mcp.json`

Custom MCP host

Release process

Troubleshooting

FAQ

License

MCP Server · Populars

🦞 OpenClaw — Personal AI Assistant

MarkItDown-MCP

MarkItDown

Awesome MCP Servers

mcp-server-sentry: A Sentry MCP server

MCP Server · New

Deepseek R1 MCP Server

expose

OmniSQL MCP

browser-use-mcp-server

ShadowGit MCP Server

wpa-mcp

Table of contents

Architecture

Prerequisites

Install

Via pipx (recommended — works once published to PyPI)

From source

Capture a trace

MCP tools

analyze_etl input schema

analyze_etl output shape (abbreviated)

Built-in WPA profiles

Analysis examples

Example 1: Runaway CPU

Example 2: UI hang / "not responding"

Example 3: Audio/mouse glitch caused by a driver

Example 4: Feeding folded stacks to the LLM

Client configuration

Claude Desktop — %APPDATA%\\Claude\\claude_desktop_config.json

VS Code (GitHub Copilot Chat / MCP) — .vscode/mcp.json

Custom MCP host

Release process

Troubleshooting

FAQ

License

MCP Server · Populars

MCP Server · New

`analyze_etl` input schema

`analyze_etl` output shape (abbreviated)

Claude Desktop — `%APPDATA%\\Claude\\claude_desktop_config.json`

VS Code (GitHub Copilot Chat / MCP) — `.vscode/mcp.json`