LoyDgIk

GBMCP

Community LoyDgIk
Updated

用于检索和下载中国及国际标准文档的 MCP 服务器。

GBMCP

GBMCP is a Python stdio MCP server for querying standards metadata and downloading standard documents from supported public sources.

Scope

  • Search national, industry, local, international ISO, foreign DIN, and group standard metadata.
  • Download only public GB/GB-T/GB-Z PDFs from the national standard full-text disclosure system.
  • Download TTBZ social organization procedure/management documents when they are published on the organization detail page.
  • Download food safety national standard body files from CFSA when the public detail page exposes a file entry.
  • Download ecology/environment document attachments from MEE detail pages when the public page exposes station-local attachments.
  • Download health/medical standard attachments from NHC detail pages or direct attachment links.
  • Download emergency-management standard attachments from MEM detail pages or direct attachment links.
  • Download metrology technical specification PDFs from RESMEA direct download or preview endpoints.
  • Download public English ITU-T Recommendation PDFs from ITU recommendation detail/download links.
  • Download supplemental Doc88 documents from shared/search Doc88 links through EBT/SWF conversion.
  • Enterprise standards are intentionally not included.
  • The project references public request flows from related projects but does not copy their source code.
  • Data source adapters prefer public JSON/AJAX endpoints. HTML parsing is used only for endpoint responses that are HTML fragments or when no JSON endpoint is exposed.

Installation

1. Create a Python environment

GBMCP requires Python 3.10+.

cd <PROJECT_ROOT>
python -m venv .venv
.\.venv\Scripts\python.exe -m pip install --upgrade pip
.\.venv\Scripts\python.exe -m pip install -e ".[test]"

For a runtime-only install, omit the test extras:

.\.venv\Scripts\python.exe -m pip install -e .

2. Start the MCP server

.\.venv\Scripts\python.exe -m gbmcp.server

or:

gbmcp

3. Configure Codex

Edit %USERPROFILE%\.codex\config.toml and add one of the following configurations.

If GBMCP is installed in the Python environment used by Codex:

[mcp_servers.gbmcp]
command = "python"
args = ["-m", "gbmcp.server"]

[mcp_servers.gbmcp.env]
PYTHONIOENCODING = "utf-8"

If Codex should use the project virtual environment directly, replace <PROJECT_ROOT> with the local project path:

[mcp_servers.gbmcp]
command = "<PROJECT_ROOT>\\.venv\\Scripts\\python.exe"
args = ["-m", "gbmcp.server"]

[mcp_servers.gbmcp.env]
PYTHONIOENCODING = "utf-8"

Restart Codex after editing the configuration. Verify the server with get_config or a read-only search tool.

4. Install the companion Codex skill

The repository includes a companion skill at skills/gbmcp-standards. Copy it into Codex's skills directory:

New-Item -ItemType Directory -Force -Path "$env:USERPROFILE\.codex\skills" | Out-Null
Copy-Item -Recurse -Force .\skills\gbmcp-standards "$env:USERPROFILE\.codex\skills\gbmcp-standards"

Restart Codex after copying the skill. The skill records GBMCP source boundaries, batch-download rules, and Doc88 mode differences.

5. Optional Doc88 downloader tools

Doc88 downloads require external conversion tools. Install and expose the following tools before using doc88_download_document:

  • Java, with Java 17 or later recommended.
  • FFDec, with ffdec.jar available.
  • svg2pdf, required by the default fast Doc88 route.

Supported discovery paths:

  • Java: java on PATH, or GBMCP_DOC88_JAVA
  • FFDec: GBMCP_DOC88_FFDEC_JAR, or ffdec\ffdec.jar under the project root
  • svg2pdf: GBMCP_DOC88_SVG2PDF, svg2pdf on PATH, svg2pdf.exe under the project root, or svg2pdf\svg2pdf.exe

Check dependency discovery:

.\.venv\Scripts\python.exe -c "from gbmcp.service import doc88_check_dependencies; import json; print(json.dumps(doc88_check_dependencies(), ensure_ascii=False, indent=2))"

6. Package for distribution

Build Python distributions and a portable project archive:

.\scripts\package.ps1

Build without running tests:

.\scripts\package.ps1 -SkipTests

Include local Doc88 downloader tools (ffdec and svg2pdf) in the archive:

.\scripts\package.ps1 -IncludeDoc88Tools

package.bat forwards the same parameters:

.\package.bat -IncludeDoc88Tools

Outputs are written under dist\, including Python wheel/sdist files and gbmcp-<version>-package.zip. The zip contains source code, README, the companion skill, and an install.ps1 helper. By default, Doc88 external tools are not bundled. Use -IncludeDoc88Tools to include local ffdec and svg2pdf directories. Java is not bundled and must be installed on the target machine.

Install from the package archive:

Expand-Archive .\gbmcp-0.1.0-package.zip -DestinationPath .\GBMCP
cd .\GBMCP
.\install.ps1

Tools by source

Source capability matrix

Use source-prefixed tools where possible. The old unprefixed tools are onlycompatibility aliases.

Source prefix Best for Detail support Download support Do not use for
openstd_ + download_standard Public GB/GB-T/GB-Z national standards with download/preview capability GB metadata from the full-text disclosure system Public GB PDFs only Industry/local/group/food/environment/medical/metrology/ITU standards
samr_ Ordinary SAMR metadata for GB, industry, local, international ISO, foreign DIN, plans; GB advanced metadata; committees; standard samples SAMR metadata/detail pages No general file download; use download_standard only for openstd GB Food-specific, environment, medical, MEM, metrology, ITU, TTBZ body downloads
samr_search_standard_samples 国家标准样品 / GSB reference materials only GSM sample details No standard-document download Ordinary standards such as "无人机"
ttbz_ 团体标准 metadata and social organization records Organization and group-standard details Only organization documents: 标准制定程序文件 and 标准化文件管理制度 Group standard body PDF download
foodmate_ Food laws/regulations and FoodMate food standards Law details and food-standard details FoodMate standard body files when detail exposes /standard/down.php Non-food general standards; law/regulation text downloads
cfsa_ 食品安全国家标准 from CFSA CFSA food safety standard detail and related standards from announcements Body file only when public detail exposes file_guid General food laws or non-CFSA standards
mee_ Ecology/environment standards, regulations, and MEE documents MEE detail pages, standard number when visible, attachments Station-local MEE attachments only Arbitrary external attachments or guaranteed standard PDFs
nhc_ Health/medical standards from NHC NHC standard details and station-local attachments NHC station-local attachments/direct file URLs General SAMR metadata; sources blocked without required cookies
mem_ Emergency-management/safety-production standards in MEM standard text column MEM details and station-local attachments MEM station-local attachments only Full-site MEM news search or cross-page exhaustive keyword search
resmea_ Metrology technical specifications such as JJG/JJF RESMEA detail metadata and preview/download capability Direct PDF or preview PDF when exposed by the source Non-metrology standards
itu_ ITU-T Recommendations Recommendation details, series, approval/status metadata Public English PDF links only ISO/DIN/SAMR searches
doc88_ Supplemental Doc88/道客巴巴 download path for standards found in metadata-only sources; may contain many standard types but search is not guaranteed Doc88 page metadata and FFDec conversion capability Fast EBT/SWF/SVG PDF by default; high-fidelity FFDec PDF only on explicit request Authoritative metadata, guaranteed search coverage, login/paid/captcha bypass, screenshot/mobile preview download

Tool selection guidelines:

  • For ordinary Chinese standard metadata, start with samr_search_standard.
  • For downloadable public national standards, start with openstd_search_gb_downloadable, then use download_standard.
  • For 国家标准样品 / GSB reference materials only, use samr_search_standard_samples.
  • For source-specific domains, use the source prefix: foodmate_, cfsa_, mee_, nhc_, mem_, resmea_, itu_, or ttbz_.
  • Do not call a download tool unless the matching detail/search result reports a downloadable source or attachment.
  • When there are two or more standards/documents to download, prefer batch_download_documents over multiple separate source-specific download calls.

Batch download

  • batch_download_documents

batch_download_documents downloads a list of already-known targets. It is not a search tool. Use source-specific search/detail tools first, then pass detail URLs, direct download URLs, standard numbers, or Doc88 p-...html links. If multiple standards/documents need downloading, use this tool instead of making several independent download calls.

Supported target sources are openstd GB downloads, FoodMate standards, CFSA food safety standards, MEE/NHC/MEM station-local attachments, RESMEA metrology PDFs, ITU-T public English PDFs, Doc88 FFDec downloads, and TTBZ organization documents. Ordinary SAMR metadata records, TTBZ group-standard body PDFs, and law/regulation body pages are not downloadable through this batch tool.

items may be a string list:

[
  "https://www.mem.gov.cn/fw/flfgbz/bz/bzwb/202605/t20260521_604344.shtml",
  "JJG 1189.2-2026"
]

or an object list:

[
  {
    "target": "https://www.doc88.com/p-69511323154.html",
    "output_file": "GB_T_8643-2002.pdf"
  },
  {
    "target": "https://www.ttbz.org.cn/organizationDetail/123.html",
    "document_type": "标准制定程序文件",
    "output_file": "团体程序文件.pdf"
  }
]

Do not use Doc88 high_fidelity / ebt_swf_ffdec_pdf in batch mode; those downloads are intentionally rejected because two high-fidelity Doc88 conversions can exceed the MCP tool timeout.

www.doc88.com

  • doc88_search_documents
  • doc88_get_document
  • doc88_download_document
  • doc88_check_dependencies

doc88_search_documents uses https://www.doc88.com/tag/xx as the search entry. query_or_url may be a plain keyword or a full /tag/... URL.

Doc88 can contain many kinds of standards, but it is not authoritative and does not guarantee search coverage. Use it mainly as a supplemental download route after an authoritative or metadata-only source has identified a standard but cannot provide the text download.

doc88_get_document and doc88_download_document accept a Doc88 document URL such as https://www.doc88.com/p-69511323154.html, a mobile Doc88 URL, or a raw p_code. Do not pass page source HTML.

doc88_download_document downloads EBT fragments from the Doc88 detail config and reconstructs SWF pages. The default route is faster EBT/SWF -> SVG -> PDF, which uses Java, FFDec, and svg2pdf. Java 17 or later is recommended for FFDec compatibility. Use the slower high-fidelity EBT/SWF -> FFDec PDF route only when the user explicitly asks for higher quality; this route uses Java and FFDec. It does not use screenshot, mobile GIF, or browser-preview PDF generation because that output quality is unstable.

Download-time method parameters:

  • method: auto, fast, high_fidelity, ebt_swf_ffdec_pdf, or ebt_swf_svg2pdf. auto, fast, and ebt_swf_svg2pdf use the faster SVG route. high_fidelity and ebt_swf_ffdec_pdf use slower FFDec direct PDF.
  • convert_method: auto, ffdec_pdf, svg2pdf, ebt_swf_ffdec_pdf, or ebt_swf_svg2pdf. With method auto, convert auto resolves to svg2pdf for speed.
  • swf2svg: force SWF -> SVG -> PDF conversion. This is the normal fast path and needs svg2pdf.
  • svgfontface: set FFDec textExportExportFontFace=true for SVG export. This may preserve text differently, but can also introduce font/shape conversion problems.
  • fix_displayrect: run FFDec header edits before conversion to repair rare SWF canvas size mismatches.
  • clean: remove intermediate EBT/SWF/PDF/SVG files after conversion. Set false to keep work files for diagnosis.
  • keep_workdir: also keeps the working directory for this call. This is useful for debugging failed conversions.
  • replace_jna_tmp_path: set FFDec jnaTempDirectory to the Doc88 work directory, useful on Windows temp-path issues.
  • convert_workers: number of SWF conversion worker threads for this call.
  • pdf_scale: FFDec PDF zoom scale, default 2.0. Smaller values can be faster; larger values can reduce some overly thick font artifacts.
  • max_pages: optional page limit for tests or quick inspection.

doc88_check_dependencies reports Java, ffdec.jar, and svg2pdf availability. Set GBMCP_DOC88_JAVA, GBMCP_DOC88_FFDEC_JAR, or GBMCP_DOC88_SVG2PDF to override auto-detection. Place ffdec/ffdec.jar and svg2pdf/svg2pdf.exe in the project root if environment variables are not set.

openstd.samr.gov.cn

  • openstd_search_gb_downloadable
  • download_standard

openstd_search_gb_downloadable searches the national standard full-text disclosure system used by downloads.download_standard downloads only public GB/GB-T/GB-Z documents when the source allows direct download or preview reconstruction.Use openstd_search_gb_downloadable when the user asks for downloadable public GB national standards.

std.samr.gov.cn

  • samr_search_standard
  • samr_get_standard
  • samr_search_gb_metadata
  • samr_search_standard_samples
  • samr_get_standard_sample
  • samr_search_technical_committees
  • samr_get_technical_committee

samr_search_standard is the ordinary standards metadata search. Use it for normal standard documents such as GB, industry, local, international, foreign, and plan records.samr_search_gb_metadata uses the advanced GB metadata JSON endpoint at https://std.samr.gov.cn/gb/search/gbAdvancedSearchPage.samr_search_standard_samples uses https://std.samr.gov.cn/gsm/search/gsmPage and searches 国家标准样品 / GSB reference materials only. It is not an ordinary standards search. Do not use it for topic searches such as "无人机" unless the user explicitly asks for standard samples/reference materials.Use level=0 for sample plans and level=1 for the sample library. For level=1, state="0" means currently valid and state="1" means expired.

samr_search_technical_committees uses https://std.samr.gov.cn/org/search/orgCommiteeInfoPage.Use samr_get_technical_committee with the returned id or detail_url.

www.ttbz.org.cn

  • ttbz_search_organizations
  • ttbz_get_organization
  • ttbz_download_organization_document
  • ttbz_search_standards
  • ttbz_get_standard

ttbz_search_organizations uses https://www.ttbz.org.cn/organization.html via POST /cms-proxy/ms/bus/organList/portal/queryOrgan.Use query for organization name keyword and province for the registration/issuing area text.

ttbz_get_organization accepts an organUniqueId or an organizationDetail/{id}.html URL. If the detail page exposes 标准制定程序文件 or 标准化文件管理制度, the result includes them under available_documents.

ttbz_download_organization_document downloads only organization documents. document_type must be exactly 标准制定程序文件 or 标准化文件管理制度. It does not download group standard body PDFs.

ttbz_search_standards uses https://www.ttbz.org.cn/standard.html via POST /cms-proxy/ms/portal/standardInfo/getPortalStandardList.Supported filters include query, standard_no, standard_name, org_code, org_name, title_en, status, can_sell, and is_public. The default status=1 matches the website's default current/published list behavior.

ttbz_get_standard accepts a standardUniqueId or a standardDetail/{id}.html URL. It returns metadata and notice-file links, but downloadable=false because standard body preview/download depends on the original site's login and permission flow.

FoodMate

  • foodmate_search_laws
  • foodmate_get_law
  • foodmate_search_standards
  • foodmate_get_standard
  • foodmate_download_standard

foodmate_search_laws uses https://law.foodmate.net/rule/search.php.Parameter mapping: query -> kw, fields uses 0=标题, 1=智能, 2=全文, 3=简介; status uses 0=所有, 1=即将实施, 2=现行有效, 3=已经废止, 6=部分有效, 8=即将废止, 10=阶段性文件; category uses 0=不限, 39=国家法规, 1873=国外法规, 1330=地方法规, 1829=法规动态, 1881=法规解读, 338=其他法规.

foodmate_get_law accepts a show-{id}.html URL or id. It returns law/regulation metadata, text summary, source link, and attachment links. It does not download regulation text.

foodmate_search_standards uses https://down.foodmate.net/standard/search.php.Parameter mapping: query -> kw, include_enterprise=False -> corpstandard=2, include_enterprise=True -> corpstandard=1; status uses 0=所有, 1=即将实施, 2=现行有效, 3=已经废止, 6=部分有效, 8=即将废止; common category values include 4=国内标准, 6=国家标准, 12=地方标准, 24=团体标准, 56=食品安全企业标准, 5=国外标准.

foodmate_get_standard accepts a FoodMate standard detail URL or itemid. foodmate_download_standard downloads only the standard body via the detail page's /standard/down.php?auth={itemid} entry and rejects HTML, verification, or empty responses.

sppt.cfsa.net.cn:8086

  • cfsa_search_standards
  • cfsa_get_standard
  • cfsa_download_standard

cfsa_search_standards uses the public JSON endpoint behind https://sppt.cfsa.net.cn:8086/db: POST /db?task=indexSearch.Parameter mapping: query -> keyword; standard_code -> keyword for standard-number searches; standard_type -> standard_type; status -> status; date filters map to startTime, endTime, s_impl_date, and e_impl_date.

Common standard_type values include 1022=污染物, 1023=微生物, 1024=食品添加剂, 1025=食品产品, 1026=生产经营规范, 1027=食品相关产品, 1028=营养与特殊膳食食品, 1029=理化检验方法与规程, 1030=微生物检验方法与规程, 1031=毒理学评价方法与程序, 2031=食品中放射性物质, 1574=食品营养强化剂, 2032=标签, 1077=农药残留, 2030=兽药残留, and 2033=修改单.status uses 1=现行有效, 2=废止, 3=即将实施; dates use YYYY-MM-DD.

cfsa_get_standard accepts a CFSA guid or /db?type=2&guid=... URL. If the detail page has a 标准公告 link, it opens that announcement page and returns all bottom 标准文本 entries under raw.related_standards.

cfsa_download_standard downloads the body file through the detail page's file_guid and POST /cfsa_aiguo. It rejects empty or HTML/verification responses instead of keeping a bad file. If the upstream preview/download service is unavailable, the tool returns a structured error.

www.mee.gov.cn

  • mee_search_documents
  • mee_get_document
  • mee_download_document

mee_search_documents searches ecology/environment documents through the backend endpoint behind https://www.mee.gov.cn/searchnew/: GET /was5/web/search with fixed channelid=270514.Parameter mapping: query -> searchword, page -> page, orderby -> orderby, publish_date_from -> timestart, publish_date_to -> timeend.

category defaults to laws_standards, which maps to chnls=5 and corresponds to the site's 法规标准 category. Use category="all" to search all categories. Raw site category numbers are also accepted: 3=政策文件, 4=环境质量, 5=法规标准, 6=业务工作, 7=机关党建, 8=信息公开, 9=政务服务, 10=互动交流, 11=专题专栏.

mee_get_document accepts a detail URL returned by search. It extracts title, column/source metadata, standard number when visible, implementation date, summary text, and station-local attachments. Some MEE detail pages have no attachment; those return downloadable=false and unsupported_reason=attachment_not_found.

mee_download_document downloads only attachments parsed from the MEE detail page and hosted under mee.gov.cn. Use attachment_index to choose among multiple attachments. Empty files and HTML error/verification pages are rejected and removed.

www.nhc.gov.cn

  • nhc_search_standards
  • nhc_get_standard
  • nhc_download_standard

nhc_search_standards searches health/medical standards through POST https://www.nhc.gov.cn/cms-search/wsbz/wsbSearchList.htm, not the GET page. Form mapping: query -> keyword, page -> page, tree_name -> treeName, tree_code_id -> treeCodeId, and pagination_input is sent empty.

The NHC site may require browser challenge cookies. If the search or detail tool returns blocked_by_waf, export the request Cookie header from a working browser/HAR and start the MCP server with GBMCP_NHC_COOKIE set to that value.

nhc_get_standard accepts an NHC detail URL or a direct station-local attachment URL. Detail pages expose metadata such as 标准号, 标准名, 发布时间, 实施时间, and attachment links under files/*.pdf.

nhc_download_standard downloads only nhc.gov.cn station-local attachments. Direct PDF attachment URLs can be downloaded even when the HTML detail/search pages are blocked. Empty files and HTML error/verification pages are rejected and removed. If Windows certificate verification fails for NHC file downloads, set GBMCP_NHC_VERIFY_SSL=false and retry.

www.mem.gov.cn

  • mem_search_standards
  • mem_get_standard
  • mem_download_standard

mem_search_standards searches the emergency-management standards text column at https://www.mem.gov.cn/fw/flfgbz/bz/bzwb/. Page 1 uses the column homepage, and later pages use index_{page-1}.shtml. The site also exposes /was5/web/search, but that endpoint returns broad full-site mixed results, so it is not used as the default standards search source.

query filters only the fetched standards list page by standard number/title. Use page to inspect more list pages when the term is not found on the current page.

mem_get_standard accepts a MEM detail URL or a direct station-local attachment URL. It extracts title, column metadata, standard number when visible, related links, and station-local attachments.

mem_download_standard downloads only mem.gov.cn station-local attachments parsed from the detail page or passed directly as attachment URLs. Use attachment_index to choose among multiple attachments. Empty files and HTML error/verification pages are rejected and removed.

jjg.spc.org.cn/resmea

  • resmea_search_standards
  • resmea_get_standard
  • resmea_download_standard
  • resmea_list_standard_types
  • resmea_list_committees

resmea_search_standards searches national metrology technical specifications through GET https://jjg.spc.org.cn/resmea/api/standard/advanced/page.Parameter mapping: code -> code, title -> title, std_type -> stdtype, committee_name -> committeeName, statuses -> repeated status, publish_date_from/publish_date_to -> publishDateStart/publishDateEnd, and implement_date_from/implement_date_to -> implementDateStart/implementDateEnd. query is a convenience value: JJG/JJF-like text maps to code, and other text maps to title. Default statuses are 现行, 即将实施, 被代替, and 废止.

resmea_get_standard accepts a code such as JJG 1189.2-2026 or a /resmea/standard/detail.html?standno=... URL. It combines search metadata with detail-page window.__STANDARD_DETAIL__ fields and detects whether the page exposes preview/download buttons.

resmea_download_standard first tries the direct detail-page download endpoint /resmea/standard/downPdf?stdno=... when the download button exists. If that fails or direct download is unavailable, it posts a100=<标准号> to /resmea/view/stdonline, extracts the temporary token, and downloads the preview PDF from /resmea/view/onlinereading?token=.... Empty, HTML, or non-PDF responses are rejected and removed.

resmea_list_standard_types returns site filter values such as 检定规程, 检定系统表, 型评大纲, 校准规范, and 其他计量技术规范. resmea_list_committees returns committee filter names.

www.itu.int

  • itu_search_recommendations
  • itu_get_recommendation
  • itu_download_recommendation

itu_search_recommendations searches ITU-T Recommendations through GET https://www.itu.int/ITU-T/recommendations/search.aspx. Supported filters include title and optional status: omit or use Z for any, F for In Force, and O for Superseded.

itu_get_recommendation accepts a search result id, a rec.aspx?id=... URL, or a dologin_pub.asp?...!!PDF-E&type=items download URL. It extracts Recommendation number, title/summary, series title, approval date/process, status, and English PDF download link when exposed.

itu_download_recommendation downloads only ITU English PDF links under /rec/dologin_pub.asp with type=items and !!PDF-E. ITU may return a WAF Request Rejected HTML page when headers are incomplete; GBMCP sends browser-like headers and rejects HTML/non-PDF responses instead of saving bad files.

Shared

  • get_config
  • set_config

Deprecated compatibility aliases

The old unprefixed tool names are still registered as compatibility aliases:

  • search_standard -> samr_search_standard
  • search_gb_downloadable -> openstd_search_gb_downloadable
  • search_gb_metadata -> samr_search_gb_metadata
  • get_standard -> samr_get_standard
  • search_standard_samples -> samr_search_standard_samples
  • get_standard_sample -> samr_get_standard_sample
  • search_technical_committees -> samr_search_technical_committees
  • get_technical_committee -> samr_get_technical_committee

Avoid deprecated aliases in new MCP calls. In particular, search_standard_samples is only a compatibility alias for 国家标准样品 / GSB reference materials, not an ordinary standards search.

samr_search_gb_metadata parameter rules

This tool maps to the national standard advanced search form. Omit any field you do not want to filter by.

  • standard_no: GB standard number. Partial values work best, for example 18030, GB18030, or exact GB 18030-2022. GB 18030 is normalized to 18030 for this endpoint. For GB/T, use values like GB/T22320 or GB/T 22320-2025.
  • name_cn, name_en: Chinese or English title keywords, fuzzy match.
  • status: 现行, 即将实施, or 废止.
  • standard_kind: 强制性, 推荐性, or 指导性技术文件.
  • standard_sort: 产品, 基础, 方法, 管理, 安全, 卫生, 环保, or 其他.
  • plan_no: national standard plan number.
  • replaced_standard_no: replaced GB standard number.
  • ics, ccs: classification code text, for example 35.040 or L71.
  • issue_date_from, issue_date_to, implement_date_from, implement_date_to: YYYY-MM-DD.
  • drafting_org: drafting organization keyword.
  • adopted_standard_no: adopted international standard number.
  • adopted_standard_type: ISO, IEC, ISO/IEC, ITU, ISO确认的国际标准, , or 其他.
  • adoption_degree: 修改, 等同, 等效, or 非等效.
  • page_size: use one of the website page sizes: 10, 15, 20, 30, 50.

Downloaded files default to data/downloads. Cache defaults to data/cache.

OCR

Captcha OCR is pluggable. By default GBMCP tries pytesseract if it is installed. You can also provide a command:

$env:GBMCP_OCR_COMMAND = "my-ocr --image {image}"

The command must print the recognized captcha text to stdout.

Compliance

Only public GB documents that are allowed for download or preview are downloaded. TTBZ organization procedure/management documents are downloadable when directly published by the organization detail page. FoodMate and CFSA standard body files are downloaded only through each detail page's explicit download entry. MEE, NHC, and MEM attachments are downloaded only when parsed from public detail pages or provided as station-local attachment URLs. RESMEA files are downloaded only through the public direct PDF endpoint or the online preview PDF endpoint. ITU files are downloaded only through public English PDF Recommendation links. Group standard body PDFs from TTBZ, enterprise standards unless explicitly enabled for FoodMate search, and other metadata-only sources return metadata and source links only.

Author

@LoyDgIk

License

This project is released under the MIT License, a permissive open-source license. Users may use, copy, modify, merge, publish, distribute, sublicense, and sell copies of the software, subject to the license notice and disclaimer.

MCP Server · Populars

MCP Server · New