Skip to content

MCP Server

Model Context Protocol server for AI assistant integration.

Overview

The MCP server exposes pvl-webtools functionality to AI assistants like Claude via the Model Context Protocol. It provides tools for web searching and content fetching.

Running the Server

Command Line

# Set SearXNG URL for search
export SEARXNG_URL="http://localhost:8888"

# Run with uvx
uvx pvl-webtools-mcp

Verbose Logging

Set LOG_LEVEL (e.g., DEBUG, INFO, TRACE) or the convenience flag VERBOSE=1 to emit detailed MCP server logs to stderr without interfering with stdio transport. TRACE removes dependency filtering so every FastMCP/Docket message continues to appear.

Programmatic

from pvlwebtools.mcp_server import run_server

# Standard I/O transport (for Claude integration)
run_server()

# HTTP transport (for web clients)
run_server(transport="http", host="0.0.0.0", port=8080)

Available Tools

Search the web via SearXNG metasearch engine.

Parameters:

Name Type Default Description
query string required Search query
max_results int 5 Maximum results (1-20)
domain_filter string null Limit to domain
recency string "all_time" Time filter

fetch

Fetch and extract content from a URL.

Parameters:

Name Type Default Description
url string required URL to fetch
extract_mode string "markdown" Extraction mode

Extract Modes: markdown, article, raw, metadata

check_status

Check the status of web tools.

Returns: Status information including SearXNG availability.

Claude Desktop Integration

Add to your Claude Desktop configuration (~/.config/claude/mcp.json):

{
  "mcpServers": {
    "pvl-webtools": {
      "command": "uvx",
      "args": ["pvl-webtools-mcp"],
      "env": {
        "SEARXNG_URL": "http://localhost:8888"
      }
    }
  }
}

API Reference

mcp_server

MCP server for pvl-webtools.

This module provides an MCP (Model Context Protocol) server that exposes web search and fetch capabilities to AI assistants and other MCP clients.

The server provides three tools:

  • search: Search the web via SearXNG metasearch engine
  • fetch: Fetch and extract content from URLs
  • check_status: Check availability of configured services
Running the Server

Via command line::

uvx pvl-webtools-mcp

Or programmatically::

from pvlwebtools.mcp_server import run_server
run_server(transport="stdio")
Configuration

Set the SEARXNG_URL environment variable for web search::

export SEARXNG_URL="http://localhost:8888"
Note

Requires the mcp extra: pip install pvl-webtools[mcp]

run_server(transport='stdio', host='127.0.0.1', port=8000)

Run the MCP server.

Starts the MCP server with the specified transport. For integration with AI assistants like Claude, use stdio transport. For HTTP clients, use http transport.

Parameters:

Name Type Description Default
transport Literal['stdio', 'http']

Transport protocol:

  • 'stdio': Standard I/O (default, for Claude integration)
  • 'http': HTTP server (for web clients)
'stdio'
host str

Host to bind to for HTTP transport. Default '127.0.0.1'.

'127.0.0.1'
port int

Port to bind to for HTTP transport. Default 8000.

8000
Example

from pvlwebtools.mcp_server import run_server run_server() # Runs with stdio transport

Or with HTTP::

run_server(transport="http", host="0.0.0.0", port=8080)

Source code in src/pvlwebtools/mcp_server.py
def run_server(
    transport: Literal["stdio", "http"] = "stdio",
    host: str = "127.0.0.1",
    port: int = 8000,
) -> None:
    """Run the MCP server.

    Starts the MCP server with the specified transport. For integration
    with AI assistants like Claude, use ``stdio`` transport. For HTTP
    clients, use ``http`` transport.

    Args:
        transport: Transport protocol:

            - ``'stdio'``: Standard I/O (default, for Claude integration)
            - ``'http'``: HTTP server (for web clients)

        host: Host to bind to for HTTP transport. Default ``'127.0.0.1'``.
        port: Port to bind to for HTTP transport. Default ``8000``.

    Example:
        >>> from pvlwebtools.mcp_server import run_server
        >>> run_server()  # Runs with stdio transport

        Or with HTTP::

        >>> run_server(transport="http", host="0.0.0.0", port=8080)
    """
    if transport == "http":
        mcp.run(transport="http", host=host, port=port)
    else:
        mcp.run()

search(query, max_results=5, domain_filter=None, recency='all_time') async

Search the web using SearXNG metasearch engine.

Use this tool to search the web for information on any topic. Results include title, URL, snippet, and optionally published date.

Parameters:

Name Type Description Default
query str

Search query string. Be specific for better results. Examples: "python async best practices", "climate change 2024 report".

required
max_results int

Maximum number of results to return (1-20, default 5).

5
domain_filter str | None

Optional domain to limit search to. Examples: "wikipedia.org", "github.com", "arxiv.org".

None
recency Literal['all_time', 'day', 'week', 'month', 'year']

Time filter for results. One of: 'all_time' (default), 'day', 'week', 'month', 'year'.

'all_time'

Returns:

Type Description
list[dict]

List of search results with title, url, snippet, and published_date.

Note

Requires SEARXNG_URL environment variable to be set.

Source code in src/pvlwebtools/mcp_server.py
@mcp.tool
async def search(
    query: str,
    max_results: int = 5,
    domain_filter: str | None = None,
    recency: Literal["all_time", "day", "week", "month", "year"] = "all_time",
) -> list[dict]:
    """Search the web using SearXNG metasearch engine.

    Use this tool to search the web for information on any topic.
    Results include title, URL, snippet, and optionally published date.

    Args:
        query: Search query string. Be specific for better results.
               Examples: "python async best practices", "climate change 2024 report".
        max_results: Maximum number of results to return (1-20, default 5).
        domain_filter: Optional domain to limit search to.
                       Examples: "wikipedia.org", "github.com", "arxiv.org".
        recency: Time filter for results. One of:
                 'all_time' (default), 'day', 'week', 'month', 'year'.

    Returns:
        List of search results with title, url, snippet, and published_date.

    Note:
        Requires SEARXNG_URL environment variable to be set.
    """
    max_results = max(1, min(20, max_results))

    client = get_searxng_client()

    logger.debug(
        "search(query=%r, max_results=%s, domain_filter=%s, recency=%s)",
        _truncate(query),
        max_results,
        domain_filter,
        recency,
    )

    if not client.is_configured:
        return [{"error": "SearXNG not configured. Set SEARXNG_URL environment variable."}]

    try:
        results: list[SearchResult] = await client.search(
            query=query,
            max_results=max_results,
            domain_filter=domain_filter,
            recency=recency,
        )

        return [
            {
                "title": r.title,
                "url": r.url,
                "snippet": r.snippet,
                "published_date": r.published_date,
            }
            for r in results
        ]

    except WebSearchError as e:
        logger.warning("Search failed: %s", e)
        return [{"error": str(e)}]

fetch(url, extract_mode='markdown') async

Fetch and extract content from a URL.

Use this tool to retrieve the content of a web page. Supports multiple extraction modes optimized for different use cases.

Parameters:

Name Type Description Default
url str

URL to fetch (must start with http:// or https://).

required
extract_mode Literal['markdown', 'article', 'raw', 'metadata']

How to extract content: - 'markdown': Convert to LLM-friendly markdown (default). Preserves headings, lists, links, code blocks. - 'article': Extract main article text (uses trafilatura). - 'raw': Return raw HTML (truncated to 50k chars). - 'metadata': Extract title, description, Open Graph tags only.

'markdown'

Returns:

Type Description
dict

Dictionary with url, content, content_length, and extract_mode.

Note

Rate-limited to 1 request per 3 seconds to avoid abuse.

Source code in src/pvlwebtools/mcp_server.py
@mcp.tool
async def fetch(
    url: str,
    extract_mode: Literal["markdown", "article", "raw", "metadata"] = "markdown",
) -> dict:
    """Fetch and extract content from a URL.

    Use this tool to retrieve the content of a web page. Supports
    multiple extraction modes optimized for different use cases.

    Args:
        url: URL to fetch (must start with http:// or https://).
        extract_mode: How to extract content:
            - 'markdown': Convert to LLM-friendly markdown (default).
              Preserves headings, lists, links, code blocks.
            - 'article': Extract main article text (uses trafilatura).
            - 'raw': Return raw HTML (truncated to 50k chars).
            - 'metadata': Extract title, description, Open Graph tags only.

    Returns:
        Dictionary with url, content, content_length, and extract_mode.

    Note:
        Rate-limited to 1 request per 3 seconds to avoid abuse.
    """
    logger.debug("fetch(url=%s, extract_mode=%s)", url, extract_mode)

    try:
        result: FetchResult = await web_fetch(url=url, extract_mode=extract_mode)

        return {
            "url": result.url,
            "content": result.content[:10000],  # Truncate for token efficiency
            "content_length": result.content_length,
            "extract_mode": result.extract_mode,
            "truncated": result.content_length > 10000,
        }

    except WebFetchError as e:
        logger.warning("Fetch failed for %s: %s", url, e)
        return {"error": str(e), "url": url}

check_status()

Check the status of web tools.

Returns:

Type Description
dict

Status information including SearXNG availability.

Source code in src/pvlwebtools/mcp_server.py
@mcp.tool
def check_status() -> dict:
    """Check the status of web tools.

    Returns:
        Status information including SearXNG availability.
    """
    client = get_searxng_client()

    logger.debug("check_status invoked")

    return {
        "searxng_configured": client.is_configured,
        "searxng_url": client.url if client.is_configured else None,
        "searxng_healthy": client.check_health() if client.is_configured else False,
        "web_fetch_available": True,
    }