MCP Server¶

Model Context Protocol server for AI assistant integration.

Overview¶

The MCP server exposes pvl-webtools functionality to AI assistants like Claude via the Model Context Protocol. It provides tools for web searching and content fetching.

Running the Server¶

Command Line¶

# Set SearXNG URL for search
export SEARXNG_URL="http://localhost:8888"

# Run with uvx
uvx pvl-webtools-mcp

Verbose Logging¶

Set LOG_LEVEL (e.g., DEBUG, INFO, TRACE) or the convenience flag VERBOSE=1 to emit detailed MCP server logs to stderr without interfering with stdio transport. TRACE removes dependency filtering so every FastMCP/Docket message continues to appear.

Programmatic¶

from pvlwebtools.mcp_server import run_server

# Standard I/O transport (for Claude integration)
run_server()

# HTTP transport (for web clients)
run_server(transport="http", host="0.0.0.0", port=8080)

Available Tools¶

search¶

Search the web via SearXNG metasearch engine.

Parameters:

Name	Type	Default	Description
`query`	string	required	Search query
`max_results`	int	5	Maximum results (1-20)
`domain_filter`	string	null	Limit to domain
`recency`	string	"all_time"	Time filter

fetch¶

Fetch and extract content from a URL.

Parameters:

Name	Type	Default	Description
`url`	string	required	URL to fetch
`extract_mode`	string	"markdown"	Extraction mode

Extract Modes: markdown, article, raw, metadata

check_status¶

Check the status of web tools.

Returns: Status information including SearXNG availability.

Claude Desktop Integration¶

Add to your Claude Desktop configuration (~/.config/claude/mcp.json):

{
  "mcpServers": {
    "pvl-webtools": {
      "command": "uvx",
      "args": ["pvl-webtools-mcp"],
      "env": {
        "SEARXNG_URL": "http://localhost:8888"
      }
    }
  }
}

API Reference¶

`mcp_server` ¶

MCP server for pvl-webtools.

This module provides an MCP (Model Context Protocol) server that exposes web search and fetch capabilities to AI assistants and other MCP clients.

The server provides three tools:

search: Search the web via SearXNG metasearch engine
fetch: Fetch and extract content from URLs
check_status: Check availability of configured services

Running the Server

Via command line::

uvx pvl-webtools-mcp

Or programmatically::

from pvlwebtools.mcp_server import run_server
run_server(transport="stdio")

Configuration

Set the SEARXNG_URL environment variable for web search::

export SEARXNG_URL="http://localhost:8888"

Note

Requires the mcp extra: pip install pvl-webtools[mcp]

`run_server(transport='stdio', host='127.0.0.1', port=8000)` ¶

Run the MCP server.

Starts the MCP server with the specified transport. For integration with AI assistants like Claude, use stdio transport. For HTTP clients, use http transport.

Parameters:

Name	Type	Description	Default
`transport`	`Literal['stdio', 'http']`	Transport protocol: `'stdio'`: Standard I/O (default, for Claude integration) `'http'`: HTTP server (for web clients)	`'stdio'`
`host`	`str`	Host to bind to for HTTP transport. Default `'127.0.0.1'`.	`'127.0.0.1'`
`port`	`int`	Port to bind to for HTTP transport. Default `8000`.	`8000`

Example

from pvlwebtools.mcp_server import run_server run_server() # Runs with stdio transport

Or with HTTP::

run_server(transport="http", host="0.0.0.0", port=8080)

Source code in src/pvlwebtools/mcp_server.py

def run_server(
    transport: Literal["stdio", "http"] = "stdio",
    host: str = "127.0.0.1",
    port: int = 8000,
) -> None:
    """Run the MCP server.

    Starts the MCP server with the specified transport. For integration
    with AI assistants like Claude, use ``stdio`` transport. For HTTP
    clients, use ``http`` transport.

    Args:
        transport: Transport protocol:

            - ``'stdio'``: Standard I/O (default, for Claude integration)
            - ``'http'``: HTTP server (for web clients)

        host: Host to bind to for HTTP transport. Default ``'127.0.0.1'``.
        port: Port to bind to for HTTP transport. Default ``8000``.

    Example:
        >>> from pvlwebtools.mcp_server import run_server
        >>> run_server()  # Runs with stdio transport

        Or with HTTP::

        >>> run_server(transport="http", host="0.0.0.0", port=8080)
    """
    if transport == "http":
        mcp.run(transport="http", host=host, port=port)
    else:
        mcp.run()

`search(query, max_results=5, domain_filter=None, recency='all_time')` `async` ¶

Search the web using SearXNG metasearch engine.

Use this tool to search the web for information on any topic. Results include title, URL, snippet, and optionally published date.

Parameters:

Name	Type	Description	Default
`query`	`str`	Search query string. Be specific for better results. Examples: "python async best practices", "climate change 2024 report".	required
`max_results`	`int`	Maximum number of results to return (1-20, default 5).	`5`
`domain_filter`	`str \| None`	Optional domain to limit search to. Examples: "wikipedia.org", "github.com", "arxiv.org".	`None`
`recency`	`Literal['all_time', 'day', 'week', 'month', 'year']`	Time filter for results. One of: 'all_time' (default), 'day', 'week', 'month', 'year'.	`'all_time'`

Returns:

Type	Description
`list[dict]`	List of search results with title, url, snippet, and published_date.

Note

Requires SEARXNG_URL environment variable to be set.

Source code in src/pvlwebtools/mcp_server.py

@mcp.tool
async def search(
    query: str,
    max_results: int = 5,
    domain_filter: str | None = None,
    recency: Literal["all_time", "day", "week", "month", "year"] = "all_time",
) -> list[dict]:
    """Search the web using SearXNG metasearch engine.

    Use this tool to search the web for information on any topic.
    Results include title, URL, snippet, and optionally published date.

    Args:
        query: Search query string. Be specific for better results.
               Examples: "python async best practices", "climate change 2024 report".
        max_results: Maximum number of results to return (1-20, default 5).
        domain_filter: Optional domain to limit search to.
                       Examples: "wikipedia.org", "github.com", "arxiv.org".
        recency: Time filter for results. One of:
                 'all_time' (default), 'day', 'week', 'month', 'year'.

    Returns:
        List of search results with title, url, snippet, and published_date.

    Note:
        Requires SEARXNG_URL environment variable to be set.
    """
    max_results = max(1, min(20, max_results))

    client = get_searxng_client()

    logger.debug(
        "search(query=%r, max_results=%s, domain_filter=%s, recency=%s)",
        _truncate(query),
        max_results,
        domain_filter,
        recency,
    )

    if not client.is_configured:
        return [{"error": "SearXNG not configured. Set SEARXNG_URL environment variable."}]

    try:
        results: list[SearchResult] = await client.search(
            query=query,
            max_results=max_results,
            domain_filter=domain_filter,
            recency=recency,
        )

        return [
            {
                "title": r.title,
                "url": r.url,
                "snippet": r.snippet,
                "published_date": r.published_date,
            }
            for r in results
        ]

    except WebSearchError as e:
        logger.warning("Search failed: %s", e)
        return [{"error": str(e)}]

`fetch(url, extract_mode='markdown')` `async` ¶

Fetch and extract content from a URL.

Use this tool to retrieve the content of a web page. Supports multiple extraction modes optimized for different use cases.

Parameters:

Name	Type	Description	Default
`url`	`str`	URL to fetch (must start with http:// or https://).	required
`extract_mode`	`Literal['markdown', 'article', 'raw', 'metadata']`	How to extract content: - 'markdown': Convert to LLM-friendly markdown (default). Preserves headings, lists, links, code blocks. - 'article': Extract main article text (uses trafilatura). - 'raw': Return raw HTML (truncated to 50k chars). - 'metadata': Extract title, description, Open Graph tags only.	`'markdown'`

Returns:

Type	Description
`dict`	Dictionary with url, content, content_length, and extract_mode.

Note

Rate-limited to 1 request per 3 seconds to avoid abuse.

Source code in src/pvlwebtools/mcp_server.py

@mcp.tool
async def fetch(
    url: str,
    extract_mode: Literal["markdown", "article", "raw", "metadata"] = "markdown",
) -> dict:
    """Fetch and extract content from a URL.

    Use this tool to retrieve the content of a web page. Supports
    multiple extraction modes optimized for different use cases.

    Args:
        url: URL to fetch (must start with http:// or https://).
        extract_mode: How to extract content:
            - 'markdown': Convert to LLM-friendly markdown (default).
              Preserves headings, lists, links, code blocks.
            - 'article': Extract main article text (uses trafilatura).
            - 'raw': Return raw HTML (truncated to 50k chars).
            - 'metadata': Extract title, description, Open Graph tags only.

    Returns:
        Dictionary with url, content, content_length, and extract_mode.

    Note:
        Rate-limited to 1 request per 3 seconds to avoid abuse.
    """
    logger.debug("fetch(url=%s, extract_mode=%s)", url, extract_mode)

    try:
        result: FetchResult = await web_fetch(url=url, extract_mode=extract_mode)

        return {
            "url": result.url,
            "content": result.content[:10000],  # Truncate for token efficiency
            "content_length": result.content_length,
            "extract_mode": result.extract_mode,
            "truncated": result.content_length > 10000,
        }

    except WebFetchError as e:
        logger.warning("Fetch failed for %s: %s", url, e)
        return {"error": str(e), "url": url}

`check_status()` ¶

Check the status of web tools.

Returns:

Type	Description
`dict`	Status information including SearXNG availability.

Source code in src/pvlwebtools/mcp_server.py

@mcp.tool
def check_status() -> dict:
    """Check the status of web tools.

    Returns:
        Status information including SearXNG availability.
    """
    client = get_searxng_client()

    logger.debug("check_status invoked")

    return {
        "searxng_configured": client.is_configured,
        "searxng_url": client.url if client.is_configured else None,
        "searxng_healthy": client.check_health() if client.is_configured else False,
        "web_fetch_available": True,
    }

MCP Server¶

Overview¶

Running the Server¶

Command Line¶

Verbose Logging¶

Programmatic¶

Available Tools¶

search¶

fetch¶

check_status¶

Claude Desktop Integration¶

API Reference¶

mcp_server ¶

run_server(transport='stdio', host='127.0.0.1', port=8000) ¶

search(query, max_results=5, domain_filter=None, recency='all_time') async ¶

fetch(url, extract_mode='markdown') async ¶

check_status() ¶

`mcp_server` ¶

`run_server(transport='stdio', host='127.0.0.1', port=8000)` ¶

`search(query, max_results=5, domain_filter=None, recency='all_time')` `async` ¶

`fetch(url, extract_mode='markdown')` `async` ¶

`check_status()` ¶