web_search¶
Web search via SearXNG metasearch engine module.
Overview¶
The web_search module provides async functions for performing web searches through a SearXNG metasearch instance. SearXNG aggregates results from multiple search engines while preserving user privacy.
Configuration¶
Set the SEARXNG_URL environment variable:
Quick Example¶
import asyncio
from pvlwebtools import web_search, SearXNGClient
async def main():
# Simple search
results = await web_search("python async", max_results=5)
for r in results:
print(f"{r.title}: {r.url}")
# With domain filter
results = await web_search(
"machine learning",
domain_filter="arxiv.org",
recency="year",
)
# Using client directly
client = SearXNGClient(url="http://localhost:8888")
if client.check_health():
results = await client.search("query")
asyncio.run(main())
Recency Filters¶
| Value | Description |
|---|---|
all_time |
No time restriction (default) |
day |
Last 24 hours |
week |
Last 7 days |
month |
Last 30 days |
year |
Last 365 days |
API Reference¶
web_search
¶
Web search via SearXNG metasearch engine.
This module provides async functions for performing web searches through a SearXNG metasearch instance. SearXNG aggregates results from multiple search engines while preserving user privacy.
Example
import asyncio from pvlwebtools import web_search
async def main(): ... results = await web_search("python async", max_results=5) ... for r in results: ... print(f"{r.title}: {r.url}") ... asyncio.run(main())
Configuration
Set the SEARXNG_URL environment variable to your SearXNG instance::
export SEARXNG_URL="http://localhost:8888"
Alternatively, pass the URL directly to :class:SearXNGClient or
the :func:web_search function.
RecencyType = Literal['all_time', 'day', 'week', 'month', 'year']
module-attribute
¶
Type alias for valid recency filter options.
SearXNGClient
¶
Client for SearXNG metasearch engine.
Provides methods for searching the web through a SearXNG instance. SearXNG is a privacy-respecting metasearch engine that aggregates results from multiple search engines.
Parameters:
| Name | Type | Description | Default |
|---|---|---|---|
url
|
str | None
|
SearXNG instance URL. If not provided, reads from
|
None
|
timeout
|
float
|
Request timeout in seconds. Default 10.0. |
DEFAULT_TIMEOUT
|
Attributes:
| Name | Type | Description |
|---|---|---|
url |
The configured SearXNG instance URL. |
|
timeout |
Request timeout in seconds. |
Example
client = SearXNGClient(url="http://localhost:8888") if client.check_health(): ... results = await client.search("python tutorial") ... for r in results: ... print(r.title)
Source code in src/pvlwebtools/web_search.py
117 118 119 120 121 122 123 124 125 126 127 128 129 130 131 132 133 134 135 136 137 138 139 140 141 142 143 144 145 146 147 148 149 150 151 152 153 154 155 156 157 158 159 160 161 162 163 164 165 166 167 168 169 170 171 172 173 174 175 176 177 178 179 180 181 182 183 184 185 186 187 188 189 190 191 192 193 194 195 196 197 198 199 200 201 202 203 204 205 206 207 208 209 210 211 212 213 214 215 216 217 218 219 220 221 222 223 224 225 226 227 228 229 230 231 232 233 234 235 236 237 238 239 240 241 242 243 244 245 246 247 248 249 250 251 252 253 254 255 256 257 258 259 260 261 262 263 264 265 266 267 268 269 270 271 272 273 274 275 276 277 278 279 280 281 282 283 284 285 286 287 288 289 290 291 292 293 294 295 296 297 298 299 300 301 302 303 304 305 306 307 308 309 310 311 312 313 314 | |
is_configured
property
¶
Check if SearXNG URL is configured.
Returns:
| Type | Description |
|---|---|
bool
|
|
check_health()
¶
Check if SearXNG instance is reachable and healthy.
Makes a request to the /healthz endpoint. Results are
cached after the first check.
Returns:
| Type | Description |
|---|---|
bool
|
|
Source code in src/pvlwebtools/web_search.py
search(query, max_results=5, domain_filter=None, recency='all_time')
async
¶
Perform a web search via SearXNG.
Parameters:
| Name | Type | Description | Default |
|---|---|---|---|
query
|
str
|
Search query string. Cannot be empty. |
required |
max_results
|
int
|
Maximum number of results to return (1-20). Default 5. |
5
|
domain_filter
|
str | None
|
Limit search to a specific domain.
Examples: |
None
|
recency
|
RecencyType
|
Time filter for results:
|
'all_time'
|
Returns:
| Type | Description |
|---|---|
list[SearchResult]
|
List of :class: |
Raises:
| Type | Description |
|---|---|
WebSearchError
|
If SearXNG is not configured, query is empty, domain filter is invalid, or the request fails. |
Example
results = await client.search( ... "climate change", ... domain_filter="nature.com", ... recency="year", ... )
Source code in src/pvlwebtools/web_search.py
196 197 198 199 200 201 202 203 204 205 206 207 208 209 210 211 212 213 214 215 216 217 218 219 220 221 222 223 224 225 226 227 228 229 230 231 232 233 234 235 236 237 238 239 240 241 242 243 244 245 246 247 248 249 250 251 252 253 254 255 256 257 258 259 260 261 262 263 264 265 266 267 268 269 270 271 272 273 274 275 276 277 278 279 280 281 282 283 284 285 286 287 288 289 290 291 292 293 294 295 296 297 298 299 300 301 302 303 304 305 306 307 308 309 310 311 312 313 314 | |
SearchResult
dataclass
¶
A single search result from a web search query.
Attributes:
| Name | Type | Description |
|---|---|---|
title |
str
|
The title of the search result page. |
url |
str
|
The URL of the search result. |
snippet |
str
|
A text snippet or excerpt from the page content. |
published_date |
str | None
|
Publication date if available (format varies). |
Example
result = SearchResult( ... title="Python Tutorial", ... url="https://python.org/tutorial", ... snippet="Learn Python programming...", ... ) print(f"{result.title}: {result.url}") Python Tutorial: https://python.org/tutorial
Source code in src/pvlwebtools/web_search.py
WebSearchError
¶
Bases: Exception
Exception raised when web search fails.
This exception is raised for various failure conditions including:
- SearXNG not configured (missing
SEARXNG_URL) - Empty search query
- Invalid domain filter format
- HTTP errors from SearXNG
- Network timeouts or connection failures
Example
try: ... results = await web_search("") ... except WebSearchError as e: ... print(f"Search failed: {e}")
Source code in src/pvlwebtools/web_search.py
web_search(query, max_results=5, domain_filter=None, recency='all_time', searxng_url=None)
async
¶
Search the web using SearXNG.
This is a convenience function that creates a :class:SearXNGClient
and performs a single search. For multiple searches, create a client
instance directly to avoid repeated initialization.
Parameters:
| Name | Type | Description | Default |
|---|---|---|---|
query
|
str
|
Search query string. Cannot be empty. |
required |
max_results
|
int
|
Maximum number of results to return (1-20). Default 5. |
5
|
domain_filter
|
str | None
|
Limit search to a specific domain.
Examples: |
None
|
recency
|
RecencyType
|
Time filter for results. One of:
|
'all_time'
|
searxng_url
|
str | None
|
SearXNG instance URL. If not provided,
reads from |
None
|
Returns:
| Type | Description |
|---|---|
list[SearchResult]
|
List of :class: |
Raises:
| Type | Description |
|---|---|
WebSearchError
|
If search fails. |
Example
import asyncio from pvlwebtools import web_search
async def main(): ... results = await web_search( ... "python async best practices", ... max_results=5, ... ) ... for r in results: ... print(f"{r.title}: {r.url}") ... asyncio.run(main())