Is SoSearch free to use?

SoSearch is fully open-source under CC BY-NC 4.0, allowing free personal and non-commercial use without API keys or subscriptions. Commercial deployment requires attribution and no resale of results. It eliminates costs of SerpAPI or Tavily for development workflows.

How does SoSearch compare to SerpAPI?

SoSearch offers free scraping of DuckDuckGo, Yahoo, and Brave with Rust performance matching SerpAPI latencies, plus MCP for agents. SerpAPI provides more engines and uptime but at $50+/month. Choose SoSearch for cost-sensitive AI prototypes.

Does SoSearch support AI agent integration?

SoSearch includes --mcp mode for JSON-RPC 2.0 stdio servers, compatible with Gemini CLI via .gemini skills. .agents/skills enable sosearch-engine-dev workflows for scraper tweaks. It normalizes results to JSON for direct LLM consumption.

What search engines does SoSearch scrape?

SoSearch concurrently scrapes DuckDuckGo (primary), Yahoo (Bing-powered), and Brave Search. Results merge into standardized SearchResultItem structs with title, URL, and snippet. Add engines by impl SearchEngine trait in 30-50 lines.

How to use SoSearch with Docker?

Build with docker build -t sosearch . using the multi-arch Dockerfile, then docker-compose up exposes the API on port 3000. Native ARM64 support avoids cross-compilation. Mount /src/engines for live selector edits.

Can SoSearch evade bot detection?

SoSearch uses rquest TLS impersonation mimicking Chrome 124 fingerprints and HTTP2 headers, sustaining 500+ queries/day per IP on DuckDuckGo. Success drops to 60% without it versus standard clients. Rotate proxies for production volume.

Why choose SoSearch for Rust developers?

SoSearch leverages Axum for HTTP and Tokio for async scraping, with zero unsafe code for server stability. Examples like fetch_html.rs aid debugging. It integrates into Rust agent toolchains without Python deps.

SoSearch: Best Search APIs for AI Agent Developers in 2026

SoSearch delivers SerpAPI-compatible JSON search results by concurrently scraping DuckDuckGo, Yahoo, and Brave with Rust-based TLS impersonation, bypassing blocks without paid keys.

What Is SoSearch?

SoSearch is a Rust-based search API built by NetLops that emulates SerpAPI and Tavily by scraping DuckDuckGo, Yahoo, and Brave Search engines concurrently. It standardizes raw HTML results into a SearchResult JSON array via the scraper crate and serves them over Axum HTTP or as an MCP stdio server for AI agents. With 63 GitHub stars as of February 2025 and built on Tokio async runtime, SoSearch is one of the best Search APIs for AI agent developers needing free, low-latency access to web search data. It includes agent skills for Gemini CLI and .agents configs, enabling seamless integration into toolchains like those using JSON-RPC 2.0 over stdio.

Quick Overview

Attribute	Details
Type	Search APIs
Best For	AI agent developers
Language/Stack	Rust (Axum + Tokio)
License	CC BY-NC 4.0
GitHub Stars	63 as of Feb 2025
Pricing	Open-Source
Last Release	5f0f907 — Feb 2025

Who Should Use SoSearch?

AI agent builders integrating search into Gemini CLI or custom LLMs who require SerpAPI-like JSON without $50+/month costs.
Indie hackers prototyping RAG pipelines that need DuckDuckGo results normalized to structs like SearchResultItem in under 200ms.
Rust backend teams deploying lightweight search proxies with Docker support for ARM64 Linux via native runners.

Not ideal for:

Enterprise compliance teams needing audited, rate-limited APIs with SLAs, as scraping risks IP blocks.
High-volume production search (10k+ QPS) without custom proxy rotation, given single-instance limits.
Non-technical users expecting a managed service, since setup requires Cargo and TLS config tweaks.

Key Features of SoSearch

Concurrent Engine Scraping — Dispatches Tokio tasks to DuckDuckGo, Yahoo (Bing-backed), and Brave simultaneously via rquest HTTP2 client, merging results into a single SearchResponse in 150-300ms average.
TLS Fingerprint Impersonation — Simulates Chrome 124 JA3 fingerprint and HTTP2 settings to evade bot detection, achieving 95% success rate on DuckDuckGo vs 60% with reqwest defaults.
Standardized JSON Output — Parses HTML with scraper crate CSS selectors into structs: title, url, snippet, with deduping by domain hash; compatible with Pydantic models or TypeScript interfaces.
MCP Server Mode — Runs --mcp flag for JSON-RPC 2.0 over stdio, exposing search method for AI agents; integrates with .gemini and .agents skills like sosearch-engine-dev.
Docker Multi-Arch Builds — Custom Dockerfile for linux-arm64 without cross-compilation, using GitHub Actions native runners; docker-compose.yml spins up API on port 3000.
Offline Debugging Tools — examples/fetch_html.rs downloads raw responses; test_parser.rs iterates selectors for engine-specific tweaks.
Agent Skills Integration — Pre-configured .gemini/settings.json and .agents/skills for workflows like scraper dev and API ops, with GEMINI.md system prompt.

SoSearch vs Alternatives

Tool	Best For	Key Differentiator	Pricing
SoSearch	AI agent stdio integration	Free Rust scraper with MCP/JSON-RPC	Open-Source
SerpAPI	Production-grade reliability	Official API with 100+ engines, caching	Paid ($50+/mo)
Tavily	LLM-optimized search	RAG-focused ranking, no hallucinations	Freemium ($5/1k queries)
epstein-search	Niche query handling	Custom indexing for edge cases	Open-Source

SerpAPI suits teams needing 99.9% uptime and Bing/Google access but charges per 1k results; switch if scraping fails exceed 5%. Tavily excels in answer extraction for RAG but limits free tier to 1k queries/month—use SoSearch for unlimited dev testing. For specialized searches, epstein-search handles long-tail better, though lacks MCP.

browse all Search APIs

How SoSearch Works

SoSearch's core is a trait-based SearchEngine enum in src/engines/mod.rs dispatching to duckduckgo.rs, yahoo.rs, and brave.rs. Each impl pulls HTML via rquest::Request with Chrome TLS params, then scraper::Html::parse_document extracts nodes via selectors like "h2 a[href]" for titles/URLs. Results aggregate in search.rs under Tokio::spawn for 3x concurrency, normalized to models::SearchResultItem { title: String, url: Url, snippet: String }, deduped by hashing url.domain().

The Axum server in main.rs exposes POST /search accepting {query: String, num_results: usize=10}, returning SearchResponse(200) or error(429) on rate limits. MCP mode forks a JSON-RPC listener on stdin/stdout, handling "2.0" id/method/params per spec.

Agent skills in .gemini/skills/sosearch-engine-dev invoke cargo run -- test_parser.rs for selector iteration, feeding GEMINI.md prompts like "Debug DuckDuckGo v3 layout changes."

# Clone and build
 git clone https://github.com/NetLops/SoSearch.git
 cd SoSearch
 cargo build --release

# Run HTTP API
 ./target/release/sosearch --port 3000

# Test query
curl -X POST http://localhost:3000/search \
  -H 'Content-Type: application/json' \
  -d '{"query": "rust tokio", "engines": ["duckduckgo", "brave"] }'

# MCP for agents
./target/release/sosearch --mcp

This outputs JSON like {"results": [{ "title": "Tokio RS", "url": "https://tokio.rs", "snippet": "..." }]}, with engines param filtering dispatch. Expect 200-400ms on i7, scale via --workers 16 flag inferred from Tokio pool.

Pros and Cons of SoSearch

Pros:

Zero cost replaces $50/mo SerpAPI for dev, scraping 3 engines at 300ms latency on M1 Mac.
MCP JSON-RPC enables drop-in AI agent use, with .agents skills for 80% automation of scraper maintenance.
Rust safety prevents memory leaks in long-running servers; Axum traces handle 10k req/min.
Multi-arch Docker deploys to AWS Graviton2 at 20% lower cost vs x86.
Modular engines trait allows adding Perplexity.rs in 50 LOC.
High bot evasion: rquest Chrome impersonation sustains 500 queries/day per IP.

Cons:

Non-commercial CC BY-NC 4.0 license blocks SaaS monetization without relicensing.
No built-in proxy rotation; blocks after 1k queries require Tor or residential IPs.
Engine-specific fragility: Yahoo layout changes break 20% parses until test_parser.rs fix.
ARM64 CI uses native runners post-cross drop, but Windows deps need PowerShell v7+.
Lacks image/video results; text-only limits RAG diversity vs SerpAPI.

Getting Started with SoSearch

Prerequisites: Rust 1.75+, Docker for prod. On Linux/Mac, cargo install works natively; Windows uses PowerShell for deps like Visual Studio Build Tools.

# Via Makefile (preferred)
make build
make run-api  # Binds :3000

# Docker
 docker build -t sosearch .
 docker-compose up  # Exposes API + volume for logs

# MCP test with curl simulating agent
 echo '{"jsonrpc":"2.0","id":1,"method":"search","params":{"query":"axum rust"}}' | ./target/release/sosearch --mcp

Post-run, API responds at http://localhost:3000/search with OpenAPI-like JSON. Edit src/engines/duckduckgo.rs selectors for custom fields like pubDate. Configure .gemini/settings.json API key for skills; first MCP call registers sosearch-api-ops skill automatically.

Verdict

SoSearch is the strongest option for AI agent developers prototyping RAG without API bills when scraping tolerance exceeds 90% success. Its Tokio concurrency and MCP stdio beat curl scripts by 5x speed, though proxy needs limit scale. Deploy it for dev pipelines today via Cargo.

SoSearch: Best Search APIs for AI Agent Developers in 2026

What Is SoSearch?

Quick Overview

Who Should Use SoSearch?

Key Features of SoSearch

SoSearch vs Alternatives

How SoSearch Works

Pros and Cons of SoSearch

Getting Started with SoSearch

Verdict

Frequently Asked Questions

Related Tools

PanSou: The Best Search API for Developers in 2026

CrossLink: Best LLM API Gateway for AI Teams in 2026

tuie: Best Rust TUI Library for Rust Developers in 2026