Source: ai-research/firecrawl-product-overview-2026-07-03.md (firecrawl.dev) + ai-research/firecrawl-pricing-2026-07-03.md + ai-research/firecrawl-monitor-product-page-2026-07-03.md + ai-research/firecrawl-monitor-docs-overview-2026-07-03.md + ai-research/firecrawl-monitor-web-scale-docs-2026-07-03.md + ai-research/firecrawl-monitor-launch-2026-07-03.md (May 27 2026) + ai-research/firecrawl-monitor-web-scale-launch-2026-07-03.md (Jul 1 2026) + ai-research/firecrawl-github-readme-2026-07-03.md
Firecrawl (Y Combinator-backed) is “the context API to search, scrape, and interact with the web at scale” — a hosted API that turns messy human-facing websites into clean, LLM-ready markdown/JSON, plus (new as of 2026-07-01) a web-scale /monitor primitive that watches the entire web for new, relevant pages rather than just diffing pages you already know about. This article resolves a standing wiki gap: Firecrawl had no home article despite already being installed as 13 first-class skills in this environment (firecrawl-scrape, firecrawl-crawl, firecrawl-map, firecrawl-search, firecrawl-interact, firecrawl-agent, plus several firecrawl-build-* workflow skills and pp-firecrawl).
Key Takeaways
- Claimed scale: 1.25M+ developers, 150,000+ companies (Apple, Canva, Shopify, Zapier, Lovable, Replit, DoorDash cited as users/case studies), 5B+ requests served, 96% web coverage, P95 latency 3.4s, “93% fewer input tokens” than raw HTML.
- Open-core, not fully open. Core repo (
firecrawl/firecrawl) is AGPL-3.0 (copyleft — matters if you’d embed/resell it) and self-hostable, ~130K+ GitHub stars (one of the top ~100 repos on GitHub); SDKs and some UI components are MIT. The hosted cloud API adds proprietary infra (“Fire-engine”), Interact, a dashboard, and other cloud-only features not fully enumerated in the public docs. - Seven core capabilities: Search (full-page content, not just snippets), Scrape (URL → markdown/HTML/screenshot/structured JSON, handles JS rendering automatically), Interact (scrape then click/fill/navigate via AI prompts or code), Crawl (follow links site-wide, respects
robots.txt), Map (instant full-site URL discovery), Agent (natural-language autonomous data gathering via “Spark 1” models), Batch Scrape, and/parse(PDF/DOCX → LLM-ready text). /monitoris the differentiated primitive — no equivalent exists elsewhere in this wiki’s toolkit (not in crawl4ai, not in Tavily). Two modes: page/site monitoring (May 26-27 2026 launch — watch named URLs or crawl a site on schedule, diff against the last snapshot) and web-scale monitoring (July 1 2026 launch — give it search queries + a goal instead of URLs; it discovers new pages across the whole web, not just changes to known ones). An AI judge scores every diff/result against a plain-English goal so only meaningful changes fire — claimed up to 90% fewer tokens reaching the receiving agent.- Already in this Claude Code environment. 13
firecrawl-*skills are installed, matching Firecrawl’s official Claude Code plugin (firecrawl/firecrawl-claude-plugin) — a stronger “already in hand” signal than most competing tools get. - Pricing is fully public and tiered, from a free 1,000-page/month tier (plus a keyless 1,000-credit/month no-signup tier added mid-June 2026) up to a $599/mo Scale plan and custom Enterprise.
/monitor — the primitive that prompted this article
Page and site monitoring (launched May 26-27, 2026)
Replaces the DIY cron + snapshot-storage + diff + webhook stack with one endpoint. Every monitor has a schedule (cron or natural language, 5-minute minimum), an optional plain-English goal, and an AI judge that scores each diff against the goal — a copyright-year bump or rotated testimonial won’t alert; a price change will. Two target types:
scrape— watch named URLs, diff against the last snapshot.crawl— crawl a whole site on schedule, catch added/changed/removed pages.
Delivery via signed webhooks (monitor.page, monitor.check.completed) with custom headers, or email with the diff inline. Supports markdown-diff mode, JSON-schema field-level diff mode (extract and diff only specific fields like price/headline), or both combined.
Web-scale monitoring (launched July 1, 2026 — 2 days old as of this article)
Flips the model from diffing known pages to discovering unknown ones. Instead of naming URLs, you give it 1-12 search queries, a recency searchWindow (5 minutes to 7 days), and a required goal. Each check runs the queries, dedupes by canonical URL, judges new results against the goal, and alerts only on genuinely new and relevant pages. Pitched for regulatory filings, competitor moves, and breaking news — not just tracked pricing pages. Same webhook/email/judge machinery as page/site monitoring underneath.
from firecrawl import Firecrawl
firecrawl = Firecrawl(api_key="fc-YOUR-API-KEY")
monitor = firecrawl.create_monitor(
name="AI coding assistant launches",
schedule={"text": "every 30 minutes", "timezone": "UTC"},
goal="Alert when a new open-source AI coding assistant is announced. Ignore funding rounds and unrelated AI news.",
targets=[{"type": "search", "queries": ["open source AI coding assistant launch"], "searchWindow": "24h", "maxResults": 10}],
notification={"email": {"enabled": True, "recipients": ["alerts@example.com"], "includeDiffs": True}},
)Both /monitor launches were authored by cofounder Eric Ciarla and are live for all users today, not previews. An active bounty (5,000 credits) for detailed /monitor feedback suggests it’s still being tuned.
Pricing (firecrawl.dev/pricing)
| Plan | Price/mo | Pages | Concurrency |
|---|---|---|---|
| Free | $0 | 1,000 | 2 |
| Hobby | $16 | 5,000 | 5 |
| Standard (recommended) | $83 | 100,000 | 50 |
| Growth | $333 | 500,000 | 100 |
| Scale | $599 | 1,000,000 credits | 150 |
| Enterprise | Custom | Unlimited | Custom + SLA, zero-data-retention, SSO |
Credit costs: Scrape/Crawl/Map/Monitor = 1 credit/page; Search = 2 credits/10 results; Interact = 2 credits/browser-minute; JSON-schema extraction = 5 credits/page (1 base + 4). Mid-June 2026 added a keyless free tier (1,000 free credits/month, no signup) specifically to lower the onboarding bar for agent/CLI use.
How it compares
| crawl4ai | Firecrawl | |
|---|---|---|
| License | Apache-2.0 (permissive) | AGPL-3.0 core (copyleft) + MIT SDKs |
| Deploy | Self-hosted first, free, unmetered | Hosted first; self-hosting secondary |
| Pricing | Free (Cloud API in closed beta) | Fully public tiered pricing |
| Scale (GitHub stars) | ~68.7K | ~130K+ |
| Scheduled change/discovery detection | None built in | /monitor — page, site, and web-scale, AI-judged |
The two genuinely overlap only on the “scrape → clean markdown” use case — crawl4ai’s own article already tells readers to benchmark it against Firecrawl on cost/token-count for that overlap. Monitor is a capability class crawl4ai simply doesn’t have: no scheduler, diff engine, AI judge, or notification layer.
Against Tavily (this vault’s default research-search MCP): Tavily’s own toolset (search, extract, crawl, map, research) overlaps heavily with Firecrawl’s Search/Scrape/Crawl/Map surface, so a Firecrawl article pitched purely as “another way to search the web” would be redundant with what’s already wired in. The differentiated angle is Monitor — Tavily’s tools are all pull/request-response, with nothing that runs on a schedule, diffs, judges, and pushes a webhook.
Try It
- Free-tier smoke test: the keyless 1,000-credit/month tier needs no signup — try
firecrawl-scrapeorfirecrawl-search(already installed as skills in this environment) against a page you care about before committing to a paid plan. - Replace a DIY watchlist with
/monitor. This vault’s own Watch operation (karpathy-obsidian-vault-main-2/CLAUDE.md§ Watch) hand-rolls a simplified version of exactly what Firecrawl Monitor does — fetchwiki/watchlist.mdURLs, diff againstai-research/watchlist-snapshots/, filter cosmetic vs. meaningful changes — but only runs when a session manually triggers “check watchlist.” A Firecrawl page/site monitor with a webhook could make that genuinely autonomous instead of session-triggered. - Try web-scale monitoring for competitor or regulatory tracking — a
search-type monitor with a tightgoal(e.g. “alert only on new dental-marketing-agency service launches, ignore blog posts”) is a lighter-weight alternative to a full crawl4ai/Tavily research pass for “did anything new happen” questions that recur on a schedule. - Benchmark Scrape against crawl4ai on the same 20 URLs for token count and per-page cost before picking one as your default — see crawl4ai’s own Try It section for the mirror version of this suggestion.
Related
- crawl4ai — the self-hosted, free, Apache-2.0 counterpart; genuinely overlapping on scrape/crawl, no equivalent to Monitor.
- TinyFish — Web Infra APIs for AI Agents — another managed fetch/search/browser API in the same competitive set; crawl4ai’s article already names Firecrawl as a competitor to both.
- Browserbase Autobrowse — managed browser harness; Firecrawl’s Interact capability covers similar click/fill/navigate ground via a different architecture.
- CloakBrowser — dedicated anti-bot/stealth tool; worth comparing against Firecrawl’s 96%-coverage claim for hard-to-scrape targets.
- The Agent-Readable Web — the thesis Firecrawl’s whole product surface (and especially web-scale Monitor) instantiates: making the existing human web legible and actionable for agents without the site’s cooperation.
- Cowork + Apify Scraping — a per-actor marketplace alternative for platform-specific scraping Firecrawl’s generic crawler doesn’t reach as deeply.
Open Questions
- Is
/monitor(with its AI judge) available on the self-hosted AGPL deployment, or is it cloud/hosted-only? The docs’ code samples use an API key in a way consistent with either, and the GitHub README’s “Open Source vs Cloud” comparison doesn’t enumerate this specifically. Worth confirming before recommending Monitor to a reader who wants to self-host. - GitHub star counts were inconsistent across Firecrawl’s own pages during this research (93.9k/130K+/142.9K/143.4K cited in different places) — treat “~130K+, one of the top 100 GitHub repos” as directionally correct, not a precise figure.