Clawdbot Competitive Intelligence
Source: /Users/jonathon/Auto1111/Claude/clawdbot/ project files, MEMORY.md project notes, clawdbot-reports.md
Multi-source competitive intelligence system tracking 16 dental marketing competitors across 8 data channels with monthly reports. Clawdbot collects YouTube videos, blog posts, podcast episodes, social media metrics, ad intelligence, Google Business Profile data, website changes, and SEO metrics — then synthesizes everything into weighted competitive position scores and trend analysis. Reports deliver to Google Docs by default (Jonathon’s preference) or local .docx with the --local flag.
What It Does
- Tracks 16 competitor agencies in the dental marketing space
- Collects data across 8 independent channels monthly
- Calculates weighted competitive position scores
- Generates PDF and DOCX reports with charts, rankings, and takeaways
- Flags WEO Marketly (both legacy WEO Media and Marketly Digital domains) as
isOwn: truefor accurate positioning against competitors
8 Data Channels
- YouTube API — Video count, views, subscriber growth, upload frequency, topic analysis
- Blog RSS — Post frequency, topic coverage, content length, freshness
- Podcast RSS + Podchaser GraphQL — Episode frequency, guest analysis, topic trends
- Social Media — Facebook, Instagram, LinkedIn (Apify + Brave fallback), Twitter/X via Apify scrapers.
totalFollowers= FB + IG + LI + X combined - Ad Intelligence — Paid advertising activity and creative analysis
- Google Business Profile — Apify
compass~crawler-google-places, health score 0-100 across 6 components. Batched in groups of 4 with 5-minute timeout each. Place IDs cached todata/gmb-place-ids.json - Website Monitoring — Change detection with baselines stored in
data/website-baselines/. Tracks structural changes, new pages, content updates - SEO Metrics — Domain authority, backlink profiles, ranking data
7-Channel Competitive Position Model
Weighted scoring across channels (GBP is the 8th data source but feeds into overall positioning, not scored independently in the position model):
| Channel | Weight |
|---|---|
| Content | 19% |
| Social | 19% |
| SEO | 16% |
| YouTube | 15% |
| Blog | 11% |
| Ads | 11% |
| GBP | 9% |
Social breadth bonus: Up to 1.15x multiplier for presence on 4+ platforms (FB, IG, LI, X).
Own-Company Tracking
WEO Marketly (covering both legacy WEO Media and Marketly Digital domains) is flagged as isOwn: true across all 5 device modules (YouTube, blogs, podcasts, social, GMB) plus the snapshot pipeline and competitive position scorecard. This ensures:
- They appear in rankings but are visually distinguished
- Self-comparison trends are tracked separately
- Competitor rankings exclude own companies when calculating relative position
Marketly Digital context: Formerly roadsidedental.com. Legacy content may appear under the old domain. The Educational Hub mixes videos, articles, and podcasts on one page. RSS at /feed/ pulls all content types.
Dual Report Rendering
Two completely separate code paths for report generation:
- PDF — Uses
scripts/lib/report-html.js(HTML templates + Playwright for rendering). Includes radar charts, bar charts, ranking tables - DOCX — Uses
buildChannelReport()in the main script via thedocxlibrary. Generates Word documents with tables and formatting
Critical rule: New data sources must be added to BOTH rendering paths. They do not share code. Missing one path means the data appears in PDF but not DOCX (or vice versa).
Report Delivery
- Default: Google Docs upload (Jonathon’s preference — never save .docx to Desktop)
- Override:
--localflag for .docx to Desktop - Run command:
node scripts/generate-monthly-report.jsfrom/Users/jonathon/Auto1111/Claude/clawdbot/
GBP Implementation Details
- Scraper: Apify
compass~crawler-google-places - Batching: Groups of 4 places per batch, 5-minute timeout per batch
- Caching: Place IDs cached to
data/gmb-place-ids.jsonto avoid re-lookup - Health score: 0-100 across 6 components (reviews, photos, posts, hours, description, categories)
- Known issue: GBP data vanishes on Apify 400 errors — needs graceful “data unavailable” fallback
Social Expansion (Feb 2026)
- LinkedIn: Apify scraper with Brave Search fallback
- Twitter/X: Apify scraper
- Website change detection: Baselines in
data/website-baselines/ - GBP posts: Now scraped alongside place data
- Skip flags:
--skip-gmb,--skip-websitesfor faster runs during development
Channel 9 — Community Signal via last30days (proposed, 2026-05-14)
Clawdbot’s 8 channels are all publisher-side (what competitors publish) or platform-side (GBP, SEO, ad libraries). They have no visibility into community-side signal — what real people say on Reddit, X, YouTube comments, Hacker News, Polymarket, GitHub about competitors or dental categories. The last30days-skill aggregates exactly those platforms and ranks by engagement (upvotes, likes, views, real money).
Integration spec. Wire bin/last30dental from the karpathy project into Clawdbot’s monthly report assembly. Two artifact buckets:
| Bucket | Invocation | Output |
|---|---|---|
| Per-competitor briefs | bin/last30dental "<competitor name>" --emit=html per tracked competitor | One HTML brief per competitor at ~/Documents/Last30Days/WEO/<slug>-brief.html |
| Service-line topic briefs | bin/last30dental "<service line>" --emit=html --competitors=2 per WEO service line | 3-way comparison brief with auto-discovered peer entities |
The Karpathy-side wrappers (bin/last30dental, bin/last30days-to-raw) and the wiki-side Tier-2 refresh hint were built 2026-05-14. The Clawdbot-side integration — adding a renderChannel9_CommunitySignal() function to both scripts/lib/report-html.js (PDF) AND buildChannelReport() (DOCX) per Clawdbot’s dual-rendering-pipeline rule — is the open follow-up. See last30days as Clawdbot’s 9th Channel for the full integration spec and clash analysis.
Free-tier cost profile. Reddit / HN / Polymarket / GitHub are zero-cost. X uses birdclaw cookie sessions (free). YouTube uses yt-dlp (free). Web search uses Brave (2k queries / month free tier, well above our 16-competitor × monthly cadence ≈ 100 calls/month). TikTok / Instagram / Threads / Pinterest are ScrapeCreators-gated — currently skipped, can be added later if patient-acquisition use cases warrant it.
Known Issues
- GBP data vanishes on Apify 400 errors (needs fallback, not silent omission)
- Social growth columns show ”---” on first snapshot (no baseline to compare against)
- Page 5 key takeaways section is sparse (needs more competitive insight generation)
- PDF radar chart updated to 6-axis (was incorrectly 5-axis before Feb 2026 QA fix)
Key Takeaways
- 8-channel data collection provides competitive intelligence that no single tool offers — Hermes integration adds cron scheduling and Telegram delivery
- Channel 9 (proposed): last30days community-signal layer fills the publisher-side blind spot — Reddit / X / YouTube comments / HN / Polymarket / GitHub coverage that the current 8 channels miss. Free-tier mode operates with Brave Search + yt-dlp + birdclaw, no ScrapeCreators required. Full integration spec at last30days as Clawdbot’s 9th Channel.
- The dual rendering pipeline is the most fragile part of the system — every new data source requires updating two separate code paths
- Apify is powerful but flaky; every scraper needs a “data unavailable” fallback, not silent failure
- Own-company tracking must be explicitly separated from competitor tracking for accurate positioning
- Social breadth bonus (1.15x for 4 platforms) incentivizes multi-platform presence in the scoring model
- Google Docs delivery by default saves the manual download-and-upload step
- See SEO Content Marketing Pipeline for how Clawdbot feeds into the full content optimization loop
Related
- ecosystem-architecture — How Clawdbot feeds into the content loop
- blog-agent-worker — Competitive data informs content gap analysis
- gsc-autonomous-seo — Competitive positioning data helps prioritize queries
- seo-patterns-learned — Patterns from building the reporting system
- _index — Hermes Agent scheduling for automated reports
- _index — Marketing automation via Hermes
- _index — Client-facing dashboard integration
- essential-mcp-servers — MCP servers (Railway, Apify) used in infrastructure
Try It
- Navigate to
/Users/jonathon/Auto1111/Claude/clawdbot/ - Run a report:
node scripts/generate-monthly-report.js(uploads to Google Docs by default) - For local testing:
node scripts/generate-monthly-report.js --local(saves .docx to Desktop) - Review
data/gmb-place-ids.jsonfor cached GBP place IDs - Check
data/website-baselines/for website change detection baselines - After adding any new data source, verify it appears in BOTH
scripts/lib/report-html.js(PDF) ANDbuildChannelReport()(DOCX)