Clawdbot Competitive Intelligence

Source: /Users/jonathon/Auto1111/Claude/clawdbot/ project files, MEMORY.md project notes, clawdbot-reports.md

Multi-source competitive intelligence system tracking 16 dental marketing competitors across 8 data channels with monthly reports. Clawdbot collects YouTube videos, blog posts, podcast episodes, social media metrics, ad intelligence, Google Business Profile data, website changes, and SEO metrics — then synthesizes everything into weighted competitive position scores and trend analysis. Reports deliver to Google Docs by default (Jonathon’s preference) or local .docx with the --local flag.

What It Does

  • Tracks 16 competitor agencies in the dental marketing space
  • Collects data across 8 independent channels monthly
  • Calculates weighted competitive position scores
  • Generates PDF and DOCX reports with charts, rankings, and takeaways
  • Flags WEO Marketly (both legacy WEO Media and Marketly Digital domains) as isOwn: true for accurate positioning against competitors

8 Data Channels

  1. YouTube API — Video count, views, subscriber growth, upload frequency, topic analysis
  2. Blog RSS — Post frequency, topic coverage, content length, freshness
  3. Podcast RSS + Podchaser GraphQL — Episode frequency, guest analysis, topic trends
  4. Social Media — Facebook, Instagram, LinkedIn (Apify + Brave fallback), Twitter/X via Apify scrapers. totalFollowers = FB + IG + LI + X combined
  5. Ad Intelligence — Paid advertising activity and creative analysis
  6. Google Business Profile — Apify compass~crawler-google-places, health score 0-100 across 6 components. Batched in groups of 4 with 5-minute timeout each. Place IDs cached to data/gmb-place-ids.json
  7. Website Monitoring — Change detection with baselines stored in data/website-baselines/. Tracks structural changes, new pages, content updates
  8. SEO Metrics — Domain authority, backlink profiles, ranking data

7-Channel Competitive Position Model

Weighted scoring across channels (GBP is the 8th data source but feeds into overall positioning, not scored independently in the position model):

ChannelWeight
Content19%
Social19%
SEO16%
YouTube15%
Blog11%
Ads11%
GBP9%

Social breadth bonus: Up to 1.15x multiplier for presence on 4+ platforms (FB, IG, LI, X).

Own-Company Tracking

WEO Marketly (covering both legacy WEO Media and Marketly Digital domains) is flagged as isOwn: true across all 5 device modules (YouTube, blogs, podcasts, social, GMB) plus the snapshot pipeline and competitive position scorecard. This ensures:

  • They appear in rankings but are visually distinguished
  • Self-comparison trends are tracked separately
  • Competitor rankings exclude own companies when calculating relative position

Marketly Digital context: Formerly roadsidedental.com. Legacy content may appear under the old domain. The Educational Hub mixes videos, articles, and podcasts on one page. RSS at /feed/ pulls all content types.

Dual Report Rendering

Two completely separate code paths for report generation:

  • PDF — Uses scripts/lib/report-html.js (HTML templates + Playwright for rendering). Includes radar charts, bar charts, ranking tables
  • DOCX — Uses buildChannelReport() in the main script via the docx library. Generates Word documents with tables and formatting

Critical rule: New data sources must be added to BOTH rendering paths. They do not share code. Missing one path means the data appears in PDF but not DOCX (or vice versa).

Report Delivery

  • Default: Google Docs upload (Jonathon’s preference — never save .docx to Desktop)
  • Override: --local flag for .docx to Desktop
  • Run command: node scripts/generate-monthly-report.js from /Users/jonathon/Auto1111/Claude/clawdbot/

GBP Implementation Details

  • Scraper: Apify compass~crawler-google-places
  • Batching: Groups of 4 places per batch, 5-minute timeout per batch
  • Caching: Place IDs cached to data/gmb-place-ids.json to avoid re-lookup
  • Health score: 0-100 across 6 components (reviews, photos, posts, hours, description, categories)
  • Known issue: GBP data vanishes on Apify 400 errors — needs graceful “data unavailable” fallback

Social Expansion (Feb 2026)

  • LinkedIn: Apify scraper with Brave Search fallback
  • Twitter/X: Apify scraper
  • Website change detection: Baselines in data/website-baselines/
  • GBP posts: Now scraped alongside place data
  • Skip flags: --skip-gmb, --skip-websites for faster runs during development

Channel 9 — Community Signal via last30days (proposed, 2026-05-14)

Clawdbot’s 8 channels are all publisher-side (what competitors publish) or platform-side (GBP, SEO, ad libraries). They have no visibility into community-side signal — what real people say on Reddit, X, YouTube comments, Hacker News, Polymarket, GitHub about competitors or dental categories. The last30days-skill aggregates exactly those platforms and ranks by engagement (upvotes, likes, views, real money).

Integration spec. Wire bin/last30dental from the karpathy project into Clawdbot’s monthly report assembly. Two artifact buckets:

BucketInvocationOutput
Per-competitor briefsbin/last30dental "<competitor name>" --emit=html per tracked competitorOne HTML brief per competitor at ~/Documents/Last30Days/WEO/<slug>-brief.html
Service-line topic briefsbin/last30dental "<service line>" --emit=html --competitors=2 per WEO service line3-way comparison brief with auto-discovered peer entities

The Karpathy-side wrappers (bin/last30dental, bin/last30days-to-raw) and the wiki-side Tier-2 refresh hint were built 2026-05-14. The Clawdbot-side integration — adding a renderChannel9_CommunitySignal() function to both scripts/lib/report-html.js (PDF) AND buildChannelReport() (DOCX) per Clawdbot’s dual-rendering-pipeline rule — is the open follow-up. See last30days as Clawdbot’s 9th Channel for the full integration spec and clash analysis.

Free-tier cost profile. Reddit / HN / Polymarket / GitHub are zero-cost. X uses birdclaw cookie sessions (free). YouTube uses yt-dlp (free). Web search uses Brave (2k queries / month free tier, well above our 16-competitor × monthly cadence ≈ 100 calls/month). TikTok / Instagram / Threads / Pinterest are ScrapeCreators-gated — currently skipped, can be added later if patient-acquisition use cases warrant it.

Known Issues

  • GBP data vanishes on Apify 400 errors (needs fallback, not silent omission)
  • Social growth columns show ”---” on first snapshot (no baseline to compare against)
  • Page 5 key takeaways section is sparse (needs more competitive insight generation)
  • PDF radar chart updated to 6-axis (was incorrectly 5-axis before Feb 2026 QA fix)

Key Takeaways

  • 8-channel data collection provides competitive intelligence that no single tool offers — Hermes integration adds cron scheduling and Telegram delivery
  • Channel 9 (proposed): last30days community-signal layer fills the publisher-side blind spot — Reddit / X / YouTube comments / HN / Polymarket / GitHub coverage that the current 8 channels miss. Free-tier mode operates with Brave Search + yt-dlp + birdclaw, no ScrapeCreators required. Full integration spec at last30days as Clawdbot’s 9th Channel.
  • The dual rendering pipeline is the most fragile part of the system — every new data source requires updating two separate code paths
  • Apify is powerful but flaky; every scraper needs a “data unavailable” fallback, not silent failure
  • Own-company tracking must be explicitly separated from competitor tracking for accurate positioning
  • Social breadth bonus (1.15x for 4 platforms) incentivizes multi-platform presence in the scoring model
  • Google Docs delivery by default saves the manual download-and-upload step
  • See SEO Content Marketing Pipeline for how Clawdbot feeds into the full content optimization loop

Try It

  1. Navigate to /Users/jonathon/Auto1111/Claude/clawdbot/
  2. Run a report: node scripts/generate-monthly-report.js (uploads to Google Docs by default)
  3. For local testing: node scripts/generate-monthly-report.js --local (saves .docx to Desktop)
  4. Review data/gmb-place-ids.json for cached GBP place IDs
  5. Check data/website-baselines/ for website change detection baselines
  6. After adding any new data source, verify it appears in BOTH scripts/lib/report-html.js (PDF) AND buildChannelReport() (DOCX)