Source: ai-research/ghl-2026-05-01/support-solutions-articles-48001060529-highlevel-api-documentation.md

HighLevel applies rate limits on its public V2 APIs to keep the platform stable. Limits are scoped per Marketplace app per resource (Location or Company), so a single agency installation does not consume budget that belongs to other tenants. Five X-RateLimit-* response headers expose remaining capacity on every call, which makes client-side throttling and back-off straightforward.

Key Takeaways

  • Burst limit: 100 requests per 10 seconds for each Marketplace app (i.e., client) per resource (i.e., Location or Company).
  • Daily limit: 200,000 requests per day for each Marketplace app per resource.
  • “Per resource” means per Location or Company. A multi-tenant Marketplace app installed on 50 sub-accounts has its own bucket for each sub-account — limits do not pool across tenants.
  • Limits apply to V2 APIs using OAuth. V1 has reached end-of-support (December 31 2025) and receives no updates; V2 is the supported tier.
  • Five response headers report current usage on every call:
    • X-RateLimit-Limit-Daily — the daily limit
    • X-RateLimit-Daily-Remaining — remaining requests for the day
    • X-RateLimit-Interval-Milliseconds — the time interval for burst requests
    • X-RateLimit-Max — the maximum request limit in the specified time interval
    • X-RateLimit-Remaining — remaining requests in the current burst window
  • Treat headers as the source of truth. Read them on every response, log the deltas, and back off before you hit zero — there is no separate quota dashboard called out in the docs.

Design Patterns to Stay Under Limits

  • Token-bucket client. Track X-RateLimit-Remaining per (app, location). When it drops below a safety floor (say, 10), defer non-urgent requests.
  • Exponential back-off on 429. If you do exceed the burst limit, the API responds with a rate-limit error. Back off with jitter rather than retrying tight.
  • Prefer bulk/upsert endpoints over per-record loops. Contacts has bulk and upsert endpoints; using them turns N requests into 1.
  • Cache reference data. Pipelines, calendars, custom fields, tags — fetch once per session, not per record.
  • Batch webhook processing. When an inbound webhook triggers an outbound API call, debounce or batch so a burst of webhooks does not produce a burst of API calls.
  • Per-tenant queueing. Because limits are per Location/Company, the throttling layer should key on the resource ID, not the app.

Monitoring Usage

The five headers are the only published telemetry surface. Common patterns:

  • Log every response’s X-RateLimit-Remaining and X-RateLimit-Daily-Remaining against (app, locationId) — this becomes a poor-man’s dashboard.
  • Emit a metric to your APM (Datadog, Sentry, etc.) on every call so trends are visible across deploys.
  • Set an alert when X-RateLimit-Daily-Remaining < 10000 (5% of budget). Daily limits reset on the calendar day.

Try It

  1. Make any authenticated V2 API call (e.g., GET /contacts/) and inspect the response headers.
  2. Pipe the five X-RateLimit-* headers into your logger so every call is annotated with remaining capacity.
  3. Set up an alert in your APM that fires when X-RateLimit-Daily-Remaining drops below 5% of X-RateLimit-Limit-Daily for any (app, locationId) pair.
  4. Replace any per-record contact-update loop with the upsert endpoint to compress request count.
  5. Add jitter to your retry-on-429 logic — synchronized retries across instances make the next burst worse.