API Rate Limits

Source: ai-research/ghl-2026-05-01/support-solutions-articles-48001060529-highlevel-api-documentation.md

HighLevel applies rate limits on its public V2 APIs to keep the platform stable. Limits are scoped per Marketplace app per resource (Location or Company), so a single agency installation does not consume budget that belongs to other tenants. Five X-RateLimit-* response headers expose remaining capacity on every call, which makes client-side throttling and back-off straightforward.

Key Takeaways

Burst limit: 100 requests per 10 seconds for each Marketplace app (i.e., client) per resource (i.e., Location or Company).
Daily limit: 200,000 requests per day for each Marketplace app per resource.
“Per resource” means per Location or Company. A multi-tenant Marketplace app installed on 50 sub-accounts has its own bucket for each sub-account — limits do not pool across tenants.
Limits apply to V2 APIs using OAuth. V1 has reached end-of-support (December 31 2025) and receives no updates; V2 is the supported tier.
Five response headers report current usage on every call:
- X-RateLimit-Limit-Daily — the daily limit
- X-RateLimit-Daily-Remaining — remaining requests for the day
- X-RateLimit-Interval-Milliseconds — the time interval for burst requests
- X-RateLimit-Max — the maximum request limit in the specified time interval
- X-RateLimit-Remaining — remaining requests in the current burst window
Treat headers as the source of truth. Read them on every response, log the deltas, and back off before you hit zero — there is no separate quota dashboard called out in the docs.

Design Patterns to Stay Under Limits

Token-bucket client. Track X-RateLimit-Remaining per (app, location). When it drops below a safety floor (say, 10), defer non-urgent requests.
Exponential back-off on 429. If you do exceed the burst limit, the API responds with a rate-limit error. Back off with jitter rather than retrying tight.
Prefer bulk/upsert endpoints over per-record loops. Contacts has bulk and upsert endpoints; using them turns N requests into 1.
Cache reference data. Pipelines, calendars, custom fields, tags — fetch once per session, not per record.
Batch webhook processing. When an inbound webhook triggers an outbound API call, debounce or batch so a burst of webhooks does not produce a burst of API calls.
Per-tenant queueing. Because limits are per Location/Company, the throttling layer should key on the resource ID, not the app.

Monitoring Usage

The five headers are the only published telemetry surface. Common patterns:

Log every response’s X-RateLimit-Remaining and X-RateLimit-Daily-Remaining against (app, locationId) — this becomes a poor-man’s dashboard.
Emit a metric to your APM (Datadog, Sentry, etc.) on every call so trends are visible across deploys.
Set an alert when X-RateLimit-Daily-Remaining < 10000 (5% of budget). Daily limits reset on the calendar day.

Try It

Make any authenticated V2 API call (e.g., GET /contacts/) and inspect the response headers.
Pipe the five X-RateLimit-* headers into your logger so every call is annotated with remaining capacity.
Set up an alert in your APM that fires when X-RateLimit-Daily-Remaining drops below 5% of X-RateLimit-Limit-Daily for any (app, locationId) pair.
Replace any per-record contact-update loop with the upsert endpoint to compress request count.
Add jitter to your retry-on-429 logic — synchronized retries across instances make the next burst worse.

Jonathon's AI Wiki

Explorer

API Rate Limits

Key Takeaways

Design Patterns to Stay Under Limits

Monitoring Usage

Try It

Graph View

Table of Contents

Backlinks

Jonathon's AI Wiki

Explorer

API Rate Limits

Key Takeaways

Design Patterns to Stay Under Limits

Monitoring Usage

Related

Try It

Graph View

Table of Contents

Backlinks