Ratelimits
To ensure fair use and maintain high performance across all customers, Doti’s API enforces rate limits. This guide explains our policies and how to design resilient applications around them.
We enforce two types of rate limits:
Search API Rate Limits – Controls how many search requests can be made per minute
Request Token Limits – Controls the size of an individual request payload
Search API Rate Limits
Doti applies a fixed-window rate limiting algorithm on search endpoints to manage usage.
Default Policy
Default Limit
100 search requests per minute
Scope
Company-wide (shared quota)
Reset Interval
Every 60 seconds
Configurable?
Yes, contact support
If the rate limit is exceeded, the API will respond with:
HTTP 429 Too Many Requests
🧾 Error Response
{
"error": "Rate limit exceeded",
"message": "Too many search requests. Please try again later.",
"retryAfter": 59,
"remaining": 0,
"capacity": 100
}
Response Headers
X-RateLimit-Remaining
Remaining requests in the current window
X-RateLimit-Capacity
Max requests allowed per window
X-RateLimit-RetryAfter
Seconds until next request is accepted
Request Token Limits
To ensure optimal performance and avoid heavy payloads, Doti limits each API call based on token count (tokens = chunks of text).
Token Limit Policy
Default Limit
40,000 tokens
Trigger
Based on payload size before processing
If a request exceeds the limit, it will return an error:
Maximum tokens per request is 40000. Your messages resulted in 41234 tokens. Please reduce the length of the messages.
Best Practices
To ensure smooth usage and prevent interruptions:
Implement Retry Logic
Handle 429
responses by respecting the X-RateLimit-RetryAfter
header or the retryAfter
field in the response body. Use exponential backoff strategies.
Monitor Rate Usage
Use the X-RateLimit-Remaining
header to throttle requests dynamically based on usage.
Cache Results
Avoid redundant queries by caching frequent responses where applicable.
Fail-Open Policy
In rare cases of internal issues, our rate-limiting infrastructure fails open - allowing requests to pass through temporarily to avoid disruptions.
Need Higher Limits?
For high-throughput use cases or enterprise-scale automation, contact us to explore custom limits for your workspace.
Last updated
Was this helpful?