Ingest API - Document Ingestion
The Documents API allows you to ingest documents into a custom collection. It provides two endpoints: one for single document ingestion and one for batch ingestion (multiple documents).
Authentication
Authentication
All requests to the Doti API require a Bearer token. Include it in the Authorization
header:
Authorization: Bearer {YOUR_ACCESS_TOKEN}
❗ Tokens must be generated by the Doti team. They are not yet user-creatable via the Portal.
API Endpoints
Ingest a Single Document into a Custom Collection
Use this endpoint to add or update a single document in a specified custom collection. Best for quick ingestion with immediate feedback.
Endpoint:
POST https://api.doti.ai/api/v2/collections/:collectionId/documents
URL Parameters
collectionId
string
The external identifier of the collection.
Request Body
The body must contain a single document. Max payload size is 10MB.
pageContent
string
Yes
The content of the document.
metadata
object
Yes
Metadata object for the document.
Metadata fields:
id
string
Yes
A unique identifier for the document.
url
string
Yes
The document’s source URL.
urlDescription
string
Yes
A short description of the URL.
lastModified
string
Yes
Last modified timestamp (ISO 8601).
additionalData
object
Yes
Additional metadata (e.g., author).
score
number
No
Optional relevance score.
Example Request
{
"pageContent": "This is the content of the document.",
"metadata": {
"id": "doc-1",
"url": "http://example.com/doc-1",
"urlDescription": "Document Example",
"lastModified": "2023-01-01T12:00:00Z",
"additionalData": { "author": "John Doe" }
}
}
Example Responses
201 Created
{
"message": "Document processed successfully",
"success": true,
"documentId": "doc-1"
}
400 Bad Request
{
"message": "Document content is empty",
"success": false,
"documentId": "doc-1",
"error": "Document content cannot be empty",
"errorCode": "EMPTY_CONTENT"
}
409 Conflict
{
"message": "Document id doc-1 already exists, please use a different id",
"success": false,
"documentId": "doc-1",
"error": "Document already exists in another collection",
"errorCode": "DOCUMENT_CONFLICT"
}
500 Internal Server Error
{
"message": "Document processing failed due to an internal error",
"success": false,
"documentId": "doc-1",
"error": "Internal processing error",
"errorCode": "PROCESSING_ERROR"
}
Ingest Multiple Documents into a Custom Collection (Batch)
Use this endpoint to add or update multiple documents in a collection at once. Best for efficiency when handling bulk ingestion.
Endpoint:
POST https://api.doti.ai/api/v2/collections/:collectionId/documents/batch
URL Parameters
collectionId
string
The external identifier of the collection.
Request Body
The body must contain an array of documents. Max 100 documents per request, max payload 10MB.
documents
object[]
Yes
Array of document objects.
Each document has the same structure as the single ingestion request.
Example Request
{
"documents": [
{
"pageContent": "This is the content of the first document.",
"metadata": {
"id": "doc-1",
"url": "http://example.com/doc-1",
"urlDescription": "First Document",
"lastModified": "2023-01-01T12:00:00Z",
"additionalData": { "author": "John Doe" }
}
},
{
"pageContent": "This is the content of the second document.",
"metadata": {
"id": "doc-2",
"url": "http://example.com/doc-2",
"urlDescription": "Second Document",
"lastModified": "2023-01-02T15:30:00Z",
"additionalData": { "author": "Jane Smith" }
}
}
]
}
Example Responses
202 Accepted
{
"message": "All documents processed successfully",
"processedCount": 2,
"faultyDocuments": []
}
207 Multi-Status
{
"message": "1 documents processed successfully, 1 documents failed",
"processedCount": 1,
"faultyDocuments": [
{
"documentId": "doc-2",
"error": "Document content is empty",
"errorCode": "EMPTY_CONTENT",
"metadata": {
"url": "http://example.com/doc-2"
}
}
]
}
400 Bad Request
{
"message": "0 documents processed successfully, 2 documents failed",
"processedCount": 0,
"faultyDocuments": [
{
"documentId": "doc-1",
"error": "Document id doc-1 already exists, please use a different id",
"errorCode": "DOCUMENT_CONFLICT",
"metadata": {
"existingCollections": ["other-collection-id"],
"targetCollection": "internal-collection-id"
}
},
{
"documentId": "doc-2",
"error": "Schema validation failed",
"errorCode": "VALIDATION_ERROR",
"metadata": {
"validationErrors": []
}
}
]
}
Rate Limiting
The Documents API uses a points-based rate limiting system:
Single Document Endpoint: 1 point per request
Batch Endpoint: 10 points per request
Example (with 100 points/minute limit):
100 single-document requests, OR
10 batch requests, OR
A mix (e.g., 50 single + 5 batch).
When to Use Which Endpoint
Single Document (
/documents
) → Best for individual documents, simple error handling, immediate feedback.Batch (
/documents/batch
) → Best for multiple documents, efficient API usage, but allows partial success scenarios.
Both endpoints ensure documents are validated, stored, and associated with the specified custom collection.
Last updated
Was this helpful?