/store-document

The /store-document endpoint processes a document and stores its embeddings in the user’s Pinecone database. Supported document types include PDFs, GitHub repositories, YouTube videos, and websites.

Request

URL: /store-document
Method: POST
Headers:
- x-api-key (string): The Ragapi API key required for authorization.

Request Body Parameters

Parameter	Type	Required	Description
`documentUrl`	`string`	Yes	The URL of the document to be stored (PDF, GitHub repo, YouTube video, or website).
`pineconeIndexName`	`string`	Yes	The name of the Pinecone index where embeddings will be stored.
`contextType`	`string`	Yes	Specifies the type of document being stored. Must be one of `pdf`, `github_repo`, `youtube_video`, or `website`.
`crawledPagesLimit`	`number`	No	The maximum number of pages to crawl (for websites only).
`githubAccessToken`	`string`	No	Access token for accessing private GitHub repositories.
`githubBranch`	`string`	No	GitHub branch name to retrieve content from (if not provided, defaults to `main`).
`pineconeNamespace`	`string`	No	Custom namespace within Pinecone for organizing stored embeddings. If not provided, a new namespace will be generated automatically.
`waitToBeIndexed`	`boolean`	No	Defaults to `true`. By design, data stored to Pinecone needs a few moments to become queryable. If you don't need to query it immediately, you can pass `false`, and the store function will be faster.

Sample Request

const serviceUrl = "https://api.ragapi.tech/store-document"
const apiKey = "YOUR_RAGAPI_API_KEY"
const pineconeIndexName = "YOUR_PINECONE_INDEX"
const documentUrl =
  "https://core.ragapi.tech/storage/v1/object/public/ragapi-public/example.pdf"

const response = await fetch(serviceUrl, {
  method: "POST",
  headers: {
    "Content-Type": "application/json",
    "x-api-key": apiKey,
  },
  body: JSON.stringify({
    pineconeIndexName,
    documentUrl,
    contextType: "pdf",
  }),
})

const result = await response.json()

// Pinecone index is separated by namespaces.
// You need to pass the namespace to query documents inside.
console.log(result.data.pineconeNamespace)

Response

Field	Type	Description
`success`	`boolean`	Indicates if the request was successful.
`data.contextType`	`string`	The document type (e.g., `pdf`, `github_repo`, `youtube_video`, `website`).
`data.storedDocumentId`	`string`	ID of the stored document.
`data.tokensUsed`	`number`	Tokens used during the embedding process.
`data.pineconeNamespace`	`string`	The Pinecone namespace where embeddings were stored. If not provided, it will be autogenerated.
`data.title`	`string`	(Optional) Title of the document if applicable (e.g., YouTube video or website title).
`data.thumbnail`	`string`	(Optional) Thumbnail URL of the document (for YouTube videos).
`data.description`	`string`	(Optional) Description of the document.

Additional Notes on Key Parameters

pineconeIndexName: This specifies the index within Pinecone where your document embeddings are stored. The index must have a dimensionality of 3072 to be compatible with the text-embedding-3-large model. Ensuring the correct dimensionality is essential for accurate and performant querying of the stored data.
pineconeNamespace: The namespace parameter provides a way to organize and isolate document embeddings within Pinecone. Documents within a namespace are accessible only by queries using the same namespace, enhancing security and data organization. To group multiple documents together, set the same pineconeNamespace for each, making them accessible within that namespace. If pineconeNamespace is not provided, an autogenerated namespace will be assigned.
waitToBeIndexed: Pinecone requires a few moments to index data before it becomes queryable. By default, waitToBeIndexed is set to true, meaning the function will wait until the document is fully indexed before returning a response. This setting ensures the document is immediately ready for queries. If you don’t need instant querying and prefer faster storage, set waitToBeIndexed to false to skip the wait time and enhance performance. You can read more in Pinecone's docs.

Request​

Request Body Parameters​

Sample Request​

Response​

Additional Notes on Key Parameters​

Request

Request Body Parameters

Sample Request

Response

Additional Notes on Key Parameters