Skip to main content

/conversation

The /conversation endpoint enables interaction with stored documents via natural language queries. It retrieves relevant information from the stored embeddings and generates AI-based responses.

Request

  • URL: /conversation
  • Method: POST

Headers:

  • x-api-key (string): The Ragapi API key required for authorization.

Request Body Parameters

ParameterTypeRequiredDescription
pineconeIndexNamestringYesThe name of the Pinecone index where embeddings were stored.
pineconeNamespacestringYesThe namespace in Pinecone for the relevant document embeddings.
querystringYesThe question or query to ask based on the stored document.
streamingbooleanNoWhether the response should be streamed (default: false).
modelstringNoModel version to use, either gpt-4o or gpt-4o-mini. Defaults to gpt-4o.
chatHistoryarray of objectsNoPrevious conversation context in this format: [ { role: "user", content: "question" }, { role: "assistant", content: "answer" } ].
tonestringNoSpecifies the desired tone for the response. Options include professional, friendly, creative, witty, etc. Default is neutral. More below.
maxTokensRetrievernumberNoThe maximum token limit for contextualizing queries in the history-aware retriever. Default: 1500. Max: 4000.
maxTokensAnswernumberNoThe maximum token limit for generating the final AI response. Default: 1500. Max: 4000.

Sample Request

const serviceUrl = "https://api.ragapi.tech/conversation"
const apiKey = "YOUR_RAGAPI_API_KEY"
const pineconeIndexName = "YOUR_PINECONE_INDEX"
const pineconeNamespace = "NAMESPACE_FROM_PREVIOUS_STEP"
const query = "Where does Ignatius the blockchain come from?"

const response = await fetch(serviceUrl, {
method: "POST",
headers: {
"Content-Type": "application/json",
"x-api-key": apiKey,
},
body: JSON.stringify({
pineconeIndexName,
pineconeNamespace,
query,
}),
})

const data = await response.json()

// Should contain "Northern Highlands"
console.log(data.response)

Response

FieldTypeDescription
successbooleanIndicates if the request was successful.
data.answerstringThe AI-generated answer based on the stored document embeddings.

Errors

  • 400: Invalid parameters or missing required fields.
  • 500: Unexpected error during conversation processing.

Available tone Options

The tone parameter allows users to specify the tone or style of the AI-generated responses. Below are the available options and their descriptions:

ValueDescription
neutral(Default) A neutral and balanced tone.
professionalFormal and precise, suitable for business or academic contexts.
friendlyWarm and conversational, ideal for general audiences.
creativeImaginative and engaging, good for brainstorming or storytelling.
wittyPlayful and humorous, adding a lighthearted touch to the responses.
encouragingSupportive and motivational, inspiring confidence and positivity.
criticalIn-depth and analytical, focusing on detailed examination and nuanced insight.

Details About maxTokensRetriever and maxTokensAnswer

These two parameters allow you to fine-tune the balance between context awareness and response depth:

maxTokensRetriever

  • Purpose: Controls the maximum token limit allocated for the history-aware retriever. It determines how much of the user’s query and chat history is used to formulate a refined question for retrieving relevant embeddings.
  • Default: 1500
  • Maximum: 4000 (Use with caution — see below)
  • Impacts:
    • Higher Values: Suitable for complex queries with lengthy chat histories. Improves context awareness by allowing the retriever to process more context but may lead to increased latency and costs.
    • Lower Values: Efficient for short or direct queries, saving resources and ensuring faster responses.

maxTokensAnswer

  • Purpose: Sets the maximum token limit for the final answer generation. This dictates how detailed and comprehensive the response can be.
  • Default: 1500
  • Maximum: 4000 (Use with caution — see below)
  • Impacts:
    • Higher Values: Ideal for queries requiring extensive, detailed responses. However, this increases API costs and might slow down response times.
    • Lower Values: Results in concise and focused responses. Cost-efficient but risks truncation for complex questions.

Note on Maximum Values

While both maxTokensRetriever and maxTokensAnswer have a maximum limit of 4000, these values should be used cautiously:

  1. Performance Implications:

    • Latency: Requests with higher token limits take longer to process, potentially leading to slower responses.
  2. Cost Considerations:

    • Token Usage: Since pricing for GPT models is token-based, higher token limits result in higher costs. Each request with 4000 tokens for both retriever and answer generation can quickly deplete user credits or incur substantial billing.
  3. Recommendations:

    • Optimize for Use Case: Use higher limits (close to 4000) only for tasks requiring deep context or lengthy, detailed responses.
    • Default Limits for Most Cases: Keep 1500 or lower for general use cases to maintain a balance between cost and performance.
    • Monitor and Notify: Provide users with tools to track their token usage and alert them if they approach excessive consumption levels.

By thoughtfully managing token limits, you can maximize the utility of the API while ensuring efficiency and cost-effectiveness.