Pinecone Terms
When using Ragapi with Pinecone, two key concepts are central to organizing and retrieving your data: indexes and namespaces. Here’s how they work and when to use them.
What is a Pinecone Index?
A Pinecone index is a container for storing data as vectors, allowing for efficient storage and retrieval. Ragapi uses indexes to save document embeddings and facilitate fast queries.
When to Create a New Index
Creating a new index is not required for separating concerns—that's what namespaces are for. A new index is necessary only when:
- Completely Different Use Case: You’re working on a project or use case that is entirely separate and doesn’t overlap with your existing data.
- Index Capacity: Your current index is full and cannot accommodate more data.
Avoid Overusing Indexes
For most situations, namespaces are sufficient for organizing and isolating data within a single index. Using namespaces instead of creating multiple indexes helps keep your setup simple and efficient.
What is a Namespace?
A namespace is a logical subdivision within an index. It acts like a folder, grouping related data within the same index while keeping it isolated for specific use cases.
Key Characteristics of Namespaces
- Unique: Every namespace in an index is unique. This ensures precise data segmentation.
- Automatic Creation: When you call the
/store-document
endpoint, if no namespace is specified, Ragapi automatically creates one for you. - Data Sharing Within a Namespace: Using the same namespace for multiple
/store-document
requests adds all those documents to that namespace, making them available for querying in the same context.
Why Use Namespaces?
Namespaces are ideal for separating concerns and managing data efficiently:
- Data Isolation: For private or user-specific data, assign a unique namespace (e.g.,
user123-private-docs
). This keeps their data separate from others. - Combined Context: Adding multiple documents to the same namespace makes them all available as a unified context for queries.
- Targeted Queries: When calling the
/conversation
endpoint, you must specify a namespace to tell Ragapi which subset of data to query.
Example Use Cases
- User Data: Use namespaces like
user-{{UUID}}
orproject-{{UUID}}
to store data specific to an individual user or team. - Multi-Document Queries: Store related documents in the same namespace, so they’re treated as a single context for answering questions.
- Privacy Controls: Isolate sensitive or private data in separate namespaces for different users or projects.
By leveraging namespaces for data organization and only creating new indexes when necessary, you can keep your system efficient and scalable with Ragapi and Pinecone. For more information, refer to Pinecone’s official documentation.