Ensuring Data Separation
Overview
Ragapi uses Pinecone namespaces to manage and separate user data. By organizing stored documents within unique namespaces, you can ensure that data does not overlap across different users or projects. This means that documents stored within a specific namespace are isolated and accessible only when querying within that same namespace, preventing cross-referencing between unrelated data.
How Namespaces Work
-
When Storing Documents: Each time you store a document, you have the option to specify a
pineconeNamespace
. If a namespace is not provided, Ragapi will automatically generate a unique namespace (as a UUID). This helps maintain data separation by isolating documents within a unique identifier. -
When Querying with Conversation: To interact with specific documents, pass the
pineconeNamespace
in the/conversation
request. Only documents stored within that namespace will be retrieved for the query, ensuring that unrelated data remains isolated.
Best Practices for Avoiding Data Overlap
If you want to ensure data is separated by user or project, use a unique pineconeNamespace
for each. By setting distinct namespaces, you prevent documents from overlapping or being accessible in unintended queries. This isolation is essential for secure, organized data management across multiple users or contexts.
Example of Namespace Isolation
Suppose you store the following documents:
-
Namespace 1 (e.g.,
"project-resources"
):- A PDF document
- A YouTube video
- A website
These three documents will be accessible together whenever
"project-resources"
is specified as thepineconeNamespace
in a query.
-
Namespace 2 (e.g.,
"user-specific-documents"
):- A single PDF document
This document is isolated from the documents in
"project-resources"
and will only be accessible when"user-specific-documents"
is used as thepineconeNamespace
in a query.
- A single PDF document
This document is isolated from the documents in
Documents stored in different namespaces do not "know" about each other, meaning that queries in "project-resources"
will only retrieve information from the PDF, YouTube, and website documents stored in that namespace, while "user-specific-documents"
queries will only retrieve information from its isolated PDF document.
Summary
- Use unique namespaces for different users or projects to keep data separated and secure.
- Pass the same namespace during
/conversation
queries to retrieve context only from the intended documents within that namespace.
By managing namespaces carefully, you can maintain strict data separation and ensure that each query accesses only the documents intended for that specific context.