Your RAG Is Underperforming Because Your Embeddings Are Too Simple Cohere's new Compass embedding model addresses the limitations of standard single-vector embeddings in enterprise RAG systems by preserving the structure of multi-aspect data. Instead of compressing documents into single vectors, Compass accepts structured JSON inputs to capture relationships between distinct concepts, reducing the need for brittle post-retrieval filters. The model is currently in private beta with an open-source Python SDK available for testing. Most production RAG systems are built on a simple premise: convert documents into single vectors and find the ones closest to a query vector. This works for simple documents, but fails on the messy, multi-aspect data that defines enterprise reality. Cohere's Compass is a new embedding model designed for this specific problem, and it suggests a necessary evolution in how we build retrieval systems. Standard embedding models, including powerful ones like Cohere's own Embed v3, map an entire document to a single point in semantic space. This is a lossy compression. If a document contains multiple distinct concepts—like an invoice with a specific sender, due date, and line items—the resulting vector is an average of all those concepts. The relationships between them are lost. This leads to retrieval errors that are painfully familiar to anyone who has shipped a RAG product. A search for a "red T-shirt" might return "blue and yellow jeans" because the vector for colors is muddled with the vector for clothing type. In an enterprise context, a query for an invoice from a specific person might fail because the contextual link between the sender and the attached document was severed during the chunking and embedding process. To compensate, engineers build brittle, complex classification layers and metadata filters on top of the vector search. This is a workaround, not a solution. It treats the symptom—poor retrieval quality—instead of the underlying disease: an embedding model that doesn't understand the structure of your data. Compass is designed to handle this multi-aspect data directly. Instead of feeding it a raw text chunk, you provide a JSON document that preserves the data's inherent structure. The model then creates a multi-aspect representation that can be stored in any vector database, capturing the relationships between the different concepts. For example, a traditional RAG pipeline might index an email and its PDF attachment as two separate, unrelated chunks. The crucial context—that this specific PDF was sent by a particular person at a specific time—is lost. The Compass workflow uses an SDK to parse the email and its attachments into a single, structured JSON object. This JSON is then passed to the embedding model, which generates an output that understands the document's internal relationships. This approach moves the complexity from post-retrieval filtering into the embedding model itself, where it can be handled more effectively. It allows for more precise, context-aware data retrieval without the need for manual classification layers. The workflow involves using a dedicated SDK to prepare and index your data. While the full system is in private beta, the open-source Python SDK shows the intended structure. You would use a parser client to convert your raw files into the structured JSON format, and then an index client to handle the embedding and storage. Here is a conceptual look at how you might use the Python client to index documents: python from cohere compass.clients.compass import CompassClient from cohere compass.clients.parser import CompassParserClient from cohere compass.models.config import MetadataConfig, MetadataStrategy Configuration for your Compass instance COMPASS API URL = "