{"slug": "building-with-gemini-embedding-2-agentic-multimodal-rag-and-beyond", "title": "Building with Gemini Embedding 2: Agentic multimodal RAG and beyond", "summary": "General Availability of Gemini Embedding 2, a unified multimodal model that maps text, images, video, audio, and documents into a single embedding space supporting over 100 languages. It enables use cases like agentic multimodal RAG and visual search, processing up to 8,192 text tokens, 6 images, 120 seconds of video, 180 seconds of audio, and 6 PDF pages in a single call. The model has improved retrieval accuracy for users like Harvey (3% increase in legal benchmarks) and Supermemory (40% increase in search accuracy), and boosted Nuuly's visual search Match@20 accuracy from 60% to nearly 87%.", "body_md": "Last week, we announced the General Availability (GA) of Gemini Embedding 2 via the Gemini API and Gemini Enterprise Agent Platform. It’s the first embedding model in the Gemini API that maps text, images, video, audio, and documents into a single embedding space, supporting over 100 languages.\nIn this post, we will explore the diverse use cases this unified model unlocks, from agentic multimodal RAG to visual search, and show you exactly how to start building them.\nThe model handles an expansive range of inputs in a single call: up to 8,192 text tokens, 6 images, 120 seconds of video, 180 seconds of audio, and 6 pages of PDFs. By mapping different modalities in the same semantic space, developers can build diverse experiences that “see” and “hear” proprietary data.\nLink to Youtube Video (visible only when JS is disabled)\nThe true power of Gemini Embedding 2 is its ability to process interleaved inputs—such as a combination of text and images—in a single request:\nfrom google import genai\nfrom google.genai import types\nclient = genai.Client()\nwith open('dog.png', 'rb') as f:\nimage_bytes = f.read()\nresult = client.models.embed_content(\nmodel='gemini-embedding-2',\ncontents=[\n\"An image of a dog\",\ntypes.Part.from_bytes(\ndata=image_bytes,\nmime_type='image/png',\n),\n]\n)\nprint(result.embeddings)\nThis enables a more accurate, holistic understanding of complex, real-world data. If you need separate embeddings for individual inputs instead of one aggregated vector, use the Batch API (support coming soon for Agent Platform).\nMultimodal embeddings enable AI agents to execute multi-step reasoning tasks, such as scanning hundreds of files to fix a codebase or cross-referencing disparate PDFs, with improved understanding and accuracy.\nTo build these pipelines with the Gemini API, you can use task prefixes based on the agent’s goal. These prefixes optimize the resulting embeddings for your specific task, helping the model bridge the gap between short queries and long documents:\n# Generate embedding for your task's query:\ndef prepare_query(query):\nreturn f\"task: question answering | query: {content}\"\n# return f\"task: fact checking | query: {content}\"\n# return f\"task: code retrieval | query: {content}\"\n# return f\"task: search result | query: {content}\"\n# Generate embedding for document of an asymmetric retrieval task:\ndef prepare_document(content, title=None):\nif title is None:\ntitle = \"none\"\nreturn f\"title: {title} | text: {content}\"\nApplying these prefixes at both index time and query time can significantly improve retrieval accuracy.\nMany users are already seeing a positive impact from adopting Gemini Embedding 2. Harvey, a legal research platform for law firms and enterprises, has seen a 3% increase in Recall@20 precision on legal-specific benchmarks compared to their previous embeddings, leading to more accurate citations and answers for law firms and enterprises.\nSupermemory is building a “vector database for memory” that enables conceptual searching across disjointed memos. Since integrating the model, they’ve achieved a 40% increase in search Recall@1 accuracy and leveraged these embeddings to drive performance across their core retrieval pipelines, spanning indexing, search, and Q&A.\nYou can also use Gemini Embedding 2 to build tools that search across data based on a multimodal input. To perform this task, you would use the following prefix: \"task: search result | query: {content}\".\nNuuly, URBN’s clothing rental company, uses Gemini Embedding 2 for their in-house visual search tool that matches photos taken on the warehouse floor against their catalog to identify untagged garments. This implementation pushed their Match@20 accuracy from 60% to nearly 87%, and their total successful product identification rate from 74% to over 90%.\nFor retrieval pipelines, you can use embeddings to rerank initial results to get the absolute best answers. To do this, you can calculate distance metrics—like cosine similarity or dot product scores—between the embedded search results and the user’s query:\n# 1. Define a function to calculate the dot product (cosine similarity)\ndef dot_product(a: np.ndarray, b: np.ndarray):\nreturn (np.array(a) @ np.array(b).T)\n# 2. Retrieve your embeddings\n# (Assuming 'summaries' is your list of search results)\nsearch_res = get_embeddings(summaries)\nembedded_query = get_embeddings([query])\n# 3. Calculate similarity scores\nsim_value = dot_product(search_res, embedded_query)\n# 4. Select the most relevant result\nbest_match_index = np.argmax(sim_value)\nBy prompting the model to generate a baseline hypothetical answer to a query using its internal knowledge, you can embed that template and compare its similarity score against your retrieved data to rank the most accurate and contextually rich match.\nLearn how in the search reranking notebook.\nEmbeddings are useful for grasping relationships between data by creating clusters based on similarities. You can also quickly identify hidden trends or outliers, making this same technique the perfect foundation for sentiment analysis and anomaly detection.\nUnlike the asymmetric retrieval tasks above, these are symmetric use cases where you use the same task prefix for both the query and the document:\n# Generate embedding for query & document of your task.\ndef prepare_query_and_document(content):\n# return f'task: clustering | query: {content}'\n# return f'task: sentence similarity | query: {content}'\n# return f'task: classification | query: {content}'\nTry these tasks out in the clustering, text classification, and anomaly detection notebooks.\nYou can store your embeddings in vector databases like Agent Platform Vector Search, Pinecone, Weaviate, Qdrant, or ChromaDB.\nGemini Embedding 2 is trained using Matryoshka Representation Learning (MRL), so you can truncate the default 3072-dimensional vectors down to smaller dimensions using the output_dimensionality parameter for more efficient storage. (We recommend 1536 or 768 for highest efficiency.)\nresult = client.models.embed_content(\nmodel=\"gemini-embedding-2\",\ncontents=\"What is the meaning of life?\",\nconfig={\"output_dimensionality\": 768}\n)\nThis results in lower costs while maintaining high accuracy out of the box. For additional cost-efficiency, the Batch API achieves much higher throughput at 50% of the default embedding price.\nWe’re excited to see how natively multimodal embeddings improve understanding of complex data across industries and use cases.\nReady to get started? Explore the model in Gemini API or Agent Platform.", "url": "https://wpnews.pro/news/building-with-gemini-embedding-2-agentic-multimodal-rag-and-beyond", "canonical_source": "https://developers.googleblog.com/building-with-gemini-embedding-2/", "published_at": "2026-05-20 03:11:52.057757+00:00", "updated_at": "2026-05-20 03:11:55.610402+00:00", "lang": "en", "topics": ["artificial-intelligence", "machine-learning", "large-language-models", "developer-tools", "enterprise-software"], "entities": ["Gemini Embedding 2", "Gemini API", "Gemini Enterprise Agent Platform", "Google"], "alternates": {"html": "https://wpnews.pro/news/building-with-gemini-embedding-2-agentic-multimodal-rag-and-beyond", "markdown": "https://wpnews.pro/news/building-with-gemini-embedding-2-agentic-multimodal-rag-and-beyond.md", "text": "https://wpnews.pro/news/building-with-gemini-embedding-2-agentic-multimodal-rag-and-beyond.txt", "jsonld": "https://wpnews.pro/news/building-with-gemini-embedding-2-agentic-multimodal-rag-and-beyond.jsonld"}}