Transform Video Into Instantly Searchable, Actionable Intelligence with AI Agents and Skills NVIDIA released a new version of its Metropolis Blueprint for video search and summarization (VSS) that transforms live and recorded video into searchable, actionable intelligence using AI agents and skills. The platform enables enterprises to monitor operations, detect trends, and make faster decisions by integrating vision-language models, large language models, and retrievers for real-time video analytics. The latest VSS update introduces a modular design, advanced fusion search, and skills that allow developers to automate deployment and integration into custom applications through a simple agentic chat interface. In today’s data-driven world, organizations increasingly rely on video to capture critical information, yet extracting meaningful, real-time insights from massive amounts of footage remains a challenge. NVIDIA Metropolis Blueprint for video search and summarization VSS https://build.nvidia.com/nvidia/video-search-and-summarization overcomes this hurdle by transforming millions of live video streams or hours of recorded video into instantly searchable, actionable intelligence. VSS brings a reference architecture for building video analytics AI agents https://www.nvidia.com/en-us/use-cases/video-analytics-ai-agents/ that perceive, reason, and act in real-time on massive volumes of live video streams and recorded data. It uses accelerated vision-based microservices, vision-language models VLMs https://www.nvidia.com/en-us/glossary/vision-language-models/ , large language models LLMs https://www.nvidia.com/en-us/glossary/large-language-models/ , and retrievers for real-time video intelligence, agentic search, and automated reporting. VSS helps enterprises monitor operations, detect trends, and make informed decisions faster than ever. The latest version of VSS brings a new modular design, advanced fusion search capability and a set of skills to easily integrate with autonomous agents. In this post you will learn how to use the new VSS skills https://github.com/NVIDIA-AI-Blueprints/video-search-and-summarization/tree/main/skills with coding agents to automate VSS deployment and integration into custom applications, followed by a deep dive into the technology behind VSS 3. Continue reading to learn how to use VSS skills with coding agents for building autonomous video analytics AI Agents http://google.com/search?q=video+analytics+AI+agent+nvidia&rlz=1C1ONGR enUS1050US1050&oq=video+analy&gs lcrp=EgZjaHJvbWUqBggAEEUYOzIGCAAQRRg7MgcIARAAGIAEMg0IAhAAGIMBGLEDGIAEMgcIAxAAGIAEMgYIBBBFGDwyBggFEEUYPDIGCAYQRRg8MgYIBxBFGEHSAQgxMTE4ajBqN6gCALACAA&sourceid=chrome&ie=UTF-8 . You can also watch a recording to learn how to build a video analytics AI agent with VSS skills. Build a video AI agent with VSS skills and coding agents In the past, developers had to manually configure, deploy and integrate the rich set of microservices VSS provides for video management, search, summarization and more to build video analytic applications. Today, it’s possible to use coding agents augmented with VSS skills to automate the deployment, usage and integration of VSS all through a simple agentic chat interface. VSS skills are hosted on the VSS GitHub Repository and follow the agent skills specification https://agentskills.io/specification , allowing them to be used with a wide variety of agents. A prerequisite to utilizing these skills is to have a system that is set up to run VSS and an agent compatible with skills such as Codex, Claude Code, OpenClaw, or NemoClaw. First we will show an example of how to add VSS skills to Codex and use it to deploy the VSS search profile. Then, we will show how to add VSS skills to OpenClaw, which will allow us to interact with our VSS deployment through nearly any chat interface to search and analyze large volumes of video. Setting up the VSS pre-requisites The first step is to prepare a system to run VSS. The easiest way to do this is to use the NVIDIA Brev Launchable for VSS. Go to the VSS launchable documentation page https://docs.nvidia.com/vss/latest/cloud-brev.html and click the “Launch Blueprint” button and then “Deploy Launchable.” Once deployed click the Open Notebook button and navigate to the /video-search-and-summarization/scripts/deploy vss launchable.ipynb notebook. Paste in your NGC CLI API KEY from NGC https://catalog.ngc.nvidia.com/ in the first cell and then execute the entire notebook including the tear-down section. This will ensure the system is fully set up for VSS and then you can make use of the deployment skill to manage our VSS deployment from our coding agent. Once the notebook has run to completion, install the Brev CLI on your host system, launch VSCode and remotely connect to your Brev Instance following the Using Brev CLI SSH section from your Launchable page as shown in Figure 2, below. Once you have a remote access configured, you can install the Codex through the VSCode extension to use as the coding agent. Deploying VSS with Codex In VSCode you will use the extensions tab to search for and install Codex. Once installed you need to install the VSS skills. You can do this by telling Codex to self install the VSS skills and providing it the location of our VSS Github repository as shown in the following prompt: Read ~/video-search-and-summarization/skills/README.md and every SKILL.md file under ~/video-search-and-summarization/skills/. For each skill in the catalog, install it for this host so I can invoke it from a shell or chat session. Use the host's standard skills directory: Claude Code: ~/.claude/skills/