{"slug": "building-supervised-fine-tuning-data-from-nvidia-open-swe-traces-trajectory-and", "title": "Building Supervised Fine-Tuning Data from NVIDIA Open-SWE-Traces: Trajectory Parsing, Patch Analysis, Token Budgets, and Tool-Use Metrics", "summary": "NVIDIA released Open-SWE-Traces, a dataset of agentic software-engineering trajectories for fine-tuning AI models. The tutorial processes the data from Hugging Face, normalizing conversations, parsing patches, and curating a subset for supervised fine-tuning based on success labels, token limits, and language filters.", "body_md": "In this tutorial, we work with NVIDIA's Open-SWE-Traces dataset to study agentic software-engineering trajectories for fine-tuning. We stream the data directly from Hugging Face, so we can process it efficiently in Google Colab without downloading everything locally. We normalize multi-turn agent conversations, parse final code patches, and build an analysis DataFrame covering trajectory length, tool usage, patch size, language distribution, and resolution outcomes. We then curate a supervised fine-tuning subset using success labels, token limits, language filters, and patch availability.\n\nThe post [Building Supervised Fine-Tuning Data from NVIDIA Open-SWE-Traces: Trajectory Parsing, Patch Analysis, Token Budgets, and Tool-Use Metrics](https://www.marktechpost.com/2026/06/26/building-supervised-fine-tuning-data-from-nvidia-open-swe-traces-trajectory-parsing-patch-analysis-token-budgets-and-tool-use-metrics/) appeared first on [MarkTechPost](https://www.marktechpost.com).", "url": "https://wpnews.pro/news/building-supervised-fine-tuning-data-from-nvidia-open-swe-traces-trajectory-and", "canonical_source": "https://www.marktechpost.com/2026/06/26/building-supervised-fine-tuning-data-from-nvidia-open-swe-traces-trajectory-parsing-patch-analysis-token-budgets-and-tool-use-metrics/", "published_at": "2026-06-27 00:02:33+00:00", "updated_at": "2026-06-27 00:08:19.674078+00:00", "lang": "en", "topics": ["artificial-intelligence", "large-language-models", "ai-research", "ai-tools", "developer-tools"], "entities": ["NVIDIA", "Open-SWE-Traces", "Hugging Face", "Google Colab", "MarkTechPost"], "alternates": {"html": "https://wpnews.pro/news/building-supervised-fine-tuning-data-from-nvidia-open-swe-traces-trajectory-and", "markdown": "https://wpnews.pro/news/building-supervised-fine-tuning-data-from-nvidia-open-swe-traces-trajectory-and.md", "text": "https://wpnews.pro/news/building-supervised-fine-tuning-data-from-nvidia-open-swe-traces-trajectory-and.txt", "jsonld": "https://wpnews.pro/news/building-supervised-fine-tuning-data-from-nvidia-open-swe-traces-trajectory-and.jsonld"}}