{"slug": "how-to-fine-tune-rf-detr-keypoints-on-custom-data", "title": "How to fine-tune RF-DETR Keypoints on Custom Data", "summary": "Roboflow released a tutorial on fine-tuning RF-DETR Keypoints on custom data, demonstrating the process using a basketball court detection dataset with 33 landmarks. The model extends the RF-DETR architecture for real-time keypoint detection, predicting bounding boxes and keypoint coordinates in a single forward pass without NMS or heatmaps.", "body_md": "[RF-DETR Keypoint](https://blog.roboflow.com/real-time-keypoint-detection-with-rf-detr/) is a real-time transformer model for keypoint detection. It extends the RF-DETR architecture, which is already state-of-the-art for object detection and instance segmentation. The model predicts bounding boxes and keypoint coordinates in a single forward pass, with no NMS and no heatmaps. Each keypoint comes with confidence scores and an uncertainty ellipse derived from a learned covariance matrix.\n\nThe default checkpoint is trained on COCO person pose with 17 keypoints. RF-DETR Keypoint is not limited to that skeleton: you can fine-tune on any keypoint layout attached to any object class. In this tutorial, we walk through that fine-tuning workflow step by step.\n\nYou can follow along interactively. All code blocks in this tutorial are ready to run in our companion notebook.\n\n[Open the Colab Notebook](https://colab.research.google.com/github/roboflow-ai/notebooks/blob/main/notebooks/rf-detr-keypoint-detection.ipynb?ref=blog.roboflow.com)\n\nTo demonstrate the fine-tuning process, we will train a custom court detector using [basketball-court-detection-2](https://universe.roboflow.com/roboflow-jvuqo/basketball-court-detection-2?ref=blog.roboflow.com). This dataset provides 33 landmarks mapped to specific locations on a basketball court. We will walk through the entire pipeline, from initializing the Preview checkpoint to evaluating the model on held-out test images and NBA broadcast footage.\n\n## Run RF-DETR Keypoint Preview on COCO\n\nRun the COCO Preview checkpoint on a sample image before fine-tuning. You get instance boxes, 17 body keypoints, and per-joint confidence scores for each detection. The model also predicts a covariance matrix for each keypoint, which supervision renders as uncertainty ellipses. Use the code below to predict and visualize the result.\n\n## Build Your COCO Keypoint Dataset\n\nWhen building a custom keypoint dataset in Roboflow Annotate, consistency is critical. For the basketball court, every one of the 33 points must be placed at its exact geometric location (like a specific lane corner or line intersection) across all frames. This ensures the model learns the true spatial structure regardless of the camera angle. See the [keypoint detection guide](https://blog.roboflow.com/keypoint-detection-on-roboflow/) for skeleton setup.\n\nIn Roboflow Annotate, use visible when the court feature appears in the image and occluded when it is hidden but geometrically known. Skip points that are off-screen entirely. COCO stores that as visibility per triplet: `v=2`\n\nvisible, `v=1`\n\noccluded, `v=0`\n\nnot in the annotation.\n\nYour training data must be exported in COCO Keypoint format so the code can read these visibility flags correctly. Every annotated object carries a sequence of x, y, and visibility values for each named keypoint in your skeleton. Generate a dataset version from Roboflow Annotate as COCO Keypoint JSON. Each split folder will then contain images plus `_annotations.coco.json`\n\n.\n\nFine-tuning initializes from the same Preview checkpoint you ran above. You replace the default 17-keypoint person head configuration with `num_keypoints_per_class`\n\nand `class_names`\n\nfrom your annotation JSON. For basketball, that means 33 court landmarks on the court class. From here, you have two options: train the model with no code using the Roboflow platform, or write a custom training loop using our open source Python package.\n\n## Train on Roboflow\n\nIf you prefer a no-code approach, you can train RF-DETR Keypoint directly within the Roboflow platform. After labeling your dataset, navigate to the Train tab and select the Roboflow RF-DETR Preview architecture. The Preview release currently uses the X Large model size. You can choose to start from COCO pre-trained weights or a checkpoint you previously trained to speed up convergence.\n\nBefore training begins, configure your dataset preprocessing. Standardize your images by applying Auto-Orient and resizing them to 576x576 to match the model's native input resolution. Once you generate the dataset version, review the summary and click Start Training.\n\nThe platform handles the infrastructure and hyperparameter optimization automatically. As the model trains, you can monitor live progress charts tracking keypoint mAP, Precision, Recall, and various loss metrics. You can let the training run for the full duration or use the Early Stop feature if the metrics stabilize. Once training is complete, your model is immediately available for deployment via the Roboflow API.\n\n## Train with the Open Source Python Package\n\n### Install Dependencies\n\nRun the pip command below in Colab or your local virtual environment. Use `rfdetr>=1.8.1`\n\nwith the train and visual extras and `supervision>=0.29.1`\n\nfor Keypoint Preview training and ellipse visualization. Add `roboflow`\n\nfor Universe download.\n\n```\npip install \"rfdetr[train,visual]>=1.8.1\" roboflow \"supervision>=0.29.1\"\n```\n\n### Download the Dataset\n\nPull the public basketball court dataset from Universe using the Roboflow SDK below. Choose the COCO export format. The SDK extracts three splits under `DATASET_DIR`\n\n, each with paired images and `_annotations.coco.json`\n\n.\n\n### Configure API Key\n\nThe download cell above uses your Roboflow API key from the environment. In Colab, store it under Secrets as `ROBOFLOW_API_KEY`\n\n. Locally, export the same variable in your shell before re-running the download snippet.\n\n## Fine-Tune RF-DETR Keypoints on Your Dataset\n\n### Infer Your Keypoint Schema\n\nYour annotation file defines how many keypoints each class has and what the classes are called. The helper below extracts that structure so you do not hard-code numbers that drift when the dataset changes. Keep `NUM_CLASSES`\n\n, `NUM_KEYPOINTS_PER_CLASS`\n\n, and `KEYPOINT_OKS_SIGMAS`\n\nfor the inference section later.\n\n### Configure Training\n\nA Colab T4 is sufficient for this tutorial. Set `RESOLUTION = 576`\n\nto match the Preview input size. The hyperparameter block below sets `BATCH_SIZE=2`\n\nand `GRAD_ACCUM_STEPS=2`\n\nfor that hardware profile. `EPOCHS`\n\n, `BATCH_SIZE`\n\n, and `GRAD_ACCUM_STEPS`\n\ncontrol training length and effective batch size. A larger GPU can take a higher batch size without lowering resolution.\n\nGradient accumulation lets you train with large-batch stability on a small GPU. With `BATCH_SIZE=2`\n\nand `GRAD_ACCUM_STEPS=2`\n\n, each optimizer step averages gradients over four images. Increase `GRAD_ACCUM_STEPS`\n\nif you want a larger effective batch without raising memory use.\n\n### Fine-Tune the Model\n\nThe code block below initializes the model with your custom schema and starts the training loop. `trainer.fit`\n\nhandles the optimization process, automatically saving the best weights to `checkpoint_best_total.pth`\n\n. All training and validation scalars are logged to `metrics.csv`\n\nin your output directory for later analysis.\n\n## Evaluate Training Results\n\nThe plot helpers below visualize the data recorded in `metrics.csv`\n\n. The most important metric for checkpoint selection is `val/keypoint_map_50_95`\n\n. A widening gap where training loss drops but validation keypoint mAP stalls is a clear indicator of overfitting to the training set.\n\nTimelapse frames replay the same eval images through checkpoints saved during training. Predictions start far from labels and improve epoch over epoch. Ellipse size shrinks as the model gains confidence on each court feature. Colored dots are model output; hollow rings are ground truth in the clips below.\n\nSome eval frames show predictions where no ground-truth point was labeled. That happens when the broadcast crop cuts off part of the court, when a point was left off-screen during annotation, or when crowd occlusion made the feature hard to label. The model may still emit a low-confidence keypoint with a large uncertainty ellipse. Compare ellipse size and confidence, not label overlap alone, on those frames.\n\n## Run Inference with Your Fine-Tuned Model\n\n### Load Your Checkpoint\n\nLoad your best checkpoint with `RFDETRKeypointPreview.from_checkpoint`\n\n. Pass the same `num_classes`\n\nand `num_keypoints_per_class`\n\nyou used during training so the head shapes match the saved weights. The path below points to `checkpoint_best_total.pth`\n\nin your output directory.\n\n### Run Inference on a Test Image\n\nOne forward pass returns court boxes and 33 keypoints with covariance. Apply bbox and keypoint thresholds before calling the supervision annotators. Filter weak joints using keypoint confidence before drawing. The snippet below loads a test frame and plots the result.\n\n## What You Can Build Next\n\nWith a reliable court keypoint model, you can map broadcast video directly to a 2D tactical board. This homography transformation turns raw video into structured spatial data. Combine it with tracking models to render player radars and visualize tactical formations.\n\n## Conclusion\n\nThis tutorial walked from pre-trained keypoint detection to a custom basketball court model. You fine-tuned RF-DETR Keypoint on a Roboflow dataset, reviewed training metrics, and ran inference with your saved checkpoint. Repeat the same steps with your own skeleton and dataset.\n\nFork the notebook, point the download cell at your Roboflow project, and start training on your own data. You can run this fine-tuning pipeline using our open source repository or train directly within the Roboflow platform. As you expand your computer vision projects, remember that RF-DETR is also the state-of-the-art model for object detection and instance segmentation.\n\n**Cite this Post**\n\nUse the following entry to cite this post in your research:\n\n[Piotr Skalski](/author/skalskip/). (Jun 26, 2026).\nHow to fine-tune RF-DETR Keypoints on Custom Data. Roboflow Blog: https://blog.roboflow.com/train-rf-detr-keypoint/", "url": "https://wpnews.pro/news/how-to-fine-tune-rf-detr-keypoints-on-custom-data", "canonical_source": "https://blog.roboflow.com/train-rf-detr-keypoint/", "published_at": "2026-06-26 13:02:46+00:00", "updated_at": "2026-06-26 13:08:26.038548+00:00", "lang": "en", "topics": ["computer-vision", "machine-learning", "ai-tools"], "entities": ["Roboflow", "RF-DETR", "COCO", "Colab", "NBA"], "alternates": {"html": "https://wpnews.pro/news/how-to-fine-tune-rf-detr-keypoints-on-custom-data", "markdown": "https://wpnews.pro/news/how-to-fine-tune-rf-detr-keypoints-on-custom-data.md", "text": "https://wpnews.pro/news/how-to-fine-tune-rf-detr-keypoints-on-custom-data.txt", "jsonld": "https://wpnews.pro/news/how-to-fine-tune-rf-detr-keypoints-on-custom-data.jsonld"}}