{"slug": "pp-ocrv6-on-hugging-face-50-language-ocr-from-1-5m-to-34-5m-parameters", "title": "PP-OCRv6 on Hugging Face: 50-Language OCR from 1.5M to 34.5M Parameters", "summary": "Baidu released PP-OCRv6, a family of OCR models scaling from 1.5M to 34.5M parameters, supporting 50 languages. The medium model achieves 86.2% detection Hmean and 83.2% recognition accuracy, improving over PP-OCRv5_server by 4.6 and 5.1 percentage points respectively. The models are available on Hugging Face with ONNX Runtime CPU backend for production deployment.", "body_md": "📄 4\n\n#### PP-OCRv6 Online Demo\n\nPP-OCRv6 OCR with ONNX Runtime CPU backend\n\nEvaluate PP-OCRv6 online, then integrate lightweight, production-ready OCR with PaddlePaddle, Transformers, or ONNX Runtime backend.\n\nPP-OCRv6 is the latest generation of PaddleOCR’s universal OCR model family. It is designed for real-world text detection and recognition across documents, screenshots, multilingual images, digital displays, industrial labels, and scene text.\n\nThe model family scales from **1.5M to 34.5M parameters**, with three tiers: **tiny**, **small**, and **medium**. The medium and small tiers support **50 languages**, including Simplified Chinese, Traditional Chinese, English, Japanese, and 46 Latin-script languages. Try PP-OCRv6 online quickly: [PP-OCRv6 Online Demo](https://huggingface.co/spaces/PaddlePaddle/PP-OCRv6_Online_Demo).\n\nOn PaddleOCR’s official in-house multi-scenario OCR benchmarks, **PP-OCRv6_medium** reaches **86.2% detection Hmean** and **83.2% recognition accuracy**. Compared with PP-OCRv5_server, it improves text detection by **+4.6 percentage points** and text recognition by **+5.1 percentage points**.\n\nPP-OCRv6 focuses on a practical OCR need: producing accurate, structured text outputs with small models and flexible deployment options. For a deeper discussion of why specialized OCR models remain useful in the VLM era, see our previous blog: [PP-OCRv5 on Hugging Face: A Specialized Approach to OCR](https://huggingface.co/blog/baidu/ppocrv5).\n\nPP-OCRv6 introduces architecture, training, and data improvements across detection and recognition. The main design goal is to improve OCR accuracy while keeping model sizes suitable for different deployment settings.\n\nPP-OCRv6 provides three model tiers, covering different model sizes and OCR accuracy levels.\n\n| Model | Model size | Detection Hmean | Recognition accuracy | Typical application scenarios |\n|---|---|---|---|---|\nPP-OCRv6_tiny |\n1.5M params |\n80.6% | 73.5% | Edge devices, lightweight local OCR, latency-sensitive demos, constrained environments |\nPP-OCRv6_small |\n7.7M params |\n84.1% | 81.3% | Mobile, desktop, balanced OCR services, multilingual OCR with lower compute cost |\nPP-OCRv6_medium |\n34.5M params |\n86.2% |\n83.2% |\nAccuracy-oriented OCR, server-side pipelines, industrial OCR, document ingestion, multilingual OCR |\n\nPP-OCRv6 uses **PPLCNetV4** as a unified backbone for text detection and text recognition.\n\nFor developers, the main benefit is consistency across the model family. The tiny, small, and medium tiers are not unrelated models; they are part of the same OCR family and share a common architectural direction.\n\nText detection is the first stage of the OCR pipeline. Detection quality affects the crops sent to the recognizer, and poor crops often lead to poorer recognition.\n\nPP-OCRv6 upgrades the detection module with **RepLKFPN**, a lightweight large-kernel feature pyramid network designed for multi-scale text detection while keeping inference efficient.\n\nThis is relevant for real-world OCR inputs, where text may be small, dense, rotated, low-resolution, or embedded in complex backgrounds.\n\nFor text recognition, PP-OCRv6 uses **EncoderWithLightSVTR**. It combines local context modeling with global attention to improve recognition quality on challenging text crops.\n\nThe recognition improvements are especially relevant for multilingual text, screen text, industrial characters, special symbols, dense text, and noisy image regions.\n\nThe medium and small tiers support **50 languages** in one model family, covering Simplified Chinese, Traditional Chinese, English, Japanese, and 46 Latin-script languages.\n\nThis helps reduce the need for separate OCR models across common multilingual OCR scenarios.\n\nInstall PaddleOCR:\n\n```\npip install paddleocr\n```\n\nRun OCR with Paddle Infernece(Default backend):\n\n``` python\nfrom paddleocr import PaddleOCR\n\n# Model: PP-OCRv6_medium(Default)\n# Backend: Paddle Inference(Default)\nocr = PaddleOCR(\n    use_doc_orientation_classify=False,\n    use_doc_unwarping=False,\n    use_textline_orientation=False,\n)\nresult = ocr.predict(\"https://paddle-model-ecology.bj.bcebos.com/paddlex/imgs/demo_image/general_ocr_002.png\")\n\nfor res in result:\n    res.print()\n    res.save_to_img(\"output\")\n    res.save_to_json(\"output\")\n```\n\nThe OCR result can be saved as visualization images and structured JSON output. The structured output can then be used by downstream systems such as document parsing, search, extraction, RAG, analytics, or agent workflows.\n\nPP-OCRv6 can be used with multiple inference backends through PaddleOCR. **PaddleOCR 3.7** provides a unified inference-engine interface, where `engine`\n\nselects the underlying runtime and related configuration can be passed through the pipeline or module API.\n\nBackend |\nDescription |\n|---|---|\nTransformers |\nHugging Face / PyTorch-oriented inference path for supported PaddleOCR models |\nONNX Runtime |\nPortable inference path for ONNX-based deployment environments |\nPaddle Inference |\nNative Paddle inference format |\n\nFor Hugging Face users, PaddleOCR supports running selected OCR and document parsing models with a Transformers backend. This can be enabled with:\n\n```\nengine=\"transformers\"\n```\n\nFor more details on how the Transformers backend works in PaddleOCR, see:\n\n[PaddleOCR: Running OCR and Document Parsing Tasks with a Transformers Backend](https://huggingface.co/blog/PaddlePaddle/paddleocr-transformers)\n\nRun PP-OCRv6 example with Transformer Backend:\n\n``` python\nfrom paddleocr import PaddleOCR\n\n# Model: PP-OCRv6_medium(Default)\n# Backend: transformers\nocr = PaddleOCR(\n    use_doc_orientation_classify=False,\n    use_doc_unwarping=False,\n    use_textline_orientation=False,\n    engine=\"transformers\",\n)\nresult = ocr.predict(\"https://paddle-model-ecology.bj.bcebos.com/paddlex/imgs/demo_image/general_ocr_002.png\")\n```\n\nONNX variants are also available in the [PP-OCRv6 Collection](https://huggingface.co/collections/PaddlePaddle/pp-ocrv6) for environments that use ONNX Runtime through `engine=\"onnxruntime\"`\n\n:\n\n``` python\nfrom paddleocr import PaddleOCR\n\n# Model: PP-OCRv6_medium(Default)\n# Backend: ONNX Runtime\nocr = PaddleOCR(\n    use_doc_orientation_classify=False,\n    use_doc_unwarping=False,\n    use_textline_orientation=False,\n    engine=\"onnxruntime\",\n)\nresult = ocr.predict(\"https://paddle-model-ecology.bj.bcebos.com/paddlex/imgs/demo_image/general_ocr_002.png\")\n```\n\nTogether, these backend options make PP-OCRv6 available across different runtime environments while keeping the same OCR model family on the Hugging Face Hub.\n\nPP-OCRv6 extends PaddleOCR with a lightweight, multilingual OCR model family for real-world text detection and recognition.\n\nThe release includes three model tiers from **1.5M to 34.5M parameters**, up to **50-language OCR support**, improved detection and recognition accuracy over PP-OCRv5_server, and multiple model formats on the Hugging Face Hub, including **safetensors**, **Paddle inference models**, and **ONNX models**.\n\nTogether with the hosted Hugging Face Space and the available PaddleOCR inference backends, PP-OCRv6 provides several entry points for evaluation and integration:\n\n**Online Demo**: [PP-OCRv6 Online Demo](https://huggingface.co/spaces/PaddlePaddle/PP-OCRv6_Online_Demo)\n\n**Model Collection**: [PP-OCRv6 Collection](https://huggingface.co/collections/PaddlePaddle/pp-ocrv6)\n\n**Transformers Backend Blog**: [PaddleOCR with Transformers Backend](https://huggingface.co/blog/PaddlePaddle/paddleocr-transformers)\n\n**PaddleOCR Documentation**: [PP-OCRv6 Documentation](https://www.paddleocr.ai/latest/en/version3.x/algorithm/PP-OCRv6/PP-OCRv6.html)\n\n**PaddleOCR Official Website**: [https://www.paddleocr.com](https://www.paddleocr.com)\n\nYou can evaluate PP-OCRv6 with the online demo, explore the available model assets in the Collection, and use the inference backend that matches your own OCR workflow.\n\nPP-OCRv6 OCR with ONNX Runtime CPU backend", "url": "https://wpnews.pro/news/pp-ocrv6-on-hugging-face-50-language-ocr-from-1-5m-to-34-5m-parameters", "canonical_source": "https://huggingface.co/blog/PaddlePaddle/pp-ocrv6", "published_at": "2026-06-22 13:18:56+00:00", "updated_at": "2026-06-23 23:50:00.678186+00:00", "lang": "en", "topics": ["computer-vision", "machine-learning", "ai-products", "ai-tools", "natural-language-processing"], "entities": ["Baidu", "PaddleOCR", "PP-OCRv6", "Hugging Face", "ONNX Runtime", "PaddlePaddle", "PPLCNetV4", "RepLKFPN"], "alternates": {"html": "https://wpnews.pro/news/pp-ocrv6-on-hugging-face-50-language-ocr-from-1-5m-to-34-5m-parameters", "markdown": "https://wpnews.pro/news/pp-ocrv6-on-hugging-face-50-language-ocr-from-1-5m-to-34-5m-parameters.md", "text": "https://wpnews.pro/news/pp-ocrv6-on-hugging-face-50-language-ocr-from-1-5m-to-34-5m-parameters.txt", "jsonld": "https://wpnews.pro/news/pp-ocrv6-on-hugging-face-50-language-ocr-from-1-5m-to-34-5m-parameters.jsonld"}}