12:55
2026-06-03
huggingface.co
large-language-models
Direct Preference Optimization Beyond Chatbots
Dharma-AI released DharmaOCR, a structured OCR model, and published a paper demonstrating that Direct Preference Optimization (DPO) reduced text degeneration rates by an average of 59.4% across all te…