04:00
2026-06-19
arxiv.org
large-language-models
Closing the Social-Semantic Gap: SPSD for Edge-Based Prompt Compression in Cloud LLM Inference
Researchers propose SPSD, an edge-based pipeline that compresses user prompts using a small language model before sending them to a cloud LLM, reducing input tokens by an average of 99.9 per call whil…