Published my first open-source Python package: llmslim.
It compresses prompts, chat histories, and RAG contexts using semantic chunking + extractive ranking before sending them to an LLM.
Example:
2847 tokens → 1138 tokens (60% reduction)
Looking for feedback from the HF community on:
Contributions and criticism welcome.