{"slug": "dataset-efficient-llm-papers-quantization-lora-moe-flashattention-from-arxiv", "title": "[Dataset] Efficient LLM papers (quantization, LoRA, MoE, FlashAttention) from arXiv + Semantic Scholar — 1,734 records, quality-scored, JSONL", "summary": "A new dataset, fineset-io/efficient-llm-papers, compiles 1,734 records of arXiv and Semantic Scholar papers on efficient LLM techniques like quantization, LoRA, MoE, and FlashAttention, each quality-scored in JSONL format. The dataset aims to serve as a reference for state-of-the-art efficiency methods and a clean corpus for fine-tuning models to reason about these techniques.", "body_md": "Most of us aren’t training frontier models — we’re trying to fit a good one onto the\n\nhardware we actually have. The research that makes that possible (quantization, LoRA/PEFT,\n\nmixture-of-experts, FlashAttention, KV-cache tricks, Mamba/SSMs) is scattered across\n\nhundreds of arXiv papers, and it’s some of the fastest-moving work in ML right now.\n\nSo I assembled it into one dataset: fineset-io/efficient-llm-papers\n\nI find it useful as a “what’s the current state of the art for making this cheaper”\n\nreference — and as a clean corpus if you’re fine-tuning a model to reason about\n\nefficiency techniques.\n\nHappy to take suggestions on gaps or answer questions about how the pipeline works.", "url": "https://wpnews.pro/news/dataset-efficient-llm-papers-quantization-lora-moe-flashattention-from-arxiv", "canonical_source": "https://discuss.huggingface.co/t/dataset-efficient-llm-papers-quantization-lora-moe-flashattention-from-arxiv-semantic-scholar-1-734-records-quality-scored-jsonl/176811#post_1", "published_at": "2026-06-15 09:16:09+00:00", "updated_at": "2026-06-15 09:18:06.426122+00:00", "lang": "en", "topics": ["large-language-models", "machine-learning", "ai-research", "ai-tools", "ai-infrastructure"], "entities": ["arXiv", "Semantic Scholar", "fineset-io/efficient-llm-papers", "LoRA", "FlashAttention", "Mamba", "MoE", "KV-cache"], "alternates": {"html": "https://wpnews.pro/news/dataset-efficient-llm-papers-quantization-lora-moe-flashattention-from-arxiv", "markdown": "https://wpnews.pro/news/dataset-efficient-llm-papers-quantization-lora-moe-flashattention-from-arxiv.md", "text": "https://wpnews.pro/news/dataset-efficient-llm-papers-quantization-lora-moe-flashattention-from-arxiv.txt", "jsonld": "https://wpnews.pro/news/dataset-efficient-llm-papers-quantization-lora-moe-flashattention-from-arxiv.jsonld"}}