[Dataset] Efficient LLM papers (quantization, LoRA, MoE, FlashAttention) from arXiv + Semantic Scholar — 1,734 records, quality-scored, JSONL

wpnews.pro

cd /news/large-language-models/dataset-efficient-llm-papers-quantiz… · home › topics › large-language-models › article

[ARTICLE · art-27771] src=discuss.huggingface.co ↗ pub=2026-06-15T09:16Z topic=large-language-models verified=true sentiment=↑ positive

[Dataset] Efficient LLM papers (quantization, LoRA, MoE, FlashAttention) from arXiv + Semantic Scholar — 1,734 records, quality-scored, JSONL

A new dataset, fineset-io/efficient-llm-papers, compiles 1,734 records of arXiv and Semantic Scholar papers on efficient LLM techniques like quantization, LoRA, MoE, and FlashAttention, each quality-scored in JSONL format. The dataset aims to serve as a reference for state-of-the-art efficiency methods and a clean corpus for fine-tuning models to reason about these techniques.

read1 min views21 publishedJun 15, 2026

Most of us aren’t training frontier models — we’re trying to fit a good one onto the

hardware we actually have. The research that makes that possible (quantization, LoRA/PEFT,

mixture-of-experts, FlashAttention, KV-cache tricks, Mamba/SSMs) is scattered across

hundreds of arXiv papers, and it’s some of the fastest-moving work in ML right now.

So I assembled it into one dataset: fineset-io/efficient-llm-papers I find it useful as a “what’s the current state of the art for making this cheaper”

reference — and as a clean corpus if you’re fine-tuning a model to reason about

efficiency techniques.

Happy to take suggestions on gaps or answer questions about how the pipeline works.

source & further reading

discuss.huggingface.co — original article Rakarrack-0.6.1 port making progress! ( AI assisted ) Cloud Storage Poll Welcome to Haiku basic(Haiku Docs, Haiku slide and Haiku sheets)

~/api · this article 200

$curl api.wpnews.pro/v1/news/dataset-efficient-llm-pa…

Read original on discuss.huggingface.co → discuss.huggingface.co/t/dataset-efficient-llm-p…

mentioned entities

arXiv

Semantic Scholar

fineset-io/efficient-llm-papers

LoRA

FlashAttention

Mamba

MoE

KV-cache

metadata

slugdataset-efficient-llm-papers-quantization-lora-moe-flashattention-from-arxiv

topic#large-language-models

secondary4 topics

sentimentpositive

canonicaldiscuss.huggingface.co

navigation

← prevHe finally felt financially stab…

next →Tell HN: Claude is completely un…

── more in #large-language-models 4 stories · sorted by recency

github.com · 30 Jul · #large-language-models

Show HN: Noisegate – a differential-privacy gateway for untrusted AI agents

promptcube3.com · 30 Jul · #large-language-models

LLM Routers: The Rise of a New Infrastructure Category

pub.towardsai.net · 30 Jul · #large-language-models

Kimi K3: Too Big to Run

cryptobriefing.com · 30 Jul · #large-language-models

DeepSeek plans massive data center in Inner Mongolia to compete with Silicon Valley

── more on @arxiv 3 stories trending now

wpnews · 28 Jul · #large-language-models

How to Download and Run Kimi K3 Open Weights

wpnews · 29 Jul · #ai-safety

News Summary for July 29, 2026

wpnews · 29 Jul · #ai-safety

Better security starts with better questions

sponsored brought to you by zahid.host 4,200+ EU-deployed projects

reading about agents? ship yours in a single git push.

Run your AI side-project on zahid.host

EU-based hosting, git-push deploys, automatic HTTPS, no cold starts. Free tier with a custom domain — perfect for shipping the agent you just read about.

$git push zahid main

→ Live at https://your-agent.zahid.host ✓

Get free account → Pricing

from €0/mo · no card required