cd/entity/LLaMA· home› entities› LLaMA

grep -l @llama /news/*.json | wc -l → 19

LLaMA

mentions 19 type Organization feed RSS

// recent coverage 19 mentions

04:00

2026-07-29

arxiv.org

large-language-models

TimeCapsule: Generative Hallucination as a Method for Historical Sensemaking

A 1.2B-parameter LLaMA-style causal model called TimeCapsule, trained exclusively on Victorian texts from 1800-1875, achieves a 45.4% perplexity reduction over a GPT-2 baseline on held-out Victorian p…

04:00

2026-07-22

machinebrief.com

artificial-intelligence

For What Reason? Interpreting Models' Encoding of Causation and Antithesis

A new study from researchers using LLaMA and Mistral models finds that instruction-tuned Transformers encode discourse relations such as causation and antithesis asymmetrically across layers, with ear…

00:00

2026-07-17

dibi8.com

artificial-intelligence

FastChat — Build Your Own Open-Source ChatGPT Clone with LLM Chatbots

LMSYS FastChat is the most comprehensive open-source platform for building, serving, and evaluating LLM-powered chatbots in 2026, offering tools for training instruction-tuned models like Vicuna and A…

04:55

2026-07-16

thegustafson.com

large-language-models

Language Modeling as Next-Token Prediction

A language model assigns a probability to the next token given all previous tokens, a task that the chain rule shows is sufficient to capture any pattern in language. Claude Shannon demonstrated in 19…

18:09

2026-07-14

machinebrief.com

artificial-intelligence

AI Tutors: Bridging the Gap or Widening It?

A study auditing four AI language models as history tutors found that safety-aligned models blocked 76.7% of educational requests from students perceived as low-tier, and exhibited biases such as usin…

20:26

2026-07-10

machinebrief.com

large-language-models

ARCQuant: Redefining Efficiency in LLM Inference with NVFP4

ARCQuant, a new framework for Large Language Model inference, uses the NVFP4 numerical format to achieve up to 3x speedup on GPUs while maintaining accuracy comparable to full-precision baselines. The…

17:08

2026-07-10

machinebrief.com

artificial-intelligence

Breaking Down Long-Context Transformer Bottlenecks

Researchers have developed a new approach to overcome the quadratic cost of causal self-attention in long-context transformers, using state update design and structural interventions like sink tokens …

04:33

2026-07-08

arxiv.org

large-language-models

From Words to Watts: Benchmarking the Energy Costs of LLM Inference (2023)

Researchers from Meta AI benchmarked the energy costs of inference for LLaMA models on NVIDIA V100 and A100 GPUs across up to 32 GPUs, finding that inference energy consumption is significant and ofte…

10:52

2026-07-07

dev.to

machine-learning

LayerNorm vs BatchNorm: why Transformers normalize per token, not per batch

A developer explains the difference between LayerNorm and BatchNorm, showing that LayerNorm normalizes per token rather than per batch, which makes it ideal for Transformers. The post includes an inte…

00:00

2026-07-04

dev.to

large-language-models

Mastering Local Deployment of SOTA LLMs: Jamesob’s Guide to Overcoming Resource Constraints

Jamesob's guide provides developers with actionable strategies to deploy state-of-the-art large language models locally on consumer-grade hardware. The framework covers model quantization, pruning, ef…

08:05

2026-06-27

dev.to

artificial-intelligence

I Rented Out My GPU for Passive Income — Here's What Happened After My First Week

A developer earned $11.56 in the first week by renting out an idle RTX 3060 on Vast.ai, a GPU marketplace for AI compute. The card was used for LLM inference and training, with utilization varying fro…

08:27

2026-06-24

discuss.huggingface.co

large-language-models

🧠 I built a novel triple-hybrid LLM (Mamba + Attention + 32-expert MoE) from scratch for ~$50 — Titan v1 complete, Titan v2 first cycle done, expanding dataset now

A developer built a novel triple-hybrid LLM combining Mamba, Attention, and a 32-expert Mixture of Experts architecture from scratch for approximately $50, completing Titan v1 and the first training c…

01:14

2026-06-18

github.com

large-language-models

Rust port of transformers (1M lines of code)

TrustformeRS 0.1.1, a pure Rust port of Hugging Face Transformers with over 1.4 million lines of code, was released on April 25, 2026, delivering 49+ transformer architectures and up to 1.67x speedup …

00:00

2026-06-18

mindstudio.ai

large-language-models

How to Run Local AI Models with Ollama: A Beginner's Setup Guide for 2026

Ollama, an open-source tool for running large language models locally, offers a beginner-friendly setup for 2026 with privacy, cost savings, and data control. Users can install it on macOS, Windows, o…

22:18

2026-06-15

dev.to

developer-tools

Three GPU affiliate programs I wired into an AI tool directory

A developer integrated three GPU cloud affiliate programs—RunPod, Vast.ai, and Hetzner Cloud—into an AI tools directory after finding Amazon's conversion weak for developer-adjacent products. The mone…

04:00

2026-06-05

arxiv.org

large-language-models

LoRi: Low-Rank Distillation for Implicit Reasoning

Researchers have developed LoRi, a low-rank distillation framework that improves implicit reasoning in large language models by aligning teacher and student reasoning trajectories within a shared low-…

02:01

2026-05-30

dev.to

large-language-models

I Built a Q&A Bot for My Docs and Almost Gave Up (Here's What Worked)

A developer built a Retrieval-Augmented Generation (RAG) pipeline for a documentation Q&A bot after multiple failed attempts, including token limits, high costs, and hallucination issues with direct L…

07:23

2026-05-26

dev.to

generative-ai

Master RAG Systems: Build an End-to-End LangChain Pipeline with Milvus, Reranking & Azure OpenAI 🚀

A developer has built an end-to-end Retrieval-Augmented Generation (RAG) pipeline using LangChain, Milvus, reranking, and Azure OpenAI to reduce hallucination in large language models. The system retr…

15:19

2026-05-20

dev.to

large-language-models

What did gemma see? - Thinking in comments...

The Gemma 4 26B model was the first local AI to achieve a perfect score on the HumanEval benchmark, including solving the notoriously difficult problem 145. This problem requires sorting integers by t…

// co-occurs with top 8 entities

Qwen 6 Mistral 5 GPT-2 4 GPT-4 3 Hugging Face 2 BERT 2 NVIDIA 2 Ollama 2

// topics top 6 topics

large language models 18 ai infrastructure 11 artificial intelligence 10 machine learning 8 ai tools 7 developer tools 7