TruthfulQA

mentions 6 type Organization feed RSS

// recent coverage 6 mentions

08:25

2026-07-11

machinebrief.com

artificial-intelligence

Uncertainty: A Breakthrough in Neural Network Prediction

Researchers have developed a lightweight method for quantifying uncertainty in neural network predictions using two key approximations: a first-order Taylor expansion and an isotropy assumption. The a…

04:00

2026-07-09

arxiv.org

large-language-models

Comprehensive Evaluation of Large Language Model Responses: A Multi-Factor Scoring System

Researchers introduced a multi-factor scoring system to evaluate large language model responses, assessing accuracy, conciseness, factual consistency, readability, and coherence. Tests on the Truthful…

09:44

2026-06-14

lesswrong.com

ai-safety

I Bet Abliteration's Cost Was Sloppy Implementation. I Was Wrong

A researcher found that a clean implementation of abliteration on Qwen3.5-27B costs only about 1.4 TruthfulQA points, far less than the 5.5+ points lost by HuiHui AI's crude method, confirming that mo…

04:00

2026-06-03

arxiv.org

large-language-models

Hallucination Is Linearly Decodable from Mid-Layer Hidden States in Quantized LLMs

Researchers found that a linear probe applied to mid-layer hidden states of quantized large language models can detect hallucinations with up to 1.000 AUROC, significantly outperforming sampling-based…

04:00

2026-05-29

arxiv.org

large-language-models

SERC: LDPC-Inspired Semantic Error Correction for Retrieval-Augmented Generation

Researchers have developed SERC, a semantic error correction framework inspired by LDPC codes that treats LLM text generation as a noisy communication channel to detect and fix hallucinations. The tra…

04:00

2026-05-29

arxiv.org

large-language-models

MechELK: A Mechanistic Interpretability Framework for Eliciting Latent Knowledge in Large Language Models

Researchers have developed MechELK, a three-stage framework that uses mechanistic interpretability to extract hidden factual and reasoning knowledge from large language models. The framework, which co…

// co-occurs with top 8 entities

SERC 1 LDPC 1 LongForm Bio 1 Llama-3-8B 1 Qwen2.5-14B 1 HuiHui AI 1 Qwen3.5-27B 1 Arditi 1