Months of self-testing: Citations shine, other features remain unproven.

wpnews.pro

cd /news/ai-products/months-of-self-testing-citations-shi… · home › topics › ai-products › article

[ARTICLE · art-13970] src=dev.to ↗ pub=2026-05-26T01:38Z topic=ai-products verified=true sentiment=↑ positive

Months of self-testing: Citations shine, other features remain unproven.

A developer built UUMuse, a PDF workspace tool that indexes documents and allows users to ask questions with verifiable citations across multiple AI models. After months of self-testing, the developer found that citation accuracy and cross-model memory work well, while onboarding and retrieval failures when the model sounds confident remain unproven pain points. The tool is designed for students, researchers, PMs, and legal professionals who need to trust document-based answers and reuse context across GPT, Claude, DeepSeek, and other models.

read4 min views13 publishedMay 26, 2026

ok so full disclosure: I'm the person who built this. I'm not here to drop a pitch deck on you — I just need people who actually live in PDF hell to tell me if I'm solving a real problem or cosplaying as a startup.

how I used to work (embarrassing) my "system" was: files in Drive → open ChatGPT → paste half a PDF → pray → switch to Claude → paste again → forget which paragraph the model hallucinated from.

NotebookLM was the first time I went "oh it actually read the doc." but I'd still end up with a chat thread that's useless outside that window. couldn't hand it to a teammate. couldn't put it on a landing page. couldn't switch models without feeling like I'm starting over.

that's basically why I started building UUMuse. yes that's the name. yes my friends say it sounds like a muse who uwu's. moving on.

what it is, if I explain it like I'm drunk at a meetup

you throw files in a workspace. it indexes them. you ask questions. answers try to show little [1][2] things so you can click and be like "ok yeah that's actually on page 4" instead of vibes-based trust.

you can bounce between GPT / Claude / DeepSeek / whatever without re-up the same stupid 80-page PDF. there's memory that carries over — like it learned I hate long intros and prefer bullets. you can also go into settings and delete a memory when it confidently learns the wrong thing about you (ask me how I know).

there's agent mode when I'm lazy — "summarize this folder and write me a memo" — and there's this multi-expert debate mode (Spark) that I built because I have decision paralysis and wanted AI to argue with itself using my files. ngl that's probably a me problem not a market problem.

how it actually feels to use day to day

the first "oh shit" moment for me was boring: I uploaded a messy PDF, asked something specific, clicked [1], and the highlight was actually the right paragraph. I know that sounds like table stakes in 2026 but emotionally it was the first time I stopped copy-pasting quotes back into the chat to fact-check.

second moment: I switched models mid-conversation and it didn't gaslight me with "I don't have access to your files." small thing. huge relief.

third: memory. I told it once I want concise answers and it… mostly listened. when it didn't, I deleted the memory entry instead of fighting the model in chat like it's an ex.

where it still feels clunky for me:

first-time onboarding tries to be helpful and sometimes feels like homework. I'm trimming it again for the 800th time.

when retrieval misses, the model can still sound confident. we're better than embed-and-pray v1 but it's not magic.

I definitely built "publish a docs page / embed widget / MCP hook" before I had ten humans who loved plain Q&A. classic side project disease. if you only try one thing, try upload → ask → click a citation. ignore the rest until that loop feels good.

who I think it's for (not "everyone")

if you're a student/researcher/PM/legal-ish person with a pile of docs and you don't want to rebuild context every time you change models — you'll probably get it.

if you want a clean notes app, use Obsidian. if you want Google's polished doc chat, use NotebookLM. I'm not trying to win that fight on day one.

I'm trying to win "I trust this enough to act on it" + "I can reuse the same brain elsewhere later."

where we're at

early. like… there's no user count I can flex. Product Hunt soon-ish. I'm posting here because my last attempt read like an ad and got nuked (fair).

built with

Next.js · FastAPI · Postgres/pgvector · Redis/Celery · Stripe · Docker. AI through the usual suspects (OpenAI/Anthropic/DeepSeek/Google). happy to yap about RAG failures in comments.

actually want from you

do you still paste docs into chat or did you fix your life with something else?

are citations a must for you or do you not care?

if you bounced off a tool like this — what was the dealbreaker?

if you want the link I'll drop it in a comment — trying not to lead with URL and get flagged again.

thanks for reading. roast welcome. compliments suspicious.

source & further reading

dev.to — original article Powering Local-First AI: Searching and Retrieving Context for Inference Mapping Semantic Meaning Onto the Night Sky Build Firebase AI Logic Application with Antigravity CLI

~/api · this article 200

$curl api.wpnews.pro/v1/news/months-of-self-testing-c…

Read original on dev.to → dev.to/owjdie163com_096e40b198/months-of-self-te…

mentioned entities

UUMuse

NotebookLM

GPT

Claude

DeepSeek

Drive

metadata

slugmonths-of-self-testing-citations-shine-other-features-remain-unproven

topic#ai-products

secondary4 topics

sentimentpositive

canonicaldev.to

navigation

← prevDonating 80% While It Still Coun…

next →How to Fix Tool-Use Loops in Aut…

── more in #ai-products 4 stories · sorted by recency

sourcefeed.dev · 10 Jul · #ai-products

GPT-5.6 Sol Rewrites the Economics of Agentic Coding

byteiota.com · 10 Jul · #ai-products

GPT-5.6 in GitHub Copilot: Sol, Terra, or Luna?

voi.id · 10 Jul · #ai-products

OpenAI Tutup Atlas, Fitur Browser AI Dipindah ke ChatGPT dan Chrome

pub.towardsai.net · 10 Jul · #ai-products

The Hidden Engineering Behind Every AI Model: Storage, Compute, and the Data Pipeline Nobody Talks…

── more on @uumuse 3 stories trending now

wpnews · 30 May · #ai-safety

Nightcord Security Analysis Report - Threat Investigation

wpnews · 27 May · #artificial-intelligence

How I Run Two Claude Accounts as One

wpnews · 8 Jul · #artificial-intelligence

Anthropic's "J-lens" reveals workspace in Claude mirrors theory of consciousness

sponsored brought to you by zahid.host 4,200+ EU-deployed projects

reading about agents? ship yours in a single git push.

Run your AI side-project on zahid.host

EU-based hosting, git-push deploys, automatic HTTPS, no cold starts. Free tier with a custom domain — perfect for shipping the agent you just read about.

$git push zahid main

→ Live at https://your-agent.zahid.host ✓

Get free account → Pricing

from €0/mo · no card required