Our email inboxes carry multiple decades of messages (100K-500K). This is a good proxy for all the important things that happened in your life, the projects you have done and the people that you have connected with. With the chronological view of messages in the inbox, these details remain hidden. What if we could turn this archive into a personal wiki that you can search and curate? That is Memento.
For the information architecture of such a wiki, Memento takes an opinionated view of creating four high level dimensions - People (like a CRM view of your contacts), Projects (life events that are bounded by some start and end dates), Concepts (evergreen topics) and Newsletters. Inboxes group messages by sender address, so the same person shows up many times across work, personal, and alias addresses. In the People dimension, Memento resolves all of that into one canonical person using deterministic algorithms and brings the people who are related to that person through graph algorithms. With just these two, you get an already populated CRM from your life history. No LLMs involved. From here, you can further enrich each person's wiki page by adding additional personal notes and create a cohesive narrative using LLM. The same applies to other dimensions as well.
How can we query this wiki? This is where Memento uses an agentic search over this curated dataset. Your emails are stored in SQLite DB, indexed using FTS and vector embeddings and kept up-to-date, using another open-source project called Msgvault [1]. Memento extends this DB with its own tables to store the output of various algorithms - canonical people discovery, graph algorithms to find clusters of connections etc. For the agentic search to be effective, Memento exposes the underlying FTS, vector and graph data in a structured way to the agent as tools. The agent can further refine the search with additional tools like get-message-details, message-cluster etc. What we discovered is that the resulting search is much more powerful that the typical ‘Ask Gmail’ search exposed by Google.
Every factual claim exposed by Memento can be traced back to the real email it came from. Any additional notes added to the wiki are incorporated in the next generation. So Memento becomes richer and more personal to you over time.
This worked really well for our personal email archive and we were pleasantly surprised by the things that Memento was able to uncover. In order to demonstrate this capability at scale without exposing our private info, we connected Memento to the public Enron dataset that contains hundreds of mailboxes. With the SQLite store, Go backend and Next.js UI, Memento handles this 5 GB dataset with ease. Now you can query this archive and run agentic searches to re-discover the Enron scandal yourself - you can see the demo set here [2].
The app is a single binary that serves on localhost and treats your archive as read-only, and you can point it at any OpenAI-API-compatible LLM, whether that's a local model or a cloud one. It's open source, so you can inspect how your data is handled.
You can try it out today without connecting your archive - use the hosted demo [2] or download the GitHub release and run it with a synthetic local archive ./memento app --demo
[2] Hosted demo (Enron data): [https://memento-demo.latentsignal.org/home](https://memento-demo.latentsignal.org/home)
Demo video: [https://www.youtube.com/watch?v=Ms1KeAYCN2A](https://www.youtube.com/watch?v=Ms1KeAYCN2A)
Project home: [https://latentsignal.org/projects/memento](https://latentsignal.org/projects/memento)
GitHub: [https://github.com/latentsignal-org/memento](https://github.com/latentsignal-org/memento)
We are George and Ann, creators of Memento.
Comments URL: [https://news.ycombinator.com/item?id=48557937](https://news.ycombinator.com/item?id=48557937)
Points: 4