cd /news/machine-learning/rlhf-for-pkms · home topics machine-learning article
[ARTICLE · art-40447] src=gist.github.com ↗ pub= topic=machine-learning verified=true sentiment=· neutral

RLHF for PKMs

Marco Porcellato proposes applying Reinforcement Learning from Human Feedback (RLHF) to Personal Knowledge Graphs (PKMs) to dynamically weight nodes based on user interactions. The system, being developed for Matryca Brain, uses implicit signals like transclusion and focus mode, plus explicit upvotes/downvotes, to surface high-value ideas and decay obsolete ones. A reference implementation in Python demonstrates updating node weights and applying temporal decay.

read3 min views18 publishedJun 21, 2026

Date: June 21, 2026 (Summer Solstice) Status: Request for Comments / Conceptual Framework Author: [Marco Porcellato / MarcoPorcellato] Context: This architecture is currently being researched and developed for the core engine of Matryca Brain, but is hereby released to the open-source community to foster experimentation in the Personal and Business Knowledge Management (PKM and BKM) space.

Current generation Personal Knowledge Management (PKM) tools like Obsidian, Logseq, and Roam Research are built on "flat" graphs. Every node (note or block) holds a static weight of 1.0

. Search algorithms rely entirely on text frequency (BM25) or static semantic proximity (Vector Embeddings).

However, human memory and cognition do not work this way. Some ideas are foundational pillars; others are fleeting thoughts or deprecated drafts.

This document proposes a new paradigm: Applying Reinforcement Learning from Human Feedback (RLHF) to Personal Knowledge Graphs. By treating nodes like neurons and interactions as synaptic reinforcements, the PKM learns to surface high-value thoughts and naturally decay obsolete ones, without forcing the user to manually rate their notes.

In this proposed architecture, every block or node in the graph database receives a dynamic weight

(salience) attribute.

The system passively listens to the user's natural workflow to increase node weights:

Transclusion / Block References: If Node A is embedded into Node B (e.g.,((uuid))

), it strongly signals that Node A is foundational. (+0.5 weight)Focus Mode / Zooming: Clicking into a block to isolate its sub-tree indicates active work/review. (+0.1 weight)AI Context Usage: If a block is repeatedly selected to feed the context window of a local LLM agent. (+0.2 weight)

Search Scroll-past: If a user searches "Machine Learning", the system returns 5 results, and the user clicks the 4th result, the unclicked top 3 results receive a micro-penalty for that specific context.Temporal Decay (Forgetting Curve): Nodes that have not been read, modified, or linked in X months undergo a logarithmic weight decay, mimicking human "retrieval-induced forgetting". They are never deleted, but they sink to the bottom of global searches.

Similar to LLM outputs, users can explicitly upvote/downvote specific blocks in the UI, marking them as "Core" or "Deprecated/Erratum", instantly altering the retrieval heuristic.

To ground this concept, here is the reference implementation logic we use when applying this to a generic Graph Database or relational index.

import time
import math

class KnowledgeGraphRLHF:
    def __init__(self, db_connection):
        self.db = db_connection  # Generic Graph or SQLite FTS bridge

    def apply_interaction_reward(self, node_uuid: str, interaction_type: str):
        """Updates the synaptic weight of a node based on implicit RLHF."""
        rewards = {
            "transclusion": 0.50,
            "focus_zoom": 0.10,
            "ai_context_inclusion": 0.20,
            "explicit_upvote": 1.00,
            "explicit_downvote": -2.00
        }
        
        reward = rewards.get(interaction_type, 0.0)
        if reward == 0:
            return
            
        query = """
            UPDATE nodes 
            SET weight = weight + ?, last_interacted_at = ?
            WHERE uuid = ?
        """
        self.db.execute(query, (reward, time.time(), node_uuid))

    def calculate_decay(self, node_uuid: str, current_time: float) -> float:
        """Applies a logarithmic temporal decay to unused nodes."""
        node = self.db.execute("SELECT weight, last_interacted_at FROM nodes WHERE uuid = ?", (node_uuid,))
        days_since_interaction = (current_time - node.last_interacted_at) / 86400
        
        if days_since_interaction < 30:
            return node.weight
            
        decay_factor = math.log10(days_since_interaction / 30 + 1)
        new_weight = max(0.1, node.weight - decay_factor)
        return new_weight

When querying the knowledge base (e.g., via Cmd+K global search), the ranking algorithm is no longer pure text matching. It becomes a composite score: Final_Score = (BM25_Text_Rank OR Vector_Similarity) * Node_Weight This ensures that heavily used paradigms naturally float to the top of the user's workflow.

This conceptual framework, architecture, and accompanying pseudo-code are released under the Apache License 2.0. My goal is to push the boundaries of the PKM ecosystem. You are free to implement this Implicit RLHF paradigm in your own tools, Obsidian plugins, Logseq forks, or standalone applications. Requirement (NOTICE): If you incorporate this architecture or core logic into your software, the Apache 2.0 license requires you to include the accompanying NOTICE file in your repository, providing clear attribution to this original RFC and referencing its origin from the Matryca Brain research initiative.

── more in #machine-learning 4 stories · sorted by recency
── more on @marco porcellato 3 stories trending now
sponsored brought to you by zahid.host 4,200+ EU-deployed projects
reading about agents? ship yours in a single git push.

Run your AI side-project on zahid.host

EU-based hosting, git-push deploys, automatic HTTPS, no cold starts. Free tier with a custom domain — perfect for shipping the agent you just read about.

$git push zahid main
Live at https://your-agent.zahid.host
Get free account → Pricing
from €0/mo · no card required
LIVE [news/rlhf-for-pkms] indexed:0 read:3min 2026-06-21 ·