RLHF for PKMs

wpnews.pro

cd /news/machine-learning/rlhf-for-pkms · home › topics › machine-learning › article

[ARTICLE · art-40447] src=gist.github.com ↗ pub=2026-06-21T11:54Z topic=machine-learning verified=true sentiment=· neutral

RLHF for PKMs

Marco Porcellato proposes applying Reinforcement Learning from Human Feedback (RLHF) to Personal Knowledge Graphs (PKMs) to dynamically weight nodes based on user interactions. The system, being developed for Matryca Brain, uses implicit signals like transclusion and focus mode, plus explicit upvotes/downvotes, to surface high-value ideas and decay obsolete ones. A reference implementation in Python demonstrates updating node weights and applying temporal decay.

read3 min views18 publishedJun 21, 2026

Date: June 21, 2026 (Summer Solstice) Status: Request for Comments / Conceptual Framework Author: [Marco Porcellato / MarcoPorcellato] Context: This architecture is currently being researched and developed for the core engine of Matryca Brain, but is hereby released to the open-source community to foster experimentation in the Personal and Business Knowledge Management (PKM and BKM) space.

Current generation Personal Knowledge Management (PKM) tools like Obsidian, Logseq, and Roam Research are built on "flat" graphs. Every node (note or block) holds a static weight of 1.0

. Search algorithms rely entirely on text frequency (BM25) or static semantic proximity (Vector Embeddings).

However, human memory and cognition do not work this way. Some ideas are foundational pillars; others are fleeting thoughts or deprecated drafts.

This document proposes a new paradigm: Applying Reinforcement Learning from Human Feedback (RLHF) to Personal Knowledge Graphs. By treating nodes like neurons and interactions as synaptic reinforcements, the PKM learns to surface high-value thoughts and naturally decay obsolete ones, without forcing the user to manually rate their notes.

In this proposed architecture, every block or node in the graph database receives a dynamic weight

(salience) attribute.

The system passively listens to the user's natural workflow to increase node weights:

Transclusion / Block References: If Node A is embedded into Node B (e.g.,((uuid))

), it strongly signals that Node A is foundational. (+0.5 weight)Focus Mode / Zooming: Clicking into a block to isolate its sub-tree indicates active work/review. (+0.1 weight)AI Context Usage: If a block is repeatedly selected to feed the context window of a local LLM agent. (+0.2 weight)

Search Scroll-past: If a user searches "Machine Learning", the system returns 5 results, and the user clicks the 4th result, the unclicked top 3 results receive a micro-penalty for that specific context.Temporal Decay (Forgetting Curve): Nodes that have not been read, modified, or linked in X months undergo a logarithmic weight decay, mimicking human "retrieval-induced forgetting". They are never deleted, but they sink to the bottom of global searches.

Similar to LLM outputs, users can explicitly upvote/downvote specific blocks in the UI, marking them as "Core" or "Deprecated/Erratum", instantly altering the retrieval heuristic.

To ground this concept, here is the reference implementation logic we use when applying this to a generic Graph Database or relational index.

import time
import math

class KnowledgeGraphRLHF:
    def __init__(self, db_connection):
        self.db = db_connection  # Generic Graph or SQLite FTS bridge

    def apply_interaction_reward(self, node_uuid: str, interaction_type: str):
        """Updates the synaptic weight of a node based on implicit RLHF."""
        rewards = {
            "transclusion": 0.50,
            "focus_zoom": 0.10,
            "ai_context_inclusion": 0.20,
            "explicit_upvote": 1.00,
            "explicit_downvote": -2.00
        }
        
        reward = rewards.get(interaction_type, 0.0)
        if reward == 0:
            return
            
        query = """
            UPDATE nodes 
            SET weight = weight + ?, last_interacted_at = ?
            WHERE uuid = ?
        """
        self.db.execute(query, (reward, time.time(), node_uuid))

    def calculate_decay(self, node_uuid: str, current_time: float) -> float:
        """Applies a logarithmic temporal decay to unused nodes."""
        node = self.db.execute("SELECT weight, last_interacted_at FROM nodes WHERE uuid = ?", (node_uuid,))
        days_since_interaction = (current_time - node.last_interacted_at) / 86400
        
        if days_since_interaction < 30:
            return node.weight
            
        decay_factor = math.log10(days_since_interaction / 30 + 1)
        new_weight = max(0.1, node.weight - decay_factor)
        return new_weight

When querying the knowledge base (e.g., via Cmd+K global search), the ranking algorithm is no longer pure text matching. It becomes a composite score: Final_Score = (BM25_Text_Rank OR Vector_Similarity) * Node_Weight This ensures that heavily used paradigms naturally float to the top of the user's workflow.

This conceptual framework, architecture, and accompanying pseudo-code are released under the Apache License 2.0. My goal is to push the boundaries of the PKM ecosystem. You are free to implement this Implicit RLHF paradigm in your own tools, Obsidian plugins, Logseq forks, or standalone applications. Requirement (NOTICE): If you incorporate this architecture or core logic into your software, the Apache 2.0 license requires you to include the accompanying NOTICE file in your repository, providing clear attribution to this original RFC and referencing its origin from the Matryca Brain research initiative.

source & further reading

gist.github.com — original article OpenCode AI config to deny read access to .env, node_modules, build artifacts, cache dirs and ask before bash execution Download CapCut Pro 2026 for Mac For Agentic Coding

~/api · this article 200

$curl api.wpnews.pro/v1/news/rlhf-for-pkms

Read original on gist.github.com → gist.github.com/MarcoPorcellato/9e5226408c56048b…

mentioned entities

Marco Porcellato

Matryca Brain

Obsidian

Logseq

Roam Research

metadata

slugrlhf-for-pkms

topic#machine-learning

secondary2 topics

sentimentneutral

canonicalgist.github.com

navigation

← prevShow HN: An AI video prompt cook…

next →Real World Tailwind CSS: Control…

── more in #machine-learning 4 stories · sorted by recency

devclubhouse.com · 26 Jun · #machine-learning

Why Developers are Trading Obsidian for Agent-Native Markdown Wikis

dev.to · 4 Jul · #machine-learning

Designing Hybrid Edge AI Systems for Low-Latency Intent Classification in Mobile Applications

dev.to · 4 Jul · #machine-learning

My AI memory benchmark said 98.3%. The number was true — and worthless.

dev.to · 4 Jul · #machine-learning

7 AI Tools Every SDET Should Learn in 2026 — With Real Testing Use Cases

── more on @marco porcellato 3 stories trending now

wpnews · 30 May · #ai-safety

Nightcord Security Analysis Report - Threat Investigation

wpnews · 4 Jul · #large-language-models

Claude Sonnet 5: What Developers Need to Know Before Migrating

wpnews · 4 Jul · #artificial-intelligence

Istota, a personal AI operating system

sponsored brought to you by zahid.host 4,200+ EU-deployed projects

reading about agents? ship yours in a single git push.

Run your AI side-project on zahid.host

EU-based hosting, git-push deploys, automatic HTTPS, no cold starts. Free tier with a custom domain — perfect for shipping the agent you just read about.

$git push zahid main

→ Live at https://your-agent.zahid.host ✓

Get free account → Pricing

from €0/mo · no card required