Hyperdimensional computing for structured querying on tabular data embeddings

wpnews.pro

cd /news/machine-learning/hyperdimensional-computing-for-struc… · home › topics › machine-learning › article

[ARTICLE · art-27522] src=arxiv.org ↗ pub=2026-06-15T04:00Z topic=machine-learning verified=true sentiment=↑ positive

Hyperdimensional computing for structured querying on tabular data embeddings

Researchers propose using hyperdimensional computing (HDC) for structured querying on tabular data embeddings, addressing the lack of interpretable similarity scores in current methods. The HDC approach, based on Holographic Reduced Representations, enables principled threshold setting for zero-match detection and outperforms a graph-based baseline on row retrieval tasks.

read1 min publishedJun 15, 2026

arXiv:2606.13871v1 Announce Type: new Abstract: Tabular data embeddings have become a cornerstone of data profiling and data integration pipelines, enabling tasks such as entity annotation and resolution; schema matching; column type detection; and table search, among others. Existing approaches embed rows, columns, or entire tables into a vector space and rely on nearest-neighbor search to retrieve candidate matches. A fundamental limitation of current embedding methods is the lack of interpretable similarity scores: the concrete similarity value between a query and its nearest neighbour carries no intrinsic meaning, making it impossible to determine whether that neighbour is a true match or simply the least-dissimilar item in a corpus that contains no valid answer. This inability to set principled thresholds for retrieval undermines practical deployment, particularly for zero-match detection. We investigate the use of HyperDimensional Computing (HDC), specifically the Holographic Reduced Representations (HRR) model, as a framework for tabular row embeddings when the retrieval task corresponds to answering structured select-project queries in vector space. Exploiting the algebraic properties of HDC operations, we derive closed-form expected similarity values for both equality and non-equality retrieval predicates, which converge to interpretable values as dimensionality increases, and use these to identify suitable retrieval thresholds. We evaluate HDC against EmbDI, a graph-based baseline, on two real-world datasets across varying table sizes and predicate lengths. Our results show that HDC matches or outperforms EmbDI for row retrieval across all configurations, handles non-equality predicates more robustly, and achieves perfect attribute projection accuracy at sufficient dimensionality -- while uniquely enabling reliable identification of zero-match predicates through its principled thresholds.

source & further reading

arxiv.org — original article

~/api · this article 200

$curl api.wpnews.pro/v1/news/hyperdimensional-computi…

Read original on arxiv.org → arxiv.org/abs/2606.13871

mentioned entities

HyperDimensional Computing

Holographic Reduced Representations

EmbDI

metadata

slughyperdimensional-computing-for-structured-querying-on-tabular-data-embeddings

topic#machine-learning

secondary2 topics

sentimentpositive

langen

canonicalarxiv.org

navigation

← prevDomain-Specific AI for Pharma, B…

next →5 Claude Automation Tricks That …

── more in #machine-learning 4 stories · sorted by recency

arxiv.org · 15 Jun · #machine-learning

A Deep Reinforcement Learning (DRL)-Based Transformer Method for Solving the Open Shop Scheduling Problem

arxiv.org · 15 Jun · #machine-learning

Orchestra-o1: Omnimodal Agent Orchestration

arxiv.org · 15 Jun · #machine-learning

Hybrid Open-Ended Tri-Evolution Makes Better Deep Researcher

arxiv.org · 15 Jun · #machine-learning

Formalizing Numerical Analysis: An Agent Pipeline and Quality Audit Beyond Kernel Acceptance

sponsored brought to you by zahid.host 4,200+ EU-deployed projects

reading about agents? ship yours in a single git push.

Run your AI side-project on zahid.host

EU-based hosting, git-push deploys, automatic HTTPS, no cold starts. Free tier with a custom domain — perfect for shipping the agent you just read about.

$git push zahid main

→ Live at https://your-agent.zahid.host ✓

Get free account → Pricing

from €0/mo · no card required