BioELX: Cross-lingual Biomedical Entity Linking via Alias-based Retrieval and LLM Ranking

wpnews.pro

cd /news/natural-language-processing/bioelx-cross-lingual-biomedical-enti… · home › topics › natural-language-processing › article

[ARTICLE · art-16058] src=arxiv.org ↗ pub=2026-05-28T04:00Z topic=natural-language-processing verified=true sentiment=↑ positive

BioELX: Cross-lingual Biomedical Entity Linking via Alias-based Retrieval and LLM Ranking

Researchers have developed BioELX, a two-stage cross-lingual biomedical entity linking framework that requires no task-specific annotated training data. The system improves candidate retrieval by enriching SapBERT with multilingual aliases from Wikidata and performs context-aware disambiguation using a pre-trained LLM ranker. BioELX achieved state-of-the-art performance across five benchmarks, with Recall@1 gains of up to +30.8 on low-resource languages like Thai and +22.1 on Korean.

read1 min views17 publishedMay 28, 2026

arXiv:2605.27380v1 Announce Type: new Abstract: Cross-lingual biomedical entity linking (BEL) maps mentions in any language to unique identifiers in a biomedical knowledge base (KB), supporting clinical and biomedical NLP applications. However, expert-annotated training data for BEL are costly, especially for low-resource languages. Moreover, many cross-lingual BEL systems rely on SapBERT-based retrievers trained on predominantly English aliases in the KB, leading to poor generalization to unseen non-English mentions and limited context-aware disambiguation. We propose BioELX, a two-stage cross-lingual BEL framework that requires no task-specific annotated training corpora. In Stage~1, we enrich SapBERT training with Wikidata-derived multilingual aliases and use the resulting retriever to improve cross-lingual candidate retrieval. In Stage~2, we perform context-aware disambiguation with a pre-trained LLM ranker that jointly considers the mention context and candidate, eliminating the need for supervised training. Experiments on five benchmarks (XL-BEL, EMEA, Patent, WikiMed-DE, and MedMentions) show that BioELX achieves new state-of-the-art performance. It improves average Recall@1 on XL-BEL by +19.2, with especially large gains for low-resource languages, e.g., +21.6 on Turkish, +22.1 on Korean, +30.8 on Thai, and delivers consistent improvements on EMEA (+6.2), Patent (+5.4), and WikiMed-DE (+12.8). Code and resources will be released upon publication.

source & further reading

arxiv.org — original article

~/api · this article 200

$curl api.wpnews.pro/v1/news/bioelx-cross-lingual-bio…

Read original on arxiv.org → arxiv.org/abs/2605.27380

mentioned entities

BioELX

SapBERT

XL-BEL

EMEA

Patent

WikiMed-DE

MedMentions

Wikidata

metadata

slugbioelx-cross-lingual-biomedical-entity-linking-via-alias-based-retrieval-and-llm

topic#natural-language-processing

secondary4 topics

sentimentpositive

canonicalarxiv.org

navigation

← prevOpen House 2026 Day 1: real-time…

next →New poll points to possible Bece…

── more in #natural-language-processing 4 stories · sorted by recency

dev.to · 12 Jul · #natural-language-processing

The benchmark that built the tools

pub.towardsai.net · 12 Jul · #natural-language-processing

DPO Fine-Tuning from First Principles in Python

dev.to · 12 Jul · #natural-language-processing

RAG - Meta Filtering and Reranking

dev.to · 12 Jul · #natural-language-processing

RNAValidate: CPU-only validator for AI-predicted 3D RNA structures

── more on @bioelx 3 stories trending now

wpnews · 23 May · #artificial-intelligence

AccessLens — a blind person's lanyard, powered by Gemma 4 on-device

wpnews · 30 May · #ai-safety

Nightcord Security Analysis Report - Threat Investigation

wpnews · 8 Jul · #artificial-intelligence

SpaceXAI unveils Grok 4.5 AI model ahead of July 2026 public release

sponsored brought to you by zahid.host 4,200+ EU-deployed projects

reading about agents? ship yours in a single git push.

Run your AI side-project on zahid.host

EU-based hosting, git-push deploys, automatic HTTPS, no cold starts. Free tier with a custom domain — perfect for shipping the agent you just read about.

$git push zahid main

→ Live at https://your-agent.zahid.host ✓

Get free account → Pricing

from €0/mo · no card required