{"slug": "miqrabert-regression-based-sentence-bert-finetuning-for-biblical-hebrew-parallel", "title": "MiqraBERT: Regression-Based Sentence-BERT Finetuning for Biblical Hebrew Parallel Detection", "summary": "Researchers introduced MiqraBERT, a Sentence-BERT model finetuned from AlephBERT for detecting verse-level semantic similarity in Biblical Hebrew. The model improves distributional separation 2.7-fold over the pre-trained baseline and reduces the ambiguous overlap region from 24% to 6%, though it performs well only on narrative textual reuse, not poetic parallels.", "body_md": "arXiv:2606.19638v1 Announce Type: new\nAbstract: Textual reuse pervades the Hebrew Bible, yet the computational methods used to detect it still rest largely on lexical overlap, and they falter once a parallel involves paraphrase, lexical substitution, or syntactic reworking. This paper introduces MiqraBERT, a Sentence-BERT model finetuned from AlephBERT (a Modern Hebrew encoder) for verse-level semantic similarity in Biblical Hebrew. The training set comprises 1,650 labeled verse and half-verse pairs: 825 true parallels drawn from the Chronicles synoptic material and from foundational studies of poetic parallelism, balanced against 825 randomly sampled negatives. Through cosine-similarity regression, the model learns an embedding space in which parallel verses cluster together and unrelated verses move apart. We evaluate separation with distribution-based metrics, Wasserstein distance and the overlap coefficient, across ten random seeds. MiqraBERT improves distributional separation 2.7-fold over the pre-trained baseline and reduces the ambiguous overlap region from roughly 24% to about 6%. Narrative synoptic parallels reach a recall@10 of 87.1%; poetic parallels remain difficult, below 9%. This genre-dependent asymmetry confines the model's reliable scope to narrative textual reuse. MiqraBERT is publicly available at https://huggingface.co/davidmsmiley/MiqraBERT", "url": "https://wpnews.pro/news/miqrabert-regression-based-sentence-bert-finetuning-for-biblical-hebrew-parallel", "canonical_source": "https://arxiv.org/abs/2606.19638", "published_at": "2026-06-19 04:00:00+00:00", "updated_at": "2026-06-19 04:05:59.187300+00:00", "lang": "en", "topics": ["natural-language-processing", "machine-learning", "large-language-models"], "entities": ["MiqraBERT", "AlephBERT", "Sentence-BERT", "Hebrew Bible", "Chronicles"], "alternates": {"html": "https://wpnews.pro/news/miqrabert-regression-based-sentence-bert-finetuning-for-biblical-hebrew-parallel", "markdown": "https://wpnews.pro/news/miqrabert-regression-based-sentence-bert-finetuning-for-biblical-hebrew-parallel.md", "text": "https://wpnews.pro/news/miqrabert-regression-based-sentence-bert-finetuning-for-biblical-hebrew-parallel.txt", "jsonld": "https://wpnews.pro/news/miqrabert-regression-based-sentence-bert-finetuning-for-biblical-hebrew-parallel.jsonld"}}