# Swedish AI decodes 17th-century Borg Cipher

> Source: <https://letsdatascience.com/news/swedish-ai-decodes-17th-century-borg-cipher-38671b24>
> Published: 2026-06-04 05:51:11.014226+00:00

# Swedish AI decodes 17th-century Borg Cipher

Researchers affiliated with Stockholm University have developed an AI-based pipeline that accelerates decoding of historical ciphers, combining character recognition, pattern reconstruction, and language modelling, according to AzerNEWS. AzerNEWS reports the system deciphered the **Borg Cipher**, a **400**-page 17th-century manuscript, in **28 minutes**, a task that previously took weeks of manual work. Stockholm University materials describe the broader DECRYPT research effort and list tools such as **TranscripTool** and **CrypTool**, and databases including **DECODE** and **HistCorp** for training and evaluation. Stockholm University reports releasing more than **7,000** encrypted sources in its collections, while AzerNEWS reports the project database contains over **20,000** encrypted documents. AzerNEWS also notes scholars estimate roughly **1%** of archival manuscripts remain encrypted. The reporting emphasizes the team's position that AI assists but does not replace historians and philologists.

### What happened

According to AzerNEWS, researchers in Sweden developed an artificial intelligence pipeline that substantially shortens the time needed to decode historical ciphers. AzerNEWS reports that, per project leader Professor Beata Medjesha, the system decyphered the **Borg Cipher**, a **400**-page 17th-century manuscript, in **28 minutes**; the article frames that runtime against prior manual efforts that took weeks. AzerNEWS describes the cipher as using **34** different symbols and containing recipes and pharmaceutical knowledge.

Stockholm University describes the longer-running interdisciplinary project led by Professor Beata Megyesi and documents tools and data releases associated with the work. The Stockholm University pages identify the DECRYPT project, note tool names **TranscripTool** and **CrypTool**, and report the release of over **7,000** encrypted sources in the project collections **DECODE** and **HistCorp**.

### Technical details

Per AzerNEWS the new system combines character recognition, pattern reconstruction, and language modelling to reconstruct original plaintext from cipher symbols. Stockholm University materials describe **TranscripTool** as a component for converting cipher images into machine-readable transcriptions and **CrypTool** as an application to assist decipherment workflows; Stockholm University also documents language-model resources and diplomatic transcriptions used for evaluation.

### Industry context

Editorial analysis: The project exemplifies a broader trend where computational linguistics, computer vision, and statistical language models are applied to archival and humanities problems. Comparable efforts typically combine OCR-like symbol recognition, probabilistic or neural sequence modelling, and human-in-the-loop validation to handle noise, sparse training data, and diverse scripts. For practitioners, this pattern highlights recurring challenges: domain-specific token inventories, limited parallel corpora for training, and error propagation between transcription and decipherment stages.

### Context and significance

Editorial analysis: For digital humanities and NLP practitioners, systematic, reproducible pipelines and shared datasets matter more than single-case runtimes. Publicly archived encrypted corpora (Stockholm University reports **7,000+** items) plus tools that expose intermediate representations enable replication and incremental improvement by the research community. If the larger database count reported by AzerNEWS (**20,000+** encrypted documents) is accurate, that would materially expand the data available for training symbol-recognition and language models; the two figures are both reported in the sources and differ, which observers should note.

### What to watch

Editorial analysis: Observers should look for:

- •releases of the underlying transcription and alignment data
- •code and model checkpoints for
**TranscripTool** and**CrypTool** so methods can be reproduced - •benchmark evaluations across multiple ciphers and languages to measure robustness
- •documentation of human-in-the-loop steps and uncertainty estimates to understand how transcription errors affect final decipherment. Increased dataset availability and explicit error metrics would make it easier for ML practitioners to adapt contemporary sequence models and vision backbones to this domain

### Attribution note

The claim about the **28-minute** decoding run and the **20,000+**-document database appears in AzerNEWS. Stockholm University's public pages describe the DECRYPT project, the **TranscripTool**/** CrypTool** tools, and the release of **7,000+** encrypted sources in project collections. AzerNEWS and Stockholm University together provide the factual basis for the technical and data descriptions above.

### Practical implication for practitioners

Editorial analysis: Researchers building models for low-resource historical scripts should prioritize modular pipelines (separate symbol recognition and language-model components), explicit uncertainty propagation, and tooling for human correction. Shared datasets and code would accelerate progress and enable cross-validation on different cipher systems.

## Scoring Rationale

The result is a notable, domain-specific application of OCR and language modelling that accelerates a historically manual task and releases useful datasets and tools. It matters to NLP and digital-humanities practitioners, but it is not a frontier-model release.

Practice with real FinTech & Trading data

90 SQL & Python problems · 15 industry datasets

[Active Verified Users by Income TierEasy](/problems/sql/active-verified-users-by-income)

[Technology Stocks with High BetaMedium](/problems/sql/technology-stocks-with-high-beta)

[Portfolio Performance ScorecardHard](/problems/sql/portfolio-performance-scorecard)

250 free problems · No credit card

[See all FinTech & Trading problems](/problems/datasets/fintech)