Skip to main content

The Unreadable Book

The Voynich Manuscript has defeated every cryptographer, linguist, and AI for 600 years. Why it might never be solved.

9 MIN READ452 WORDS
9 sources cited
Investigation
An opening from the Voynich Manuscript showing unidentified plant illustrations alongside undeciphered script, c. 1404–1438
Voynich Manuscript, folio 34r — Beinecke Rare Book & Manuscript Library, Yale University (MS 408)

Yale University's Beinecke Rare Book and Manuscript Library houses, under catalogue number MS 408, a 240-page vellum codex written in an unknown script, illustrated with drawings of unidentifiable plants, astronomical diagrams of uncertain meaning, and images of small human figures bathing in interconnected pools of green liquid. The text flows with the fluency and visual consistency of a natural language. No one has ever been able to read a single word of it.

A page from the Voynich Manuscript showing unidentified botanical illustrations alongside undeciphered text in an unknown script, c. 1404–1438
Voynich Manuscript, folio 25v — Beinecke Rare Book and Manuscript Library, Yale University (MS 408), c. 1404–1438

The Voynich Manuscript, named after the Polish book dealer Wilfrid Voynich who acquired it in 1912, has been radiocarbon-dated to the early fifteenth century (between 1404 and 1438). Its vellum, inks, and pigments are consistent with that dating. Beyond these material facts, almost everything about the manuscript is contested.

The script comprises approximately 20 to 30 distinct characters (depending on how one defines character boundaries) arranged in patterns that exhibit statistical properties consistent with natural language. The text shows regular word-length distributions, follows Zipf's law (the frequency of a word is inversely proportional to its rank in frequency), and displays positional constraints on character occurrence that resemble the phonotactic rules of known languages. These properties are extremely difficult to produce artificially, which argues strongly against the manuscript being a random or meaningless hoax.

Attempts at decipherment have been continuous since at least the 1920s. William Friedman, the cryptanalyst who broke Japan's PURPLE cipher during World War II, spent decades on the Voynich without success. The manuscript has been subjected to analysis using frequency counting, index of coincidence, entropy measurements, neural networks, and large language models. None has produced a convincing reading. The fundamental obstacle is the absence of a bilingual text or any confirmed external reference point. Without a Rosetta Stone, the cipher—if it is a cipher—remains opaque.

Several competing hypotheses coexist. The manuscript may be an elaborate hoax—perhaps the most successful in history—though the statistical properties of the text make this difficult to sustain. It may be written in a constructed language or a natural language rendered in a unique script. It may employ a steganographic technique that conceals meaningful content within an apparently linguistic structure. It may be a pharmacological or alchemical text whose referents are deliberately obscured to protect proprietary knowledge. Each hypothesis has adherents, and none has been definitively excluded.

The Voynich Manuscript endures as an object lesson in the limits of analytical method. We possess extraordinary tools for pattern recognition and code-breaking. We have the computational power to test millions of decryption hypotheses per second. And yet a fifteenth-century book, written by an unknown author for an unknown purpose, remains as opaque as the day it was made. Some problems resist solution not because we lack sufficient power, but because we lack sufficient context.

mysterylanguagemedievalcryptography

Share this article

APOTHEOSIS OF KNOWLEDGERC 1956161