A primer on computational methods for historical research. How do we read a million books? How do we map the Republic of Letters?
Transforming the chaos of the archive into structured, analyzable datasets.
Historical records are often messy scans. To a computer, this is just pixels, not information.
We use Named Entity Recognition (NER) to identify People, Locations, and Organizations.
1import spacy2nlp = spacy.load("en_core_web_sm")3doc = nlp(archive_text)45for ent in doc.ents:6 print(ent.text, ent.label_)
Execute the script to tag entities within the noise.
The result is a clean, queryable database. What was once a chaotic image is now a Network of Relations.
History is defined by connections, not just individuals.
Traditionally, we study historical figures in isolation. We read their biographies, diaries, and works as solitary endeavors.
But by mapping correspondence, we reveal the Social Gravity.
Force-directed algorithms simulate how individuals pull together into communities or push apart into factions.
How to read 10,000 newspapers at once? You don't. You model them.
Algorithms like LDA find clusters of words that co-occur frequently, revealing hidden themes.
We visualize these themes as a river, tracking their rise and fall over decades.
Immersive 3D environments allow us to experience historical spaces that no longer exist.
We stand here today in the West Kowloon Cultural District. The vertical screen of M+ dominates the skyline.
Rewind 30 years. This land did not exist. It was a construction site, part of the massive airport reclamation project.
Drag to look around. Click years to time travel.
These tools do not replace close reading; they enhance it. They allow us to toggle between the micro-history of a single letter and the macro-history of an entire civilization.