What historian has time to read tens of millions of news articles from more than a century of British history? None of them. So computer scientists and historians have taught computers how to do the job instead, analysing billions of words of news reports to take a new look at the 19th and early 20th centuries.
The study, published in the journal PNAS, marks the early steps of the emerging field of "culturomics".
Computers analysed a total of 28.6 billion words from 35 million British regional news stories published between 1800 and 1950, which made up about 14% of the total output of the regional press in that period.
For comparison, the average adult has a reading speed of about 300 words per minute. At that rate, it would take someone about 180 solid years to do all that reading, not including a lunch break. The computer algorithms did all that in about eight weeks, study author Nello Cristianini, a computer scientist at the University of Bristol, told IBTimes UK.
The first step of the study was a sense-check, to make sure the computers could pick up real historical events from the papers. They saw if the computers could accurately pinpoint events such as coronations, known disease epidemics and wars.
The next step was the interesting part, to see whether the computer algorithms could pick up historical moments that historians couldn't using traditional methods.
"Here we're looking for something a bit less clear – for example, uptake of technology," said Cristianini. "We can see that around 1900 technology changed. But then we can check at a more subtle signal, we can see how quickly telegraph, telephone and radio become established. Then we found a faster and faster rate. Now, for Twitter or Facebook to be accepted it just takes a year."
The AI analysis went beyond simple word counts, which has been done for large swathes of digitised literature before. This time the researchers used AI techniques such as Natural Language Processing in order to get a sense of context and the meaning of the text. Think of it as the ultimate skim-reading.
So what did they find? Here's the AI view of British history.
When did electricity outpace steam?
1898. This is the moment when electricity overtakes steam in terms of coverage in the news, as one technology replaces the other.
When did trains become more popular than horses?
Just four years later: 1902. The dawn of the railway age was the 1840s, when national railway network in Britain began to boom. But in terms of what was considered news, it took more than half a century to become more important than the horse.
When did people stop talking about slavery?
Peaks in mentions of slavery took in the time of the abolitionist movement, from about 1830 to 1870, and the American civil war, from 1861 to 1865. Mentions died down considerably after about 1870.
When did reporters start to cover the suffrage movement?
1906. A dramatic peak followed in 1913, when suffragette Emily Wilding Davidson threw herself under the king's horse at Ascot races.
When did women get equal news coverage with men?
Well, never. There was an upward trend in the coverage of women that began in the 20th century, with a spike around the time of WWII. But throughout the period there were roughly three men for every one woman mentioned in the news. In the 21st century, the figure is a little closer to two men for every one woman, but the shift hasn't been particularly dramatic.
When did courage matter most?
Perhaps unsurprisingly, also around the periods of the First World War and Second World War. Victorian values such as perseverance show a steady decline throughout the period. But in times of war, values like endurance and courage showed a clear spike.
When did Britishness emerge?
The idea of Britishness took off in the early 20th century, with its first spike around 1900, followed by two much larger spikes coinciding with the First World War and Second World War. This finding has sparked conflicts with many historians, who argue that Britishness became a popular idea earlier on.
When did the economy become a catchphrase?
"Political economy" was consistently the more commonly used term compared with "the economy" up until about 1900, when things got more blurry. The start of the 20th century saw the two terms in roughly similar popularity for a decade or so, after which "the economy" took off as the more popular term, with several dramatic spikes in usage before the term began a steady rise in popularity.