show image

Sooraj Shah

Contributing Editor

Sooraj Shah is Contributing Editor of New Statesman Tech with a focus on C-level IT leader interviews. He is also a freelance technology journalist.

How the National Library of Scotland aims to preserve millions of books

The National Library of Scotland is Scotland’s largest reference library, with about 31 million items, more than seven million manuscripts and two million maps.

At a time when content is increasingly delivered and consumed digitally, Stuart Lewis has an intriguing role as associate director of digital at library. He explains that his department of 45 people brings together a number of teams that work in unison.

This includes the classic IT function, which incorporates an infrastructure team, support team and software development team. They have to work together with a digitisation team, which focuses on digitising the library’s content, and a digital preservation team.

“Our heritage goes back to the 1600s. We know how to look after books for hundreds or even thousands of years, but we don’t yet know how to look after digital files for that long, so having a digital preservation team means we’re looking at that from the creation of our digital assets, through to how we store them on the infrastructure, and then how we preserve them in the long-term,” Lewis tells NS Tech.

The library has an ongoing digital strategy with a number of targets, one of which is that it will have a third of its collections in a digital format by 2025. Considering the library has about 31 million items that it looks after on-site, this is an ambitious aim.

“The two ways that will happen is through digitisation, where we digitise physical materials such as maps, exam papers or books, and then we have new types of content that are born digital including one million electronic journal articles and thousands of e-books per year, which we’ll also factor in as part of our aim,” he says.

In the past, the library would buy storage-area network (SAN) block storage, but it would reach capacity within about 18 months, meaning it would have to buy an additional SAN. Currently, the organisation has seven SANs storing digital content.

“We wanted to move away from that to a platform that could grow with the library,” Lewis explains.

The library also needed to shift away from backing up books in the same way that corporate organisations back-up their data – there is no need for weekly back-ups, for example.

Instead, the National Library of Scotland wanted to shift to a multiple copy approach, where it would store three copies of each piece of content in three different locations. Two of the copies would automatically replicate in the library’s two datacentres: one in Edinburgh and one in Glasgow, and the library would also be able to begin using cloud-type technologies on-premise.

“This is why we specified we wanted Amazon S3 as the interface as we can use it for our cloud copy but also for our on-premise copies,” Lewis explains.

In addition, the library wanted a product that could cope with a petabyte of data, and enable the library to grow substantially.

After a thorough procurement process, Scality’s RING object storage software was selected on HP Apollo hardware. The product allowed the library to store one copy in each of its data centres, and a third copy in AWS Glacier Deep Archive.

Lewis said that Scality was selected primarily because of its replication features and pricing model.

“The licensing model is essentially for all the data you protect, but as we’re protecting one copy of the data, we’re essentially getting two for the price of one because we’re getting two replicas,” he says.

“Scality’s system is also very user friendly, easy to use and operate, particularly in terms of the statistics it can provide. This is one of the real strengths, as we can drill right down into it and the amount of information that it conveys to us really helps us to understand what the system is doing,” he adds.

This is just one of the projects that the library is working on. Another is the possibility to give huge data sets of all the books that it has.

“We can give you the first hundred years of the Encyclopaedia Britannica as a text file, so if you wanted to do research on how knowledge has evolved over time, then rather than having to read every copy and stitch it together yourself, you can download it all in a 44GB file with 155,000 images and 155,000 associated text files,” Lewis says, adding that 50 years of Scottish exam papers are also available in a similar format.

For Lewis and the National Library of Scotland, digitising is just one part of its digital transformation – the library wants to preserve content, but also make it more accessible, enabling people to do more with the content than ever before.