Sunday, 13 November 2011
Read all about it! British Newspaper Archive beta site
Of course, Chris Paton beat me to it with a detailed description on his own blog, so I won't repeat what he has said, instead I will add a few observations of my own. Overall, my impression of the site is very positive, and it compares well with other newspaper sites - no doubt the smart people at brightsolid have looked at them and learnt. It is, of course a beta version, so more content and features could be added before the final version goes live, but it looks a lot more finished than some beta sites I have tested.
I have always been particularly interested in newspapers and periodicals as sources for history and genealogy. For many years I was lucky enough to live within easy reach of the Newspaper Library at Colindale, so have looked at the original versions of quite a number of the titles included here. And some of them really were originals; although a lot of newspapers have been microfilmed, many others have not, so some of the ones in this collection have been copied for the first time. You can easily tell the ones that have been scanned from microfilm because they are in pure black and white, while the first-timers are in colour. In practical terms, of course, 'colour' mostly means black and beige, but the quality of these is particularly good, because they are using the latest equipment, while scans from film will be as good as the technology at the time when the filming was done.
Indexing of newspapers is done by OCR (optical character recognition) because the sheers volume of printed material makes manual indexing impractical. This of course has its limitations, although it is getting better all the time. I can remember when no-one thought it would ever be possible to use the technique on newspapers at all. It works best on nice clear print or typescript, so the results from older papers with very small print and the occasional archaic long 's' that looks like a lower-case 'f' are variable, to say the least. Maybe OCR will be able to cope with this sometime in the future.
in the meantime, the British Newspaper Archive deals with this in an interesting way. Search results include the first few lines of the raw OCR text, so you can see at a glance if this publication is one of the dodgier ones, and for each article you can view the full OCR text and submit corrections.
There are many useful features on the site, such as the basic and advanced searches that we have come to expect, with filtering options that will be familiar to anyone who has used the Times Digital Archive, although it has a cleaner look and is a little more user-friendly. There are day, month and year options so that you can limit your search to a particular date range, or to a specific date of issue. I particularly like the f fact that you can select a range of years without having to also select a day and month from the drop-down menus. It's a minor point, but one that irritates me when I use the (otherwise excellent) London Gazette site.
You can choose to have your search results sorted by relevance, the default setting, or by date, and once you have a set of search results you can filter them using a range of date and place options, or by tags. The site uses tags that have already been assigned, such as 'classified', 'illustrations' and so on, but there is a facility for users to add their own public tags, which could be interesting. You can also bookmark items you have looked at, and create menus for them within your own 'My Research' area. This area contains an edit function for adding notes, which unfortunately does not work on the beta site. Navigation options within the digitised page images are very good, but to get back to your search results you need to use the'back' button or the breadcrumb trail. A 'back to search results' option would be nice.
The beta site does not allow saving or printing options, so we will have to see how they work out later.
I wish I had more time to explore and comment on the site, but my first impressions are that it will be very good indeed, and I can't wait for the full release.