Tag Archives: government information

Business information – selected slides from June 2016 workshops

Some of the slides that I used as part of my June 2016 workshops on Business Information are now available on both SlideShare and authorSTREAM. The workshop run in the last week of June inevitably included a session on the EU referendum and the Brexit result. A few of those extra slides are included in this edited version of the presentation.

Business Information - key web resources

More UK information vanishes into GOV.UK

Just when you’ve finally worked out how to search some of the key UK government web resources they disappear into the black hole that is GOV.UK.

The statistics publication hub went over a few weeks ago and the link http://www.statistics.gov.uk/ now redirects to http://www.gov.uk/government/statistics/announcements. Similarly, Companies House is now to be found at http://www.gov.uk/government/organisations/companies-house and the Land Registry is at http://www.gov.uk/government/organisations/land-registry. Most of the essential data, such as company information and ownership of properties, can still be found via GOV.UK and in fact some remains in databases on the original websites. For example, following the links on GOV.UK for information on a company eventually leads you to the familiar WebCHeck service at http://wck2.companieshouse.gov.uk/. Companies House useful list of overseas registries, however, seems to have totally disappeared but is in fact hidden in a general section covering all government “publications” (http://www.gov.uk/government/publications/overseas-registries#reg).

Documents may no longer be directly accessible from the new departmental home pages so a different approach is needed if you are conducting in-depth research. GOV.UK is fine for finding out how to renew your car tax or book your driving theory test – two of the most popular searches at the moment – but its search engine is woefully inadequate when it comes to locating detailed technical reports or background papers. Using Google’s or Bing’s site command to search GOV.UK is the only way to track them down quickly, for example biofuels public transport site:www.gov.uk.  Note that you need to include the ‘www’ in the site command as site:gov.uk would also pick up articles published on local government websites. This assumes, though, that the document you are seeking has been transferred over to GOV.UK.

There have been complaints from researchers, including myself, that an increasing number of valuable documents and research papers have gone AWOL as more departments and agencies are assimilated Borg-like by GOV.UK. Some of the older material has been moved to the UK Government Web Archive at http://www.nationalarchives.gov.uk/webarchive/.

This offers you various options including an A-Z of topics and departments and a search by keyword, category or website. The latter is slow and clunky with a tendency to keel over when presented with complex queries. I have spent hours attempting to refine my search and wading through page after page of results only to find that the article I need is not there, nor anywhere else, which is an experience several of my colleagues have had. This has led to conspiracy theories suggesting that the move to GOV.UK has provided a golden opportunity to “lose” documents.

I am reminded of a scene from Yes Minister:

James Hacker: [reads memo] This file contains the complete set of papers, except for a number of secret documents, a few others which are part of still active files, some correspondence lost in the floods of 1967…

James Hacker: Was 1967 a particularly bad winter?

Sir Humphrey Appleby: No, a marvellous winter. We lost no end of embarrassing files.

James Hacker: [reads] Some records which went astray in the move to London and others when the War Office was incorporated in the Ministry of Defence, and the normal withdrawal of papers whose publication could give grounds for an action for libel or breach of confidence or cause embarrassment to friendly governments.

James Hacker: That’s pretty comprehensive. How many does that normally leave for them to look at?

James Hacker: How many does it actually leave? About a hundred?… Fifty?… Ten?… Five?… Four?… Three?… Two?… One?… *Zero?*

Sir Humphrey Appleby: Yes, Minister.

From “Yes Minister” The Skeleton in the Cupboard (TV Episode 1982) – Quotes – IMDb  http://www.imdb.com/title/tt0751825/quotes 

For “floods of 1967” substitute “transfer of files to GOV.UK”.

The case of the disappearing press release

UK government departments and organisations frequently change their names, merge or disappear altogether. The same applies to their websites and documents held on those sites. Tracking down copies of older reports, data and superseded guidelines and regulations is becoming increasingly difficult, especially as so many sites are now being closed down. Information is supposed to be transferred to the new Gov.uk web site (http://www.gov.uk/) but historical information is in danger of vanishing altogether.

I recently needed to get back to a press release issued by the Potato Council (yes, there really is such a thing!) dated November 9, 2007. The title of the document was “Provisional Estimate of GB Potato Supply for 2007” and I had the original URL in my notes. The URL is no longer on the Potato Council’s web site and searching the site failed to turn up the document. Searching the Potato Council’s web site using the Google site: command also failed to find it. I next ran the URL through Google, Bing and DuckDuckGo and found 2 references to it in research papers but not the press release itself.

As I had the URL my next stop was the Internet Archive Wayback Machine (http://www.archive.org/) but the archive found nothing. The Wayback Machine periodically takes snapshots of web sites and lets you browse those copies by date. You can enter the URL of a home page or an individual page. The snapshots are not taken every time a website changes so there are gaps in its coverage, and a page or document can be missed. Hoping that the URL might have changed at some point I browsed copies of the Potato Council’s site for late 2007 and early 2008, but no joy.

Next I tried the UK Government Web Archive at the National Archives (http://www.nationalarchives.gov.uk/webarchive/). This is similar to the Wayback Machine but concentrates on UK government sites and related official bodies. One of the options is to browse the A-Z directory. I found fewer archive copies than in the Wayback Machine but hoped that the one entry for 2008 might come up trumps. Unfortunately it did not.

Archive copies of the Potato Council web site

Another possibility was that Zanran (http://www.zanran.com/) might have a copy. Zanran concentrates on indexing and searching information contained in charts, graphs and tables of data. It archives copies of the documents and I have used it several times to track down information that has been removed from the live web. A search on potato supply estimate UK 2007 came up with a list of results with my document at the top.

Zanran search result

At first glance, it does not appear to match the document I am looking for because the title is different. The titles listed by Zanran are not always those of the whole document but the labels or captions associated with the individual charts and tables. If you hover over the thumbnail to the left of the entry you can see a preview of a much larger section to make sure you have the right document. Clicking on the thumbnail or title will usually take you to Zanran’s archive copy.

Had I not found the press release on Zanran, I would next have contacted the Potato Council. My experience, though, is that very few organisations are able or willing to supply older documents such as press releases. My last resort would have been to contact the authors of the two papers I had found via Google to see if they had kept copies.

I usually keep copies of all papers and pages that I use as part of my research on major projects but inevitably there are times when I forget. As demonstrated above, there are several tools that can be used to try and track down documents that have disappeared from the web but success is not guaranteed.