Or: Paranoia ‘r’ us
This is a list of Top 10 Tips that the participants of Assessing the Quality of Information compiled at the end of a workshop held at TFPL in London on 31st October 2006. On a scale of 1 to 10, most of the delegates started out with a paranoia level of around 7 or 8. By the time they had worked through half the exercises a couple of them had increased that to 25-30! Paranoia had eased off slightly by the end of the day and at least they had a toolkit at their finger tips that they could use to help evaluate and assess the quality and validity of information.
- Check who is behind the domain name of a web site using www.allwhois.com . The contact details sometimes just give the ISP or service who organised the domain name for the web site owner but at least it is a starting point if you need to contact the owner to discuss any issues about the content. If someone really wishes to hide, they can use an agent to do the registration for them and in that case there is little one can do to track down the real owner. Note that you can only find out who owns a domain name; you cannot take a person’s or company’s name and find out which domain names they own.
- Try the Wayback Machine (Internet Archive) (www.archive.org) for tracking down pages or sites that have disappeared. Type in the web site URL or the URL of the document/page you have ‘lost’. This can pick up pages no longer cached by the search engines (see number 3 below). This trick is not guaranteed: some sites have asked to be removed from the archive or have designed their pages so that they automatically refresh to the most recent page. This can also be a useful tool for reviewing how a company presented itself on the web in the past and how organisations have evolved, both of which can be useful components of assessing quality.
- Look at the search engine cached copies of pages for more recent past pages. This is especially useful if the current web page that you found via Google et al does not seem to resemble your search strategy in any way. The cached copy is the copy that the search engine has in its index and it will also highlight your search terms within the page.
- Use links to and from the site or page to find pages that are similar to a known quality page (pages of similar content tend to link to one another), or to see what other people saying about the page in terms of quality and the authority and of those that link to it. Use Windows Live (www.live.com) . For pages that link in to your known or ‘suspect’ page use the link and linkdomain commands.Link will find pages that link to an individual page, for example: link:www.rba.co.uk/sources/stats.htm
Linkdomain will find pages that link to anywhere within a web site, for example:
To find out what page a site links to (can give you an idea of bias, political stance, ideology etc) use linkfromdomain, for example: linkfromdomain:rba.co.uk
- Use ‘hoaxbusting’ sites for if you are suspicious about a site or a ‘well known and accepted fact’. Examples are:
www.vmyths.com (concentrates on virus myths and hoaxes)
- If relevant and appropriate double check information and data with other independent sources (not always possible and you may find yourself going round on circles chasing sources that quote each other!)
- Use the search engine advanced options to focus your search. For example the domain and site command or box to limit your search to, for example, UK government sites (gov.uk), academic sites (.ac.uk, .edu etc), a known trusted site.
- Use different search tools and their features to give you results that are prioritised in a different order or for suggestions on alternative search strategies:
Yahoo – search.yahoo.co.uk – for results sorted in a different order from Google
AltheWeb Livesearch – Livesearch.alltheweb.com – for results that change as you type and suggestions for alternative search terms
Ask – www.ask.co.uk – for ways of narrowing down or broadening your search
Exalead – www.exalead.com – for its unique advanced search commands and related terms
Windows Live – www.live.com – for its link, linkdomain and linkfromdomain commandsThink about using different types of resources for example reference sources, video/audio, blogs and RSS feeds (yes, there are some good ones around!). Have a look at Trovando (www.trovando.it ) for some starting points. And don’t forget evaluated listing such as Intute (www.intute.ac.uk) and, for business, Alacrawiki (www.alacrawiki.com ).
- If you are looking for up date to market research etc. use market research content aggregators to identify who is publishing on a topic and go direct to the publisher. Individual publishers do not always give their full catalogue to the aggregators, may embargo their information for weeks or months, and may have more up to date information on their web site. You can also sometimes get a better deal by going direct to the publisher.
- Dates. Compared with structured databases, proper and accurate date searching is almost impossible with Google et al. A web page is assigned a date by the web server when it is loaded or reloaded onto the web site. It is not when the information was gathered or written. The web server date is the one that the search engines look at when you use the date option in the advanced search. Neither should you automatically trust the date that so often appears at the bottom of a page. It may be accurate and reflect the date of the content, but pages can be set-up to incorporate the date the page was loaded or reloaded onto the site, the date when minor changes are made, or even today’s date 🙁 If the date is not obvious from the content, contact the author.
Two additional general points were made in conclusion:
- it is important to build up your own personal collection of sites, relevant to your sector and applications, and that you have already quality assessed and trust
- errors and misleading information are not new and pre-date the Internet era. Nothing has changed in that mistakes and bias in the media – whatever form – are a fact of life. What has changed is that everyone now has the opportunity to become involved in creating and perpetuating myths and mis-information, which means that we have to wade through so much more rubbish and spend more time separating the gold from the dross.