Category Archives: Search Strategies

Articles and top tips in eLucidate

The latest eLucidate from UKeiG is now out at  and available at http://www.cilip.org.uk/uk-einformation-group/elucidate/elucidate-current-issue. My contributions to this issue are Alphabet Soup (about the changes and restructuring of Google), top tips on Exploiting Google and Kicking the Google Habit.

The two “top tips” articles came out of two workshops I facilitated for UKeiG in Manchester and came from the participants themselves. I am repeating the workshops – significantly updated following recent announcements –  next week in London; Essential non-Google Search Tools and New Google, New Challenges.  If you are interested and want to learn more, there is still time to book a place on either or both of the workshops.

Wayback Machine gets funding to rebuild and add keyword searching

The Wayback Machine (http://www.archive.org/), also known as the Internet Archive,  is always a popular site on my search workshops. It is a fantastic way of discovering how web pages looked in the past and for tracking down documents that are no longer on the live web.

Wayback-UKOLUG
UKOLUG Home Page 27th April 1999

It isn’t 100% guaranteed to have what you are looking for and at present you need the URL of the web site or document in order to use it. People often ask if keyword searching is possible; it isn’t at the moment but it will be.

The Internet Archive has received support from the Laura and John Arnold Foundation (LJAF) and will be re-building the Wayback Machine. When it is completed in 2017, the next generation Wayback Machine will have more webpages that are easier to find and will include keyword indexing of homepages.

Further details of the rebuild are on the Internet Archive blog at http://blog.archive.org/2015/10/21/grant-to-develop-the-next-generation-wayback-machine/

 

Slides from my talk given at the Anybook Oxford Libraries conference

The slides from my talk at the Anybook Oxford Libraries Conference in July 2015 are now available on Slideshare via the Bodleian Staff Development account.

Google: The Answer to Life, The Universe and Everything  http://www.slideshare.net/BodStaffDev/karen-blakeman

As well as advanced Google search features and alternative search tools I comment on the direction Google is going in. Note that this presentation was given before the Alphabet announcement. Those of you who have attended my Google and non-Google search tool workshops should know most of what is in the slides, but they might serve as a useful reminder.

It is also available on authorSTEAM at http://www.authorstream.com/Presentation/karenblakeman-2553775-google-answer-life-universe-everything/

UKeiG Article: New Google, New Challenges

From "Introducing Spot", Boston Dynamics, Introducing Spot - YouTube  https://www.youtube.com/watch?v=M8YjvHYbZ9w
From “Introducing Spot”, Boston Dynamics, Introducing Spot – YouTube https://www.youtube.com/watch?v=M8YjvHYbZ9w

My article on major changes at Google, “New Google, New Challenges”, is now available in UKeiG’s latest issue of eLucidate at http://www.cilip.org.uk/uk-einformation-group/elucidate-ukeigs-journal/elucidate-current-issue/new-google-new-challenges

As well as the general dumbing down and relentless removal of search options, it covers the new technologies that Google is experimenting with: artificial intelligence, driver-less cars, robotics, home environment sensors and controls. Some of this is already being integrated with search and “mobile”.

I am running  a “New Google, New Challenges” workshop for UKeiG this autumn in Manchester and London. It concentrates on search, how the changes at Google are impacting the way it manages our search and presents results, and how to use what is left of the advanced search techniques and specialist databases for more relevant research results.

Google dumps Reading Level search filter

It seems that Google has dumped the Reading Level search filter. This was not one that I used regularly but it was very useful when I wanted more serious, in-depth, research or technically biased articles rather than consumer or retail focused pages. It often featured in the Top Tips suggested by participants of my advanced Google workshops.

It was not easy to find. To use it you had to first run your search and then from the menu above the results select ‘Search tools’, then ‘All results’, and from the drop menu ‘Reading level’. Options for switching between basic, intermediate and advanced reading levels then appeared just above the results.

Google Reading Level comparison
Slide showing Google Reading Levels from one my search workshops

More details of how it worked are in the blog posting I wrote when it was launched in 2010 (http://www.rba.co.uk/wordpress/2010/12/13/x-factor-web-pages-are-advanced-says-googles-reading-level/).

So another tool that helped serious researchers find relevant material bites the dust. I daren’t say what I suspect might be next but, if I’m right, its disappearance could make Google unusable for research.

More UK information vanishes into GOV.UK

Just when you’ve finally worked out how to search some of the key UK government web resources they disappear into the black hole that is GOV.UK.

The statistics publication hub went over a few weeks ago and the link http://www.statistics.gov.uk/ now redirects to http://www.gov.uk/government/statistics/announcements. Similarly, Companies House is now to be found at http://www.gov.uk/government/organisations/companies-house and the Land Registry is at http://www.gov.uk/government/organisations/land-registry. Most of the essential data, such as company information and ownership of properties, can still be found via GOV.UK and in fact some remains in databases on the original websites. For example, following the links on GOV.UK for information on a company eventually leads you to the familiar WebCHeck service at http://wck2.companieshouse.gov.uk/. Companies House useful list of overseas registries, however, seems to have totally disappeared but is in fact hidden in a general section covering all government “publications” (http://www.gov.uk/government/publications/overseas-registries#reg).

Documents may no longer be directly accessible from the new departmental home pages so a different approach is needed if you are conducting in-depth research. GOV.UK is fine for finding out how to renew your car tax or book your driving theory test – two of the most popular searches at the moment – but its search engine is woefully inadequate when it comes to locating detailed technical reports or background papers. Using Google’s or Bing’s site command to search GOV.UK is the only way to track them down quickly, for example biofuels public transport site:www.gov.uk.  Note that you need to include the ‘www’ in the site command as site:gov.uk would also pick up articles published on local government websites. This assumes, though, that the document you are seeking has been transferred over to GOV.UK.

There have been complaints from researchers, including myself, that an increasing number of valuable documents and research papers have gone AWOL as more departments and agencies are assimilated Borg-like by GOV.UK. Some of the older material has been moved to the UK Government Web Archive at http://www.nationalarchives.gov.uk/webarchive/.

This offers you various options including an A-Z of topics and departments and a search by keyword, category or website. The latter is slow and clunky with a tendency to keel over when presented with complex queries. I have spent hours attempting to refine my search and wading through page after page of results only to find that the article I need is not there, nor anywhere else, which is an experience several of my colleagues have had. This has led to conspiracy theories suggesting that the move to GOV.UK has provided a golden opportunity to “lose” documents.

I am reminded of a scene from Yes Minister:

James Hacker: [reads memo] This file contains the complete set of papers, except for a number of secret documents, a few others which are part of still active files, some correspondence lost in the floods of 1967…

James Hacker: Was 1967 a particularly bad winter?

Sir Humphrey Appleby: No, a marvellous winter. We lost no end of embarrassing files.

James Hacker: [reads] Some records which went astray in the move to London and others when the War Office was incorporated in the Ministry of Defence, and the normal withdrawal of papers whose publication could give grounds for an action for libel or breach of confidence or cause embarrassment to friendly governments.

James Hacker: That’s pretty comprehensive. How many does that normally leave for them to look at?

James Hacker: How many does it actually leave? About a hundred?… Fifty?… Ten?… Five?… Four?… Three?… Two?… One?… *Zero?*

Sir Humphrey Appleby: Yes, Minister.

From “Yes Minister” The Skeleton in the Cupboard (TV Episode 1982) – Quotes – IMDb  http://www.imdb.com/title/tt0751825/quotes 

For “floods of 1967” substitute “transfer of files to GOV.UK”.

Top ten Google search tips from Oxford

View_Training_Suite_Osney_Blog_20140507
Training room with a view

These Top Ten search tips comes from an advanced workshop I recently ran for a group in Oxford. If this is the first Top Tips that you have read on this blog, a few words of explanation as to how these are generated. These are not my own personal tips but are nominated by people who have attended my full day workshops and tried out the various commands and techniques during the practical sessions.

The participants on this particular workshop were experienced, heavy duty researchers so I was keen to see what they came up with.

 

1. Verbatim
This is a regular in the Top Ten lists on this blog. It is an essential tool for making Google behave and forcing it to run your search the way you want it run but is well hidden. Google automatically looks for variations on your terms and sometimes drops terms from the search. To make Google carry out your search exactly as you have typed it in, first run your search, then click on ‘Search tools’ in the menu above your results. In the second line of options that appears click on ‘All results’ and from the drop down menu select Verbatim. This is very useful when searching for an article by title and Google decides to ignore the double quote marks, which it sometimes does if it thinks you don’t have enough results. If you are carrying out in-depth research it is worth using Verbatim even if your “normal” Google results seem to be OK. You may see very different content in your results list.

2. site: search and -site:
Use the site: command to focus your search on particular types of site, for example site:ac.uk for UK academic websites, or to search inside a large rambling site. If you prefer you can use the Advanced search screen at http://www.google.co.uk/advanced_search and fill in the site or domain box. You can also use -site: to exclude sites from your search.

3. filetype:
Use the filetype: command to limit your research to PowerPoint for presentations, spreadsheets for data and statistics or PDF for research papers and industry/government reports.

4. Asterisk * betweem terms
Use the asterisk between two words to stand in for 1-5 words. This is useful if you want two of your keywords close to one another but suspect that there may often be one or two words separating them. For example solar * panels will find solar photovoltaic panels, solar water heating panels etc.

5. Numeric range search
This command is unique to Google. Use it for anything to do with numbers – years, temperatures, weights, distances, prices etc. Simply type in your two numbers separated by two full stops as part of your search. For example to limit your search forecasts covering a future time period.

6. Incognito/private browsing
Even if you are not signed in to a Google account, Google personalises your results according to your search and browsing behaviour using the cookies that are stored on your computer. If you want to burst out of the filter bubble, as it is often called, use a private browser window or incognito (Chrome). Google will then ignore tracking and search cookies on your machine. To call up a private browser or incognito window use the following keys:

Chrome –  Ctrl+Shift+N
FireFox – Ctrl+Shift+P
Internet Explorer – Ctrl+Shift+P

7. Public Data explorer
The Public Data Explorer is one of Google’s best kept secrets. It can be found at http://www.google.com/publicdata/ and allows you to search open data sets from organisations such as the IMF, OECD, IM,  Eurostat and the World Bank. You can compare the data in a number of ways and there are several charting options.

8. Repeat search terms
If you are fed up with seeing the same results for a search repeat your main search term or terms. This often changes the emphasis of your search and the order in which the results appear.

9.Change order of terms
Changing the order in which you type in your search terms can change the order of your results. The pages that contain the terms in the order you specified in your search are usually given a higher weighting. This is another useful tip for when you are stuck in a search rut and are seeing the same results over and over again.

10. Different country versions
The country versions of Google give priority to the country’s local content, although it might be in the local language. This is a useful strategy when searching for research groups, companies and people that are active in a specific country. Use the standard ISO two letter country code, for example http://www.google.fr/ for Google France, http://www.google.it/ for Google Italy. It is also worth trying your search in Google.com. Your results may be more international or US focused and Google usually rolls out new search features in Google.com before launching in other country versions. If Google insists on redirecting you to your own local country version, go to the bottom right hand corner of the Google home page and you should see a link to Google.com.

Anything but Google – Top Tips

This collection of Top Tips is a combined list nominated by those who attended the autumn and spring UKeiG workshops on “Anything but Google”. The participants came from all sectors and types of company, and included a couple of self employed researchers. The sessions covered both general search tools and specialist services, and the list is an interesting mix of strategies and specific sites.  A big “Thank- you” to everyone who participated in the workshops.

1. Get to know the advanced search commands and options.
Google is not the only search tool that uses them and they can help focus your search, especially when using general search tools such as Bing.

2. If you are conducting serious research don’t stop with the first reasonable looking results.
Information of dubious quality can infiltrate even the most well respected of specialist websites. Put on your “skeptical goggles” as one delegate said! There are plenty of alternative tools and resources out there so get some corroboration from additional sources before acting on the information you find.

3. Allocate time for your search.
If you are carrying out in-depth research don’t leave it to the last minute. You will probably need to tweak your strategy and try different search tools to ensure that you are retrieving the best information. It can sometimes take longer than you anticipate.

4. Plan your strategy.
Think about the type of search you want to conduct and the type of information you are looking for. For example if you are carrying out a systematic review and want to use Boolean operators forget about Google; head for Bing instead. And if you need official statistics or company information go straight to specialist sites that provide that data.

5. Don’t stick with what you regularly use.
Experiment with other resources, especially if you suspect your default search tool is not telling you the whole story.

6. Country versions of search tools.
Many search tools offer country versions that give priority to the country’s local content, although that might be in the local language. This is a useful strategy when searching for industries, companies and people that are active in a particular country.

7. Learn when to try something else.
If a site’s navigation or internal search engine seems to be returning rubbish don’t struggle with it. Try another route to get to the information. Either try an alternative source of information or use the ‘site:’ command – available in Bing as well as Google – to search inside the site.

8. DuckDuckGo http://www.duckduckgo.com/.
This was recommended for its clean, straightforward layout and the range of resources it offers on a topic. A school librarian commented that the pupils at her school loved it.

9. MillionShort  http://millionshort.com/.
If you are fed up with seeing the same results from Google again and again give MillionShort a try. MillionShort enables you to remove the most popular web sites from the results. Originally, as its name suggests, it removed the top 1 million but you can change the number that you want omitted. The page that best answers your question might not be well optimised for search engines or might cover a topic that is so “niche” that it never makes it into the top results in Google or Bing.

10. Carrot Search http://carrotsearch.com/

Carrot-Search
Carrot Search foam tree

This was recommended for its clustering of results and also the visualisations of terms and concepts via the circles and “foam tree”. There is a link to the live web demo on the left hand side of the home page.

11. Microsoft academic Search  – charts http://academic.research.microsoft.com/
This is a direct competitor to Google Scholar. The site can be slow to load and it sometimes assigns authors to the wrong institution. Nevertheless, the visualisations such as the co-author and citation maps can be useful in identifying who else is working in a particular area of research. The visualisations can be accessed by clicking on the Citation Graph image to the left of the search results or an author profile.

12. Creative Commons and public domain images.
Use the Bing license option (US version only) to search for images with creative commons or public domain licenses, but do go to the original webpage and check that the license is indeed associated with the image you want to use. Alternatively use one of the following:

Flickr  Creative Commons http://www.flickr.com/creativecommons
Flickr The Commons http://www.flickr.com/commons/
Wikimedia Commons http://commons.wikimedia.org/
MorgueFile.com  http://www.morguefile.com/
Geograph http://www.geograph.org.uk/
Nasa http://www.nasa.gov/

13. Tineye Multicolr http://labs.tineye.com/multicolr/.
“Search 10 million Creative commons Flickr images by colour.”  You can specify more than one colour and move the the dividing bar between two colours to increase/decrease their prominence within the image. Click through to the original Flickr image to double check the license.

14. Company Check http://www.companycheck.co.uk/
Company Check repackages Companies House data and provides 5 years of accounts, and graphs for some financials free of charge. It also lists the directors of a company. Click on a director’s name and you can view other current and past directorships for that person. It provides more free information than Companies House but you have to register (free) to gain full access. Additional information such as credit risk, CCJs, credit reports, and many Companies House documents are priced or available as part of a subscription.

15. Guardian Data Store http://www.guardian.co.uk/data/
For datasets and visualisations relating to stories currently in the news. As well as the graphs and interactive maps the source of the data is always given and there are links to the original datasets that are used in the articles.

16. Zanran http://zanran.com/
This is a search tool for searching information contained in charts, graphs and tables of data and within formatted documents such as PDFs, Excel spreadsheets and images. Enter your search terms and optionally limit your search by date and/or format type. One delegate said “It has changed my life!”. (We think/hope she meant her working life.)

17. Keep up to date.
Keep up to date with what the search engines are up to, changes to key resources and new sites. Identify blogs and commentators that are relevant to your research interests and subject areas, and follow them using RSS or email alerts.

Google shows missing search terms

Several weeks ago I noticed that Google was displaying the terms it had dropped from your search as ‘Missing’. Google started routinely ignoring selected search terms towards the end of 2011 (see http://www.rba.co.uk/wordpress/2011/11/08/dear-google-stop-messing-with-my-search/). Google’s response to the outcry from searchers was to introduce the Verbatim search option. However, there was no way of checking whether all of your terms appeared in a result other than viewing the whole page. Irritating, to say the least, if you found that the top 10 results did not include all of your keywords.

Fast forward to December 2013, and some people started seeing results lists that showed missing keywords as strikethroughs. I saw them for a few days and then, just as I was preparing a blog posting on the feature, they disappeared! I assumed that they were one of Google’s live experiments never to be seen again but it seems they are back. Two people contacted me today to say that they are seeing strikethroughs on missing terms. I ran my test searches again and, yes, I’m seeing them as well.

I ran the original search that prompted my November 2011 article (parrots heron island Caversham UK) and included -site:rba.co.uk in the strategy to exclude my original blog postings. Sure enough, the first two results were missing parrots and had “Missing parrots” underneath their entry in the list.

Google Missing Terms Shown as Strikethroughs

Remember, though, that Google automatically looks for variations on your search terms. Your original keyword may not be present in the results but a synonym may be, for example birds instead of parrots. I did find one entry in the top 10 results that seemed out of place. It was an article in the local newspaper. There was no ‘Missing….’ underneath the result but I could not find synonyms of parrots anywhere in the story. The closest was the surname of a person mentioned in the article – Parry – and I can only assume that this was Google truncating my ‘parrots’ keyword and looking for variations. But Parry was not highlighted in the entry in the results list as synonyms are. I shall look at the article in more detail to check that I have not missed a lurking parakeet or cockatoo!

Google-Missing-Terms-2

I need to carry out more test searches to see how reliable this new feature is but, nevertheless, it is a welcome and useful addition to Google. If your top 10 results are showing strikethroughs then you know you have to either prefix the missing term with ‘intext:’ or use Verbatim for the whole search.

Top search tips from Exeter and Bristol

A couple of weeks ago I was in Exeter and Bristol leading workshops for NHS South West on “Google & Beyond”. We covered advanced Google commands, Google Scholar and alternatives to Google. Below are the combined top tips from the two sessions. I may have missed a couple from the list as I could not read my writing, so if you attended one of the workshops let me know if I’ve omitted your suggested tip.

  1. Verbatim Yet again, this has topped the list of useful Google search options. Google automatically looks for variations on your search terms and sometimes drops terms from your search without telling or asking you. To make Google run your search exactly as you have typed it in, first run your search. Then click on ‘Search tools’ in the menu above your results, in the second line of options that appears click on ‘All results’ and from the drop down menu select Verbatim.
  2. Be aware of personalisation. Even if you are not signed in to a Google account Google personalises your results according to your search and browsing behaviour. Personalisation is not necessarily a bad thing but if your want to burst out of the filter bubble, as it is often called, use a private browser window or incognito (Chrome). Google will then ignore tracking and search cookies on your machine and will not personalise your results. To call up a private browser or incognito window use the following keys:

Chrome –  Ctrl+Shift+N
FireFox – Ctrl+Shift+P
Internet Explorer – Ctrl+Shift+P

  1. site: Use the site: command to focus your search on particular types of site, for example site:nhs.uk, or to search inside a large, rambling site. You can also use -site: to exclude sites from your search.
  2. intext: Google’s automatic synonym search can be helpful when looking for alternative terms, but if you want a term to be included in your search exactly as you have typed it then prefix the word with intext:.
  3. filetype: Use the filetype: command to limit your research to PowerPoint for presentations, spreadsheets for data and statistics or PDF for research papers and industry/government reports. Note that in Google filetype:ppt and filetype:xls will not pick up the newer .pptx and xlsx formats so you will need to include those in your strategy, for example filetype:ppt OR filetype:pptx, or run separate searches for each one. In Bing.com, though, filetype:pptx will pick up both .ppt and .pptx files.
  4. Advanced search commands and search options Learn how to use the search commands (for example intext:, filetype: and site:). Many of these can be used on the advanced search screen that can usually be found under the cog wheel in the  upper right hand area of the screen, but that link sometimes disappears so learning the commands is a better bet. A list of the more useful Google commands is at http://www.rba.co.uk/search/SelectedGoogleCommands.shtml.
  5. Combine advanced search commands. Practise combining the advanced search commands for a more precise, focused set of results.
  6. Google Reading level. This changes the type of results that you see. Run your search and from the menu above the results select ‘Search tools’, ‘All results’ and from the drop menu ‘Reading level’. Options for switching between basic, intermediate and advanced reading levels should then appear just above the results. Click on the Advanced option to see results biased towards research. Google does not give much away as to how it calculates the reading level and it has nothing to do with the reading age that publishers assign to publications. It seems to involve an analysis of sentence structure, the length of sentences, the length of the document and whether scientific or industry specific terminology appears in the page.
  7. Numeric range. This command is unique to Google. Use it for anything to do with numbers – years, temperatures, weights, distances, prices etc. Simply type in your two numbers separated by two full stops as part of your search. This is a good way of limiting your search, for example, to forecasts over the few years.
  8. Limiting your search by date. To limit your search by date, for example the last month or year, first run your search. Then click on ‘Search tools’ in the menu above the results and from the second row of options that appears click on ‘Any time’. Select your time period or a custom range from the drop down menu.Google date
  9. Use the minus sign to exclude documents containing a word. If you do not want documents containing a specific word prefix that word term with a minus sign. The minus sign can also be used with commands such as site: and filetype: to remove an individual site or type of document from your results.
  10. Million Short http://millionshort.com/. If you are fed up with seeing the same results from Google again and again give Million Short a try. Million Short runs your search and you can choose to remove the most popular web sites from the results. Originally, as its name suggests, it automatically removed the top 1 million but now you can choose to remove the most popular 100, 1000, 10k, 100k or million sites. The page that best answers your question might not be well optimised for search engines or might cover a topic that is so “niche” that it never makes it into the top results in Google or Bing.
  11. Creative commons searches for images. Rather than search for images and go through them individually to find one that you can legally use in your document or presentation, use advanced search options or tools that allow you to select the appropriate license from the start. In Google, use the usage rights menu on the image advanced search screen to search for images with the license you need. The US version of Bing images includes a license option in the menu at the top of your results.

Bing Image License option
Double check the license of the photo on the website or blog hosting it. The license you need may be associated with a different image and yours could, for example, be ‘all rights reserved’.Flickr has a page where you can search for images with a specific Creative Commons license at http://www.flickr.com/creativecommons.

  1. Compare in Google. This is not a Google command but if you type in a search such as compare carrots with cabbage Google will create a table comparing the properties of the two items. Google has been known to get some of the data wrong, though, so it’s worth double checking the figures before you use them.
  2. Web archives. Want to see what was on a website a few years ago or trying to track down a document that seems to have vanished from the web? Try the Internet Archive Wayback Machine at http://www.archive.org/. Enter the URL of the website or document and you should then see a calendar of the snapshots that the archive has of the site or document. Choose a date from the calendar to view the page. The archive does not have everything but it is worth a try. See also the UK National Archives of old government websites and pages at http://www.nationalarchives.gov.uk/webarchive/ and the UK Web Archive at http://www.webarchive.org.uk/ukwa/.
  3. Statistics sites. Although you can often find statistics via Google, you may find dedicated official statistics sites quicker and more reliable. Some of the sites we covered during the workshops were:

    NHS Statistics Links http://www.nhs.uk/Pages/LinkListing.aspx?CategoryId=Statistics
    UK National Statistics Publication Hub http://www.statistics.gov.uk/
    Office for National Statistics http://www.ons.gov.uk/
    Welsh Government Statistics http://wales.gov.uk/topics/statistics/
    Welsh Assembly Government StatsWales http://statswales.wales.gov.uk/
    UK Open data http://data.gov.uk/
    Eurostat http://ec.europa.eu/eurostat/
    European Union Open Data Portal http://open-dat.europa.eu/en/
    Zanran http://www.zanran.com/