Category Archives: Search Strategies

Google offers to include missing search terms – sometimes

Google has been omitting terms from searches for several years.  For me, the matter came to a head wayback  in November 2011 (see Dear Google, stop messing with my search).  Many of has had noticed it happening for a while but what suddenly made it more frustrating was that one could no longer prefix a term with a plus sign to force its inclusion in a search. Furthermore, surrounding terms and phrases with double quote marks did not always work either.

Google’s Dan Russell explained why in a comment to my blog posting:

“When you do a multi-term query on Google (even with quoted terms), the algorithm sometimes backs-off from hard ANDing all of the terms together. It’s a kind of “soft” backoff. Why? Because it’s clear that people will often write long queries (with anywhere from 5 to 10 terms) for which there are no results. Google will then selectively remove the terms that are the lowest frequency to give you some results (rather than none). Bear in mind that 99% of searchers have no idea why they’d want to hard AND, and just get frustrated when they get no results. The soft AND is a way to reduce the overall frustration and give the searcher something to examine (and with luck, a chance to reformulate their query).”

He added:

“But I see what you mean about wanting to know if there are NO hits to a given query. I’ll pass this information along to the Google design team and see if we can’t do something with this.”

Well,  Google did do something about it and some weeks later Verbatim, which could be applied to your entire search and make Google run it without omissions or variations,  was added as a tool. The other option that existed then, and still does, is to prefix individual terms or phrases with ‘intext:’.

If you did not use Verbatim you were still left guessing as to whether or not all of your terms or their synonyms were present in a particular document until you actually clicked on it and viewed it in its entirety.  About a couple of years ago, Google started to include information on omitted terms in the results snippets  by adding a “Missing: ” statement underneath the entry.  At least we now had something to work with.  Google has now added a search option to it.  It started to appear 2-3 months ago, disappeared for a while, but now seems to be a permanent feature.  It enables you to tell Google that it must include the missing term. Let’s works through the example that first alerted me to it: a search for broad beans called Eleonora and supplied by Tamar Organics.

Before you ask, the reason I did not go directly to the Tamar Organics website was because it was quicker to go via Google than to work through the seed supplier’s site search and navigation system. Also, please note that if you try this search out yourselves you will probably get very different results. When we tried this in a workshop of 20 people we ended up with 11 variations on the theme!

First, the quick and simple approach of just throwing in a few terms:

broad beans eleonora tamar organics

The first two results were relevant and exactly what I was looking for,  but 8 results seemed a bit low especially as Google had indicated on the next two in the list that the term “eleonora”  was missing.  (We’ll come back to the “Must include: ” in a moment.) Going to the bottom of the results page there was the usual message that similar entries had not been displayed.

Erm… but, Google, you displayed 8 not 15 as you claim.  Let’s play along, anyway, and repeat the search by clicking on the link Google gives us. This time I was given 11 results.  We know that Google often gets the count wrong when using the repeat search option but I still thought that the number of results was rather low if it was omitting terms.  What would happen if I decided to take Google up on its offer of “Must include: eleonora”? Two, three or perhaps just four results?  I clicked on the eleonora link and …. 20,700 results!

In the search bar above the results we can see that Google has put eleonora in double quote marks to force its inclusion.

The first three results were fine but when I looked in detail at the fourth document it was missing both tamar and organics, and there was no indication in the snippets provided by Google that these, or any other terms, had been omitted.

Going back to my first set of results and looking further down the list I saw that, as well as one from which eleonara was omitted, there was another that had left out both eleonora and tamar, and a third with just tamar missing.

If the “Must include:” option has more than one term, you can only choose one of them. You cannot have all of them.  Choosing tamar gave me  43,500  results but this time Google did tell me when eleonora was missing from the documents. Most of the results were totally irrelevant.

How would I normally deal with missing terms?  I generally start off with a quick and dirty search and, unless I am looking for a particular type of document such as a presentation or industry report, I don’t always use advanced commands.  I just type in the separate words and in this case I did get what I wanted at the top of the page. But what if I hadn’t?

I was interested in the variety of broad beans called Eleonora but Google was omitting it from some of the results.  I could have done what Google did and use quote marks around eleonora but my experience is that Google sometimes ignores those if the number of results is low. My usual strategy is to use ‘intext:’ before the missing word, for example:

broad beans intext:eleonora tamar organics

This gave me 18,400 results with, again, most of them missing one or more terms.

Deciding to trust Google not to ignore double quote marks I changed my search to:

“broad beans” “eleonora” “tamar organics”

This time it was just 3 results, and when I repeated the search to include the omitted results I saw 5 but nothing from the Tamar Organics website itself. The reason for this was the presence of the phrase “broad beans” in the search string.  Looking at the results in my very first  search, I saw that Google was picking up the phrases “broad bean” and “beans (broad)” so I was now missing out on the top and most relevant results. A reminder that one needs to think very carefully about how and in what order search terms may appear in documents before applying phrase searching.

For comparison I applied Verbatim to the original quick and dirty search and got 411 results.  The main problem with that set was that Tamar and Organics were appearing in the documents separated by several words or even sentences.  When I applied Verbatim to the search string:

broad beans eleonora “Tamar Organics”

I was presented with a respectable list of 18 relevant results.

So, is the “Must include:” option worth using? It is quick and easy to apply, especially on a mobile device and I suspect that is why it has been introduced. However, it all starts to get very messy and complicated  if you try to use it on subsequent sets of results.  When I’m searching on my laptop, or on a desktop, I sometimes try the link but if that set of results is disappointing  and Google drops a different selection of terms I go back to my practice of using intext and/or Verbatim.  I also try double quotes around terms and phrases but my experience is that that Google still occasionally ignores them. It is entirely up to you which approach you use. How well each works does vary from one search to another, and on whether or not you are allowing Google to adjust results according to your search history and behaviour.  The important thing is to be aware of the options available to you and to be willing to experiment.

Google makes it harder to change location for country specific research

Google has made a major change to search and it does not bode well. Results are now based on your current location. So what’s new?  Google has always looked at your location, even down to city/town level, and changed the results accordingly. That is fine if you are travelling and want to find the nearest Thai restaurant via your mobile, for example. Presenting a list of eateries in my home town of Reading is no good to me if I’m away in Manchester and getting very hungry!

The problems start if you are researching a person, company or industry based in a country other than your own – let’s use Norway as an example – or just want the latest news from that country.  The trick used to be to go to the relevant country version of Google, in this case www.google.no, run your search and Google would give preference to Norwegian content. It is a great way to get alternative viewpoints on a topic and more relevant “local” information on a subject. Now, regardless of which version of Google you go to, you will see the same results tailored for your home location.

In a blog posting Making search results more local and relevant Google says:

Today, we’ve updated the way we label country services on the mobile web, the Google app for iOS, and desktop Search and Maps. Now the choice of country service will no longer be indicated by domain. Instead, by default, you’ll be served the country service that corresponds to your location. So if you live in Australia, you’ll automatically receive the country service for Australia, but when you travel to New Zealand, your results will switch automatically to the country service for New Zealand. Upon return to Australia, you will seamlessly revert back to the Australian country service.

This confirms that mobile search is what Google is concentrating on. After all it is, one assumes, where Google makes most of its money but it does not help professional researchers.

There is a way around it but it is rather long-winded. You need to go to Settings – use either the link in the bottom right hand corner of your Google home page or the one near the top of a search results page – and click on Advanced Search .

Google Settings Menu

On the Advanced Search screen scroll down to “Then narrow your results by…” and use the pull down menu in the region box to select the country.

Google Advanced Search Region

I ran a search on Brexit in google.co.uk, google.no and a few other country versions of Google. All gave me essentially the same results.Google UK results for Brexit

Using the region filter and selecting Norway as the country I am given the following by Google:

Google Norway Region Filter

Notice, though, that Google is giving me English articles or English versions of them. Google has decided that I would prefer English articles and I have to scroll down to number 10 and beyond to see pages in Norwegian. To get a  broader view of what is being said in Norway about Brexit I have to go back into settings, click on Languages and choose Norwegian/Norsk.

Brexit search with region and language filter on

Oh – and you get slight different results if you go through a VPN and set Norway as the country.

What worries me even more is that Google could do away with the advanced search screen and the region filter with it.

Google says:

We’re confident this change will improve your Search experience, automatically providing you with the most useful information based on your search query and other context, including location.

No, Google. You have just made things more difficult for those of us who conduct serious, in-depth research. The way I feel about this change at the moment is that if you were a person I would take a baseball bat to your head!

UPDATE: In response to David Pearson’s comment and reminder below.
Including a site command e.g. site:no in the search works relatively well for this particular example (Norway) and gives good but slightly different results. It will, of course, miss Norwegian sites that are registered as .com or other international domains. The amount of overlap (or lack of it) will vary depending on the country. It’s another one to add to the list of strategies, which I am sure will become longer,  for dealing with this problem.

Essential Non-Google Search Tools for Researchers – Top Tips

This is the list of Top Tips that delegates attending the UKeiG workshop on 7th September 2016 in London came up with at the end of the training day.  Some of the usual suspects such as the ‘site:’ command, Carrot Search and Offstats are present but it is good to see Yandex included in the list for the first time.

  1. Carrotsearch http://search.carrotsearch.com/carrot2-webapp/search or http://carrotsearch.com/ and click on the “Live Demo” link on the left hand side of the page.
    This was recommended for its clustering of results and also the visualisations of terms and concepts via the circles and “foam tree”. The Web Search uses eTools.ch for the general searches and there is also a PubMed option.

    Carrot Search Foam PubMed Foam Tree
    Carrot Search Foam PubMed Foam Tree
  1. Advanced Twitter Search http://twitter.com/search-advanced
    The best way to search Twitter! Use the Advanced Search http://twitter.com/search-advanced or the click on the “More Options” on the results page. There is a detailed description of the commands and how they can be used at https://blog.bufferapp.com/twitter-advanced-search 
  1. Yandex http://www.yandex.com/
    The international version of the Russian search engine with a collection of advanced commands – including a proximity operator – that makes it a worthy competitor to Google. Run your search and on the results page click on the two line next to search box.

    Yandex Advanced Search
    Yandex Advanced Search

    Alternatively, use the search operators. Most of them are listed at https://yandex.com/support/search/how-to-search/search-operators.xml. There is also a /n operator that enables you to specify that words/phrases must appear within a certain distance of each other, for example:

    "University of Birmingham" nanotechnology /2 2020

    There are country versions of Yandex for Russia, Ukraine, Belarus, Kazakhstan and Turkey. You will, though, need to know the languages to get the best out of them and apart from Turkey they use a different alphabet.

  1. Millionshort http://millionshort.com/
    If you are fed up with seeing the same results from Google again and again give MillionShort a try. MillionShort enables you to remove the most popular web sites from the results. The page that best answers your question might not be well optimised for search engines or might cover a topic that is so specialised that it never makes it into the top results in Google or Bing.Originally, as its name suggests, it removed the top 1 million but you can change the number that you want omitted. There are filters to the left of the results enabling you to remove or restrict your results to ecommerce sites, sites with or without advertising, live chat sites and location. The sites that have been excluded are listed to the right of the results.
  1. site: command
    Use the site: command to focus your search on particular types of site, for example include site:ac.uk in your search for UK academic websites. Or use it to search inside large rambling sites with useless navigation, for example site:www.gov.uk. You can also use -site: to exclude individual sites or a type of site from your search. All of the major web search engines support the command.
  1. Microsoft Academic Search http://academic.research.microsoft.com/
    An alternative to Google Scholar.“Semantic search provides you with highly relevant search results from continually refreshed and extensive academic content from over 80 million publications.”This was recently revamped and although it now loads and searches faster than it used to the new version has lost the citation and co-author maps that were so useful. It can be a useful way of identifying researchers, publications and citations but do not rely on the information too much. It can get things very wrong indeed. For example, I’ve found that for some reason the affiliation of several authors from the Slovak Technical University in Bratislava is given as the Technical University of Kenya!
  1. Wolfram Alpha https://www.wolframalpha.com/
    This is very different from the typical search engine in that it uses its own curated data. Whether or not you get an answer from it depends on the type of question and how you ask the question. The information is pulled from its own databases and for many results it is almost impossible to identify the original source, although it does provide a possible list of resources. If you want to see what WolframAlpha can do try out the examples and categories that are listed on its home page.
  1. OFFSTATS – The University of Auckland Library http://www.offstats.auckland.ac.nz/
    This is a great starting point for locating official statistical sources by country, region or subject. All of the content in the database is assessed by humans for quality and authority, and is freely available.
  1. Meltwater IceRocket http://www.icerocket.com/
    IceRocket specialises in real-time search and was recommended for inclusion in the Top Tips for its blog search and advanced search options. There is also a Trends tool that shows you the frequency with which terms are mentioned in blogs over time and which enables you to compare several terms on the same graph.

    IceRocket Trends
    IceRocket Trends

    Very useful for comparing, for example, mentions of products, companies, people in blogs.

  1. Behind the Headlines NHS Choices http://www.nhs.uk/news/Pages/NewsIndex.aspx
    Behind the headlines provides an unbiased and evidence-based analysis of health stories that make the news. It is a good source of information for confirming or debunking the health/medical claims made by general news reporting services, including the BBC. For each “headline” it summarises in plain English the story, where it came from and who did the research, what kind of research it was, results, researcher’s interpretation, conclusions and whether the headline’s claims are justified.

Alternatives to Google: Carrot Search and eTools.ch

Two of the services I cover in my workshop for researchers on alternatives to Google are Carrot Search and eTools.ch, and recently one of the people who had attended the session in April asked me to confirm what Carrot Search used  to provide its main results. Strictly speaking, neither Carrot Search nor eTools are Google free: eTools is a metasearch tool that has Google as one of its sources and Carrot Search uses eTools for its web search. At the start of the year, Carrot Search offered 7 options for searching under tabs across the top of the search screen including Web, “wiki”, Bing, News, Images, PubMed and Jobs. Web search used eTools.ch to provide the results.

Carrot Search
Carrot Search – beginning of 2016

The range of options has now been reduced to just three: the more transparently labelled eTools Web Search, PubMed and Jobs.

Carrot Search options July 2016
Carrot Search options July 2016

 

This makes sense as the number of accesses to Bing via the api was always limited and I could never get the news or images options to work. eTools in any case is a metasearch engine covering 17 tools including Google, Bing and Wikipedia so the extra Carrot Search tabs did seem to be unnecessary. The full list can be seen on the eTools home page.

eTools list of search engines
eTools list of search engines

This is where it gets interesting. It appears that Carrot Search does not just copy the results from a search on eTools.  I ran a search on Brexit in Carrot Search and compared the results from eTools Worldwide and eTools United Kingdom. All of the sets  were different so Carrot Search must be doing some additional analysis and processing.

Carrot Search doesn’t just list the results but also organises them into topics or Folders that are displayed on the left hand side of the screen. These can be a useful way of narrowing down your search.

Carrot Search Brexit results

Carrot Search offers two other ways of displaying results: Circles and Foam Tree.

Carrot Search Circles
Carrot Search Circles

 

Carrot Search Foam Tree
Carrot Search Foam Tree – 13th July 2016

Both show the density of terms in the top 100 results and allow you to click on an area to add the term or phrase to the search.  In addition I am finding that the Foam Tree is an interesting way of monitoring changes in news coverage and social media discussions on a topic, product or company. Yesterday, when I ran the search on Brexit, there was an area representing Theresa May.  Today, that had been replaced with one for David Cameron. I assume that is because the news coverage has been concentrating on David Cameron’s last day as Prime Minister and his last Prime Minister’s Questions (PMQ) in Parliament . Later he goes to see the Queen to officially resign as Prime Minister. Tomorrow,  with Theresa May as our new Prime Minister and a new Cabinet, the Foam Tree could have a very different structure so I shall be looking at it periodically to see if and how it reflects changes in events.

As I mentioned earlier eTools.ch, which is behind the main Carrot Search web search, is a metasearch engine covering 17 tools. It also has options to select a country from a drop down list (Worldwide, Swtzerland, Liechtenstein, Germany, Austria, France, Italy, Spain,  UK) and a language (All, English, German, French, Italian, Spanish). Either or both of these give you completely different views and opinions on a subject.

eTools - Switzerland, all languages
eTools – Switzerland, all languages

 

eTools_CH_French
eTools – Switzerland, French

 

eTools - Spain, all languages
eTools – Spain, all languages

It is a convenient way of gathering a range of foreign language information, especially on European events, and is easier than searching individual country versions of Google or Bing. The disadvantages are that the range of countries and languages is limited and many of the articles will not be in English. Nevertheless, I often find it helpful at the start of a piece of research as I get a general feel for the type and range of information that is available.

Carrot Search and eTools.ch are just two of the tools that I cover in my workshop on alternatives to Google. If you are interested in finding out more, the next session is being organised by UKeiG and will be held in London on Wednesday, 7th September 2016. Further details are available on the UKeiG website.

Searching for the height of Ben Nevis – how hard can it be?

If you have attended one of my recent search workshops, or glanced through the slides, you will have noticed that I have a new test query: the height of Ben Nevis. It didn’t start out as a test search but as a genuine query from me.  A straightforward search, I thought, even for Google.

I typed in the query ‘height of ben nevis’ and across the top of the screen Google emblazoned the answer: 1345 metres.  That sort of rang a bell and sounded about right, but as with many of Google’s Quick Answers there was no source and I do like to double or even triple check anything that Google comes up with.

Ben_Nevis_1

To the right of the screen was a Google Knowledge Graph with an extract from Wikipedia telling me that Ben Nevis stands at not 1345 but 1346 metres above sea level. Additional information below that says the mountain has an elevation of 1345 metres and a prominence of 1344 metres (no sources given). I know have three different heights – and what is ‘prominence’?

Ben-Nevis-3

After a little more research I discovered that prominence is not the same as elevation, but I shall leave  you to investigate that for yourselves if you are interested. The main issue for me was that Google was giving me at least three slightly different answers for the height of Ben Nevis, so it was time to read some of the results in full.

Before I got around to clicking on the first of the two articles at the top of the results, alarm bells started ringing.  One of the metres to feet conversions in the snippets did not look right.

Height of Ben Nevis search results 3

So I ran my own conversions for both sets of metres to feet and in the other direction (feet to metres):

1344m = 4409.499ft, rounded down to 4409ft

4406ft = 1342.949m, rounded up to 1343m

1346m = 4416.01ft, rounded down to 4416ft

4414ft = 1345.387m, rounded down to 1345m

As if finding three different heights was not bad enough, it seems that the contributors to the top two articles are incapable of carry out simple ft/m conversions, but I suspect that  a rounding up and rounding down of the figures before the calculations were carried out is the cause of the discrepancies.

The above results came from a search on Google.co.uk. Google.com gave me similar results but with a Quick Answer in feet, not metres.

Ben-Nevis-4

We still do not have a reliable answer regarding the height of Ben Nevis.

Three articles below the top two results were from BBC News, The Guardian and Ordnance Survey – the most relevant and authoritative for this query –  and were about the height of Ben Nevis having been remeasured earlier this year using GPS. The height on the existing Ordnance Survey maps had been given as 1344m but the more accurate GPS measurements came out at 1344.527m or 4411ft 2in. The original Ordnance Survey article explains that this is only a few centimetres different from the earlier 1949 assessment but it means that the final number has had to be rounded up rather than down. The official height on OS maps has therefore been increased from 1344m to 1345m.  So Google’s Quick Answer at the top of the results page was indeed correct.

Why make a fuss about what are, after all, relatively small variations in the figures? Because there is one official height for the mountain and one of the three figures that Google was giving me (1346m) was neither the current nor the previous height. Looking at the commentary behind the Wikipedia article, which gave 1346m, it seems that the contributors were trying to reconcile the height in metres with the height in feet but carrying out the conversion using rounded up or rounded down figures. As one of my science teachers taught me long ago, you should always carry forward to the next stage of your calculations as many figures after the decimal point as possible. Only when you get to the end do you round up or down, if it is appropriate to do so. And imagine if your Pub Quiz team lost the local championship because you had correctly answered 1345m  to this question but the MC  had 1346m down as the correct figure? There’d be a riot if not all out war!

That’s what Google gave us. How did Bing fare?

The US and UK versions of Bing gave results that looked very similar to Google’s but  with two different quick answers in feet, and neither gave sources:

Bing UK

Ben-Nevis-Bing-UK

Bing US

Bing-Ben-Nevis-US

I won’t bore you with all of the other search tools that I tried except for Wolfram Alpha. This gave me 1343 meters or 4406 ft. At least the conversion is correct but there is no direct information on where the data has been taken from.

Ben-Nevis-WA

The sources link was of no help whatsoever and referred me to the home pages of the sites and not the Ben Nevis specific data. On some of the sites, when I did find the Ben Nevis pages, the figures were different from those shown by Wolfram Alpha so I have no idea how Wolfram arrived at 1343 meters.

So, the answer to my question “How high is Ben Nevis?” is 1344.527m rounded up on OS maps to 1345m.

And the main lessons from this exercise are:

  1. Never trust the quick answers or knowledge graphs from any of the search engines, especially if no source is given. But you knew that anyway, didn’t you?
  2. If you are seeing even small variations in the figures, and there are calculations or conversions involved, double check them yourself.
  3. Don’t skim read the results and use information highlighted in the snippets – read the full articles and from more than one source.
  4. Make sure that the articles you use are not just copying what others have said.
  5. Try and find the most relevant and authoritative source for your query, and ideally a primary source. In this case it was Ordnance Survey. GB officially taller – Ben Nevis  https://www.ordnancesurvey.co.uk/about/news/2016/gb-officially-taller-ben-nevis.html

Bing extends date search option

Bing has at last extended its date search options. Until recently one could only limit results to the past 24 hours, past week or the past month, and then only in Bing US.  Bing has now added a custom range on a par with Google.

Bing_Date_US_2

The UK version of Bing has not had a date option until now but bizarrely has added the old, limited US selection.

Bing-Date-UK-2It seems very strange that they haven’t implemented the full US list. One can but hope that it will happen soon rather than in several years time, which is how long it has taken for this version to appear in Bing UK.

Advanced Google workshop – Top Tips

This collection of Top Tips is a combined list nominated by those who attended the UKeiG workshop on “New Google, New Challenges”. The next UKeiG Google workshop will be run on 8th September 2016.

1. Do not trust Google’s facts and answers
Google tries to provide facts and quick answers to your queries at the top and to the right of your results. These are computer generated extracts from pages and several different sources may be used to produce an “answer”. They are sometimes misleading or completely wrong. At the time of writing, the answer provided for a search on frugivore is an excellent example. (It explains why your cat is so fussy over its food – it is obviously craving its 5 a Day!) Always go to the original source to double check the information, but this is not always provided by Google.

2. Country versions of Google and /ncr
Country versions of Google give priority to the local content. This is a useful strategy when searching for research groups, companies and people that are active or working in a particular country. Use the standard ISO two letter country code, for example http://www.google.fr/ for Google France, http://www.google.it/ for Google Italy.

It is also worth trying your search in Google.com. Your results will probably be more international or US focused but you may see new search features or layouts in Google.com that are not yet available elsewhere. If Google insists on redirecting you to your own country version, go to the bottom right hand corner of the Google home page and you should see a link to Google.com. If there is no link then add ‘/ncr’ to the Google URL, for example http://www.google.com/ncr .

The downside of using country versions of any search tool is that the prioritised information is likely to be in the local language.

3. Search history
Your search history, which is recorded and available for you to view if you are signed in to your Google account, is used by Google to help personalise your results but it can also be useful as a record of past searches. If a user comes back to you having forgotten or lost the search and documents you gave them your search history should be able to help you find both. On any search results page click on the cog wheel in the upper right hand area of the screen and select History. You can then browse your history or select a date from the calendar (upper right and area of the History screen).

4. Verbatim
This is an essential tool for making Google carry out your search the way you want it run. Google automatically looks for variations on your terms and sometimes drops terms from your search, which is not always helpful. To use Verbatim, first run your search. Then click on ‘Search tools’ in the menu that runs across the top of your results page. A second row of options should appear. Click on ‘All results’ and from the drop down menu select Verbatim. Google will then search for your terms without any variations or omissions. Note that Google will search for documents and pages in which the words appear in any order. If you are searching on the title of a paper place the title within double quote marks to force an exact phrase match. If Google still alters your search then run Verbatim. 

Verbatim-Factsheet
If you are carrying out in-depth research it is worth trying out Verbatim even if the “normal” Google results seem OK. You may see very different and possibly more relevant content.

5. filetype: command.
An important advanced search command that is available not only in Google but in many alternative search tools. Use the filetype: command to limit your research to PowerPoint for presentations, spreadsheets for data and statistics, or PDF for research papers and industry/government reports.

For example:

plasmonic nanoparticles filetype:ppt

The command must be all lower case and there must be no spaces between the colon and the command or the file extension, otherwise Google will treat the command as a searchable word. Also you must search for pre and post Office 2007 file extensions separately as Google does not automatically pick up both.

For example

plasmonic nanoparticles filetype:ppt OR filetype:pptx

Note that Google’s Advanced Search screen pull down menu for filetype: only searches for pre Office 2007 extensions.

6. Minus sign to exclude information
Use the minus sign immediately before a term to exclude documents containing that term, but use with care as you may lose valuable information. It can also be used with commands to exclude file formats or websites from your search.

For example:

occupational asthma UK site:gov.uk -site:hse.gov.uk
-site:nationalarchives.gov.uk

7. Combine search commands
Combine multiple commands such as filetype: and site: to focus your search. Use the OR command to search for alternatives, for example:

occupational asthma UK site:ac.uk filetype:ppt OR filetype:pptx

8.Personalise Google News
Personalise Google News (http://news.google.co.uk) page when signed in to your account  and change what content is automatically displayed or add your own searches. Click on the Personalise button at the top of the right hand column. 

9. Google Scholar Cite feature
Click on the Cite link under a reference in Google Scholar and Google will give you options to import a citation in MLA, APA, Chicago, Harvard or Vancouver style into BibTex, EndNote, RefMan or RefWorks. Note that if the article is only available online you may need to add a doi or a URL, and the date of access.

10. Use Google site: search on Google scholar
This is one I had not thought of but was recommended by one of the delegates as a way of using Google’s advanced search commands on Google Scholar instead of Scholar’s own. (I have not had time to test this one out myself).

Google advanced search – get it right!

When running advanced search workshops, and especially Google sessions, I prefer not to dwell on commands and search options that are no longer supported. They are gone and that is that, and it is far better to concentrate on how to get the best out of what is left. Of course it is unavoidable when your slides have been prepared several days before the event and  Google decides to pull the plug on one of your favourite search features just before you start! Similarly I tend not to show “this is how NOT to use….” a command  or incorrect syntax. It is often the incorrect format that one remembers.  Recently, though, I have added slides  to my presentations that cover both defunct commands and errors in syntax and format.

The problem is that not only are many people unaware that some search options are no longer available but also some fact sheets and articles covering advanced search are getting it wrong. The recent Guardian article on top search tips for Google almost got it right  but referred to the tilde, which was dropped in 2013, and did not really understand how Google automatically looks for synonyms and variations on a term (see my earlier blog posting Guardian’s top search tips for Google not quite tiptop). I have also seen a couple of recently produced Google  fact sheets riddled with mistakes.

The wonderful thing about Google is that it can take the most tortuous and error ridden search string and still come back with something that is sensible – most of the time.  The downside of this is that one assumes the search query has worked as intended when in fact Google has totally rewritten the search for you.  At some point, though, Google will rewrite the search in such a way that it brings back rubbish. So, it is important to know what commands are available and how they should be used.

Let’s get started.

Plus (+) sign before a word to force an exact match.

This was discontinued in October 2011 because Google intended to use it as a way to search for Google+ pages. That has been abandoned and it is now a searchable character.  If you want to force an exact match search on a term precede the term with intext: for example intext:agriculture.

I have also seen examples claiming that a plus sign between words acts as a Boolean AND. No, it doesn’t.  If you do get different results when using + it is because Google is searching for that as well as your terms.

Tilde  (~) for synonyms

This was withdrawn in  June 2013  because not many people used it and it was no longer needed. Google now looks for synonyms by default.

thesaurus: for alternative terms

‘thesaurus:’ sort of works because Google treats ‘thesaurus’, having ignored the colon,  as a search term. So ‘thesaurus:eclectic’ will give you links to pages and websites of dictionaries and definitions that give synonyms for eclectic. It does not give you a straightforward list of alternatives in the same way that ‘define’ does. If you use thesaurus  you have to go the websites in turn to view the synonyms.

eclectic-thesaurus

eclectic-define

The asterisk *

The asterisk (*) is a placeholder for terms between two words e.g. solar * panels finds solar photovoltaic panels, solar PV panels, solar thermal panels. It is NOT a truncation symbol. Again, you might think it is because Google ignores the asterisk and automatically looks for  words that begin with the letters you have typed in.

The example I gave in my earlier blog posting was a search on phenobarb*. I expected Google to pick up references to phenobarbitone. It picked up 76,000 results including phenobarbital but there was no mention of phenobarbitone in the first 100.  Phenobarb without the asterisk picked up the exact same results.  A search on phenobarbitone, with and without the asterisk came up with 241,000 results. I have no idea how or when Google decides to stop looking for variations on your string but it is obvious from the above example that the asterisk is not a truncation symbol.

Do NOT capitalise the first letter of commands, and NO spaces

Commands such as intitle:, intext:, filetype: and site: must be all lower case and NO spaces between the colon and the search term. Capitalise the first letter or add a space after the colon or both and Google treats the command as an ordinary searchable word.

The correct format for an intitle: search is, for example, intitle:caversham and finds the following:

Google_Intitle_Correct

Capitalise the first letter of the command or insert a space or both and you find:

Google_Intitle_WrongI do understand why so many fact sheets, and presentations, show commands with an initial capital letter. You spend ages preparing your information and when you have sent off your slides for printing or converted your document to a PDF you discover that Microsoft Office has changed the format of the command. Because your search example is on a separate line with the command at the start Office, bless it, decides to auto-correct and capitalise the first letter. I know, it has happened to me! So, please, check and double check your support materials.

Google searches for all of your terms by default

Not always. If your search, as it stands, finds zero or a low number of results then Google will drop one or more terms that are usually shown as strikethroughs.  In the above screenshot you can see that the third entry in the results has a “Missing: caversham” at the end of the snippet.

If Google is dropping a term that is essential to your search then prefix it with intext:, for example intext:caversham. If you want all of your terms to be included, and without any variations, then use the Verbatim search option.  If you are using a desktop or laptop run your search and then click on the Search tools option at top of your results. A second line of options will appear. Click on All results and select Verbatim.  The layout and location of Verbatim on mobile devices will usually be different.

Double quotation marks around phrases

Double quotation marks around phrases, titles of papers, song titles, famous sayings etc. works most of the time. But, again, if Google finds zero or only a handful of results it will ignore the marks. Google may also alter the spelling of one or more words within the double quotation marks. Use Verbatim if you are sure that the phrase is correct and you want to bring Google to heel.

Full nested Boolean search

Google has NEVER supported full nested Boolean search. I still meet people who are adamant that Google does, but when pushed they admit that they often get unexpected results.  You can , though, use OR for alternative terms and the minus sign before a term to exclude documents containing that term.

This is how Google interprets the search (confectionery OR chocolate) AND (production OR manufacture) AND (france OR Germany OR UK OR switzerland) NOT belgium

Google_Boolean

Note that pages containing Belgium are included rather than excluded.

Remove the ‘NOT Belgium’ and this is what we see:

Google_Boolean_2

Add ‘-belgium’ to the end of the search instead of ‘NOT belgium’ and we get:

Google_Boolean_3

Running Verbatim on our original Boolean search shows that Google is treating AND and NOT as lower case, searchable words:

Google_Boolean_Verbatim

If you really want to use full Boolean, then get thee hence to Bing.

If you want to learn more about Google search Dan Russell, who works at Google,  is currently running an online course on Power Searching with Google.  Alternatively, if you want a more business or academic research and UK/European oriented workshop on what Google can do I am running an advanced Google workshop with UKeiG on April 13th, 2016.

Guardian’s top search tips for Google not quite tiptop

I have just been alerted by fellow search expert Alison McNab to an article by Samuel Gibbs (@SamuelGibbs) in the Guardian on top search tips for Google.  I had to double check the date of the article because although it is OK for the most part it has got a few things wrong, one of the commands was withdrawn some time ago,  and it has missed what I consider to be one of the most important Google search options.

So let’s have a look at the tips one by one.

  1. Exact phrase.

Yes, placing double quote marks around words usually makes Google search for the exact phrase. However, Google does sometimes ignore the quote marks.

2.  Exclude terms

Yes, preceding a term with a minus sign will exclude documents containing that term

3. Either OR

Yes, the OR command does work when searching on alternative terms – most of the time. Make sure the OR is in capital letters.

4. Synonym search

Tilde symbol (~) for a synonym search? No! Google withdrew it over two years ago  because not many people used it. Google now looks for synonyms by default. If you precede a term with a tilde Google ignores it and carries on as normal. I’ve just tried several searches with and without the tilde and get exactly the same results.

5. Search within a site

Yes. The site: command is one of the most powerful advanced search commands and enables you to search within a single site, for example site:www,gov.uk, or a type of site, for example site:ac.uk for UK academic sites.

6. The power of the asterisk

Yes, the asterisk can stand in for one or more terms between two words, for example solar * panels. No, it is not a truncation symbol.

The example given by The Guardian  is a search on architect*, which finds “architect, but also architectural, architecture, architected, architecting and any other word which starts with architect.” As with synonyms, Google searches for variations on a word by default.

I ran a search on phenobarb* expecting Google to pick up references to phenobarbitone. It picked up 76,000 results including phenobarbital but there was no mention of phenobarbitone in the first 100 documents.  Phenobarb without the asterisk picked up the exact same results. Excluding phenobarbitone by using the minus sign gave me 70,000 results.  A search on phenobarbitone, with and without the asterisk came up with 241,000 results.

7. Searching between two values

Yes. The number range search does work and is great for searching within a range of values or years.  For example:

chocolate consumption forecasts 2016..2020

top 10..100 UK car insurance companies

toblerone 1..5 kg

8. Search for word in the body, title or URL of a page

This covers the commands intext:, intitle: and inurl:.  All correct but intext: is especially useful in that it forces Google to include that term in the search. It is invaluable if you find Google dropping key terms from your strategy, which it does if you are likely to retrieve zero results or it thinks the number of results is too low.

9. Search for related sites

The related: command looks for similar sites, for example related:theguardian.com finds other news organisations. It works but only shows you 20-30 sites. Worthwhile using, though, if you want to broaden your search to other but similar organisations and only have one or two to start with.

10. Combine them

I wholeheartedly agree with this one. Once you have a few advanced commands under your belt you can really start to focus your search and retrieve more relevant results.

What’s missing?

I’m surprised that filetype: was not included. It is nearly always on the list of top tips that my advanced search workshop participants suggest at the end of the day.  It’s a quick and easy way of finding presentations (filetype:ppt, filetype pptx), government documents and research papers (filetype:pdf) and datasets (filetype:xls, filetype: xlsx, filetype:csv).

The major omission for me, though, is Verbatim. It is different from the rest in that it is not a command that you can type in. You have to run your search first. From the menu at the top of the results select ‘Search tools’, followed by ‘All results’  and  then ‘Verbatim’. Use this when Google is wreaking havoc on your search by leaving out terms and using weird and wonderful terms that have nothing to do with your subject. Verbatim will search on all of your terms without dropping any or looking for variations and synonyms.

Verbatim

If you are interested in learning more about advanced search in Google and other search tools, some of my past presentations and fact sheets are available at http://rba.co.uk/as/.  If you are interested in attending a workshop my public access training schedule for 2016 is at http://www.rba.co.uk/training/ (more events will be added shortly).