Tag Archives: advanced search

Google dumps Reading Level search filter

It seems that Google has dumped the Reading Level search filter. This was not one that I used regularly but it was very useful when I wanted more serious, in-depth, research or technically biased articles rather than consumer or retail focused pages. It often featured in the Top Tips suggested by participants of my advanced Google workshops.

It was not easy to find. To use it you had to first run your search and then from the menu above the results select ‘Search tools’, then ‘All results’, and from the drop menu ‘Reading level’. Options for switching between basic, intermediate and advanced reading levels then appeared just above the results.

Google Reading Level comparison
Slide showing Google Reading Levels from one my search workshops

More details of how it worked are in the blog posting I wrote when it was launched in 2010 (http://www.rba.co.uk/wordpress/2010/12/13/x-factor-web-pages-are-advanced-says-googles-reading-level/).

So another tool that helped serious researchers find relevant material bites the dust. I daren’t say what I suspect might be next but, if I’m right, its disappearance could make Google unusable for research.

Is Bing dropping search terms?

Google has been automatically dropping terms from searches that give few or no results for some time. It now looks as though Bing may be doing the same. Unfortunately I cannot give the details of the search that brought this to light as it was confidential research. In general, though, what we were searching for were announcements or news articles about two companies involved in a particular project. We hadn’t found anything in Google so we tried various alternative search engines including Bing (http://www.bing.com/). The results seemed quite promising until we started looking at the individual pages. None of them had all of our terms. It is possible that the missing terms appeared in links to the pages but the content of the documents suggested that this was unlikely, and there is no reliable free tool that shows you who is linking to a specific page. So it looks as though Bing is now dropping terms in the same way that Google does.

There are two ways to stop Bing doing this. The first is to use the Boolean AND operator between all of your terms. The second is to prefix the term that must be present in a document with ‘inbody:’, for example inbody:aardvark.

Did we find anything that answered our question? No, but sometimes I don’t expect to and it is frustrating when the search engine thinks it knows best and unilaterally decides to rewrite the search strategy.

For a list of all of the Bing advanced search commands go to http://msdn.microsoft.com/en-us/library/ff795620.aspx

Top search tips from Exeter and Bristol

A couple of weeks ago I was in Exeter and Bristol leading workshops for NHS South West on “Google & Beyond”. We covered advanced Google commands, Google Scholar and alternatives to Google. Below are the combined top tips from the two sessions. I may have missed a couple from the list as I could not read my writing, so if you attended one of the workshops let me know if I’ve omitted your suggested tip.

  1. Verbatim Yet again, this has topped the list of useful Google search options. Google automatically looks for variations on your search terms and sometimes drops terms from your search without telling or asking you. To make Google run your search exactly as you have typed it in, first run your search. Then click on ‘Search tools’ in the menu above your results, in the second line of options that appears click on ‘All results’ and from the drop down menu select Verbatim.
  2. Be aware of personalisation. Even if you are not signed in to a Google account Google personalises your results according to your search and browsing behaviour. Personalisation is not necessarily a bad thing but if your want to burst out of the filter bubble, as it is often called, use a private browser window or incognito (Chrome). Google will then ignore tracking and search cookies on your machine and will not personalise your results. To call up a private browser or incognito window use the following keys:

Chrome –  Ctrl+Shift+N
FireFox – Ctrl+Shift+P
Internet Explorer – Ctrl+Shift+P

  1. site: Use the site: command to focus your search on particular types of site, for example site:nhs.uk, or to search inside a large, rambling site. You can also use -site: to exclude sites from your search.
  2. intext: Google’s automatic synonym search can be helpful when looking for alternative terms, but if you want a term to be included in your search exactly as you have typed it then prefix the word with intext:.
  3. filetype: Use the filetype: command to limit your research to PowerPoint for presentations, spreadsheets for data and statistics or PDF for research papers and industry/government reports. Note that in Google filetype:ppt and filetype:xls will not pick up the newer .pptx and xlsx formats so you will need to include those in your strategy, for example filetype:ppt OR filetype:pptx, or run separate searches for each one. In Bing.com, though, filetype:pptx will pick up both .ppt and .pptx files.
  4. Advanced search commands and search options Learn how to use the search commands (for example intext:, filetype: and site:). Many of these can be used on the advanced search screen that can usually be found under the cog wheel in the  upper right hand area of the screen, but that link sometimes disappears so learning the commands is a better bet. A list of the more useful Google commands is at http://www.rba.co.uk/search/SelectedGoogleCommands.shtml.
  5. Combine advanced search commands. Practise combining the advanced search commands for a more precise, focused set of results.
  6. Google Reading level. This changes the type of results that you see. Run your search and from the menu above the results select ‘Search tools’, ‘All results’ and from the drop menu ‘Reading level’. Options for switching between basic, intermediate and advanced reading levels should then appear just above the results. Click on the Advanced option to see results biased towards research. Google does not give much away as to how it calculates the reading level and it has nothing to do with the reading age that publishers assign to publications. It seems to involve an analysis of sentence structure, the length of sentences, the length of the document and whether scientific or industry specific terminology appears in the page.
  7. Numeric range. This command is unique to Google. Use it for anything to do with numbers – years, temperatures, weights, distances, prices etc. Simply type in your two numbers separated by two full stops as part of your search. This is a good way of limiting your search, for example, to forecasts over the few years.
  8. Limiting your search by date. To limit your search by date, for example the last month or year, first run your search. Then click on ‘Search tools’ in the menu above the results and from the second row of options that appears click on ‘Any time’. Select your time period or a custom range from the drop down menu.Google date
  9. Use the minus sign to exclude documents containing a word. If you do not want documents containing a specific word prefix that word term with a minus sign. The minus sign can also be used with commands such as site: and filetype: to remove an individual site or type of document from your results.
  10. Million Short http://millionshort.com/. If you are fed up with seeing the same results from Google again and again give Million Short a try. Million Short runs your search and you can choose to remove the most popular web sites from the results. Originally, as its name suggests, it automatically removed the top 1 million but now you can choose to remove the most popular 100, 1000, 10k, 100k or million sites. The page that best answers your question might not be well optimised for search engines or might cover a topic that is so “niche” that it never makes it into the top results in Google or Bing.
  11. Creative commons searches for images. Rather than search for images and go through them individually to find one that you can legally use in your document or presentation, use advanced search options or tools that allow you to select the appropriate license from the start. In Google, use the usage rights menu on the image advanced search screen to search for images with the license you need. The US version of Bing images includes a license option in the menu at the top of your results.

Bing Image License option
Double check the license of the photo on the website or blog hosting it. The license you need may be associated with a different image and yours could, for example, be ‘all rights reserved’.Flickr has a page where you can search for images with a specific Creative Commons license at http://www.flickr.com/creativecommons.

  1. Compare in Google. This is not a Google command but if you type in a search such as compare carrots with cabbage Google will create a table comparing the properties of the two items. Google has been known to get some of the data wrong, though, so it’s worth double checking the figures before you use them.
  2. Web archives. Want to see what was on a website a few years ago or trying to track down a document that seems to have vanished from the web? Try the Internet Archive Wayback Machine at http://www.archive.org/. Enter the URL of the website or document and you should then see a calendar of the snapshots that the archive has of the site or document. Choose a date from the calendar to view the page. The archive does not have everything but it is worth a try. See also the UK National Archives of old government websites and pages at http://www.nationalarchives.gov.uk/webarchive/ and the UK Web Archive at http://www.webarchive.org.uk/ukwa/.
  3. Statistics sites. Although you can often find statistics via Google, you may find dedicated official statistics sites quicker and more reliable. Some of the sites we covered during the workshops were:

    NHS Statistics Links http://www.nhs.uk/Pages/LinkListing.aspx?CategoryId=Statistics
    UK National Statistics Publication Hub http://www.statistics.gov.uk/
    Office for National Statistics http://www.ons.gov.uk/
    Welsh Government Statistics http://wales.gov.uk/topics/statistics/
    Welsh Assembly Government StatsWales http://statswales.wales.gov.uk/
    UK Open data http://data.gov.uk/
    Eurostat http://ec.europa.eu/eurostat/
    European Union Open Data Portal http://open-dat.europa.eu/en/
    Zanran http://www.zanran.com/

Search Strategies new articles

New Search Strategies articles are now available.

“Excluding sites from your search” (subscribers only) is at http://www.rba.co.uk/search/subscribers/ExcludeSites.shtml

“From tourism to research information: how to change the emphasis of results” (subscribers only) covers techniques for changing the type of information returned by the search engines, for example consumer vs. more research focused pages (http://www.rba.co.uk/search/subscribers/Emphasis.shtml).

“Free Search Tools for Finding Research Information” is a 42 page PDF covering five things you need to know about Google, advanced searching in Google, alternative web search tools, institutional repositories and specialist tools. If you do not wish to purchase an annual subscription to the whole of Search Strategies, this article can be purchased on its own for £5.99. See http://www.rba.co.uk/search/ResearchInformationTools.shtml for further details.

 A full list of Search Strategies fact sheets and articles is at http://www.rba.co.uk/search/.

Search Strategies covers facts and tips, reviews of search tools and detailed strategies for more effective searching. Some information such as the fact sheets and Top Tips are available free of charge. The more detailed information on strategies is available on subscription. Annual individual subscription rates are £48/year (£40 + £8 VAT). Multi-user and corporate rates are available on request.

Details of how to purchase a subscription are at http://www.rba.co.uk/search/purchase.shtml

Forthcoming workshops

I am running three workshops in April on business information and search. All three have a practical element so that you can try out resources and techniques for yourself.

Introduction to Business Research

This is being organised by TFPL and will be held in London on Thursday, 18th April. This course provides an introduction to many areas of business research including statistics, official company information, market information, biographical information and news sources. It will cover explanations of the jargon and terminology, regulatory issues, assessing the quality of information, primary and secondary sources. Further information is available on the TFPL web site at http://www.tfpl.com/services/coursedesc.cfm?id=TR1116&pageid=-9&cs1=&cs2=f

Business information: key web resources

This is also being organised by TFPL in London and is being held on Friday, 19th April. This workshop looks in more detail at the resources that are available for different types of information, alerting services and free vs. fee. It also covers search strategies for tracking down industry, market and corporate reports. Further information is available at http://www.tfpl.com/services/coursedesc.cfm?id=TR945&pageid=-9&cs1=&cs2=f

Make Google behave: techniques for better results

This is a very popular workshop and is being organised by UKeiG. It is being held in Manchester on Tuesday, 30th April.

Topics include:

  • How Google works
  • Recent developments and their impact on search results
  • How Google personalises your results and can you stop it?
  • How to use existing and new features to focus your search and control Google
  • How and when to use Google’s specialist tools and databases
  • What Google is good at and when you should consider alternatives

The workshop will be repeated in London on Wednesday, 30th October. Details and booking information are on the UKeiG website at http://www.ukeig.org.uk/trainingevent/make-google-behave-techniques-better-results-karen-blakeman

Top search tips from North Wales

August is usually a quiet month for me with respect to work. Time for a holiday away and then a couple of weeks ambling along the Thames Path or pottering around the garden. This year, though, as soon as I was I back from my travels I was knuckling down and updating my notes for two search workshops in North Wales. Both were for the North Wales Library Partnership (NWLP), the first taking place at Coleg Menai in Bangor and the second at Deeside College. Both venues had excellent training facilities and IT, which meant we could concentrate on getting to grips with what Google is doing with search and experiment with different approaches to making Google do what we want it to do.

At the end of the workshops both groups were asked to come up with a list of  Top 10 Tips. I’ve combined the two lists and removed the duplicates to generate the list of 16 tips below.

  1. Repeat one or more of your search terms one or more times
    Fed up with seeing the same results for your search?  Repeat your main search term or terms to change the order of your results.
  2. Menus on left hand side of Google results pages
    Use the menus on the left hand side of the results page to focus your search and see extra search features. To see all of the options click on the ‘More’ and ‘More search tools’ links. The content of the menus changes with the type of search you are running, for example Image search has a colour option.
  3. Verbatim
    Google automatically looks for variations of your terms and no longer looks for all of your terms in a document. If you want Google to run your search exactly as you have typed it in, click on the ‘More search tools’ options at the bottom of the left hand menu on your results page and then on Verbatim at the bottom of the extended menu that appears.
  4. intext:
    Google’s automatic synonym search can be helpful in looking for alternative terms but if you want just one term to be included in your search exactly as you typed it in then prefix the word with intext:. For example carbon emissions buses intext:biofuels flintshire. The command sometimes has the effect of prioritizing pages where your term is the main focus of the article.
  5. Advanced search screen and search commands
    Use the options on the advanced search screen  or the search commands (for example filetype: and site:) in the standard search box to narrow down your search. A link to the advanced searchscreen can usually be found under the cog wheel in the  upper right hand area of the screen. If you can’t see a cog wheel or the link has disappeared from the menu go to http://www.google.co.uk/advanced_search. A list of the more useful Google commands is at http://www.rba.co.uk/search/SelectedGoogleCommands.shtml
  6. Try something different
    Get a fresh perspective by trying something different. Two most popular during these two workshops seemed to be DuckDuckGo (http://duckduckgo.com/) and Millionshort (http://millionshort.com). Other search engines to try include Bing (http://www.bing.com/) and Blekko (http://blekko.com/).
  7. Use the country versions of Google for information that is country specific
    This will ensure that the country’s local content will be given priority, although it might be in the local language. Useful for companies and people who are based in or especially active in a particular country, or to research holiday destinations. Use Google followed by the standard ISO two letter country code, for example http://www.google.de/ for Google Germany or http://www.google.no/ for Google Norway.
  8.  Filetype to search for document formats or types of information
    For example PowerPoint for experts or presentations, spreadsheets for data and statistics, or PDF for research papers and industry/government reports. Note that filetype:ppt will not pick up the newer .pptx so you will need to include both in your search, for example filetype:ppt OR filetype:pptx. You will also need to look for .xlsx if you are searching for Excel spreadsheets and .docx for Word documents. The Advanced Search screen file type box does not search for the newer Microsoft Office extensions.
  9. Clear cookies
    Even if you are logged out of your Google account when you search, information on your activity is stored in cookies on your computer. These can personalise your results according to your past search and browsing history. Many organisations have set up their IT systems so that these tracking cookies are automatically deleted at least once a day or whenever a person logs in or out of their computer account. At home, your anti-virus/firewall software may perform the same function. If you want to make sure that cookies are deleted or want to control them manually How to delete cookies at http://aboutcookies.org/Default.aspx?page=2 has instructions on how to do this for most browsers.
  10. Looking for research papers? Google Scholar (http://scholar.google.com/) is one place to look but there may be additional material hidden somewhere on an academic institution’s web site. Include advanced search commands, for example filetype:pdf site:ac.uk, in your search.
  11. For the latest news, comments and analysis on what is happening in an industry or research area carry out a  Google blog search and limit your search by date. Simply run your search as usual in the standard Google search box. On the results page click on Blogs in the menu on the left hand side of the screen and then select the appropriate time option.
  12. site: and -site:
    Use the site:command to search within a single site or type of site.For example:2011 carbon emissions public transport site:statistics.gov.uk to search just the UK official statistics web siteasthma prevalence wales site:gov.uk OR site:nhs.ukto search all UK government and NHS web sites

    If you are fed up with a site dominating your results use -site: to exclude it from your search.

    For example:

    Dylan Thomas -site:bbc.co.uk

  13. Reading level – from tourism to research
    Use this to option in the menus on the left had side of your results page to change the type of information. For example run a search on copper mines north wales. Then click on Reading Level in the left hand menus. Selecting “Basic” from the options that appear at the top of the results gives you pages on tourism and holiday attractions. “Advanced” gives you research papers, journal articles and mineral databases. Google does not give much away as to how it calculates the reading level and it has nothing to do with the reading age that publishers assign to books. It could involve sentence structure, grammar, the length of sentences on a web page, the length of the document, the terminology used and doubtless many other criteria.
  14. Google.com
    Apart from presenting your search results in a different order Google.com is where Google tries out new features. As well as seeing pages that may not be highly ranked in Google.co.uk you will get an idea of how Google search may look in the UK version in the future.
  15. Numeric range search
    Use this for anything to do with numbers – years, temperatures, weights, distances, prices etc. Use the boxes on the Advanced Search screen or just type in your two numbers separated by two full stops as part of your search.For example:world oil demand forecasts 2015..2030
  16. An understanding of copyright is important if you intend to re-use information found in the web and absolutely essential if you are going to use images. Creative Commons licences clearly state what you can and can’t do with an image but they are not all the same. The list at Creative Commons http://creativecommons.org/licenses/ outlines the terms and conditions. “FAQs – Copyright – University of Reading” at http://www.reading.ac.uk/internal/imps/Copyright/imps_copyrightfaqs.aspx gives some guidance on copyright but if in doubt always ask! An example of what can happen if you get it wrong is demonstrated by “Bloggers Beware: You CAN Get Sued For Using Pics on Your Blog” http://www.roniloren.com/blog/2012/7/20/bloggers-beware-you-can-get-sued-for-using-pics-on-your-blog.html.


IFEG advanced search presentation now available

The presentation on advanced web searching that I gave to the Information for Energy Group on April 3rd, 2012 in London is now available. It can be found on:

authorSTREAM at http://www.authorstream.com/Presentation/karenblakeman-1383280-ifeg-20120403/


Slideshare at http://www.slideshare.net/KarenBlakeman/advanced-web-searching-ifeg-3rd-april-2012

If you have problems accessing it on either of those sites it is temporarily available as a PowerPoint file on my web site at http://www.rba.co.uk/as/

Order matters with Google advanced search commands

The great thing about running search workshops is that you have so many people experimenting with advanced commands that someone is bound to spot an anomaly that you haven’t. We’ve become used to seeing different results when changing the order in which we enter keywords but not when using advanced search commands. During one of my workshops we had a couple of people playing around with Google’s allintitle command. This tells Google to look for all of the keywords following allintitle in the title of a document.

The search that was initially used was allintitle:diabetic retinopathy and came back with 277,000 results. Restricting the search to UK academic sites by using allintitle:diabetic retinopathy site:ac.uk reduced the number to about 2,190 and gave sensible results. But changing the order of the commands to site:ac.uk allintitle:diabetic retinopathy gave  two very bizarre results:

Site and Allintitle  Commands

Both results are from academic sites but the allintitle as a search command seems to have been ignored. The first entry includes intitle, diabetic and retinopathy and the second has allintitle, diabetic and retinal. Using the Verbatim option from the menus on the left hand side of the results page gave us zero!

Next we tried combining allintitle with fieltype:pdf.

allintitle:diabetic retinopathy filetype:pdf

gave us 3490 results of which at least the first 100 were relevant.

Switching the order to :

filetype:pdf allintitle:diabetic retinopathy

gave 495,000 results some of which were relevant but many did not contain all of our terms nor did they contain both diabetic and retinopathy in the title. Google was also looking for variations on our terms.

Order of advanced search commands


Using Verbatim on this search gave us zero again.

Advanced Commands and Verbatim

When we looked at the advanced search screen Google had put everything in the right boxes. If we used the advanced search screen to enter our terms afresh the search worked with Google putting the allintitle command at the start of the search.

Was this a general problem or just with allintitle? We then played around with the intitle command.

intitle:diabetic intitle:retinopathy site:ac.uk – 2220 sensible results (slightly more than our original allintitle search)

site:ac.uk intitle:diabetic intitle:retinopathy – 2220 sensible results identical to those above

intitle:diabetic intitle:retinopathy filetype:pdf – 3480 sensible results

filetype:pdf intitle:diabetic intitle:retinopathy – 3480 sensible results same as previous search

We then tried using a phrase after intitle:

intitle:"diabetic retinopathy" site:ac.uk – 2130 sensible results

site:ac.uk intitle:"diabetic retinopathy" 2130 sensible results identical to previous search

Following a suggestion made by Tamara Thompson of PIBuzz ( http://pibuzz.com/) changing the search slightly to site:ac.uk "intitle:diabetic intitle:retinopathy" gave exactly the same results.

Just to make sure that it wasn’t just us in the UK seeing this I asked fellow members of AIIP (http://www.aiip.org/) to run the original two allintitle searches. They saw exactly the same thing.

Its seems, then, that there is a problem when allintitle is not the first command in a search. The intitle alternatives appear more reliable. If you prefer to use the command line rather than fill in the boxes on the Advanced Search screen remember that order sometimes matters.

Does this affect other combinations of commands? I left it at allintitle and intitle but I wouldn’t be at all surprised.

x-Factor web pages are “advanced” says Google’s reading level

Google has rolled out a new search option that assigns a reading level to the pages in your results list. Don’t be surprised if you haven’t spotted it yet; it is hidden on the advanced search screen. Under the “Need more tools?” section you can choose from the drop down menu to see all of the results with reading level annotations, basic results, intermediate results or advanced results.

Google Reading Level

Google does not give much away as to how it calculates the reading level and it has nothing to do with the reading age that publishers assign to books. It could involve sentence structure, grammar, the length of sentences on a web page, the length of the document, the terminology used and doubtless many other criteria. But Google isn’t saying.

If you have opted to see the annotations, at the top of your results page you will see a graphic showing the percentages for each of the categories. Under the title of each entry in your results list is the reading level.

Google Reading Level Results

Click on the Basic, Intermediate or Advanced links next to the bar chart to see pages for that reading level. The eagle-eyed amongst you will have spotted that Google appears to be mathematically challenged because the numbers do not add up to 100%. In all of the searches I have done so far 1 or 2% are missing from the statistics. Looking through the lists of results some pages have no reading level assigned to them and they seem to be documents that contain very little information, have more numbers than text, and some are formatted files. Note, though, that most file formats do have a reading level so why some are not picked up remains a mystery to me. Some Daily Mail articles do not have a reading level either but many would argue that they fall into the ‘very little information’ category!

Once you have used the Reading Level in the advanced search screen you can change your search on the results page and it remains as part of your search strategy until you close down your browser or tab.

You can also check out an entire web by using the site command, for example site:rba.co.uk

Google Reading Level for RBA site

And this is where you can start to have some fun comparing sites (WARNING – this is addictive!). Phil Bradley has done some in his blog posting Google adds reading level
(http://philbradley.typepad.com/phil_bradleys_weblog/2010/12/google-adds-reading-level.html). He also highlights some potential problems with labelling pages in this way. For example ‘basic’ does not necessarily mean stupid, but some people may be deterred from selecting basic pages because of the tag.

Most of my pages are classed as intermediate and I am happy with that. Many of them are listings and analyses of business information sources. My husband’s blog on the other hand is 71% advanced and 27% intermediate. This comes as no surprise to me as he has a habit of littering his postings with complex calculations on topics such as wind turbine energy generation and the EROEI of tar sands oil production. (Just the sort of thing not to read before you have had your second cup of coffee of the day.) That plus the industry specific jargon that he uses makes an advanced tag inevitable.

Google Reading Level Energy Balance Blog

The evidence so far seems to be suggesting that using terms or jargon that are relatively uncommon in the whole of the Google database is a heavy factor in determining the reading level. Let’s look at what one might consider to be an intellectually challenging topic: the use of zeolites in environmental remediation.

Google Reading Level Zeolites search

That seems to confirm it.

As a final test and for a bit of fun let’s look at what Google makes of a search on the recent x Factor final.

Google Reading Level xFactor

Noooooo! Surely some mistake? The X factor home page is rated as basic but 93% of the results are advanced. There is indeed a mistake but it was my sloppy search strategy. Changing the x factor part of the search to a phrase gives what I would expect and a switch to 53% basic, 40% intermediate and 6% advanced.


Out of curiosity, I looked at the content of the advanced pages and am now totally bemused. I cannot see how they could ever have been classified as such, but then this is Google we’re talking about. Perhaps Google cannot comprehend the scoring system, why so many people watch it or why the programme exists at all?

Google Reading Level xFactor

I have experimented with several other searches. Some came up with results as bizarre as those for the x Factor search but it is interesting how the breakdown can be changed by slightly modifying your search strategy, for example by using phrases when appropriate or a plus sign before a term to force an exact match search. Google’s Reading Level could be useful as a training tool to show how small alterations to a search strategy can radically change the results. But as with all things Google, we do not know how it works and the results can sometimes be very strange. Use with caution.

IFEG Advanced Search, Statistics & Market Research

I have now uploaded the slides for my workshop at the Information for Energy Group (IFEG). As usual, I have uploaded them to several different web sites in case one or more are blocked by corporate firewalls. If you have problems accessing any of the locations, let me know and I’ll sort out some other means of getting the presentation to you.

Workshop: Advanced Internet Searching for Energy Information & Market Research
Organised for:
Information for Energy Group
Venue: The Energy Institute, New Cavendish Street, London.
Date: Thursday 13 May 2010

PowerPoint Presentation (download from the RBA site – 7.5 MB)