Update: On further investigation the example given below is not Google rewriting
webpage titles (see the comments section). However, Google has said that they
do rewrite under certain circumstances so please let me know if you come across
any good examples.
Most of us are used to Google rewriting our searches and personalising results and know how to stop Google doing it, but Google also rewrites the titles of some pages on the results page. This is something that I and my colleagues have noticed on and off for a while but it is now official (See Google’s Matt Cutts: Why Google Will Ignore Your Page Title Tag & Write Its Own http://searchengineland.com/googles-matt-cutts-look-title-match-query-190039).
According to the article Google checks that the title of a page is relatively short, a good description of the page and relevant to the query. If the existing page title fits those criteria then Google leaves it alone. If not then Google may use other content on the page such as H1 content, anchor text links pointing to the page and/or use the Open Directory Project. The aim, Matt Cutts says, is to ensure that the title helps a user assess whether or not the page has the information they are looking for.
During a search workshop I was running last week, one of the participants came across an example of what we think was a rewritten page title. Their search was mindfulness in school as crime prevention uk site:ac.uk. Top of the list was the home page of JournalTOCs and the title that Google gave was “Implementing mindfulness and yoga in urban schools: a…”.
This looked relevant to the search but clicking on the link took us to the home page of JournalTOCs where none of the original search terms were mentioned.
The source code of the page showed that the original title is simply JournalTOCs.
Did JournalTOCs have the keywords on an earlier version of its homepage that is currently in Google’s cache or did Google rewrite the search as well as the title of the page? When I tried to view Google’s cached copy of the page I got a 404 error!
I reran the search and applied Verbatim to it. There were four JournalTOCs pages in the first 100 results that were relevant but none matched the title that Google gave in the original results. I ran a search on that title in JournalTOCs but found nothing. Searching elsewhere I found that the article does exist. Also, the URL of the JournalTOCs page in the orginal results seems to include a reference to an article page, so I am not sure what is going on here. Did Google really rewrite the title? Or was the article once listed in JournalTOCs but no longer there and Google’s cached copies of JournalTOCs are out of date? Either way Google’s results were inaccurate, misleading and very confusing.
I have several Google accounts used for different purposes. I set up the first in the very early days of Google -long before even Gmail arrived on the scene – in order to manage analytics and what I then called “serious stuff” related to my business website. I subsequently used it for managing my YouTube videos. I set up a second account when Google Labs and Gmail came along and regarded that as my experimental acccount. Gradually, I used the second one more and more as my main account but kept the first for my business website applications. When Google+ came along I “upgraded” the second account and set up a profile.
Everything was fine until one day I tried to access my YouTube videos that were linked to my first, non-Google+ account. YouTube encouraged me to set up a Google+ profile for this account but I declined. YouTube responded by making my videos invisible to everyone, including myself! So I gave in and set up a second Google+ profile.
If only that had been the end of it. People started adding this new profile to their circles rather than my main one. I tried to find ways around this but in the end decided to just abandon the YouTube videos and delete the superfluous Google+ profile. It is easily done via your Google+ settings page but of course there are numerous dire warnings of all the wonderful things that you will no longer be able to enjoy (not a lot actually!). Despite what has been implied in the past deleting or what Google calls “downgrading” your Google+ account does NOT delete your ordinary Google account.
Use the site: command to focus your search on particular types of site, for example site:nhs.uk for UK NHS websites, or to search inside a large rambling site. If you prefer you can use the Advanced search screen at http://www.google.co.uk/advanced_search and fill in the site or domain box
An essential tool for making Google behave and run your search the way you want it run. Google automatically looks for variations on your terms and sometimes drops terms from the search. To make Google run your search exactly as you have typed it in, first run your search. Then click on ‘Search tools’ in the menu above your results. In the second line of options that appears click on ‘All results’ and from the drop down menu select Verbatim.
Google’s automatic synonym search can be helpful in looking for alternative terms but if you want a term to be included in your search exactly as you have typed it in then prefix the word with intext:. For example heron island intext:parrots caversham UK.
4. Incognito/Private browsing
Even if you are not signed in to a Google account, Google personalises your results according to your search and browsing behaviour. If you want to burst out of the filter bubble, as it is often called, use a private browser window or incognito (Chrome). Google will then ignore tracking and search cookies on your machine. To call up a private browser or incognito window use the following keys:
Chrome – Ctrl+Shift+N
FireFox – Ctrl+Shift+P
Internet Explorer – Ctrl+Shift+P
5. Reading level
This changes the emphasis of the results that you see. Run your search and from the menu above the results select ‘Search tools’, then ‘All results’, and from the drop menu ‘Reading level’. Options for switching between basic, intermediate and advanced reading levels should then appear just above the results. Click on the Advanced option to see results biased towards research.
To limit your search by date, for example the last month or year, first run your search. Then click on ‘Search tools’ in the menu above the results and from the second row of options that appears click on ‘Any time’. Select your time period or a custom range from the drop down menu. Unfortunately, this does not work with Verbatim. You could use the ‘daterange:’ command instead to specify your dates and then apply Verbatim, but you first have to convert you dates to Julian format. The Julian Date Converter at http://aa.usno.navy.mil/data/docs/JulianDate.php tells you more about the format and provides a tool for converting dates. Alternatively, using something like Gmacker (http://gmacker.com/web/content/gDateRange/gdr.htm). This enables you to enter your search terms and select your dates from a calendar. It then runs your search and on the Google results page you can apply Verbatim in the usual way.
7. Cached The cached option enables you to view the copy of the page that Google has in its database. This is useful when the current version of a page seems to differ signicantly from the one described in the Google search results. Click on the little green arrow next to the URL of the page on the results list and then select Cached.
Use the filetype: command to limit your research to PowerPoint for presentations, spreadsheets for data and statistics or PDF for research papers and industry/government reports. One workshop participant found it to be a great way to track down conference poster presentations by combining PDF and PowerPoint filetypes with keywords and the term ‘poster’.
9. Country versions of Google
The country versions of Google give priority to the country’s local content, although it might be in the local language. This is a useful strategy when searching for industries, companies and people that are active in a particular country. Use the standard ISO two letter country code, for example http://www.google.fr/ for Google France, http://www.google.it/ for Google Italy.
10. Books – About this magazine
Several people were interested in Google Books and in the magazine archives in particular. Google does not, though, make it easy to browse a magazine’s archives. Once you have identified a series that is of interest it would seem logical to click on “Browse all issues” to view a list of what is available.
However, it seems to list the years of the issues randomly. Selecting “About this magazine” brings up some brief information about the title and links that enable you to browse past issues by year.
This collection of Top Tips is a combined list nominated by those who attended the autumn and spring UKeiG workshops on “Anything but Google”. The participants came from all sectors and types of company, and included a couple of self employed researchers. The sessions covered both general search tools and specialist services, and the list is an interesting mix of strategies and specific sites. A big “Thank- you” to everyone who participated in the workshops.
1. Get to know the advanced search commands and options.
Google is not the only search tool that uses them and they can help focus your search, especially when using general search tools such as Bing.
2. If you are conducting serious research don’t stop with the first reasonable looking results.
Information of dubious quality can infiltrate even the most well respected of specialist websites. Put on your “skeptical goggles” as one delegate said! There are plenty of alternative tools and resources out there so get some corroboration from additional sources before acting on the information you find.
3. Allocate time for your search.
If you are carrying out in-depth research don’t leave it to the last minute. You will probably need to tweak your strategy and try different search tools to ensure that you are retrieving the best information. It can sometimes take longer than you anticipate.
4. Plan your strategy.
Think about the type of search you want to conduct and the type of information you are looking for. For example if you are carrying out a systematic review and want to use Boolean operators forget about Google; head for Bing instead. And if you need official statistics or company information go straight to specialist sites that provide that data.
5. Don’t stick with what you regularly use.
Experiment with other resources, especially if you suspect your default search tool is not telling you the whole story.
6. Country versions of search tools.
Many search tools offer country versions that give priority to the country’s local content, although that might be in the local language. This is a useful strategy when searching for industries, companies and people that are active in a particular country.
7. Learn when to try something else.
If a site’s navigation or internal search engine seems to be returning rubbish don’t struggle with it. Try another route to get to the information. Either try an alternative source of information or use the ‘site:’ command – available in Bing as well as Google – to search inside the site.
This was recommended for its clean, straightforward layout and the range of resources it offers on a topic. A school librarian commented that the pupils at her school loved it.
If you are fed up with seeing the same results from Google again and again give MillionShort a try. MillionShort enables you to remove the most popular web sites from the results. Originally, as its name suggests, it removed the top 1 million but you can change the number that you want omitted. The page that best answers your question might not be well optimised for search engines or might cover a topic that is so “niche” that it never makes it into the top results in Google or Bing.
This was recommended for its clustering of results and also the visualisations of terms and concepts via the circles and “foam tree”. There is a link to the live web demo on the left hand side of the home page.
11. Microsoft academic Search – chartshttp://academic.research.microsoft.com/
This is a direct competitor to Google Scholar. The site can be slow to load and it sometimes assigns authors to the wrong institution. Nevertheless, the visualisations such as the co-author and citation maps can be useful in identifying who else is working in a particular area of research. The visualisations can be accessed by clicking on the Citation Graph image to the left of the search results or an author profile.
12. Creative Commons and public domain images.
Use the Bing license option (US version only) to search for images with creative commons or public domain licenses, but do go to the original webpage and check that the license is indeed associated with the image you want to use. Alternatively use one of the following:
13. Tineye Multicolrhttp://labs.tineye.com/multicolr/.
“Search 10 million Creative commons Flickr images by colour.” You can specify more than one colour and move the the dividing bar between two colours to increase/decrease their prominence within the image. Click through to the original Flickr image to double check the license.
14. Company Checkhttp://www.companycheck.co.uk/
Company Check repackages Companies House data and provides 5 years of accounts, and graphs for some financials free of charge. It also lists the directors of a company. Click on a director’s name and you can view other current and past directorships for that person. It provides more free information than Companies House but you have to register (free) to gain full access. Additional information such as credit risk, CCJs, credit reports, and many Companies House documents are priced or available as part of a subscription.
15. Guardian Data Storehttp://www.guardian.co.uk/data/
For datasets and visualisations relating to stories currently in the news. As well as the graphs and interactive maps the source of the data is always given and there are links to the original datasets that are used in the articles.
This is a search tool for searching information contained in charts, graphs and tables of data and within formatted documents such as PDFs, Excel spreadsheets and images. Enter your search terms and optionally limit your search by date and/or format type. One delegate said “It has changed my life!”. (We think/hope she meant her working life.)
17. Keep up to date.
Keep up to date with what the search engines are up to, changes to key resources and new sites. Identify blogs and commentators that are relevant to your research interests and subject areas, and follow them using RSS or email alerts.
My Twitter feed and other social media this morning is full of posts and updates saying that Getty Images is making all of its images freely available. It is not. Read the “Embedded Viewer” section of its Terms and Conditions at http://www.gettyimages.co.uk/Corporate/Terms.aspx for what you can and cannot do.
They are making a limited selection of images available for “editorial purposes (meaning relating to events that are newsworthy or of public interest).”
“Embedded Getty Images Content may not be used: (a) for any commercial purpose (for example, in advertising, promotions or merchandising) or to suggest endorsement or sponsorship“.
Getty also reserve the right “to place advertisements in the Embedded Viewer or otherwise monetise its use without any compensation to you.”
Ignore these T&Cs at your financial peril!
As for the image associated with this article, it is not from Getty but one of my own. It is a decommissioned composting toilet at Barracks Lane Community Garden, Oxford. Please feel free to use as you wish.
Google has been automatically dropping terms from searches that give few or no results for some time. It now looks as though Bing may be doing the same. Unfortunately I cannot give the details of the search that brought this to light as it was confidential research. In general, though, what we were searching for were announcements or news articles about two companies involved in a particular project. We hadn’t found anything in Google so we tried various alternative search engines including Bing (http://www.bing.com/). The results seemed quite promising until we started looking at the individual pages. None of them had all of our terms. It is possible that the missing terms appeared in links to the pages but the content of the documents suggested that this was unlikely, and there is no reliable free tool that shows you who is linking to a specific page. So it looks as though Bing is now dropping terms in the same way that Google does.
There are two ways to stop Bing doing this. The first is to use the Boolean AND operator between all of your terms. The second is to prefix the term that must be present in a document with ‘inbody:’, for example inbody:aardvark.
Did we find anything that answered our question? No, but sometimes I don’t expect to and it is frustrating when the search engine thinks it knows best and unilaterally decides to rewrite the search strategy.
This is a feature which I have been seeing on and off for a few months so I’m not sure if it is one of their experiments or if it is being rolled out gradually. It’s very simple: advertisements that appear at the top of your results lists and in the panel to the right are marked with a little yellow box with ‘Ad’ written inside.
Over the years it has become harder to identify ads at the top of results as the pale pastel backgrounds to them became more subtle. It has been suggested that the more obvious marker is a consequence of discussions between Google and various regulatory authorities.
On one of my recent workshops I was asked if I used Google as my default search tool, especially when conducting business research. The short answer is “It depends”. The long answer is that it depends on the topic and type of information I am looking for. Yes, I do use Google a lot but if I need to make sure that I have covered as many sources as possible I also use Google alternatives such as Bing, Millionshort, Blekko etc. On the other hand and depending on the type of information I require I may ignore Google and its ilk altogether and go straight to one or more of the specialist websites and databases.
Here are just a few of the free and pay-per-view resources that I use.
There are at least a dozen statistics sites that I use on a regular basis but if I’m unsure of where to look or want to make sure I haven’t missed anything I use OFFSTATS – Official Statistics on the Web at http://www.offstats.auckland.ac.nz. A great starting point for official statistical sources by country, region subject or a combination of categories. All of the content in the database is in the public domain and available through the Internet and has been quality assessed by staff at The University of Auckland Library.
Official company information
If I want to confirm the existence of a company or obtain filings and accounts I usually go direct to the relevant official company registry. I have a list of the registries that can be searched online at http://www.rba.co.uk/sources/registers.htm. As many of my enquiries are for UK companies I am a frequent visitor to Companies House at http://www.companieshouse.gov.uk/. Some information is free but filings and accounts are priced. There are several companies that repackage Companies House data and sometimes make extra data or analysis free of charge for example Company Check at http://www.companycheck.co.uk/, which enables you to search by company or director’s name. Risk reports, information on CCJs, and some official filings are priced if you do not have a subscription to the full service.
Share price information
For free share price information I use Yahoo Finance (http://uk.finance.yahoo.com/) and Google Finance (https://www.google.co.uk/finance). Both of these services provide charts and news on shares on the major stock markets. Google’s graphs are ‘annotated’ with labels that link to news articles listed to the right of the graph, so you can see whether or not a particular event or announcement has affected a share price. Both offer free, daily historical share prices in figures. As well viewing historical graphs for share prices you can download the data as a spreadsheet.
For news alerts I use a mixture of bookmarked searches, Google email and RSS alerts, and RSS feeds from a wide range of blogs and news sites. I find that Google alerts are erratic and unreliable but they do sometimes pick up something unique so I still include them in the mix. RSS feeds are my main source of current awareness and when a news feed is rather broad in coverage I use my RSS reader’s search function to identify the articles that are of interest to me. I use a desktop reader call RSSOwl (http://www.rssowl.org/) but Inoreader (http://inoreader.com/) is a web based service that offers similar features and options.
Who is behind a site?
If I am to use any information from the web for business purposes I need to know who is behind the website. DomainTools http://www.domaintools.com/ is one of many services that will tell who owns a domain name, unless they are hiding behind an agent or privacy protection service. There is also a Whois+ extension for Chrome (my default browser) that can be used to run a quick and easy check on the domain name of a displayed page.
If you are interested in finding out more about business information resources I am running a workshop for TFPL in London on March 6th , June 6th 2014. Details are on the TFPL website.
Update: please note change of date for the next business information workshop. It is now being held on June 6th, 2014.
Several weeks ago I noticed that Google was displaying the terms it had dropped from your search as ‘Missing’. Google started routinely ignoring selected search terms towards the end of 2011 (see http://www.rba.co.uk/wordpress/2011/11/08/dear-google-stop-messing-with-my-search/). Google’s response to the outcry from searchers was to introduce the Verbatim search option. However, there was no way of checking whether all of your terms appeared in a result other than viewing the whole page. Irritating, to say the least, if you found that the top 10 results did not include all of your keywords.
Fast forward to December 2013, and some people started seeing results lists that showed missing keywords as strikethroughs. I saw them for a few days and then, just as I was preparing a blog posting on the feature, they disappeared! I assumed that they were one of Google’s live experiments never to be seen again but it seems they are back. Two people contacted me today to say that they are seeing strikethroughs on missing terms. I ran my test searches again and, yes, I’m seeing them as well.
I ran the original search that prompted my November 2011 article (parrots heron island Caversham UK) and included -site:rba.co.uk in the strategy to exclude my original blog postings. Sure enough, the first two results were missing parrots and had “Missing parrots” underneath their entry in the list.
Remember, though, that Google automatically looks for variations on your search terms. Your original keyword may not be present in the results but a synonym may be, for example birds instead of parrots. I did find one entry in the top 10 results that seemed out of place. It was an article in the local newspaper. There was no ‘Missing….’ underneath the result but I could not find synonyms of parrots anywhere in the story. The closest was the surname of a person mentioned in the article – Parry – and I can only assume that this was Google truncating my ‘parrots’ keyword and looking for variations. But Parry was not highlighted in the entry in the results list as synonyms are. I shall look at the article in more detail to check that I have not missed a lurking parakeet or cockatoo!
I need to carry out more test searches to see how reliable this new feature is but, nevertheless, it is a welcome and useful addition to Google. If your top 10 results are showing strikethroughs then you know you have to either prefix the missing term with ‘intext:’ or use Verbatim for the whole search.
A couple of weeks ago I was in Exeter and Bristol leading workshops for NHS South West on “Google & Beyond”. We covered advanced Google commands, Google Scholar and alternatives to Google. Below are the combined top tips from the two sessions. I may have missed a couple from the list as I could not read my writing, so if you attended one of the workshops let me know if I’ve omitted your suggested tip.
Verbatim Yet again, this has topped the list of useful Google search options. Google automatically looks for variations on your search terms and sometimes drops terms from your search without telling or asking you. To make Google run your search exactly as you have typed it in, first run your search. Then click on ‘Search tools’ in the menu above your results, in the second line of options that appears click on ‘All results’ and from the drop down menu select Verbatim.
Be aware of personalisation. Even if you are not signed in to a Google account Google personalises your results according to your search and browsing behaviour. Personalisation is not necessarily a bad thing but if your want to burst out of the filter bubble, as it is often called, use a private browser window or incognito (Chrome). Google will then ignore tracking and search cookies on your machine and will not personalise your results. To call up a private browser or incognito window use the following keys:
Chrome – Ctrl+Shift+N
FireFox – Ctrl+Shift+P
Internet Explorer – Ctrl+Shift+P
site: Use the site: command to focus your search on particular types of site, for example site:nhs.uk, or to search inside a large, rambling site. You can also use -site: to exclude sites from your search.
intext: Google’s automatic synonym search can be helpful when looking for alternative terms, but if you want a term to be included in your search exactly as you have typed it then prefix the word with intext:.
filetype: Use the filetype: command to limit your research to PowerPoint for presentations, spreadsheets for data and statistics or PDF for research papers and industry/government reports. Note that in Google filetype:ppt and filetype:xls will notpick up the newer .pptx and xlsx formats so you will need to include those in your strategy, for example filetype:ppt OR filetype:pptx, or run separate searches for each one. In Bing.com, though, filetype:pptx will pick up both .ppt and .pptx files.
Advanced search commands and search options Learn how to use the search commands (for example intext:, filetype: and site:). Many of these can be used on the advanced search screen that can usually be found under the cog wheel in the upper right hand area of the screen, but that link sometimes disappears so learning the commands is a better bet. A list of the more useful Google commands is at http://www.rba.co.uk/search/SelectedGoogleCommands.shtml.
Combine advanced search commands. Practise combining the advanced search commands for a more precise, focused set of results.
Google Reading level. This changes the type of results that you see. Run your search and from the menu above the results select ‘Search tools’, ‘All results’ and from the drop menu ‘Reading level’. Options for switching between basic, intermediate and advanced reading levels should then appear just above the results. Click on the Advanced option to see results biased towards research. Google does not give much away as to how it calculates the reading level and it has nothing to do with the reading age that publishers assign to publications. It seems to involve an analysis of sentence structure, the length of sentences, the length of the document and whether scientific or industry specific terminology appears in the page.
Numeric range. This command is unique to Google. Use it for anything to do with numbers – years, temperatures, weights, distances, prices etc. Simply type in your two numbers separated by two full stops as part of your search. This is a good way of limiting your search, for example, to forecasts over the few years.
Limiting your search by date. To limit your search by date, for example the last month or year, first run your search. Then click on ‘Search tools’ in the menu above the results and from the second row of options that appears click on ‘Any time’. Select your time period or a custom range from the drop down menu.
Use the minus sign to exclude documents containing a word. If you do not want documents containing a specific word prefix that word term with a minus sign. The minus sign can also be used with commands such as site: and filetype: to remove an individual site or type of document from your results.
Million Shorthttp://millionshort.com/. If you are fed up with seeing the same results from Google again and again give Million Short a try. Million Short runs your search and you can choose to remove the most popular web sites from the results. Originally, as its name suggests, it automatically removed the top 1 million but now you can choose to remove the most popular 100, 1000, 10k, 100k or million sites. The page that best answers your question might not be well optimised for search engines or might cover a topic that is so “niche” that it never makes it into the top results in Google or Bing.
Creative commons searches for images. Rather than search for images and go through them individually to find one that you can legally use in your document or presentation, use advanced search options or tools that allow you to select the appropriate license from the start. In Google, use the usage rights menu on the image advanced search screen to search for images with the license you need. The US version of Bing images includes a license option in the menu at the top of your results.
Double check the license of the photo on the website or blog hosting it. The license you need may be associated with a different image and yours could, for example, be ‘all rights reserved’.Flickr has a page where you can search for images with a specific Creative Commons license at http://www.flickr.com/creativecommons.
Compare in Google. This is not a Google command but if you type in a search such as compare carrots with cabbage Google will create a table comparing the properties of the two items. Google has been known to get some of the data wrong, though, so it’s worth double checking the figures before you use them.
Web archives. Want to see what was on a website a few years ago or trying to track down a document that seems to have vanished from the web? Try the Internet Archive Wayback Machine at http://www.archive.org/. Enter the URL of the website or document and you should then see a calendar of the snapshots that the archive has of the site or document. Choose a date from the calendar to view the page. The archive does not have everything but it is worth a try. See also the UK National Archives of old government websites and pages at http://www.nationalarchives.gov.uk/webarchive/ and the UK Web Archive at http://www.webarchive.org.uk/ukwa/.
Statistics sites. Although you can often find statistics via Google, you may find dedicated official statistics sites quicker and more reliable. Some of the sites we covered during the workshops were: