Tag Archives: search

But it’s all free on Google

This is the presentation I am giving tomorrow (27th September 2012) at the East of England Information Services Group of CILIP at the Bury St Edmunds Library. The event is “Information integrity on a decreasing budget”. Many of the slides are just images so I am not really giving that much away by releasing the slides early. And before anyone asks, no, I will be not providing notes and I will not be providing a version with a voice over – at least for free. Some will soon be made available in the resurrected Search Strategies section of my web site at http://www.rba.co.uk/search/index.shtml and in the subscription area only.

The straightforward, no notes, no voice over slides are available at

http://www.rba.co.uk/as/ – available for a few weeks only

http://www.authorstream.com/Presentation/karenblakeman-1549834-39-free-google/

http://www.slideshare.net/KarenBlakeman/but-its-all-free-on-google

And the next Google killer is….Google!

Many of us have been saying for a while that the search engine that will kill Google is Google itself. It has come so close in the past, two of the more recent incidents being the removal of the plus sign from general web search and stopping the ‘ANDing’ of search terms. Prefixing search terms with the plus sign enabled searchers to disable Google’s synonym and variation search so that it carried out an exact match search. It still works in Google Scholar but not in general web search; Google is now using the ‘+’ prefix within Google+ to help users find Google+ business pages, for example +BASF will quickly take you to the BASF business page.  Google redeemed itself to some extent by hastily bringing in the Verbatim option, which can be found in the left hand menu of your results page. This will run your search exactly as you specify it (Google: Verbatim for exact match search http://www.rba.co.uk/wordpress/2011/11/18/google-verbatim-for-exact-match-search/). However, while it works with Google commands such as ‘filetype:’ and ‘site:’ it gives up as soon as you start using some of the options in the left hand menu on the results page, such as date.

And now enter Google+ and Search Plus Your World (SPYW). If you decided to add Google+ to your Google account Google has seriously messed up altered the way it handles your search if you are logged in. It now incorporates and gives priority to results from your Google+ network. (For more details from Google see Search, plus Your World – Inside Search http://insidesearch.blogspot.com/2012/01/search-plus-your-world.html). At present it is only available if you are signed in on Google.com and searching in English. “Search Plus Your World” is now the default and personalizes your results based on both your own behaviour and social connections, and content that has been shared with you through Google+.  Phil Bradley has written an excellent posting on how this works (Why Google Search Plus is a disaster for search http://philbradley.typepad.com/phil_bradleys_weblog/2012/01/why-google-search-plus-is-a-disaster-for-search.htmll).

Initially I was in two minds about SPYW. I thought I might find it useful if I wanted to check what people in my Google circles were saying about a particular issue but then realised that most of them prefer to post on Twitter rather than in Google+ and Google+ does not cover Twitter! The Search+ results include

  • listings from the web
  • pages from the web that have been given priority because of your search behavior
  • pages from the web given priority because of your social connections
  • both public and private (or limited) Google+ posts, photos and Google Picasa photos

When it comes to serious research Search+ includes far too much irrelevant information. So how easy is it to turn it off? If you are logged in when you run your search you will see a message above your results that tells you the number of personal results and “other results” that have been found. There is also a toggle that enables you to switch between personalised and unpersonalised results. You can also switch it off permanently within your search settings.

You can of course just log out of your Google account before you run a search, or never sign up for Google+ in the first place. But Google is making the latter increasingly difficult. Let’s look at the results that might be popping up on your screen and as an example I’ll use a search on Phil Bradley, search and social media expert and President of CILIP. First of all a search on Phil Bradley before Search+ arrived:

On my screen I see pages from his web site, his blog and a Wikipedia entry (which is not the Phil Bradley I am looking for!). When I sign in to a Google account that has Google+ associated with it I see something completely different:

 

Phil’s Google+ profile is given priority above everything else and takes up most of the screen regardless of whether or not it is the most relevant or most up to date (Real-Life Examples Of How Google’s “Search Plus” Pushes Google+ Over Relevancy http://searchengineland.com/examples-google-search-plus-drive-facebook-twitter-crazy-107554).  And don’t think you can escape with a Google account that does not include Google+. Google has ways of enticing you to “upgrade”:

Even worse, if you sign up now for a new Gmail, YouTube or Blogger account you are automatically joined to Google+ (http://searchenginewatch.com/article/2140440/New-Gmail-YouTube-Blogger-Users-Join-Google-by-Default).

Search+ has even tainted the suggestions that pop up as you type in your search:

Phil’s Google+ profile is given prominence and if you click on the link without having an account yourself your are invited to join:

To see what the suggestions should look like a group called Focus on the User (http://www.focusontheuser.org/) has produced a bookmarklet for Chrome, Firefox and Safari and extensions for Chrome and Firefox. This tries – and succeeds most of the time – to display your search results without the intrusion of Google+ results. For my search on Phil his Google+ profile is replaced with Twitter.

When I run a search on my own name my Google+ entry is supplanted by my LinkedIn profile.

“What Google should be” does not, though, remove the extra “content” that Search+ sometimes adds to the right of your results. Run a subject search and you may see “People and Pages on Google+” that are supposedly related to your search terms.

I have not yet found these entries to more relevant than standard search results and the link “Learn how you could appear here too” indicates that Google sees this as another way of persuading people and organisations to join Google+. Switching it off is not easy. It is still there if you are logged out of your Google account. It is still there if you add &pws=0 to the search URL (in fact &pws=0 does not seem to work any more at all for depersonalising results). It does disappear, though, if you use Incognito in Chrome. The intrusion of Google+ is most obvious when running searches with just one or two terms or more consumer biased searches. As soon as you start building more complex searches involving filetype: or site: for example, or research more scientific subjects then Google+ takes a back seat.

Search+ is not all that is affecting how Google presents results. Google is simplifying its privacy policies and combining user data from all of its services (Official Google Blog: Updating our privacy policies and terms of service http://googleblog.blogspot.com/2012/01/updating-our-privacy-policies-and-terms.html). It sounds innocent enough but I’ve already spotted major changes. Google knows I live in Reading because I have told it and I do find that useful when I am carrying out local searches for restaurants, builders etc. Google has now decided, though, to bombard my YouTube home page with videos about Reading.

The videos of the Reading railway station redevelopment are vaguely interesting but I see enough of that in real life on a daily basis when I pass through the centre of town. The football videos are of no interest to me whatsoever. So the crossover of content has already started and I am not looking forward to what Google decides to put in my web search results as a consequence of my YouTube activity!

It is becoming increasingly difficult to make Google behave. Using advanced search commands is one way but many searches do not require them. The best method I have found so far is to use Chrome as your browser and open an incognito window. This depersonalises your results, ignores your web history and existing cookies, and leaves no traces of your search activity. Alternatively, since Google has clearly lost the plot when it comes to search, try another service. The three that I would currently recommend are Bing (http://www.bing.com/), DuckDuckGo (http://duckduckgo.com/) and Blekko (http://blekko.com/).

AROUND: Google proximity search operator

Several people have already blogged about Google’s AROUND proximity operator: Digital InspirationResearchBuzz, SearchReSearch and Phil Bradley to name just four. According to SearchReSearch the command has been available for 5-6 years, which begs the question “Why has no-one picked up on it before now?” Could it possibly be because the operator does not do what it says on the tin? Perish the thought and wash my brain out with soap and water for even considering such a thing. 

The AROUND command allows you to specify the maximum number of words that separate your search terms. The syntax is firstword AROUND(n) secondword. For example oil AROUND(2) production.

The reason I have not commented on AROUND so far is because – how can I put this politely – I am finding it difficult to find a search in which it is of practical value. I shall illustrate with just one of my searches, macular degeneration, but my experiences with other test and “real” searches are similar. When testing search features the relevance of the documents that appear on the first few pages of the results is more important than the number of  hits, especially as the latter are often guesstimates from Google and can vary enormously depending on which version of Google you use. Nevertheless, the numbers are interesting even if they only serve to confuse us further and I have included them with the screen shots. All of the following searches were run in Google.co.uk

Let’s kick off with a very basic version of my test search: macular degeneration

Number of results: 7,340,000

Macular Degeneration simple search

The results are relevant and as usual Google appears to be listing first those pages where the terms appear next to one another. If we did want to be more precise and reduce the number we could search for the phrase: "macular degeneration".

Number of results: 1,690,000

Macular degeneration phrase search

Not surprisingly the number of results has been reduced significantly to 1,690,000.

Let us now say that my enquirer has come back with an amendment to the original request. They have been told that there are several forms of macular degeneration, for example macular disciform degeneration, and they want a selection of articles covering as many of them as possible. I have a biomedical background and can easily identify the relevant phrases and run separate searches on them, but what if I didn’t have a clue where to start? I could use Google’s asterisk (*) between my two terms to stand in for one or more words.

The strategy macular * degeneration gives us a massive 21,500,000 results, far more than our first basic search if the numbers are to be believed.

Macular degeneration asterisk search

In just the first 6 results we have picked up vitelliform and disciform degeneration, and more are picked up in the subsequent 20-30 results.

Google’s search tips say “If you include * within a query, it tells Google to try to treat the star as a placeholder for any unknown term(s) and then find the best matches.” It is not clear from this whether the asterisk stands in for one or more terms. Adding more asterisks to the search does not alter the number of results, which in any case are only an estimate. We do, though, see very different content and now variations on our terms (for example macula)  are appearing emboldened in the page summaries.

Comparison of asterisk searches

We could try and force an exact match search by placing a plus sign before macular in our strategy, but let’s try and keep this exercise simple.

Now for three searches using AROUND(n). Note that AROUND must be in capital letters, otherwise Google will treat it as just another search term. Specifying the number of separating words as 1, 2 and 3 gave me 1,710,000, 1,710,000 and 1,720,000 results respectively.

Google AROUND operator

The results are very different from the searches incorporating the asterisk and AROUND(2) and AROUND(3) were identical. Also, it seems that with the AROUND operator Google is giving priority to documents where the terms are a phrase and not separated by any other words. It was only when I reached around 650 that I started to see phrases where my two terms were separated by one other word.

Using just AROUND without any number gave me 1,610,000 results that looked very similar to those obtained with AROUND(1).

Logically, one might think that macular AROUND(0) degeneration would be the same as a search on the phrase "macular degeneration". It isn’t!

Phrase versus AROUND(0)

Not only are the number of results different (AROUND(0) comes back with 4, 250,000 compared with 1,690,000 from the phrase search) but so is the content.

Finally, I decided to follow Phil Bradley’s lead and see what happens when I try and exclude the phrase from the AROUND(0) search: macular AROUND(0) degeneration -"macular degeneration". I got 43,000 results in which the terms seemed to appear anywhere within the document, in any order and separated by any number of other words.

In conclusion, despite what I said earlier I think AROUND does work but it is difficult to test because Google always seems to give priority to pages in which your terms appear as a phrase and not separated by any other words. Its effect is probably more obvious if you are dealing with a topic that would otherwise return a very small number of results. The ranking and sorting of the results changes significantly, though, when you use AROUND so it might be worth trying if you are fed up with seeing the same documents and sites again and again. In all of the test searches I have carried out so far I still prefer the asterisk, especially if I want to be able to identify expanded phrases quickly and easily. But, as the saying goes, your mileage may vary. Feedback on your own experiences, please.

Your Google results are about to get weirder

Persuading Google to recreate the same list of results for a search is impossible. Google continually updates its database and index with new and updated pages. Even a few minutes between repeat searches can make a significant difference. Add into this mix the fact that your search will probably be diverted to a different server from the one that gave you your initial results (Google has thousands of servers) and that the second server may have been updated at a different time with different pages. Oh, and Google may have decided to play around with the ranking algorithms and display options on this particular server as an experiment. And are you sure you have entered your search terms in exactly the same order as before, because that can make a difference as well? And we haven’t even started to consider the difference of searching in Google.co.uk vs. Google.com vs. Google.ca etc.

Now we have Google personalized search, and by ‘we’ I mean all of us by default.

Search personalization is nothing new. In 2005 Google announced a new feature that was enabled if you were logged in with your Google account: web history and personal search (see Official Google Blog: Search gets personal http://googleblog.blogspot.com/2005/06/search-gets-personal.html). If you were logged in to your Google account and had your web history enabled – a record of your searches and sites that you selected from your results – future search results would be adjusted, or personalized, accordingly. And then we had (have) Google Searchwiki (see Begone Searchwiki http://www.rba.co.uk/wordpress/2008/12/11/begone-searchwiki/). Searchwiki – you have to opt-in for it – enables you to delete results from your search results, or move a result up or down in the list depending on how relevant you think it is. Your actions are saved and remembered when you next run the search.

The new Google personalized search is different. You do not have to be signed in to a Google account and by default it is switched on.  The claim is that Google is “helping people get better search results”:

“For example, since I always search for [recipes] and often click on results from epicurious.com, Google might rank epicurious.com higher on the results page the next time I look for recipes. Other times, when I’m looking for news about Cornell University’s sports teams, I search for [big red]. Because I frequently click on www.cornellbigred.com, Google might show me this result first, instead of the Big Red soda company or others.”

The customization is based on 180 days of search activity linked to an anonymous cookie in your browser. See the “Official Google Blog: Personalized Search for everyone” http://googleblog.blogspot.com/2009/12/personalized-search-for-everyone.html for further details.

This might sound at first to be a useful additional feature, but think it through. Let us say that in the run-up to Christmas your boss has asked you to look up recipes for chocolate desserts, cakes or puddings for the office party. When your results list comes up you repeatedly click on links for recipes or videos of how to make that extra complicated chocolate soufflé. In your regular day job’s research, though, you are researching the pharmacological properties of the various compounds to be found in cocoa. Your results are now starting to come up with some very odd results, but at least they will be on the same topic. For those of us who research a wide range of subjects Google’s personalized search is going to lose the plot very quickly.

There is then the question of which computer are you using? Do you always use the same computer at work or at home (we have three here)?  What are you going to see when you go to an Internet cafe? And what results will Google present you with if you are a CILIP member and use the IT facilities in the members’ information centre?

Whatever PC you use for your Google search, look in the top right hand corner of the results page. You should see an option for Web History:

Click on Web History and then Disable Customisations based on search activity:

When this first went live, I found that disabling the customisation was not saved from one session to the next. Today, this now seems to have been saved from my previous search session but if you want to ensure that customisation has been disabled I would recommend that you check the setting at the start of every day.

Internet Search: a challenging and ever changing landscape

CILIP in the Thames Valley evening meeting

Date & Time: Tuesday 6th October 2009,  1800 for 1830 hrs
Location: Great Expectations, 33 London St, Reading

Google threatens to go hyper with its “caffeine” search. Bing is taking over Yahoo. Image search options are expanding: creative commons, colour, similar images. More specialist search tools for the “hidden web” are emerging and Web 2.0 is now an essential part of the search mix. Karen Blakeman will look at the new services that are being pushed out by the major search engines and the alternatives.

This is a free event followed by free refreshments and networking opportunities with colleagues.

An invitation is extended to anyone with a professional interest in the topic

Contact: Norman Briggs, nwbriggs@pcintell.co.uk to advise attendance for catering purposes.