There was a time when Google would aggregate pages from the same website in your search results. There might be just a couple of entries for the site with a “More from….” link next to the result.
Alternatively you might see a mini sitemap:
This has the advantage that you are not swamped with results from a single website but are given instead a variety of options that might provide you with a better answer to your question.
Not any more.
You may have noticed that multiple entries from single websites have started appearing in your results. For example, rather than just one Wikipedia entry you see 4, 5, 6 or even more. On the other hand, you might not have noticed anything at all. Some of my colleagues are seeing this and some are not. Google tests new features and algorithms on a small percentage of its users to see how they react so new or test features are not seen by everyone (see How Google makes improvements to its search algorithm – YouTube http://www.youtube.com/watch?v=J5RZOU6vK4Q). As far as I’m concerned this particular “improvement” is a disaster.
I was running a very general search on the use of biofuels by public transport in the UK. I just want to get an idea of some of the issues that were being discussed before refining my search and went, by default, to Google. My first screen had nothing but results from the UK government Department for Transport (DfT).
I scrolled down and saw more DfT pages. I scrolled down further and yet MORE dft pages. OK, Google, so dft.gov.uk is a good place for me to look at biofuels in public transport. I get the message. STOP! There were 27 DfT pages in total flooding the top of my results page, which I have set to display 100 entries at a time. Creeping in at number 28 came the Guardian with 5 results.
The Friends of the Earth website had 7 results, and then at last I started to see more variety in my results at around number 40, but still with a lot of repetition.
Google may think that the DfT is a very important source of information on the topic but I want to decide whether or not to explore more of a particular site. Spamming my results list annoys me and makes me want to go elsewhere. So I did.
DuckDuckGo (http://www.duckduckgo.com/) is my main Google alternative and it came up with a decent and varied set of results without repetition, hesitation or deviation.
Bing (http://www.bing.com/) and Yandex (http://www.yandex.com/) came up with similar, non-repetitive results.
Blekko (http://www.blekko.com/) came up with some interesting alternative pages for me to consider. These would not have been that useful to me in the earlier stages of my research but this test confirmed my feeling that Blekko is good at pulling up information that explores more than the mainstream issues.
If you want to stay with Google how do you deal with multiple listings of sites? The most obvious approach would be to incorporate a ‘-site:’ command in your search, for example:
biofuels public transport -site:dft.gov.uk
If you are conducting in depth research and are likely to be running many variations on a search, incorporating ‘-site:’ each time can become a chore. Google’s own browser Chrome has a Personal Blocklist extension that enables you to block selected sites from results (https://chrome.google.com/webstore/detail/nolijncfnkgaikbjbdaogikpmpbdcdef). Once installed a block link appears next to each entry in your results. Click on the link to block the site from all future results. A message appears at the bottom of searches that would normally contain pages from the blocked site warning you about exclusions.
The ‘show’ link displays and highlights the previously blocked pages and offers an option to unblock them.
Neither the -site: option nor the Blocklist approach should be necessary. There was nothing wrong with the previous ways of offering additional pages from a site in search results. It wasn’t broke but Google did break it by trying to fix it. For me, there are now several Google alternatives that produce quality results and with less irritation. I shall be using them more in future.