Category Archives: Search Engines

Anything but Google – Top Tips

This collection of Top Tips is a combined list nominated by those who attended the autumn and spring UKeiG workshops on “Anything but Google”. The participants came from all sectors and types of company, and included a couple of self employed researchers. The sessions covered both general search tools and specialist services, and the list is an interesting mix of strategies and specific sites.  A big “Thank- you” to everyone who participated in the workshops.

1. Get to know the advanced search commands and options.
Google is not the only search tool that uses them and they can help focus your search, especially when using general search tools such as Bing.

2. If you are conducting serious research don’t stop with the first reasonable looking results.
Information of dubious quality can infiltrate even the most well respected of specialist websites. Put on your “skeptical goggles” as one delegate said! There are plenty of alternative tools and resources out there so get some corroboration from additional sources before acting on the information you find.

3. Allocate time for your search.
If you are carrying out in-depth research don’t leave it to the last minute. You will probably need to tweak your strategy and try different search tools to ensure that you are retrieving the best information. It can sometimes take longer than you anticipate.

4. Plan your strategy.
Think about the type of search you want to conduct and the type of information you are looking for. For example if you are carrying out a systematic review and want to use Boolean operators forget about Google; head for Bing instead. And if you need official statistics or company information go straight to specialist sites that provide that data.

5. Don’t stick with what you regularly use.
Experiment with other resources, especially if you suspect your default search tool is not telling you the whole story.

6. Country versions of search tools.
Many search tools offer country versions that give priority to the country’s local content, although that might be in the local language. This is a useful strategy when searching for industries, companies and people that are active in a particular country.

7. Learn when to try something else.
If a site’s navigation or internal search engine seems to be returning rubbish don’t struggle with it. Try another route to get to the information. Either try an alternative source of information or use the ‘site:’ command – available in Bing as well as Google – to search inside the site.

8. DuckDuckGo http://www.duckduckgo.com/.
This was recommended for its clean, straightforward layout and the range of resources it offers on a topic. A school librarian commented that the pupils at her school loved it.

9. MillionShort  http://millionshort.com/.
If you are fed up with seeing the same results from Google again and again give MillionShort a try. MillionShort enables you to remove the most popular web sites from the results. Originally, as its name suggests, it removed the top 1 million but you can change the number that you want omitted. The page that best answers your question might not be well optimised for search engines or might cover a topic that is so “niche” that it never makes it into the top results in Google or Bing.

10. Carrot Search http://carrotsearch.com/

Carrot-Search
Carrot Search foam tree

This was recommended for its clustering of results and also the visualisations of terms and concepts via the circles and “foam tree”. There is a link to the live web demo on the left hand side of the home page.

11. Microsoft academic Search  – charts http://academic.research.microsoft.com/
This is a direct competitor to Google Scholar. The site can be slow to load and it sometimes assigns authors to the wrong institution. Nevertheless, the visualisations such as the co-author and citation maps can be useful in identifying who else is working in a particular area of research. The visualisations can be accessed by clicking on the Citation Graph image to the left of the search results or an author profile.

12. Creative Commons and public domain images.
Use the Bing license option (US version only) to search for images with creative commons or public domain licenses, but do go to the original webpage and check that the license is indeed associated with the image you want to use. Alternatively use one of the following:

Flickr  Creative Commons http://www.flickr.com/creativecommons
Flickr The Commons http://www.flickr.com/commons/
Wikimedia Commons http://commons.wikimedia.org/
MorgueFile.com  http://www.morguefile.com/
Geograph http://www.geograph.org.uk/
Nasa http://www.nasa.gov/

13. Tineye Multicolr http://labs.tineye.com/multicolr/.
“Search 10 million Creative commons Flickr images by colour.”  You can specify more than one colour and move the the dividing bar between two colours to increase/decrease their prominence within the image. Click through to the original Flickr image to double check the license.

14. Company Check http://www.companycheck.co.uk/
Company Check repackages Companies House data and provides 5 years of accounts, and graphs for some financials free of charge. It also lists the directors of a company. Click on a director’s name and you can view other current and past directorships for that person. It provides more free information than Companies House but you have to register (free) to gain full access. Additional information such as credit risk, CCJs, credit reports, and many Companies House documents are priced or available as part of a subscription.

15. Guardian Data Store http://www.guardian.co.uk/data/
For datasets and visualisations relating to stories currently in the news. As well as the graphs and interactive maps the source of the data is always given and there are links to the original datasets that are used in the articles.

16. Zanran http://zanran.com/
This is a search tool for searching information contained in charts, graphs and tables of data and within formatted documents such as PDFs, Excel spreadsheets and images. Enter your search terms and optionally limit your search by date and/or format type. One delegate said “It has changed my life!”. (We think/hope she meant her working life.)

17. Keep up to date.
Keep up to date with what the search engines are up to, changes to key resources and new sites. Identify blogs and commentators that are relevant to your research interests and subject areas, and follow them using RSS or email alerts.

Compare Google and Bing results with Bingiton

Bingiton comparison results

Just over a year ago Bing launched a website called Bingiton (http://www.bingiton.com/), which enabled you to compare search results from Google and Bing side by side and then decide which set was best. You had to run five searches and then Bingiton told you which search engine you had chosen for each. After a couple of weeks the site was restricted to US users but it has now been relaunched in the UK.

The principle remains the same: you type in five searches, Bingiton displays the two sets of results side by side, and you decide which you prefer or go for the draw option (“can’t decide”).

I ran several batches of searches through Bingiton and Google won 4 of the rounds. The fifth, which consisted of searches for cake and jam recipes, was a draw with me being unable to decide. Two other rounds had to be declared null and void because “Scholarly articles” links (Google Scholar) appeared at the top of one set of results indicating that they were from Google. Another included what was obviously a Google map!

Bingiton

For me Bing seemed to be better at recipes and shopping enquiries than research oriented queries. Google consistently came out on top for local information and current news. Phil Bradley has also blogged about Bingiton (http://philbradley.typepad.com/phil_bradleys_weblog/2013/10/bings-bing-it-on-challenge-returns-to-the-uk.html) and invited people to comment on their own results. It is an interesting mix and Google does not always win or win outright. Take the Bingiton challenge yourself at http://www.bingiton.com/.

Google adds in-depth articles to results

Google is rolling out a new addition to search results called “In-depth articles” (http://insidesearch.blogspot.ca/2013/08/discover-great-in-depth-articles-on.html):

“To understand a broad topic, sometimes you need more than a quick answer. Our research indicates perhaps 10% of people’s daily information needs fit this category — topics like stem cell research, happiness, and love, to name just a few. That’s why over the next few days we’ll be rolling out a new feature to help you find relevant in-depth articles in the main Google Search results.”

The articles appear as a block of three at the bottom of your results, if you only display 10 results per page, or in the middle of the page if you display more. As Google says, they appear if your search is fairly broad and they do not appear for every query. I had to run several different searches before I found an example. At present it is only available in Google.com

My search on thorium reactor started with a Wikipedia article at the top, which seems all too often to be the default.

 Search on thorium reactors top results

Further down the page was a block of three “in-depth” articles from Wired, Cosmos Magazine and Nature.

Google in-depth articles on thorium reactors

They do not appear at all if you use a Chrome Incognito window or your browser’s private browsing option. They also disappear if you apply Verbatim to your results.

How useful are these articles? They are certainly lengthy and in depth but only the one from Nature was fairly recent (December 2012). The one from Wired was published in 2009 and the Cosmos Magazine article appeared in 2006. I tried limiting my search to articles published in just the last year using Search Tools, Any time, Past year. The documents in the main results changed but the in-depth articles remained the same. The Nature article is highly relevant but there are more recent documents to be found than those from Wired and Cosmos. This raises the question as to how these articles are selected. I have not yet found any reliable information on how it is done, although Google’s Webmaster Central Blog has provided a checklist that may help get an article into the triumvirate (http://googlewebmastercentral.blogspot.co.uk/2013/08/in-depth-articles-in-search-results.html). The Moz Blog has run an analysis on 352 searches and found that the major news sources feature heavily (see http://moz.com/blog/inside-indepth-articles for further details).

In theory, in-depth articles are a good way to find an overview of a topic but do check the dates. They may be horrendously out of date.

Farewell AltaVista

Yahoo is finally pulling the plug on AltaVista on July 8th. It appears as a one line entry in Yahoo’s latest list of closures (http://yahoo.tumblr.com/post/54125001066/keeping-our-focus-on-whats-next) with the comment “Please visit Yahoo! Search for all of your searching needs”. AltaVista was started by Digital Equipment in 1995 and quickly became the default search engine for many of us. I still meet people who have remained loyal to AltaVista even though it lost its unique search features a long time ago. Danny Sullivan has written a short history and eulogy for the search engine at http://searchengineland.com/altavista-eulogy-165366 – “A Eulogy for AltaVista, The Google of its Time”. Great though it was, some of us had already defected to the Inktomi powered search engine HotBot by the time Google had arrived on the scene. Alas, HotBot is now a shadow of its former self and AlltheWeb, which Yahoo had also acquired, was closed down in April 2011.

I’ve unearthed the AltaVista chapter and summary that I wrote for an early edition of Search Strategies. The chapter is at http://www.rba.co.uk/search/altavista/AltaVistaChapter.pdf and the summary sheet at http://www.rba.co.uk/search/altavista/avsumm.pdf

Google – you can say “NO!”

Picture the scene: an obviously distressed researcher is hunched over a computer screen, sobbing hysterically. All they wanted was a list of donkey sanctuaries in Surrey. How difficult is that? But Google decided that what they really wanted was a field guide to identifying buttercups. Our researcher tries all the advanced search commands and options they know but to no avail. It seems that Google has locked them into its dreaded live experiments (1) with no possibility of escape, and the information is needed NOW.

There is hope, though. There are other search engines out there. Bing may seem consumer/retail focused, but its list of advanced search commands is great at unearthing serious research information that Google buries at around the 2 millionth entry in your results list. My comparison and summary of search commands at http://www.rba.co.uk/search/compare.shtml lists the Bing commands that you are most likely to need. Or if you just want a no nonsense summary of your topic without all of Google’s personalisation and experiments look no further than DuckDuckGo. But should you even be using Google or similar, generic search engines in the first place? Think about the type of information you are looking for.

For news, RSS feeds are still a great way to pull together updates from your favourite newspapers, blogs and websites. Google Reader is about to disappear into a black hole but there are other, better RSS readers out there. I use a desktop client called RSS Owl (http://www.rssowl.org/) but if that doesn’t suit you Phil Bradley has a list of alternatives on his blog at http://philbradley.typepad.com/phil_bradleys_weblog/2013/03/20-alternatives-to-google-reader.html. Or you could try a different approach: create a Twitter list of essential news sources, or use Paper.li to create daily “newspapers” using keyword searches or hashtags. See my own “daily” at http://paper.li/karenblakeman or the paper.li on biofuels at http://paper.li/karenblakeman/1321447614

Interested in statistics and open data? Try the University of Auckland’s statistics portal (http://www.offstats.auckland.ac.nz/) or the Guardian’s Datastore (http://www.guardian.co.uk/data).

If you are looking for images Flickr.com is an obvious alternative. For photos you can re-use without fear of being dragged through the courts for copyright infringement try Geograph (http://www.geograph.org.uk/) or Morguefile (http://www.morguefile.com/).

And when it comes to free search tools for tracking down open access and research information there are dozens, some of which are listed at http://www.rba.co.uk/search/links.shtml#research.

These and many more are covered in my workshop “Anything but Google”, which is is being held in Newcastle later this month. Further details are on the UKeiG web site at http://www.ukeig.org.uk/trainingevent/anything-google-karen-blakeman.

We may not be able to avoid Google completely but there are equally good, if not better, tools available. Take the first step and say “No” to Google.

(1) Just Testing: Google Users May See Up To A Dozen Experiments http://searchengineland.com/just-testing-google-searchers-may-see-up-to-a-dozen-experiments-141570

Search tools for research information – Kindle version

At last! I’ve managed to convert my article on “Free search tools for research information”  into a Kindle version (http://www.amazon.com/dp/B00C11XLVQ). It took me four attempts to get it right (and I hope it is indeed OK). The Amazon instructions are here, there and everywhere. Amazon’s general guide on producing a Kindle version is OK, but it’s the detailed stuff that is hard to find. The link I have given takes you to Amazon.com. If your “local” Amazon is different you’ll need to search for either the title or my name in the Kindle store.

How search works – sort of

Google has put together a site showing how Google search works (http://www.google.com/insidesearch/howsearchworks/thestory/). The main page is a scrolling animated graphic that just gives you some elementary facts but there are links to more detailed information and videos on the main topics of crawling and indexing, the searching and ranking algorithms, fighting spam and Google’s general policies. They are a useful set of pages for anyone who does not already know the basics of how Google works, but if you are looking for something that tells you how to get sensible results from Google you’ll be disappointed. As Phil Bradley says:

“…. boils down to ‘we find some stuff, do magic to it, filter out the crap that our magic didn’t get and then give it to you.’ Yes folks, an entire site to say that. Wasted opportunity.”

Top tips for finding research information

Free Search Tools for Finding Research Information

This week I was in Canterbury leading a workshop and discussion on Google and Google Scholar for finding research information. Although the emphasis was on Google we also covered other specialist tools designed to search for scientific and research information. We also had an interesting discussion on h-index, other citation indices and services such as ORCID and ResearchGate. The slides for the session are available on authorSTREAM (http://www.authorstream.com/Presentation/karenblakeman-1706478-google-scholar-research-information/), Slideshare (http://www.slideshare.net/KarenBlakeman/scholar-research-information) and temporarily at http://www.rba.co.uk/as/.

Anyone who has attended one of my workshops knows that I ask the group to propose at the end of the session their top tips. These are the Canterbury group’s top 10 tips.

1. What’s going on?
Try and find out what’s going on behind the scenes and how the different search tools work. For example, Google and Google Scholar are quite different in the way they manage your search. Understanding how they operate means that you can adapt your search strategy accordingly and also manage your expectations; for example Google Scholar does not use the publishers’ meta data so author and date search are unreliable.

2. Personalisation and ‘unpersonalisation’
Google personalises your search based on past activity, who is in your social networks,and a whole host of other ‘stuff’. You can quickly ‘unpersonalise’ your results by using a separate browser window that does not use cookies or your web history as part of the search algorithm.

If you use Chrome as your browser, open what is called an incognito window. In the top right hand corner of your screen there is an icon with three lines. Click on it and from the drop down menu select New incognito window. Alternatively press the Ctrl Shift N keys on your keyboard

If you use Firefox, from the menu at the top of the screen select Tools followed by Start Private Browsing.

In Internet Explorer select Tools followed by InPrivate Browsing. If you cannot see InPrivate under Tools try looking under the Safety option.

3. Advanced search commands
Use Google advanced commands  such as filetype: to focus on PDFs, presentations, spreadsheets containing data and site: to look for information on just one site or a range of sites such as UK government. Although the advanced search screen has boxes for you to fill in for the commands the file format or filetype option is limited. It does not include options for the newer Microsoft Office formats such as .pptx and xlsx. Use filetype: as part of your search strategy, for example:

nasa dark energy dark matter filetype:pptx

Google Scholar commands are more limited – see slide 28 of the presentation.

4. intext:
Google automatically looks for variations on your terms and sometimes omits words from your search if it thinks the number of results is too low. Prefixing a term with intext: tells Google that it must be included in your search and exactly as you have typed it in. For example:

UK public transport intext:biodiesel statistics

tells Google that biodiesel must be included in the search and exactly as typed in.

5. Reading Level
Use Reading level if Google is failing to return any research oriented documents for a query. Run the search and from the menu above the results select Search toolsAll results and then from the drop menu Reading level. Options for switching between basic, intermediate and advanced reading levels should then appear just above the results. Google does not give much away as to how it calculates the reading level and it has nothing to do with the reading age that publishers assign to publications. It seems to involve an analysis of sentence structure, the length of sentences, the length of the document and whether scientific or industry specific terminology appears in the page.

6. Date options
In Google web search, use the date options in the menus at the top of the results page to restrict your results to information that has been published within the last hour, day, week, month, year or your own date range. Click on Search tools, then Any time and select an option. This works best with news, discussion boards, and blogs and web sites that use blogging software  to generate pages but Google is getting better at identifying the correct date of a web page.

Google Scholar handles publication dates differently. On the results page you can select a date range from the menu on the left hand of the page. Alternatively, you can run a Google advanced search and enter your publication years. However, Google Scholar looks for publication years in the area of the document where the date is most likely to be. As a result it may identify a page number or part of an author’s address as a year!

7. Google Scholar alerts
To be used with caution as the searches periodically stop without warning, and so have to be set up again, and they sometimes include documents that are several years old. Whatever your search you can set up an alert by selecting Create alert from the menu on the left hand side of the results page.

If the author has created a profile on Google Scholar, from their profile page you can follow new articles and/or new citations for that author. From past experience I warn you that this is not entirely reliable.

Google Scholar Follow Author

8. Metrics – top publications
Although it claims to search all scholarly literature Google Scholar does not always cover all of the key journals in a subject area. There is no complete source list but there is a top publications for subjects and languages under the ‘Metrics’ link in the upper right hand corner of the Scholar home page.

9. Microsoft Academic Search – visualisations
Microsoft Academic Search (http://academic.research.microsoft.com/) is a direct competitor to Google Scholar. The site is sometimes slow to load and it often assigns authors to the wrong institution. Nevertheless, the visualisations such as the co-author and citation maps can be useful in identifying who else is working in a particular area of research. The visualisations can be accessed by clicking on the Citation Graph image to the left of the search results or author profile.

Microsoft academic search citation graph
Author Citation Graph


10. Mednar visual
Deep Web Technologies has developed in conjunction with various institutions a number of science and research specific portals, some of which are publicly available. The sources that they cover are different but they all have similar search and display options. Results are automatically ranked by relevance but this can be changed to date, title or author. In addition to the standard relevance ranked list of results the portals create clusters of topics on the left hand side of the screen. The topics include broad subject headings, authors, publications, publishers, and year of publication and are a useful tool for narrowing down a search. Some of the portals, such as Mednar (http://mednar.com/), offer a clickable ‘visual’ of topics and sub-topics.

Mednar Macular Degeneration Visual

Forthcoming workshops

I am running three workshops in April on business information and search. All three have a practical element so that you can try out resources and techniques for yourself.

Introduction to Business Research

This is being organised by TFPL and will be held in London on Thursday, 18th April. This course provides an introduction to many areas of business research including statistics, official company information, market information, biographical information and news sources. It will cover explanations of the jargon and terminology, regulatory issues, assessing the quality of information, primary and secondary sources. Further information is available on the TFPL web site at http://www.tfpl.com/services/coursedesc.cfm?id=TR1116&pageid=-9&cs1=&cs2=f

Business information: key web resources

This is also being organised by TFPL in London and is being held on Friday, 19th April. This workshop looks in more detail at the resources that are available for different types of information, alerting services and free vs. fee. It also covers search strategies for tracking down industry, market and corporate reports. Further information is available at http://www.tfpl.com/services/coursedesc.cfm?id=TR945&pageid=-9&cs1=&cs2=f

Make Google behave: techniques for better results

This is a very popular workshop and is being organised by UKeiG. It is being held in Manchester on Tuesday, 30th April.

Topics include:

  • How Google works
  • Recent developments and their impact on search results
  • How Google personalises your results and can you stop it?
  • How to use existing and new features to focus your search and control Google
  • How and when to use Google’s specialist tools and databases
  • What Google is good at and when you should consider alternatives

The workshop will be repeated in London on Wednesday, 30th October. Details and booking information are on the UKeiG website at http://www.ukeig.org.uk/trainingevent/make-google-behave-techniques-better-results-karen-blakeman

New Search Strategies articles

There are three new articles available in the subscribers area of Search Strategies:

Searching for research information: Institutional Repositories HTML article and PDF

Mendeley as a search tool for research papers. Available as an HTML article and PDF

Scirus. Available as an HTML article and PDF

Annual individual subscription rates are £48/year (£40 + £8 VAT). Multi-user and corporate rates are available on request. For further details contact Karen Blakeman publications@rba.co.uk.

To purchase a subscription go to http://www.rba.co.uk/search/purchase.shtml