Tales from the Terminal Room

September 2012, Issue No. 103

Home About RBA Business Resources Search Strategies for the Internet Tales from the Terminal Room Training Contact Us

Please Note: This is an archive copy of the newsletter. The information and links that it contains are not updated.

PDF PDF version
(635 KB)

Share |




Creative Commons License.

Tales from the Terminal Room ISSN 1467-338X
September 2012, Issue No. 103
Editor: Karen Blakeman
Published by: RBA Information Services

Tales from the Terminal Room (TFTTR) is an electronic newsletter that includes reviews and comparisons of information sources; useful tools for managing information; technical and access problems on the Net; and news of RBA's training courses and publications. Many of the articles will have already appeared on Karen Blakeman's Blog at http://www.rba.co.uk/wordpress/

Tales from the Terminal Room can be delivered via email as plain text or as a PDF with active links. You can join the distribution list by going to http://www.rba.co.uk/tfttr/index.shtml and filling in the form. You will be sent an email asking you to confirm that you want to be added to the list. TFTTR is also available as an RSS feed. The URL for the feed is http://www.rba.co.uk/rss/tfttr.xml

In this issue:

  • Search tools
    • Search Strategies goes electronic
    • Million Short: unearthing stuff hidden in the dungeons of Google's results
    • Rediscovering BananaSlug for “long tail” search
  • Top search tips from North Wales
  • Business Information
    • Doing Business in the United Kingdom and France
    • Company information: Luxembourg and Belgium
  • Twitter Notes

Search tools

Search Strategies goes electronic

My publication “Search Strategies for the Internet” is changing and going electronic. There will be shorter articles covering specific search techniques and search engine features. In addition there will be screencasts and presentations. The new structure means that updates to the content will be easier and more frequent. Initially there will be no “book” but I may eventually combine some of the articles into a single publication.

As before, some information such as the fact sheets and Top Tips are available free of charge but the detailed information and screencasts will be available to subscribers only. See http://www.rba.co.uk/search/ for further details.

Annual individual subscription rates are £48/year (£40 + £8 VAT). Multi-user and corporate rates are available on request. If you don't want to commit to an annual subscription some of the articles will be available separately for purchase.

“How to make Google run the search you want” is already available in the subscribers' area and covers Verbatim, daterange, using the tilde and intext. Several topics are in preparation and to help me get an idea of what people are interested in I have set up a quick poll at http://www.rba.co.uk/wordpress/2012/10/14/search-strategies-goes-electronic. It would be great if you could rank the seven topics with '1' being the one you would like to see most. If there is a search topic or feature that you would like to see covered and it is not in the list use the comment box at the bottom of this posting. The topic with the most votes will be made available in the free content area.

Million Short: unearthing stuff hidden in the dungeons of Google's results

Fed up with seeing the same results from Google again and again? Wondering if that elusive document is buried somewhere at the bottom of Google's 2,000,000 hits? Then get thee hence to Million Short (http://millionshort.com/).

Million Short runs your search and then removes the most popular web sites from the results. Originally it removed the top 1 million, as its name suggests, but the default has changed to the top 10,000. The principle remains the same, though: exclude the more popular sites and you could uncover a real gem. The page that best answers your question might not be well optimised for search engines or might cover a topic that is so “niche” that it never makes it into the top results. Million Short does not say what it uses for search results or how it determines what are the most popular web sites. According to Webmonkey “ Sanjay Arora, founder of Exponential Labs, tells Webmonkey that Million Short is using “the Bing API… augmented with some of our own data” for search results. What constitutes a “top site” in Million Short is determined by Alexa and Million Short's own crawl data. ” (http://www.webmonkey.com/2012/05/million-short-a-search-engine-for-the-very-long-tail/).

Using Million Short is straightforward. Type in your search and select how many sites you want to exclude (top 10K, top million, top 100). The results page includes a list of the sites that have been removed and you can opt to add one or more back in. You can also block a site using a link next to it in the results or click on “Boost!” so that pages from the site go to the top.

Million Short automatically tries to detect which country you are in but you can change it under “Manage Settings and Country”. I didn't notice much difference when I changed countries but then most of the queries I pass through Million Short tend to be scientific or technical. On the same page you can manage sites that you have blocked, added or boosted.

Does it work? I would not use it instead of the existing major search engines such as Google, Bing or DuckDuckGo but as an additional tool to surface material that is not easily found in the likes of Google. As well as web search there are image and news searches, but I'm not convinced that I'd find those all that useful.

If you are interested in comparing Million Short with Google try Million Short It On at http://www.millionshortiton.com/index.html . I had several goes at this and most of the results were a draw. That is no surprise as the searches I ran were very specific and I wanted to see if Million Short would pull up additional information, which it did. Million Short won outright on a couple and Google on one. The Google win was by default because Million Short did not come up with anything for comparison (the search in question was biofuels public transport carbon emissions).

There are a number of techniques that you can use to improve Google results for example changing the order of the words in your search, Verbatim, filetype or Reading Level but I would also recommend trying Million Short. The results should at least be different and may reveal vital information for your research.

Rediscovering BananaSlug for “long tail” search

I think it must have been seeing Phil Bradley the other night that made me think of revisiting BananaSlug.com (http://bananaslug.com/). I don't mean that Phil reminds me of a banana slug but he did introduce me to the search tool via his blog way back in 2005. I have been looking at ways of getting out of what I call “search ruts”. You keep seeing the same results again and again but suspect that there may be something more relevant if only you could get to it. Million Short (see above) is one way to tackle the problem. BananaSlug takes a different approach to what is known as long tail search. It adds a random term to your search and pulls up pages buried way down in the results list that you would probably never see. Just type in your search and then select a category, for example Animals, Great Ideas, Random Number, Themes from Shakespeare. BananaSlug then adds a random word from that category to your terms.

At first glance this approach to search may seem appropriate for frivolous, fun stuff only but I find that it works really well with serious research topics. Running one of my test searches on zeolites in environmental remediation through the categories pulled up information that could have taken me hours or even days to find otherwise.

Bear in mind that BananaSlug uses Google so synonyms and variations of the random word will be included in the search. When I selected Colors as my category red was added to my search and Google included reddish and reds.

Most of the categories came up with something useful although Random Number, inevitably for this type of search, came up with page numbers of journal articles. I didn't think Themes from Shakespeare would work but the random word it suggested was storm and there were several interesting papers on storm water management and treatment.

This may seem a bizarre way to explore search alternatives but if you are stuck for ideas give it a go.

Note: for more information on the banana slug Ariolimax see http://en.wikipedia.org/wiki/Banana_slug. The Pacific banana slug is the second-largest species of terrestrial slug in the world, growing up to 25 centimetres (9.8in) long.

Top search tips from North Wales

August is usually a quiet month for me with respect to work. Time for a holiday away and then a couple of weeks ambling along the Thames Path or pottering around the garden. This year, though, as soon as I was I back from my travels I was knuckling down and updating my notes for two search workshops in North Wales. Both were for the North Wales Library Partnership (NWLP), the first taking place at Coleg Menai in Bangor and the second at Deeside College. Both venues had excellent training facilities and IT, which meant we could concentrate on getting to grips with what Google is doing with search and experiment with different approaches to making Google do what we want it to do.

At the end of the workshops both groups were asked to come up with a list of Top 10 Tips. I've combined the two lists and removed the duplicates to generate the list of 16 tips below.

  1. Repeat one or more of your search terms one or more times
    Fed up with seeing the same results for your search? Repeat your main search term or terms to change the order of your results.
  2. Menus on left hand side of Google results pages
    Use the menus on the left hand side of the results page to focus your search and see extra search features. To see all of the options click on the ‘More' and ‘More search tools' links. The content of the menus changes with the type of search you are running, for example Image search has a colour option.
  3. Verbatim
    Google automatically looks for variations of your terms and no longer looks for all of your terms in a document. If you want Google to run your search exactly as you have typed it in, click on the ‘More search tools' options at the bottom of the left hand menu on your results page and then on Verbatim at the bottom of the extended menu that appears.
  4. intext:
    Google's automatic synonym search can be helpful in looking for alternative terms but if you want just one term to be included in your search exactly as you typed it in then prefix the word with intext: . For example carbon emissions buses intext:biofuels flintshire. The command sometimes has the effect of prioritizing pages where your term is the main focus of the article.
  5. Advanced search screen and search commands
    Use the options on the advanced search screen or the search commands (for example filetype: and site:) in the standard search box to narrow down your search. A link to the advanced search screen can usually be found under the cog wheel in the upper right hand area of the screen. If you can't see a cog wheel or the link has disappeared from the menu go to http://www.google.co.uk/advanced_search. A list of the more useful Google commands is at http://www.rba.co.uk/search/SelectedGoogleCommands.shtml.
  6. Try something different
    Get a fresh perspective by trying something different. Two most popular during these two workshops seemed to be DuckDuckGo (http://duckduckgo.com/) and Millionshort (http://millionshort.com). Other search engines to try include Bing (http://www.bing.com/) and Blekko (http://blekko.com/).
  7. Use the country versions of Google for information that is country specific
    This will ensure that the country's local content will be given priority, although it might be in the local language. Useful for companies and people who are based in or especially active in a particular country, or to research holiday destinations. Use Google followed by the standard ISO two letter country code, for example http://www.google.de/ for Google Germany or http://www.google.no/ for Google Norway.
  8. Filetype to search for document formats or types of information
    For example PowerPoint for experts or presentations, spreadsheets for data and statistics, or PDF for research papers and industry/government reports. Note that filetype:ppt will not pick up the newer .pptx so you will need to include both in your search, for example filetype:ppt OR filetype:pptx . You will also need to look for .xlsx if you are searching for Excel spreadsheets and .docx for Word documents. The Advanced Search screen file type box does not search for the newer Microsoft Office extensions.
  9. Clear cookies
    Even if you are logged out of your Google account when you search, information on your activity is stored in cookies on your computer. These can personalise your results according to your past search and browsing history. Many organisations have set up their IT systems so that these tracking cookies are automatically deleted at least once a day or whenever a person logs in or out of their computer account. At home, your anti-virus/firewall software may perform the same function. If you want to make sure that cookies are deleted or want to control them manually “How to delete cookies” at http://aboutcookies.org/Default.aspx?page=2 has instructions on how to do this for most browsers.
  10. Looking for research papers?
    Google Scholar (http://scholar.google.com/) is one place to look but there may be additional material hidden somewhere on an academic institution's web site. Include advanced search commands, for example filetype:pdf site:ac.uk, in your search.
  11. For the latest news, comments and analysis on what is happening in an industry or research area carry out a Google blog search and limit your search by date. Simply run your search as usual in the standard Google search box. On the results page click on Blogs in the menu on the left hand side of the screen and then select the appropriate time option.
  12. site: and -site:
    Use the site: command to search within a single site or type of site. For example:

    2011 carbon emissions public transport site:statistics.gov.uk

    to search just the UK official statistics web site and

    asthma prevalence wales site:gov.uk OR site:nhs.uk

    to search all UK government and NHS web sites.

    If you are fed up with a site dominating your results use -site: to exclude it from your search.

    For example:

    Dylan Thomas -site:bbc.co.uk
  13. Reading level – from tourism to research
    Use this to option in the menus on the left had side of your results page to change the type of information. For example run a search on copper mines north wales . Then click on Reading Level in the left hand menus. Selecting “Basic” from the options that appear at the top of the results gives you pages on tourism and holiday attractions. “Advanced” gives you research papers, journal articles and mineral databases. Google does not give much away as to how it calculates the reading level and it has nothing to do with the reading age that publishers assign to books. It could involve sentence structure, grammar, the length of sentences on a web page, the length of the document, the terminology used and doubtless many other criteria.
  14. Google.com
    Apart from presenting your search results in a different order Google.com is where Google tries out new features. As well as seeing pages that may not be highly ranked in Google.co.uk you will get an idea of how Google search may look in the UK version in the future.
  15. Numeric range search
    Use this for anything to do with numbers – years, temperatures, weights, distances, prices etc. Use the boxes on the Advanced Search screen or just type in your two numbers separated by two full stops as part of your search. For example: world oil demand forecasts 2015..2030
  16. An understanding of copyright is important if you intend to re-use information found in the web and absolutely essential if you are going to use images. Creative Commons licences clearly state what you can and can't do with an image but they are   not   all the same. The list at Creative Commons http://creativecommons.org/licenses/ outlines the terms and conditions. “FAQs – Copyright – University of Reading” at http://www.reading.ac.uk/internal/imps/Copyright/imps_copyrightfaqs.aspx gives some guidance on copyright but if in doubt always ask! An example of what can happen if you get it wrong is demonstrated by “Bloggers Beware: You CAN Get Sued For Using Pics on Your Blog” http://www.roniloren.com/blog/2012/7/20/bloggers-beware-you-can-get-sued-for-using-pics-on-your-blog.html

Business Information

Doing Business in the United Kingdom and France

Compiled and published by Bryan Cave LLP, Doing Business in the UK (http://www.bryancave.com/files/Uploads/Documents/DoingBusiness-UK2012.pdf) is an excellent summary of what is involved in setting up a business in the UK and the associated legislation. As well as describing the various types of company it also covers director's duties, UK taxation, employment law, business immigration, intellectual property, data protection and competition law. There is a similar publication on Doing Business in France (http://www.bryancave.com/files/Uploads/Documents/DoingBusiness-France2012.pdf). Both are free of charge.

Company information: Luxembourg and Belgium

I am updating the official registries section of my business sources listings (http://www.rba.co.uk/sources/registers.htm) and there are changes to the entries for Luxembourg and Belgium.

The Registre de Commerce et des Sociétés – Accueil (http://www.rcsl.lu/) is the official register of companies and associations in Luxembourg. The search options are limited to company name or number and the interface is in German and French. Searching and company name, address and contact details are free of charge. Documents are priced.

Legilux Sociétés et Associations has more search options at http://www.legilux.public.lu/entr/search/index.php and it provides a history of the documents filed by a company. This is a free service but for the documents themselves you have to go back to the Registre de Commerce et des Sociétés where there is a charge per document.

In Belgium the KBO Public Search (http://economie.fgov.be/nl/ondernemingen/KBO/Pubd/PuS/)
enables you to search for public information on every registered active enterprise and establishment. Search by company number, name, branch number or name, address and municipality. Each record provides name, company number, activities, address, contact details and links to other sites for official documents and annual reports. The search interface is available in Dutch and French. The information is in Dutch or French and is free.

The CBSO (Central Balance Sheet Office) section of the National Bank of Belgium (http://www.bnb.be/) has the accounts of companies, associations and foundations active in Belgium. The search interface is available in Dutch, French, German and English.

Many thanks to Inez de Bois for the information and updates.

Twitter Notes

The following are some of my recent tweets and retweets. They are selected because they contain links to resources or announcements that may be of general interest. I have unshortened the shortened URLs.

September 4th

Karenblakeman: Market research just asked me when my household first got Internet access. It was 1992. Earliest option in researcher's script is 1997 #fail

September 6th

Karenblakeman: “HootSuite acquires Seesmic, Seesmic customers to be transitioned to HootSuite” VentureBeat http://venturebeat.com/2012/09/05/hootsuite-acquires-seesmic-seesmic-customers-to-be-transitioned-to-hootsuite/

MT @AlisonMcNab: RT @dmuleicester: Digital detective will save thousands of research hrs by tracking historic photos http://www.dmu.ac.uk/about-dmu/news/2012/september/digital-detective-will-save-thousands-of-research-hours-by-tracking-down-historic-photos.aspx

September 9th

Karenblakeman: Google Testing Search Options Listed Above Results, Rather Than To Side : http://searchengineland.com/google-testing-search-options-listed-above-results-rather-than-to-side-132531 via @sengineland

September 12th

karenblakeman It was rubbish anyway RT @LocusCommunis: As of September 30th, Google Reader will be turning off track changes http://www.confidentialresource.com/2012/09/10/where-there-is-no-rss/

September 16th

Karenblakeman: Google to stop supporting IE8 in November. Google Apps update alerts – Explorer 8 support discontinued http://googleappsupdates.blogspot.co.uk/2012/09/supporting-modern-browsers-internet.html

Karenblakeman: RT @stephendale: Useful for Twitter newbies: The Guide to Twitter's Language http://thenextweb.com/twitter/2012/09/15/a-list-twitters-language/

September 18th

Karenblakeman: MT @librarygirlknit RT @DrSustainable Can librarians trust resources found on Google Scholar? Yes …and no. http://blogs.lse.ac.uk/impactofsocialsciences/2012/09/17/can-science-students-and-researchers-trust-resources-found-on-google-scholar-yes-and-no/

September 20th

Karenblakeman: Thanks to @awareci for this. “Close your Facebook account” http://www.youtube.com/watch?v=3sThcwmx3rs&feature=youtu.be Spoof alert: note the onion news network logo. [Love the comment about Twitter:“400 billion tweets and not one useful bit of data was transmitted”]

September 25th

Karenblakeman: Power Searching with Google video tutorials available again at http://www.powersearchingwithgoogle.com/

September 26th

Karenblakeman: RT @JamesFirth: Murdoch backs down in war with ‘parasite' Google, Times will be indexed. (last laugh: Google refuses to index??) http://www.telegraph.co.uk/finance/newsbysector/mediatechnologyandtelecoms/9566353/Rupert-Murdoch-backs-down-in-war-with-parasite-Google.html

karenblakeman: RT @FocusOnInfo: Happy Birthday, DuckDuckGo: Homegrown Search Engine Turns 4 http://searchenginewatch.com/article/2208454/Happy-Birthday-DuckDuckGo-Homegrown-Search-Engine-Turns-4

September 28th

karenblakeman: RT @awareci: Google Carousel - a roundabout of images but not for all searches http://awareci.com/2012/09/28/google-carousel-a-roundabout-of-images-but-not-for-all-searches/

TFTTR Contact Information

Karen Blakeman, RBA Information Services
Twittername: karenblakeman (http://twitter.com/karenblakeman)
UK Tel: 0118 947 2256, Int. Tel: +44 118 947 2256
UK Fax: 020 8020 0253, Int. Fax: +44 20 8020 0253
Address: 88 Star Road, Caversham, Berks RG4 5BE, UK


TFTTR archives: http://www.rba.co.uk/tfttr/archives/index.shtml

Subscribe and Unsubscribe

To subscribe to the newsletter fill in the online registration form at http://www.rba.co.uk/tfttr/index.shtml

To unsubscribe, use the registration form at http://www.rba.co.uk/tfttr/index.shtml and check the unsubscribe radio button.

Privacy Statement

Subscribers' details are used only to enable distribution of the newsletter Tales from the Terminal Room. The subscriber list is not used for any other purpose, nor will it be disclosed by RBA or made available in any form to any other individual, organisation or company.

Creative Commons License
Tales from the Terminal Room by Karen Blakeman is licensed under a Creative Commons Attribution-Non-Commercial 3.0 Unported License.
For permissions beyond the scope of this license contact Karen Blakeman by email at karen.blakeman@rba.co.uk or via http://www.rba.co.uk/about/contactkb.htm.

This page was last updated on 14 October 2012 Copyright © 2012 Karen Blakeman.