Tag Archives: Google

Google experiments with Image Swirl

Having made Google Image Options (including colour) and Similar Images available as part of their standard image search, Google are now playing around with Image Swirl in Google Labs. According to Google it “builds on new computer vision research to cluster similar images into representative groups in a fun, exploratory interface”. In practice it is a combination of similar images and the Wonderwheel.

One of my image test searches is Edvard Munch and Swirl came back with 12 thumbnails of stacked images (12 is the standard number of stacks) :

Click on a group of stacked  images and another set of images “swirl” into view in the form of the wonderwheel:

And you can keep on clicking on groups/stacks of images but still keep the “history” of your selections.

I was pleasantly surprised by the clustering or stacking of the images. I thought that by the time I had reached ‘level 3’ of my browsing each stack would be just different versions of the same image or images with similar colour composition. My Edvard Munch level 3 selection, however, came up with a selection of landscapes with different colours. They did, though, seem to have similar ‘patterns’, for examples paths or what could be interpreted as paths as a major component of the image.

Phil Bradley has also reviewed Google Swirl and comments “Bing is going to have their work cut out to try and catch up.” Far too polite, Phil. I’d say “Bing, eat your heart out!”

Google Swirl looks very promising and I shall be monitoring its progress with interest.

Twitter search in Bing and Google

Bing and Google have both announced that they have done a deal with Twitter that enables them to offer ‘real time’ Twitter searches. The Bing service is live now at http://www.bing.com/Twitter/. SearchEngineWatch has an overview of the service at Bing.com/Twitter: A Visual Tour. It looks impressive but as is so often the case with Bing the reality does not live up to expectations.

I have just returned from a conference on chemical information held in Sitges – hashtag #icic09. This should be an easy one for Bing Twitter to handle I thought. Silly me. Up came “We did not find any Twitter results or links for icic09″. I tried it with and without the hashtag – still nothing.  And yet both search.twitter.com and www.twazzup.com had no problem finding tweets from the conference.

Bing Twitter results on #icic09

Twazzup results on #icic09

It also appears that you cannot search on a username. I then compared the results of searches on keywords and names that I knew had been tweeted at the conference: chemspider, chemspiderman, David Walsh, semantic mediawiki, markush. Nothing! It seems that the whole conference has been boycotted by Bing Twitter. I did begin to suspect that the service is not really up and running but searching on Nick Griffin came up with plenty of results and it found a tweet from one of my Twitter network about chickpea curry that had been posted a few minutes before.

There is something seriously wrong with Bing Twitter. Until they fix it and can present credible results I recommend that you give it a miss.

So what of Google’s offering? It isn’t live yet but there is useful discussion and comments on Google Social Search Is Coming & More On Google-Twitter. The main question for us as searchers is whether or not the Twitter search will be integrated into the standard web search or made available as a separate option. Tweets are already included in the web search as I discovered when I did a search on icic09 but they are spread out amongst the results. It would make sense to have a separate search tool such as Google’s Blogsearch. Another option would be to incorporate it into the side bar under “Show options” (See Google new search and display options).

Bing have yet again snatched defeat from the jaws of victory. As for Google Twitter, we shall just have to wait and see.

Presentation: Internet Search – a challenging and ever changing landscape

CILIP in the Thames Valley, 6th October 2009, Great Expectations, Reading

The presentation I gave to CILIP in the Thames Valley on 6th October is now available in a number of locations. At least one of these should be accessible through your firewall!

PowerPoint presentation – RBA web site
Slideshare

Authorstream
Slideboom

Some of the slides have annotations from my blog and new comments so make sure you check out the notes to the slides. Many of the slides are screen shots so they won’t make much sense without the notes or unless you were at the live presentation.

Google new search and display options

Some of you may have spotted that Google has introduced some excellent new search and display options. Many of you probably have not – the link to them is very discreet, almost as though Google does not want you to find out about them. Carry out a standard Google search and to the left just above your search results you will see a “Show options” link.

Click on ‘Show options’ or the plus sign and additional search and sort options will appear to the left of your search results.

At the top of the list you can choose to limit your search to videos, blogs, forums or reviews.

Below that are options to restrict your search to “recent results”, the past hour, past 24 hours, past week, past year or to your own specific date range. Not surprisingly the past hour, 24 hours and week pull up mostly blog postings and news articles. “Recent results” seems to pick results that go back about a couple of months.

As soon as you select any of the time options apart from the specific date range, additional options to sort by relevance or by date appear but the date option only sorts with most recent first. For some inexplicable reason sorting by date disappears if you want to specify your own range of weeks, months, or years; results are automatically sorted by relevance.

A word of warning about Google’s date sorting: the “date” of many of the web pages bears no relationship whatsoever to the real date of publication or when the content was actually written. In these cases Google is using the date and time stamp assigned to the page by the hosting web server. Most web sites have been revamped and reloaded at least once in their lifetime and some pages are dynamically created at the time of search. The dates of blog postings and news articles are a little more reliable, although there too you can find anomalies.

If you want to quickly identify articles that fall within a specific time period you may be better off selecting the Timeline but this seems to only include articles from Google Current News and Google Archive News. Also, the list of results below the Timeline graphic does not include every year. You have to click on the bar representing the required years and only then are all the articles displayed.

Related searches is obvious: this comes up with alternative search strategies that you might want to try. For me they would be far more useful displayed at the top of the standard search results rather than being hidden under  “Show options”.

The Wonder wheel is difficult to describe in words as it is a clustering and visualisation tool combined. Click on a link on the first wheel and a second pops up with a different set of clustered links for you to follow. Try it and see if it works for you.

“Images from the page” adds thumbnails of images found on the page next to the text entry in your results list.

The “More text” option gives you a larger extract from each of the pages in the results list making it easier for you to decide which are most relevant for your needs.

And if you are fed up with seeing shopping sites in your lists or perhaps want more, Google has thought of that as well. Simply click on “Fewer shopping sites” or “More shopping sites”. This works very well and reminds me of Yahoo’s Mindset experiment that allowed you to move a slider bar between research and shopping to change the emphasis of the results. Sadly, Yahoo never incorporated it into its standard search and abandoned the project a while ago.

Overall, Google has come up with a winner here. I would not want to use every option for every search so having a bar from which you can easily select and combine them is a great idea. It is a pity that Google has not made the additional options more obvious.

Internet Search: a challenging and ever changing landscape

CILIP in the Thames Valley evening meeting

Date & Time: Tuesday 6th October 2009,  1800 for 1830 hrs
Location: Great Expectations, 33 London St, Reading

Google threatens to go hyper with its “caffeine” search. Bing is taking over Yahoo. Image search options are expanding: creative commons, colour, similar images. More specialist search tools for the “hidden web” are emerging and Web 2.0 is now an essential part of the search mix. Karen Blakeman will look at the new services that are being pushed out by the major search engines and the alternatives.

This is a free event followed by free refreshments and networking opportunities with colleagues.

An invitation is extended to anyone with a professional interest in the topic

Contact: Norman Briggs, nwbriggs@pcintell.co.uk to advise attendance for catering purposes.

Searching for images by colour

This is not a frequently asked question on my workshops but when it is raised by one of the participants it generates a great deal of interest amongst the rest. So far I have come across three that I would recommend trying.

The first is Exalead’s Chromatik, which is part of the Exalead Labs experimental area. This enables you to search a selection of Flickr images by colour and optionally by keyword. You first select one or more colours or hues from a palette which are added to a bar below the palette. You can adjust the proportions of  the colours in the photos by moving the separators between the colours in the bar. Luminosity can be toggled between bright and dark, and saturation between colourful and grey levels. The last option in the list is to search for specific images using keywords (I assume this searches the titles, tags and descriptions associated with the Flickr images). The implication is that once you have selected your colours you can then limit your search to particular objects. In practice, if you search for colour followed by keyword, Chromatik ignores your colour choices and searches only on your keywords. If, for example, you want to search for apples of a particular colour you must first search on apples and then pick your colours.

It pays to keep the number of colour choices to two or three, even if you require very specific colours, as this will give you a wider range of images to choose from. When the thumbnails are displayed you can hover over the best match and select “show images with same colors”. Click on an image and it is displayed full size, but in order to see further information about it you have to right click and select properties. This will give you a URL for the original image on Flickr but only for the image itself. It does not take you to the “full” Flickr page for the photo, which means that  you cannot check ownership and copyright.

The second tool is Multicolr Search Lab from Idée Inc. This uses “10 million of the most “interesting” Creative Commons images on Flickr”. As with Chromatik you select colours from a palette. You can select up to ten colours and click on the same colour several times if you wish to increase its prominence in the photo. Unfortunately there is no keyword search. On the plus side, if you find an image you like simply click on the image to go straight to its page on Flickr where you can double check the copyright situation.

And of course there is Google’s image search. Carry out a search on your keywords in Google images and above the results there is an option to select a colour. There are only twelve colours from which to choose and you can only select one but it works well enough. If you want to search only Creative Commons images then carry out the first stage of your search in the Advanced Image Search screen and select the appropriate option from the Usage Rights menu.

Google compiles industry stats for the UK – sort of

Google has launched a new page that pulls together industry stats for the UK. Google – Internet Stats, which is biased towards information on electronic and online services and products, gathers data from third party vendors many of which are priced. A list is available at the bottom of the Internet Stats page. You can, though, submit your own “killer fact”.  All submissions are vetted by Google.

There are five categories: Technology, Macro Economic Trends, Media Landscape, Media Consumption and  Consumer Trends. Each section has further sub-categories.

This is not the answer to a market/industry researcher’s prayer. The number of statistics is very limited and the search option only searches within the browsable statistics on the landing page. Do not expect to be able to search for and find data on, for example, UK chocolate consumption! If your query falls within one of the listed categories you may be in luck.

Exactly where Google is going with this and why they have introduced it is not clear. This is a UK-only initiative at present and there is no link to it from either the .com or .co.uk main Google search pages. Neither is it listed in Google Labs. Even the official announcement on “Google Barometer: New! Internet Stats all in one place” gives very little further information.

Searching for file types made easy

One of the Top 10 Tips that participants of my advanced search workshops regularly come up with is using file format options to focus your search. If you are looking for an expert on a topic, a conference presentation or a quick overview of a topic then seek out PowerPoint files; government and industry reports are often stored as PDFs; and substantial collections of statistics may be left in Excel format. Both Google and Yahoo have options for file type searches on their advanced search screens, but if you want a quick and easy way of searching both of these search tools for the four main file types (Word, Excel, PDF, PowerPoint), then head for DocJax.

Simply type your search terms into the box and DocJax will pull up a list of all four file formats in Yahoo and Google that contain your terms. You can then limit your search to just one file type by clicking on one of the four logos at the top of the list.

DocJax

I have only one minor quibble with DocJax, which is that it does not deduplicate the results. Other than that, it is an excellent tool for filetype searching. Many thanks to Peter Guillaume for alerting me to the service.

If you prefer to search Yahoo and Google separately, then try Browsys Advanced Finder. Select Files form the menu at the top of the screen, enter your search terms and click on Yahoo or Google for your preferred file type. There is no need to re-enter your search terms for each search – just click your way through the list.

BrowsysFiles

I usually berate such services for not including Bing (formerly Microsoft Live Search) in their lists because Bing does sometimes come up with unique content. Although not included in Bing’s advanced search options one used to be able to simply incorporate the filetype: command followed by the file extension in the search. On testing it today, though, I discovered that the filetype command no longer works in Bing. Like the link and linkdomain commands, it has been obliterated from their search system. Another example of Bing dumbing down their search. This does not bode well for Yahoo: as part of the recent Microsoft deal, Microsoft will power Yahoo search and as a result Yahoo will lose many of its current search features. I’m afraid that rather than stealing market share from Google, Bing’s current approach to search will encourage users to stay with the big G.

Free-to-use images might not be

You may have already read that Google now includes a creative commons license filter option in its Advanced Image search screen. Creative Commons is a series of licenses that can be applied to a variety of works such as images, video and PowerPoint presentations and they specify what you can and cannot do with those works. Information on the licenses can be found on the Creative Commons web site at http://www.creativecommons.org/.

Google does not use the CC terminology but has instead generated a pull down menu with the options: labelled for reuse, labelled for commercial reuse, labelled for reuse for modification,  and labelled for commercial reuse for modification.

GoogleCCImages

There is another option at the top of the list that is the default: not filtered by license. I had to think twice about this one because my first thoughts were that this was for public domain images. It is not. The “not filtered” option is all images. I ran the license options past a few people over the past week and they all immediately assumed that the default option is for images that you can use as you want.  A couple, though, then asked how “labelled for reuse” differed from this and then they became totally confused by the whole thing. To make it worse,  the licenses as listed by Google do not cover all the possible CC license conditions, for example attribution and share alike. So once you have done your search you still have to check the full license for the image that you wish to use. Furthermore, very few people are aware that you have to cite the license and any attribution as requested by the author.

Google says in its help files:

“By returning these search results, Google isn’t making any representation that the linked content is actually or lawfully offered under a Creative Commons license. It’s up to you to verify the terms under which the content is made available and to make your own assessment as to whether these terms are lawfully applied to the content.”

The accuracy and validity of the Google implied license was raised recently in The Register: The tragedy of the Creative Commons . It comments:

“Since there’s no guarantee that the licence really allows you to use the photo as claimed, then the publisher (amateur or professional) must still perform the due diligence they had to anyway. So it’s safer (and quicker) not to use it at all.”

I disagree with that: I recommend using it as a first level filter but then check with the original web site regarding the details of the license. At least you won’t be spending hours wading through “all rights reserved” images.

If you do use the license filter you will notice that many of the photos come from Flickr, which is owned by Yahoo!. Yahoo! has had a Creative Commons filter on its Image Advanced Search screen for a long time but only on the US site, not the UK. A far better way of searching CC Yahoo images is to go straight into Flickr at http://www.flickr.com/creativecommons/.  This gives you a description of the different licenses and you can search images assigned that license. This assumes, of course, that the person who has uploaded the image is the owner of that image and there are stories that this is not always so. But how paranoid do you have to be? With respect to Flickr my approach is to take the photographer’s word for it unless there are serious inconsistencies in their photostream, for example the  meta data associated with the photos suggests that they were in Armenia, New Zealand and Peru on the same day!

So where do you go for images that really are free to use.  There is a trick you can use in Google  to pull up just public domain images. Carry out your search on the standard Image search screen and when the results come up add

&as_rights=cc_publicdomain

to the end of the string in your br0wser address bar, and press enter. (Thanks to Barry Schwartz at Search Engine Land for this tip) . The test searches I have tried so far come up with photos from NASA, US government sites and Wikimedia Commons.  NASA is a safe bet for public domain images as are US government web pages, although there are a few exceptions but these are clearly labelled with any copyright restrictions.. A recent spat between Wikimedia Commons and the UK’s National Portrait Gallery  – National Portrait Gallery bitchslaps Wikipedia: Hands off our photos! – has thrown suspicion on the validity of CC and public domain licenses attached to its photos. This appears to have been an isolated incident, though, and the high resolution images have now been removed if you are accessing the site from the UK.

Another source of public domain images is MorgueFile, which is a small database of high resolution photos but you may have to play around with your search terms before you find exactly what you want.

If you are looking for photos of buildings or locations in the UK then head straight for Geograph.  This aims to collect geographically representative photographs and information for every square kilometre of Great Britain and Ireland. Anyone can upload photos provided that they adhere to the guidelines and attach a Creative Commons Attribution-Share Alike 2.0 Generic license.

Geograph has saved me so much time. A few months ago I was trying to find a photo of the Great Expectations pub in Reading, Berkshire. Google, Yahoo and Live (now Bing) insisted on giving me photos of people reading a copy of  Charles Dickins’s Great Expectations  while sitting in a pub in Berkshire. The image I wanted was probably somewhere in the list but I was not prepared to trawl through hundreds of results to find it. I typed in Great Expectations Reading into Geograph and I was there in a couple of seconds. Brilliant!

GeographGreatExpectations

If you are interested in finding out more about finding and using images head for JISC Digital Media – Still Images.

Google lets you turn off SearchWiki.

At long last Google now lets you turn off SearchWiki. If you don’t know what that is see my posting Begone Searchwiki. You first of all need to be logged into the Google account on which you enabled SearchWiki. Then go to Preferences and tick the SearchWiki box that says “Hide the ability to share, promote, remove, comment, or add your own results.” That’s the good news.

The bad news is that if you have promoted, demoted or deleted results from a search your changes will remain in place every time you log in to your Google account and run a search. You will have to re-enable SearchWiki, run a search and at the bottom of the results page click on “See all my SearchWiki notes”. From there you can undo all of your changes. Then go back into Preferences and disable SearchWiki.

For more details see Search engine land’s Google SearchWiki: You Can ‘Check Out,’ But Your Results Don’t Leave