Google has a pretty reliable algorithm for determining page rank on text searches. All sorts of attempts are made to game the system, by both legitimate and unscrupulous Search Engine Optimization schemes alike, but those strategies always seem to fail in the long run. But put a search query into Google Images, especially with safe-search turned off, and you'll get a bizarre assortment of pictures to look at. That's because image searches generally just comb through the associated text appended to images. It's the reason why if you're an ornithologist looking for pictures of birds with bright blue feet, you might see a different sort of boobies if you search for them. Now Google says it's got a software solution to more closely align Image Search Page Rank with reality.
The research paper, “PageRank for Product Image Search,” is focused on a subset of the images that the giant search engine has cataloged because of the tremendous computing costs required to analyze and compare digital images. To do this for all of the images indexed by the search engine would be impractical, the researchers said. Google does not disclose how many images it has cataloged, but it asserts that its Google Image Search is the “most comprehensive image search on the Web.”
The company said that in its research it had concentrated on the 2000 most popular product queries on Google’s product search, words such as iPod, Xbox and Zune. It then sorted the top 10 images both from its ranking system and the standard Google Image Search results. With a team of 150 Google employees, it created a scoring system for image “relevance.” The researchers said the retrieval returned 83 percent less irrelevant images.
It doesn't sound like the approach is likely to be scalable to all the images on the web, but it's a good start. On another topic, the New York Times has forgotten the difference between "less" and "fewer," so it's official: writing standard English is dead.