NachoFoto Vs Other Exisiting Image Search Engines
When AltaVista launched its image search engine back in 1998, the first major image search service provider, the concept of displaying thumbnails in the search results became an instant hit among users hunting for images across the web.
After that, there have been myriads of image search service providers, most of them basically adopting one of the following technologies:
1) Text Based: These image search engines crawl the web for images and index the text information associated with it, i.e., title, filename, "alt" attribute, etc. E.g Google, Yahoo Image Search.
However there are some major flaws in this technology:
a) Complete guess work:
To detect who is actually in the image, the algorithms essentially make a guess depending on the text surrounding it. Thus the quality of the result is completely dependent on the quality of the textual content found around it. This explains why the results from such engines will often be inaccurate or unsatisfactory.
b) Stale results:
It usually takes weeks to index the entire web. This means that the image results are usually served from a stale database that can be many weeks old. Also, since the sites keep modifying their content, at times when you click on a thumbnail, you might find a broken link or you might not even find the desired image on that page!
c) Revenue model:
Google has yet to discover a way to monetize its "Google Image Search". Approximately 18% of all Google searches are image search, which makes image search a multi-billion dollar market!
2) Detecting the image itself: These image search engines use image recognition techniques to identify an image. The analysis is done on the basis of pixel level information. This technology is still in the research stage and is yet to make its mark in the market.
e.g., like.com
3) Detecting information provided by humans: Instead of using an automated web crawler, human power is used to collect and create content for forming a rich database; e.g., nachofoto.com.
Since it's the humans who generate images and also the associated textual content with it , so there is no guessing work involved while trying to figure out whose image it is. Also, the users can change the textual content to give results more relevance on future search queries.
Unlike the traditional method where crawlers have to crawl the web in order to fetch fresh data, which might take weeks , in this approach its just adding the index of the newly submitted data to the already existing index. So a higher frequency of updating the index is affordable.
Agreed the size of the index won't be that large as compared to the web crawlers. But how many of us go beyond page 5 of the search results? I think its more of the quality then quantity that people are looking for !!
-Anuj