Business Today

Politics
Business
Entertainment and the Arts
PeopleBusiness Today Home

Cover Story
Corporate Front

Trends
Archives
Contests
Investments
Tools
Polls
Exclusives
Debates

People

What's New
About Us

HELP
How Do You Set Up A Search-Engine?

A primer on how to start up biz on the net. first up: search-engines.

By Rakhi Mazumdar

Your search for a successful e-Biz model could end here. Then again, so could your hope for one. With 85 per cent of the Net traffic making at least 1 trip to a search-engine every time it logs on, such a service is brimming with potential when it comes to capturing eyeballs-and, by extension, advertising. Unfortunately, it is also brimming with competition. At the last count, there were at least 1,000 search-engines on the Net, covering both varieties of the product: all-purpose search-engines like AltaVista, Hotbot, Yahoo!, Excite, Lycos, or Alltheweb; and services focused on specific subjects: Medhunt, which deals with medical information only; Inomics, which is devoted to economics; or Lawcrawler, which concentrates on legal affairs.

To be a winner in this game-where the aim is to provide a search service that is so effective that it attracts enough people for you to sell advertising and make money-you must differentiate your service dramatically. The primary route: technology. Few e-Biz ventures are as driven by technology as search-engines.

So, either you have to create this technology for yourself, which means writing the necessary programmes, or buy an off-the-shelf package, such as those sold by Inktomi, Excite, Infoseek, or Xyindex. Buying such a technology means using the entire index of Webpages compiled by that service as well as its searching method-paying on the basis of usage. But you can mix-and-match since there are 3 components to the technology, and customisation is necessary to make your search-engine different from those of your competitors'.

The first is the agent that trawls the Web, recording the contents of every Webpage. This is an enormous task since the Web consists of some 330 million different pages at present. So, you have a choice as to just how you want your agent-termed a robot or a spider-to pick the pages it will scan. One option is to start at the beginning of an alphabetical list of all URLs and systematically scroll through them all.

A second is to begin with a shorter list, and follow the links on each of them to store the contents of a continuously-increasing number of pages. A third: invite webmasters to submit their URLs, and then scan the contents of those pages. A fourth is to focus on sites with specific contents, and limit your trawling to those-the result of which will be a specialised search-engine dedicated to specific subjects like, say, India or Hollywood films.

The second component converts the contents of different pages recorded by the robots into gigantic indices. Different technologies use different forms of indexing, which is where one search-engine can vary from another. The differentiation can be built-in, in the form of speed-the sooner a search-engine technology can convert a spidered page into an indexed page, the quicker the fruits of the robot's efforts be available to your customers.

It is in the third and final stage-the way in which you set up your search process through the index-that your engine will stand out from others. For, depending on how you configure your searching option, the results of running a search for a particular word, phrase, or subject will throw up sharply different results-especially in terms of the order in which the Webpages containing those words are listed. The menu of strategic options?

THE POPULARITY STRATEGY. Adopted by search-engines like Direct Hit and Google, this method monitors the pages that surfers click on after running searches for particular words or phrases on other search-engines. It, then, uses these figures to sort the results so that pages with the most clicks appear on top.

THE COMPREHENSIVENESS STRATEGY. Visible at AltaVista and FAST, the method involves the use of raw power to search the entire index-which aims at including as many Webpages as possible-and then present the results in an order that depends on how close to the title or to the top of the text in each page the word or phrase being searched for appears.

THE QUESTION-AND-ANSWER STRATEGY. Used by sophisticated engines like Ask Jeeves and Electric Monk, the objective of this strategy is to allow searchers to type in questions and, then, provide the answers. The technologies vary: Ask Jeeves compares every query to a vast database of questions and Webpages with the answers, and matches new questions to these. Electric Monk translates the query into a complex search that can be run on AltaVista, and provides the answers.

THE PAY-FOR-VIEW STRATEGY. Search-engines like GoTo and even AltaVista use a simple policy: Webpages whose owners pay a fee have their pages listed near the top of every search for words or phrases in these pages.

THE HUMAN-COMPILATION STRATEGY. People who run search-engines like Snap and Open Directory believe automated searches for keywords or phrases of Websites produce too many irrelevant results. Only an index of pages read and classified by humans can be used to offer results to surfers in search of specific subjects. The USP: every page that is thrown up by a search will have something the searcher is really looking for.

THE METASEARCH STRATEGY. Adopted by search-engines like Dogpile, it runs the search on every other major search-engine, and offer results from each. The benefit to customers: a one-stop shop for all results.

Whether you are setting up a horizontal or a vertical search-engine, you don't have to limit yourself to one of these options. For instance, in a search for a word on Hotbot, the first 10 results come from Direct Hit's popularity-based listing, but number 11 onwards come from Inktomi's indices. Using a combination of technologies can ensure that different customer-segments get the results they are looking for.

Be warned, though, that unless you can develop your own distinctive search technology, or find a way to extend the universe that you search, it is the way in which you package your search results that will give you a distinctive positioning. A classic example of such innovation: Egosurf, which specialises in pulling out references on the Web to an individual's name-and then updates the results every day for a period of 7 days and mails them to the searcher so that she does not have to keep running the search repeatedly for new results. It is, probably, by identifying such specialised needs-whether in terms of specific subjects, or in terms of unique service offerings-that the search-engine-based business model has its best chance of success tomorrow.

THE WEB OF SEARCH-ENGINES

AltaVista: www.altavista.com/
One of the largest search-engines on the Web in terms of pages indexed.

Ask Jeeves: www.askjeeves.com/
A human-powered service that directs searchers to the Webpage that answers questions.

Direct Hit: www.directhit.com/
Monitors what users of other engines click on from their results and ranks them accordingly.

Excite: www.excite.com/
Offers a medium-sized index, but integrates material like company information and sports scores.

FAST Search: www.alltheweb.com/
Aims to index the entire Web, and was the first to break the 200-million page index milestone.

Go/Infoseek: www.go.com/
Consistently provides quality results thanks to its ESP search algorithm.

GoTo: www.goto.com/
Companies can pay to be placed higher in the search results, which GoTo feels improves relevance.

Google: www.google.com/
Makes heavy use of link popularity as a primary way to rank Websites.

HotBot: www.hotbot.com/
First page of results comes from the Direct Hit service, and secondary results from Inktomi.

Inktomi: www.inktomi.com/
Powers several other services, but provides ways for its partners to distinguish themselves.

LookSmart: www.looksmart.com/
It is the closest rival to Yahoo! in terms of being a human-compiled directory of the Web.

Lycos: www.lycos.com/
Started out as a search-engine, but shifted to a directory model similar to Yahoo!'s.

Northern Light: www.northernlight.com/
Features the largest index of the Web along with the ability to cluster documents by topic.

Open Directory: dmoz.org/
Uses volunteer editors to catalogue the Web, and has an open licence arrangement.

Snap: www.snap.com/
A human-compiled directory of Websites, supplemented by search results from Inktomi.

WebCrawler: www.webcrawler.com/
Has the smallest index of any major engine, but provides less overwhelming results to general searches.

Yahoo: www.yahoo.com/
The largest human-compiled guide to the Web, employing about 150 editors to categorise the Web.

 

India Today Group Online

Top

Issue Contents  Write to us   Subscriptions   Syndication 

INDIA TODAYINDIA TODAY PLUS | COMPUTERS TODAY
TEENS TODAY | NEWS TODAY | MUSIC TODAY |
ART TODAY

© Living Media India Ltd

Back Forward