


  












|
HELP
How Do You Set Up A Search-Engine?A primer on how to start up biz on the net. first up: search-engines.
By Rakhi Mazumdar
Your search for a successful e-Biz model could end here. Then
again, so could your hope for one. With 85 per cent of the Net traffic making at least 1
trip to a search-engine every time it logs on, such a service is brimming with potential
when it comes to capturing eyeballs-and, by extension, advertising. Unfortunately, it is
also brimming with competition. At the last count, there were at least 1,000
search-engines on the Net, covering both varieties of the product: all-purpose
search-engines like AltaVista, Hotbot, Yahoo!, Excite, Lycos, or Alltheweb; and services
focused on specific subjects: Medhunt, which deals with medical information only; Inomics,
which is devoted to economics; or Lawcrawler, which concentrates on legal affairs.
To be a winner in this game-where the aim is to provide a
search service that is so effective that it attracts enough people for you to sell
advertising and make money-you must differentiate your service dramatically. The primary
route: technology. Few e-Biz ventures are as driven by technology as search-engines.
So, either you have to create this technology for yourself,
which means writing the necessary programmes, or buy an off-the-shelf package, such as
those sold by Inktomi, Excite, Infoseek, or Xyindex. Buying such a technology means using
the entire index of Webpages compiled by that service as well as its searching
method-paying on the basis of usage. But you can mix-and-match since there are 3
components to the technology, and customisation is necessary to make your search-engine
different from those of your competitors'.
The first is the agent that trawls the Web, recording the
contents of every Webpage. This is an enormous task since the Web consists of some 330
million different pages at present. So, you have a choice as to just how you want your
agent-termed a robot or a spider-to pick the pages it will scan. One option is to start at
the beginning of an alphabetical list of all URLs and systematically scroll through them
all.
A second is to begin with a shorter list, and follow the
links on each of them to store the contents of a continuously-increasing number of pages.
A third: invite webmasters to submit their URLs, and then scan the contents of those
pages. A fourth is to focus on sites with specific contents, and limit your trawling to
those-the result of which will be a specialised search-engine dedicated to specific
subjects like, say, India or Hollywood films.
The second component converts the contents of different pages
recorded by the robots into gigantic indices. Different technologies use different forms
of indexing, which is where one search-engine can vary from another. The differentiation
can be built-in, in the form of speed-the sooner a search-engine technology can convert a
spidered page into an indexed page, the quicker the fruits of the robot's efforts be
available to your customers.
It is in the third and final stage-the way in which you set
up your search process through the index-that your engine will stand out from others. For,
depending on how you configure your searching option, the results of running a search for
a particular word, phrase, or subject will throw up sharply different results-especially
in terms of the order in which the Webpages containing those words are listed. The menu of
strategic options?
THE POPULARITY STRATEGY. Adopted by
search-engines like Direct Hit and Google, this method monitors the pages that surfers
click on after running searches for particular words or phrases on other search-engines.
It, then, uses these figures to sort the results so that pages with the most clicks appear
on top.
THE COMPREHENSIVENESS STRATEGY. Visible at
AltaVista and FAST, the method involves the use of raw power to search the entire
index-which aims at including as many Webpages as possible-and then present the results in
an order that depends on how close to the title or to the top of the text in each page the
word or phrase being searched for appears.
THE QUESTION-AND-ANSWER STRATEGY. Used by
sophisticated engines like Ask Jeeves and Electric Monk, the objective of this strategy is
to allow searchers to type in questions and, then, provide the answers. The technologies
vary: Ask Jeeves compares every query to a vast database of questions and Webpages with
the answers, and matches new questions to these. Electric Monk translates the query into a
complex search that can be run on AltaVista, and provides the answers.
THE PAY-FOR-VIEW STRATEGY. Search-engines
like GoTo and even AltaVista use a simple policy: Webpages whose owners pay a fee have
their pages listed near the top of every search for words or phrases in these pages.
THE HUMAN-COMPILATION STRATEGY. People who
run search-engines like Snap and Open Directory believe automated searches for keywords or
phrases of Websites produce too many irrelevant results. Only an index of pages read and
classified by humans can be used to offer results to surfers in search of specific
subjects. The USP: every page that is thrown up by a search will have something the
searcher is really looking for.
THE METASEARCH STRATEGY. Adopted by
search-engines like Dogpile, it runs the search on every other major search-engine, and
offer results from each. The benefit to customers: a one-stop shop for all results.
Whether you are setting up a horizontal or a vertical
search-engine, you don't have to limit yourself to one of these options. For instance, in
a search for a word on Hotbot, the first 10 results come from Direct Hit's
popularity-based listing, but number 11 onwards come from Inktomi's indices. Using a
combination of technologies can ensure that different customer-segments get the results
they are looking for.
Be warned, though, that unless you can develop your own
distinctive search technology, or find a way to extend the universe that you search, it is
the way in which you package your search results that will give you a distinctive
positioning. A classic example of such innovation: Egosurf, which specialises in pulling
out references on the Web to an individual's name-and then updates the results every day
for a period of 7 days and mails them to the searcher so that she does not have to keep
running the search repeatedly for new results. It is, probably, by identifying such
specialised needs-whether in terms of specific subjects, or in terms of unique service
offerings-that the search-engine-based business model has its best chance of success
tomorrow.
THE WEB
OF SEARCH-ENGINES |
AltaVista: www.altavista.com/
One of the largest search-engines on the Web in terms of pages indexed.Ask Jeeves: www.askjeeves.com/
A human-powered service that directs searchers to the Webpage that answers
questions.
Direct Hit: www.directhit.com/
Monitors what users of other engines click on from their results and ranks them
accordingly.
Excite: www.excite.com/
Offers a medium-sized index, but integrates material like company information and
sports scores.
FAST Search: www.alltheweb.com/
Aims to index the entire Web, and was the first to break the 200-million page
index milestone.
Go/Infoseek: www.go.com/
Consistently provides quality results thanks to its ESP search algorithm.
GoTo: www.goto.com/
Companies can pay to be placed higher in the search results, which GoTo feels improves
relevance.
Google: www.google.com/
Makes heavy use of link popularity as a primary way to rank Websites.
HotBot: www.hotbot.com/
First page of results comes from the Direct Hit service, and secondary results from
Inktomi.
Inktomi: www.inktomi.com/
Powers several other services, but provides ways for its partners to distinguish
themselves.
LookSmart: www.looksmart.com/
It is the closest rival to Yahoo! in terms of being a human-compiled directory of the Web.
Lycos: www.lycos.com/
Started out as a search-engine, but shifted to a directory model similar to
Yahoo!'s.
Northern Light: www.northernlight.com/
Features the largest index of the Web along with the ability to cluster documents
by topic.
Open Directory: dmoz.org/
Uses volunteer editors to catalogue the Web, and has an open licence arrangement.
Snap: www.snap.com/
A human-compiled directory of Websites, supplemented by search results from
Inktomi.
WebCrawler: www.webcrawler.com/
Has the smallest index of any major engine, but provides less overwhelming
results to general searches.
Yahoo: www.yahoo.com/
The largest human-compiled guide to the Web, employing about 150 editors to
categorise the Web. |
|