Computers Today..

December 1-15, 1998                                                                           THE NET

Computers Today Home
Politics
BusinessEntertainment and the Arts
People
About UsWhat's New

Masterfile

Country Buzz

Chief Guest

Front End

End User

Attractions

Managing IT

Networking

Tech Trends

The Net

Telecom

Columns

Circuit

SEARCH THE WEB
Strings to Make it Simple

Relevance Ranking is determined by the breadth of match, the frequency of queries and the density of the retrieved documents.

Sarita Agarwal

Columns

On the surface, searching the Web is a very simple process. Just type in your keywords and off you go. However, it helps if you understand certain concepts about the process.

Relevance Ranking: It is a technique used by search engines to arrange a set of retrieved records so that those most likely to be relevant to your request are shown to you first. As you move down the ranking, you move toward less relevant records. The rating is given as a percentage or a value between 0 and 1. A document with a relevance ranking of 98 percent has a greater likelihood of satisfying your query than one with a relevance ranking of 75 percent.

What determines relevance? A combination of the following indicators:

  • Breadth of Match: The more distinct query terms appear in a document, the higher the weight of relevance.
  • Inverse Document Frequency: Rare terms receive a higher weight of relevance.
  • Frequency: The number of times a query term occurs in a document.
  • Density: The length of retrieved documents.

Note that stopwords, like 'the', 'and', 'of', are not considered for relevance ranking.

Feedback Searching: Using the results from a search to refine your query.

Concept Searching: Searching for a general idea. This is useful in cases where you can't immediately articulate your needs. Here, you can do a concept search and build on the results.

Field Restricted Searching: Searching within a specific field within a page; this could be title, links or images. The logic here is that if a match is found within a specific field, the retrieved item is more relevant.

Phrase Searching: Searching for the occurrence of multiple words as a phrase. Again, this is a way to increase the precision of your search. For example, when you look for 'heart attack' as a phrase, with the words appearing in that order, rather than just the words 'heart' and 'attack', your search engine might pick up the following sentence also: 'the police launched an attack on the Mirchi gang operating from Mumbai'.

Proximity Searching: Searching for term(s) within N words of other term(s). This allows you to pick up related, yet relevant phrases, such as 'technology transfer' and 'transfer of technology', or 'J.S. Raju', 'Raju, J.S.', 'J. Srihari Raju'.

Wildcard Searching: Searching for different word endings with the use of a truncation wildcard symbol. This lets you find similar forms of the same words, and accounts for plurals, noun/verb, American/British spelling variations, and more. The truncation symbol is typically * or $, but must be used judiciously to avoid picking up junk. For example, car* will return such unrelated matches as cars, carom and cartoon.

Synonym Searching: Searching for multiple meanings of a word. For example, 'purpose' could have been described as 'aim', 'end', 'goal'. In general, the more synonyms you can come up with, the more complete your search.

Boolean Searching: Searching based on Boolean logic. Queries are constructed using connective terms. The basic connective terms are AND (represented by '+' in some search engines), OR, and NOT (represented by '-' in some search engines). So, searching for 'cats OR dogs' will retrieve items containing either term, thus broadening the search. Searching for 'cats AND dogs' will retrieve only items containing both terms, narrowing the search. Searching for 'Delhi NOT New' will retrieve items about Delhi but not about New Delhi.

NOT is especially useful when doing searches for homographs (words that are spelled the same but differ in meaning or pronunciation). For example, since the word 'polish' could be what you do to shoes or a thing from Poland, 'polish NOT shoe' would pick up only the latter.

Use parentheses (), to group portions of Boolean queries together for more complicated searches; they force the search engine to evaluate the query in a particular order. Each service explains its connective terms for Boolean searching in its help or FAQ file. Some systems default to a certain connective term; like, sometimes, 'cats dogs' is treated as 'cats OR dogs'.

 

India Today Group Online

Top

Issue Contents    Write to us    Subscriptions    Syndication

INDIA TODAY | BUSINESS TODAY | INDIA TODAY PLUS | TEENS TODAY
NEWS TODAY | MUSIC TODAY | ART TODAY | SYNDICATIONS TODAY

© Living Media India Ltd

Back Forward