INDEXING TECHNIQUE OF DIFFERENT SEARCH ENGINES
GOOGLE The heart of google indexing technique is "PageRank", a system for ranking web pages. PageRank relies on the uniquely democratic nature of the web by using its vast link structure as an indicator of an individual page's value. Google interprets a link from page A to page B as a vote, by page A, for page B. Google also analyzes the page that casts the vote. Votes cast by pages that are themselves "important" weigh more heavily and help to make other pages "important."
YAHOO Yahoo the most popular hierarchically organized search engine uses robots to discover the sites and humans are relied upon for indexing, index includes URL, HTML title tags, very short description of the site.
ALTAVISTA It uses crawlers to visit every site on the web, indexes all the site they find there. It also uses meta tag for indexing.
LYCOS Lycos indexes the title, URL, headings and subheadings and the first twenty lines of text < reindexing it's database every two weeks>.
EXCITE It uses robots to do full text indexing; it also employs " Intelligent concept extraction" which relies on clustering of words to locate the presence of concepts, It indexes more than first level of heading.
INFOSEEK It uses robot to do full text indexing, meta descriptor tags are also indexed, indexes third and fourth level also.
< www.searchenginewatch.com has vast database on the strength and weakness of major search engines>
Search Engine
|
% of searchable web indexed
|
Northern Light
|
16.0
|
Snap
|
15.5
|
AltaVista
|
15.5
|
HotBot
|
11.3
|
Microsoft
|
8.5
|
Infoseek
|
8.0
|
Google
|
7.8
|
Yahoo
|
7.4
|
Excite
|
5.6
|
Lycos
|
2.5
|
Euroseek
|
2.2
|
*According to computer scientists Steve Lawrence and C.Lee Giles at NEC Research Institute in Princeton, New Jersey.