|portal → basic searching → main.htm||
You'r deep inside searchlores
Updated April 2009, version 1.97
Machine translation ~ Files repos ~ 'Deep web' searching ~ Yahoo!/AllTheWeb's image search syntax ~ Short term searching ~ Long term searching ~ Golden rules ~ Yahoo's (Inktomi's) search syntax ~ Search engines' operators ~ Powerbrowsing ~ Learning to transform questions into effective queries ~ Search Engines Anti-Optimization ~ Fishing for troubles ~ Music searching ~ Catching the rabbit's ears ~ When your search fails ~ Follow Links in the Underground ~ Google's wild side ~ Using Fuzzy Logic ~ A Re-ranking trilogy ~ Searching scarcity ~ Searching Historical Information
Back to Portal ~ Library ~ Bk:flange of mysc ~
pda searches low band searches (good for GPRS)
Instructions & caveats (read this)
Quick forms (use them)
(Why should anyone use a "not google" search engine at all?)
You would be well advised to try (at least some of) the various search engines listed on this page, which
has been defined as
both "a fine tool and a powerful weapon for searchers". Search engines use
in fact quite different algos, which gives indexes that do not overlap that much and thus offer
searchers the possibility
to fish results that they wouldn't even see if they would stuck onto just one index.|
In other words: if you always limit yourself to google you 'll just cover (far) less than one half of the "visible" web (and probably not even 1/5 ~ 1/10 of the hidden one). This is true even if google offers -as it does- very good precision and failry broad recall (yet check our own relevance comparisons and heed anyway the spamming problems google is subjected to).
Despite the previous advice to always use more search engines when searching, there are good resons
to get familiar with google's advanced parameters.
Since the usage of google, relatively to all other engines, has actually further increased (march 2009: google 80% | yahoo 11% | msn 3% | aol 2% | msn-live 2% | ask 1% | all other s.e. 1%, and since google is gaining one percentage point per trimester (no matter what the other engines offer, and we doubt that CUIL will break this hold), we have prepared an in depth, specific, google page that seekers are encouraged to visit.
As said, this page is both a tool and a useful weapon, especially when preparing a long term search. Just copy this page (or even better: the quick forms page onto your harddisk as c:\main.htm (or whatever), and then bookmark it there and use it (after having edited or thrown away anything you fancy) in order to perform effective searches on the web using any main search engine and starting from an unpolluted jumping off place, a page that has as few frills as possible and as many useful forms as we know of. A page that you can modify -and ameliorate- yourself (feedback, in that case, would be appreciated).
The main reason you should use more than one main search engine is that search engines' results overlap FAR less than you would think. Ad hoc studies point out that around 3/4 of the results of a given search are UNIQUE for each search engine.
Remember that search engines list only the first part of any BIG DOCUMENT: the size varies.
Google had a famous limit of 101K, which was abolished in January 2005, the new limit should be around 150K. These limits are very annoying when dealing with large documents (or on-line books).
Note also that just because one, hundred, or thousand pages from a given site are crawled and made searchable trough one of the main search engines, this does not guarantee that every page from an indexed site has really been crawled and indexed. This shortcoming hits not only 'new' pages, that can take MONTHS to be indexed: beehives of spiders harvesting a site often MISS whole subdirectories, old and new. Useful material may be all but invisible to those that only use 'main' search tools to seek. Moreover anyone that uses regularly google (for instance, but other search engines are not that different) will have noticed how polluting commercial sites results nowadays are. Would a search engine introduce a new, simple "please hide all commercial sites form your SERPs" (Search Engines Result Pages) option, or switch, or slide, it would probably become king of the hill in a couple of months.
Therefore, seen the commercial-oriented pollution of the web, you would be well advised to use regional engines, usenet and other specialized or targeted search tools and combing techniques and also to rely on your own bots as well, when searching your various targets.
Note that you can also easily search and find targets that do not exist any more :-)
A useful tool to compare results in google and yahoo:
"The allmighty google monopole..."
...should not blind seekers into using only one engine"
"They don't overlap that much..."
(main s.e. claimed index sizes & relative web-coverage)
(Use the MAPA to navigate)
"As the web grows and evolves, web search needs to grow and evolve too. Swicki technology improves on existing general web search by enabling vertical, community site and web searches to be initiated from any website. A swicki's strength is in its community appeal and dynamic buzzcloud (sic :-(
As such, collaboration between groups of people with similar interests using a swicki will quickly produce much more relevant, tailored results for that group than a generic search engine"