leblancsatpittsburgh: 2008-11-02

Web Search Engines: Part 1 and Part 2 by David Hawking

The main point of these two articles are the nature and infrastructure of a search engine. Oddly enough the author believes that search engines should not index every web page. From what I understood indexing every page slows down the search and the probability of fetching "low-value" pages happens continuously, however, indexing has proven to be an effective strategy to find information. I did not find his arguments convincing and crawling sounds a lot like indexing to me, which leads me to Part 2 of his article.

In Part 2 he feebly attempts to explain indexing alogrithim , I got to the second paragraph and reread it over and over again. This was difficult to comprehend.

Current Developments and Future Trends for the OAI Protocol for Metadata Harvesting by Sarah L. Shreeves, Thomas G. Habing, Kat Hagerdorn, and Jeffrey A. Young

An interesting article that discusses current developments in the Open Archives Institute and its projects. For example, the Protocol for Metadata Harvesting is a tool developed by the center to facilitate interoperability between different collection standards in XML, HTTP, and Dublin Core.

leblancsatpittsburgh

Friday, November 7, 2008

Comments on other blogs: Week 10

Muddiest Point for Nov. 11

Reading Notes for Week 10: Nov. 11

Blog Archive

About Me