Web Search Engines: Part 1 and Part 2 by David Hawking
The main point of these two articles are the nature and infrastructure of a search engine. Oddly enough the author believes that search engines should not index every web page. From what I understood indexing every page slows down the search and the probability of fetching "low-value" pages happens continuously, however, indexing has proven to be an effective strategy to find information. I did not find his arguments convincing and crawling sounds a lot like indexing to me, which leads me to Part 2 of his article.
In Part 2 he feebly attempts to explain indexing alogrithim , I got to the second paragraph and reread it over and over again. This was difficult to comprehend.
Current Developments and Future Trends for the OAI Protocol for Metadata Harvesting by Sarah L. Shreeves, Thomas G. Habing, Kat Hagerdorn, and Jeffrey A. Young
An interesting article that discusses current developments in the Open Archives Institute and its projects. For example, the Protocol for Metadata Harvesting is a tool developed by the center to facilitate interoperability between different collection standards in XML, HTTP, and Dublin Core.
Friday, November 7, 2008
Subscribe to:
Post Comments (Atom)
3 comments:
Algorithms aren't fun. But it is interesting because I took a math class for liberal arts majors and we learned efficiency algorithms, and computers usually use the same ones to do their work. I think that crawling could be considered selective indexing, if that makes his argument make more sense.
yeah, getting to read about dublin core again makes me giddy with excitement. I wish that i could pay 25 grand all over again in hopes of reading "dublin core" somewhere, anywhere. this is really the best of all possible worlds.
I think that "Dublin Core" could be a new genre of Irish Hardcore.
Post a Comment