Home » Bibliography Agent
| Spider: | A computer robot program "spide" on the internet to collect / find document. |
| BibTex: | A record of meta-data which describes a bibliography of a document, usually a paper. |
| Agent: | A computer program which automously does jobs delegated by humans. |
Given a BibTex Entry, how to find the full-text or abastract data of the document?
Most of all, it was a term project of a course called "Intelligent Agents". And there are some reasons that we thought this was a good approach for this domain". Let me give you an example first:
Assume that you're looking for anything about "Agents" written by "Jane, Yung-jen Hsu", published by AAAI, what will you do? 1. Use general search engine 2. Connect to your closest library homepage 3. Try to connect to AAAI to look for it 4. ...
As you can see, there are more ways to do it and all of them need different "search" mechanism. Thus we decides to have many "specifialized" agents to cooperate to finish this job. Moreover, agent is automous; we'd like each agent to operate automously, to keep running, and to adapt itself.
Many agents in our community need to do searching starting from a certain URL. Instead of doing duplicated jobs, we decide to build a spider which, given the BibTex entry, searches through nearby indicated web-pages or URLs. Then return possibles links, the scores, the way we found it and the fields in BibTex we matched. So that they can use these feedback to improve themselves.
| URL / HTML page: | the URL / HTML page which we use to start to search (some agents give more than 1 URL would construct a HTML page and deliver it to Spider. |
| MatchingList: | Some agents, according to previous feedbacks, know how to find this document. They directly tell Spider "how to get it". For example, they might tell Spider "try to match 'Author' first, then 'Title', then...". |
| ExploreLeve : | Some agents are certain the document should reside in a certain depth of HTML hierarchy. |
| Timout: | Sometimes network is just very slow. Spider, as an agent, should always give feedback in a reasonable time which can be adjusted by its delegate. |
| HTML: | To make output consistent, Spider returns a formatted HTML |
| Number: | Number of links found |
| Matched | There are more than one result could be returned, this stores the URL candidate. |
| MatchedList | For each matched URL, this records "how Spider found the URL". |
Last Updated : 7/19/2005 by Bo-chieh Yang (bcyang@alumni.cmu.edu)