
  1. 1、本文档共55页,可阅读全部内容。
  2. 2、原创力文档(book118)网站文档一经付费(服务费),不意味着购买了该文档的版权,仅供个人/单位学习、研究之用,不得用于商业用途,未经授权,严禁复制、发行、汇编、翻译或者网络传播等,侵权必究。
  3. 3、本站所有内容均由合作方或网友上传,本站不对文档的完整性、权威性及其观点立场正确性做任何保证或承诺!文档内容仅供研究参考,付费前请自行鉴别。如您付费,意味着您自己接受本站规则且自行承担风险,本站不退款、不进行额外附加服务;查看《如何避免下载的几个坑》。如果您已付费下载过本站文档,您可以点击 这里二次下载
  4. 4、如文档侵犯商业秘密、侵犯著作权、侵犯人身权等,请点击“版权申诉”(推荐),也可以打举报电话:400-050-0827(电话支持时间:9:00-18:30)。
.. .. 中文摘要 由于Web海量的信息处于不断的变化中,搜索引擎己经很难再为用户提供一个高质量的、全面并且更新及时的信息搜索服务,其局限性在于它试图索引全部Web信息并服务于所有主题的查询请求。相比之下,面向主题的搜索引擎只覆盖与特定主题相关的Web区域,这样它搜索的内容可以更深,搜索的周期可以更短,因此能满足用户对快速、准确的获取信息资源的要求。 本文首先介绍了面向主题搜索引擎系统开发的背景及意义,分析了系统开发的可行性,并对系统开发过程中所涉及到的相关理论知识进行简要的介绍,然后进行需求分析、总体设计和详细设计,得到系统所要实现的主要功能,绘制出系统的功能模块图并用程序流程图描述系统的各个模块的处理过程,而后进行系统的实现。 本系统实现了管理员登录系统,添加关键词,发现主题资源信息,下载主题资源,用户检索等功能。由于本系统在下载网页时资源较少,因此用户检索出的结果较少。 关键词:搜索引擎;Nutch;Tomcat;Cygwin .. Subject-Oriented Search Engines Author: ZhaoBei Tutor: XunYaling Abstract As a result of massive information of web is in change constantly, the search engines has been difficult to provide users with a high-quality, comprehensive and timely information to update the search service, its limitations in that it attempts to index all the web information and services to all the theme query request. In contrast, subject-oriented search engines only cover a specific theme and web-related areas, so that it can be a deeper search, search the cycle can be shorter, so they can meet the fast and accurate access to information resources of the user’s requirements. This paper first introduces development’s background and significance of subject-oriented search engines system, feasibility of the analysis ,development of systems and the brief introduction to the theoretical knowledge relevant of systems involved in the process of developing, followed by needs analysis, design and detailed design, in order to achieve the main function the system , drawn the map of function of the system modules and system process flow chart to describe the process of each module, and then the realization of the system. This system realized the manager to register the system, the increase key word, the discovery of subject resources information, the downloading subject resources, and user retrieval functions so on. Because this system when downloads the homepage the resources are few,so the user retrieves the result are few. Keywords: search engines; Nutch; Tomcat; Cygwin 目


ranfand + 关注


