- 4
- 0
- 约5.37万字
- 约 70页
- 2019-05-11 发布于上海
- 举报
Abstract
With the development of Search Engine, serves in the specific domain vertical search application starts to emerge. As a vertical search technique which pays great attention in specialized and the structure analysis, Its premise is the establishment in and above the structurized data information which relate to the subject. So, how to accurate and promptly gain the structurized data information has become a current vertical search area research big topic.
Crawler as a Search Engine information source tenderer who can automatic extract the Hyperlinks on the pages, and download the information from the web. But in structurized data information gain aspect, its not been able to meet the Vertical Search Engines needs. So, This article proposes using Focusing Crawler which Faced Vertical Search Engine to solve the above problem.
After simple introduction Vertical Search and Crawler’s technical background, regarding face the center of Focusing Crawler which Faced Vertical Search Engine, this article has completed the following main research and the application work:
Described systematically the concept of Focusing Crawler which Faced Vertical Search Engine, its prime task principle and flow, key technologies analysis, also has discussed its trend of development.
In view of two big and most basic work link of Focusing Crawler: page capture and information extraction, the advocate uses for reference overseas advanced open-source project technology: Heritrix Crawler and Web-Harvest tools. Also made the technical upholstery of the following application.
In existing research foundation, introduces an actual position employment advertise vertical search engine project, union a concrete case stand (i.e. “ZhiLian” website) application demand. Standardized design and realization a Focusing Crawler System which can solve the structurized data information gain problem in the project. This system has the good extendibility and the modifiability, and has the good practical a
您可能关注的文档
- 面向产品设计过程的工作流管理系统的设计及引擎的实现-软件工程专业毕业论文.docx
- 蒙药孟根乌苏-18中微量元素的含量形态分析及汞元素药代动力学研究-环境科学与工程;环境科学专业毕业论文.docx
- 猕猴慢性注射苯环利定引起运动发起障碍-生物物理专业毕业论文.docx
- 考虑寿命预测的地铁车轮需求预测及备件库存控制-物流工程专业毕业论文.docx
- 空心轴类零件内孔开式反挤压成形方法研究-机械制造及其自动化专业毕业论文.docx
- 考虑景观因素的黄石城区江滩防洪整治研究-建筑与土木工程专业毕业论文.docx
- 科技项目综合后评价方法研究-计算机应用技术专业毕业论文.docx
- 煤炭产量计重系统设计与实现-软件工程专业毕业论文.docx
- 面向电信行业的即时消息综合业务系统的设计与实现-软件工程专业毕业论文.docx
- 面向图像处理的FPGA多核SoC互连技术研究-电路与系统专业毕业论文.docx
原创力文档

文档评论(0)