基于web的文本分类算法分析及系统实现-analysis and system implementation of text classification algorithm based on web.docxVIP
- 8
- 0
- 约3.46万字
- 约 51页
- 2018-05-18 发布于上海
- 举报
基于web的文本分类算法分析及系统实现-analysis and system implementation of text classification algorithm based on web
AbstractIn recent years, with the rapid development of Internet,the data volume of electronic text presented by Web has a geometric rate of expansion. How to effectively organize and manage these data and to send the required information to user comprehensively, accurately and quickly is an important challenge for the current research of InformationTechnology. The problem of finding information accurately and quickly from messy data can be well solved through Text Categorization. Automatic text categorization is an important technology for organizing and processing large quantity of text information.Earlier text categorization was only based on pure text, with the growing popularity of the Internet and the rapid development of Web technology, more and more digital information is presented in the form of web pages .Web is becoming the most important channel for users to get information. How to find useful information quickly from distributed, heterogeneous, semi-structured web environment, and extract knowledge from the web pages become the core issue in Data Mining and Knowledge Management.The implementation of a Web-based text categorization system is discussed in this thesis, including two parts, extracting text from web pages and text categorization.In this thesis we first describe the latest research of the automatic text categorization both at home and abroad. And then make an in-depth discussion and propose solutions for text collection and text categorization. Methods for page analysis and URL reduction are given for the reptiles and a mask based text extraction method is proposed. Also show the ways for solving problems on words segmentation, feature extraction and text categorization.The system prototype for TCViewer (Text Categorization Viewer) is the main achievement in our research. And at the end of this paper we carry out two experiments on text collection and text categorization to verify the effectiveness of the system.Key Words: Text Collection, Tex
您可能关注的文档
- 基于webrtc语音引擎的会议混音技术分析-analysis of conference mixing technology based on webrtc voice engine.docx
- 机构养老问题探究——以泗水县为例-probe into the problem of institutional pension —— taking surabaya county as an example.docx
- 基于webservice的招商引资视频会议系统-video conference system of investment promotion and capital introduction based on web service.docx
- 基于webservices应用的安全机制分析-analysis of security mechanism based on web services application.docx
- 基于webservice的应用集成关键技术分析-analysis of key technologies of application integration based on web service.docx
- 基于webgis与遗传-禁忌算法木材物流网络优化-optimization of wood logistic network based on webgis and genetic - tabu algorithm.docx
- 基于webservice的bs架构的在线考试系统的设计与实现-design and implementation of bs - based online examination system based on web service.docx
- 基于webservice物流跟踪决策系统实现-implementation of logistic tracking decision system base on web service.docx
- 基于web-svg的电网信息数据展示的分析与应用-analysis and application of grid information data display based on we b - svg.docx
- 基于web代理的访问控制网关系统分析与实现-analysis and implementation of access control gateway system based on web proxy.docx
- GB/T 42818.2-2026认知无障碍 第2部分:报告.pdf
- 中国国家标准 GB/T 47116-2026地下采矿机械 工作面移动式采掘机械 采煤机和犁式系统的安全要求.pdf
- 《GB/T 47116-2026地下采矿机械 工作面移动式采掘机械 采煤机和犁式系统的安全要求》.pdf
- 中国国家标准 GB/T 42818.2-2026认知无障碍 第2部分:报告.pdf
- 《GB/T 42818.2-2026认知无障碍 第2部分:报告》.pdf
- 《GB/T 27664.1-2026无损检测仪器 超声检测设备的性能与检验 第1部分:仪器》.pdf
- 中国国家标准 GB/T 27664.1-2026无损检测仪器 超声检测设备的性能与检验 第1部分:仪器.pdf
- GB/T 27664.1-2026无损检测仪器 超声检测设备的性能与检验 第1部分:仪器.pdf
- GB/T 45305.5-2026声学 建筑构件隔声的实验室测量 第5部分:测试设施和设备的要求.pdf
- 中国国家标准 GB/T 45305.5-2026声学 建筑构件隔声的实验室测量 第5部分:测试设施和设备的要求.pdf
原创力文档

文档评论(0)