Web使用挖掘关键技术分析-管理科学与工程专业论文.docx

下载文档 降价啦

1
0
约12.51万字
约 119页
2018-09-06 发布于上海
举报
版权申诉
保障服务

Web使用挖掘关键技术分析-管理科学与工程专业论文.docx

1、原创力文档（book118）网站文档一经付费（服务费），不意味着购买了该文档的版权，仅供个人/单位学习、研究之用，不得用于商业用途，未经授权，严禁复制、发行、汇编、翻译或者网络传播等，侵权必究。。
2、本站所有内容均由合作方或网友上传，本站不对文档的完整性、权威性及其观点立场正确性做任何保证或承诺！文档内容仅供研究参考，付费前请自行鉴别。如您付费，意味着您自己接受本站规则且自行承担风险，本站不退款、不进行额外附加服务；查看《如何避免下载的几个坑》。如果您已付费下载过本站文档，您可以点击这里二次下载。
3、如文档侵犯商业秘密、侵犯著作权、侵犯人身权等，请点击“版权申诉”（推荐），也可以打举报电话：400-050-0827(电话支持时间：9:00-18:30)。

Web使用挖掘关键技术分析-管理科学与工程专业论文

华中科技大学博士学位论文华中科技大学博士学位论文 II II 为。因此，可以将生物信息学中经典的 DNA 或蛋白质序列比对算法加以改造应用于 Web 会话相似性度量。簇的个数、各簇初始点和划分数据点的规则函数的确定是 Web 会话聚类算法需要考虑的 3 个重点和难点。基于相似性增长的 Web 会话聚类算法 WSCBSI（Web Session Clustering Based on the Increase of Similarity）根据领域知识的分析确定应当划分的簇的数目，利用聚类结果质量高但对大数据量来说时空复杂度较大的 ROCK 聚类算法确定各簇的初始点，根据 Web 会话划分到不同簇中对全局相似性增长的贡献确定规则函数，这既克服了传统聚类算法只考虑局部相似性带来聚类结果质量不佳的缺点，也降低了聚类过程的时空复杂度。关键词：Web 使用挖掘；数据预处理；Web 会话相似性；Web 会话聚类 III III Abstract Web usage mining is aiming at discovering the visiting patterns of users and predicting the users? visiting behavior by mining web log record, so as to achieve a better comprehension and service over the application based on Web. The results of Web usage mining are usually the mutual behavior and interest, the searching preferences, habits and patterns of personal users, etc. Therefore, it is theoretically and practically important to offer personified service and customization, to improve the structure and performance of Web system, to reform the websites? structure, to support commercial intelligence for the business organizations and and to recommend web pages to the users, etc. The fact that the content of Web has the quality of being complex, diverse, and unstructured, that the organizational structure of Web is dynamic and changeable, and that the Web usage data is inaccurate, has caused lots of difficulties to Web usage mining, which brought about, on the one hand, the consequence that the traditional data mining technique cannot be correspondingly applied to Web data, but on the other hand, it also offered more challenges and opportunities for a further study on Web mining theories and techniques. The results of data preprocessing will directly affect the results of data mining with different quality. The data of Web usage mining may stem from the server side, client side, proxy server, site files, registration information, or remote agent. Each type of data collection differs not only in terms of the location of the data sou