基于最大频繁项集k-means的文本聚类算法分析及应用-analysis and application of text clustering algorithm based on maximum frequent itemsets k - means.docxVIP
- 1、原创力文档(book118)网站文档一经付费(服务费),不意味着购买了该文档的版权,仅供个人/单位学习、研究之用,不得用于商业用途,未经授权,严禁复制、发行、汇编、翻译或者网络传播等,侵权必究。。
- 2、本站所有内容均由合作方或网友上传,本站不对文档的完整性、权威性及其观点立场正确性做任何保证或承诺!文档内容仅供研究参考,付费前请自行鉴别。如您付费,意味着您自己接受本站规则且自行承担风险,本站不退款、不进行额外附加服务;查看《如何避免下载的几个坑》。如果您已付费下载过本站文档,您可以点击 这里二次下载。
- 3、如文档侵犯商业秘密、侵犯著作权、侵犯人身权等,请点击“版权申诉”(推荐),也可以打举报电话:400-050-0827(电话支持时间:9:00-18:30)。
- 4、该文档为VIP文档,如果想要下载,成为VIP会员后,下载免费。
- 5、成为VIP后,下载本文档将扣除1次下载权益。下载后,不支持退款、换文档。如有疑问请联系我们。
- 6、成为VIP后,您将拥有八大权益,权益包括:VIP文档下载权益、阅读免打扰、文档格式转换、高级专利检索、专属身份标志、高级客服、多端互通、版权登记。
- 7、VIP文档为合作方或网友上传,每下载1次, 网站将根据用户上传文档的质量评分、类型等,对文档贡献者给予高额补贴、流量扶持。如果你也想贡献VIP文档。上传文档
查看更多
基于最大频繁项集k-means的文本聚类算法分析及应用-analysis and application of text clustering algorithm based on maximum frequent itemsets k - means
AbstractWiththerapiddevelopmentofinformationtechnology,networkinformationisexpandinginageometricway.Howtoimprovetheefficiencyofutilizingtheinformationresourcethroughinformationfusionhasbecomeanurgentproblem.Textclustering,whichisanunsupervisedinformationorganizationmethod,playsanimportantrole.Bygroupingalargeamountofdocumentsintosmallnumberofmeaningfulclasses,thedocumentsineachclassarehighlyrelatedtoeachotherandvastlydifferentbetweenthedocumentsindifferentclasses.Soasthetextclusteringachievesthetargetofeffectivelyorganizingthedocumentinformation.Thepaperresearchestheexistingtextclusteringalgorithms,analyzesandsummarizesthecharacteristicsofeachalgorithm;then,introducestheconceptoffrequentitemsetsinassociationrules,andthefrequentitemset-basedclusteringalgorithm.Atlast,thepaperproposesanoveliteratetextclusteringmethodbasedonmaximalfrequentitemsetsandK-means,whichhighlyimprovedtheclusteringresult.Thepaperestablishesdocumentdatabaseforminingmaximalfrequentitemsets,andthedocumentswhichcontainthesamelongestmaximalfrequentitemsetsaregatheredtoforminitialclusters.Then,thepaperdesignsanewselectionmethodtoselectreasonablebaseclustersastheinitialcentersforK-meansclusteringalgorithm.Consideringthecharacteristicsofunevendistributionandtheclustersizearedifferent,thepaperproposedatwo-stepclusteringidea.First,wegettheclusterswhichhaveobviouscharacteristicsorlargegranularitythroughonestepclusteringbymaximalfrequentitemsetsK-meansalgorithm;then,removedthedocumentsfromthedocumentdatabase,andmaximalfrequentitemsetsandbaseclusterselectionalgorithmareusedagaintominetheclustercentersfromtheleftdocuments.Finally,wecombinetheseclustercenterswiththeproducedcentersinfirststep,clusteringtheleftunclassifieddocumentsbyK-meansalgorithm.Theimprovementsandinnovationsoftheproposedtextclusteringalgorithmincludethefollowingaspects:First,generatestheinitialclustersbythesharinglongestmaximalfrequentitemsets,andproposedrepresentationabilityconceptofthemaximalfrequentitemsetsforacluster.Then,wedesignacrite
您可能关注的文档
- 基于轴棱锥的ndyag激光器腔内倍频产生贝塞尔绿光-bessel green light produced by intracavity frequency doubling of ndyag laser based on axicon.docx
- 基于重庆地域文化的轨道交通工具设计分析-analysis of rail transit vehicle design based on chongqing regional culture.docx
- 基于周期性结构的紧凑型无线通信微带天线分析-analysis of compact wireless communication microstrip antenna based on periodic structure.docx
- 基于主成分分析法对我国上市公司综合评价的分析-analysis of comprehensive evaluation of listed companies in china based on principal component analysis.docx
- 基于主成分方法的空调系统传感器故障诊断与检测-sensor fault diagnosis and detection of air conditioning system based on principal component method.docx
- 基于逐线积分的氧气a吸收带透过率的算法分析-algorithm analysis of oxygen a absorption band transmittance based on line-by-line integration.docx
- 基于主动轮廓模型的脑肿瘤分割技术分析-analysis of brain tumor segmentation technology based on active contour model.docx
- 基于主从结构的微电网综合控制策略分析-analysis of integrated control strategy for microgrid based on master-slave structure.docx
- 基于主成分分析的半局部块匹配图像降噪算法分析-analysis of denoising algorithm for semi-local block matching image based on principal component analysis.docx
- 基于主动视觉的大空间坐标测量关键技术分析-analysis of key technologies of large space coordinate measurement based on active vision.docx
最近下载
- 学校教学楼采暖改造投标方案施工组织设计.doc VIP
- 能源转型关键矿产的“资源民族主义”抬头趋势——基于2023–2025年印尼、智利、墨西哥出口管制.docx VIP
- 养老院入院协议合同协议表格模板实用文档-养老院入院协议百.pdf VIP
- 湘少版三年级英语上册全册教案.pdf VIP
- T-CIAS-3-2020建筑设备安装工程支吊架计算书编制标准.pdf VIP
- 加味左金丸治疗肝胃不和型反流性食管炎临床观察.pptx VIP
- 新苏教版二年级下册道德与法治期末测试.docx VIP
- 地下管网cctv检测报告.docx VIP
- (高清版)DB12∕T 1115-2021 泵站工程运行管理规程.pdf VIP
- ZP型矿用自动洒水降尘装置说明书.doc VIP
原创力文档


文档评论(0)