基于FP-tree的最大频繁项集挖掘算法.pdfVIP

下载本文档

7
0
约1.64万字
约 5页
2018-05-09 发布于福建
举报

基于FP-tree的最大频繁项集挖掘算法.pdf

第33卷第9期合肥工业大学学报 (自然科学版) VoI．33No．9 2010年 9月 JOURNALOFHEFEIUNIVERSITYOFTECHNOLOGY Sept．2010 Doi：10．3969／]．issn．1003—5060．2010．09．025 基于 FP-tree的最大频繁项集挖掘算法蒋翠清，胡俊妍 (合JE212业大学管理学院，安徽合肥 230009) 摘要：现有的最大频繁项集挖掘算法在支持度阈值较大情况下已达到较高性能，但在支持度阈值较小时，由于候选项集的快速增长，其性能往往不理想。文章提出了一种基于频繁模式树 (FP-tree)存储结构的最大频繁项集挖掘算法——DMFIF算法，将 FP-tree各分枝作为初始候选项集，并按维数和支持度递减排序，结合子集剪枝策略，自顶向下搜索挖掘最大频繁项集。实验结果表明，该算法在低支持度阈值下稠密数据集中挖掘长模式具有较好性能。关键词：数据挖掘；知识发现；最大频繁项集挖掘算法；模式发现中图分类号：TP301 文献标志码：A 文章编号：1003—5060(2010)09—1387—05 Algorithm forminingmaximum frequentitemsetsbasedonFP-tree JIANG Cui—qing， HU jun-yan (SchoolofManagement，HefeiUniversityofTechnology，Hefei230009，China) Abstract：Thepresentalgorithmsforminingmaximum frequentitemsetsareeffectivewithahighsup— portthreshold．However，whenthesupportthresholdisrelativelylow，thescaleofcandidateitem— setsgrowsfast，andtheperformanceofthesealgorithmsisnotsatisfactoryinthiscase．Inthispaper， analgorithm calledDMFIF(discovermaximum frequentitemsetsbased on FP-tree) isproposed， whichtreatsbranchesofFP-treeascandidateitemsets，andordersthem bytheirdimensionandsup— portdescent．Thealgorithm minesmaximum frequentitemsetswith top—down search，using the methodofsubsetprunning．Experimentalresultstestifytheeffectivenessofthealgorithm whenmin— inglongpatternsinadensedatabasewithalow supportthreshold． Keywords：datamining；knowledgediscovery；algorithm forminingmaximum frequentitemsets；pat— terndiscovery 的子集均为频繁项集，因此大大减少了挖掘的工 0 引口作量。关联规则挖掘是数据挖掘领域研究的重要课另外，某些数据挖掘应用仅需发现最大频繁题，在实际中得到了大量应用。关联规则模型和项集，而不必发现所有的频繁项集，因而发现最大 Apriori算法最初由文献[1，2]提出，现存的频繁频繁项集对数据挖掘有重大意义。项集挖掘算法多为Apriori算法

您可能关注的文档

文档评论（0）

1亿VIP精品文档

更多 >

基于FP-tree的最大频繁项集挖掘算法.pdfVIP