WEB大数据第二回(频繁项挖掘).pdfVIP

  • 1
  • 0
  • 约5.03万字
  • 约 89页
  • 2019-03-06 发布于湖北
  • 举报
WEB大数据挖掘(二) Association Rules and Frequent Pattern Mining Slides adapted from Outline  Association Rules  Frequent Itemset Mining Algorithms  Sequential Pattern Mining Algorithms WEB大数据挖掘 2 Association Rules The Market-Basket Model  A large set of items, e.g., things sold in a supermarket  A large set of baskets, each of which is a small set of the items, e.g., the things one customer buys on one day TID Items 1 Beer, Diaper, Milk 2 Coke, Diaper, Eggs 3 Beer, Coke, Diaper, Eggs 4 Coke, Eggs WEB大数据挖掘 3 The Market-Basket Model  A general many-many mapping (association) between two kinds of things  But we ask about connections among “items” not “baskets”  The technology focuses on common events, not rare events (“long tail”) WEB大数据挖掘 4 Some Definition - Support An itemset is supported by a basket (transaction) if it is included in the basket Market-Basket transactions TID Items Beer, Diaper is supported 1 Beer, Diaper, Milk by basket 1, and 3, and its 2 Coke, Diaper, Eggs support is 2/4=50%. 3 Beer, Coke, Diaper, Eggs 4 Coke, Eggs WEB大数据挖掘 5 Some Definition – Frequent Itemset If the support of an itemset exceeds user specified min_support (threshold), this itemset is called a frequent itemset (pattern). Market-Basket transactions TID Items

文档评论(0)

1亿VIP精品文档

相关文档