chapter10ClusBasic
* * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * Similarity Defined by SimTree Path-based node similarity simp(n7,n8) = s(n7, n4) x s(n4, n5) x s(n5, n8) Similarity between two nodes is the average similarity between objects linked with them in other SimTrees Adjust/ ratio for x = n1 n2 n4 n5 n6 n3 0.9 1.0 0.9 0.8 0.2 n7 n9 0.3 n8 0.8 0.9 Similarity between two sibling nodes n1 and n2 Adjustment ratio for node n7 Average similarity between x and all other nodes Average similarity between x’s parent and all other nodes * LinkClus: Efficient Clustering via Heterogeneous Semantic Links Method Initialize a SimTree for objects of each type Repeat until stable For each SimTree, update the similarities between its nodes using similarities in other SimTrees Similarity between two nodes x and y is the average similarity between objects linked with them Adjust the structure of each SimTree Assign each node to the parent node that it is most similar to For details: X. Yin, J. Han, and P. S. Yu, “LinkClus: Efficient Clustering via Heterogeneous Semantic Links”, VLDB06 * Initialization of SimTrees Initializing a SimTree Repeatedly find groups of tightly related nodes, which are merged into a higher-level node Tightness of a group of nodes For a group of nodes {n1, …, nk}, its tightness is defined as the number of leaf nodes in other SimTrees that are connected to all of {n1, …, nk} n1 1 2 3 4 5 n2 The tightness of {n1, n2} is 3 Nodes Leaf nodes in another SimTree * Finding Tight Groups by Freq. Pattern Mining Finding tight groups Frequent pattern mining Procedure of initializing a tree Start from leaf nodes (level-0) At each level l, find non-overlapping groups of similar nodes with frequent pattern mining Reduced to g1 g2 {n1} {n1, n2} {n2} {n1, n2} {n1, n2} {n2, n3, n4} {n4} {n3, n4} {n3, n4} Transactions n1 1 2 3 4 5 6 7 8 9 n2 n3 n4 The tightness of a group of nodes is the
您可能关注的文档
- 04 Objects and Types_d.ppt
- 06_Self_and_Self_Identity_S1.pptx
- 2-2 等价类划分法2.ppt
- 2-2 等价类划分法3.ppt
- 2008101815085513350.ppt
- 2013高考英语人教版总复习课件:8-2.ppt
- 2015年全国中学生英语能力竞赛(NEPCS)高三决赛.docx
- 3.2 重要有机化合物的紫外吸收光谱及应用.ppt
- 7六年级上册Unit 6 The story of rain11.ppt
- 6_The_role_of_the_undesirables.ppt
- Chapter_2_Speech_Sounds.ppt
- DA040002 VLAN技术原理 ISSUE1.5.ppt
- Einstein 2.ppt
- ERP基础知识与企业实务3.pptx
- fluent-v6.3-lect-06a-turbulence.ppt
- Goldman Sachs Story The Bank Job Vanity Fair Dec 09.doc
- Introduction to astronomy.ppt
- KMail Shipping instructions rev sep.2008.doc
- Lecture 7---John Milton.ppt
- Leertechnologie-afspraken.ppt
最近下载
- 高中信息技术课件:大数据.pptx
- 精密器械复用再处理常见问题及质量管理要点PPT.pptx
- 原子物理学-褚圣麟-第五章.ppt VIP
- 人教版2026-2027学年三年级道德与法治下册教学计划(及进度表).docx
- 新能源汽车热管理仿真及其控制策略介绍.pdf VIP
- 云南省玉溪市2025年小升初入学分班考试英语考试真题含答案.docx VIP
- 讲解员培训方案.pptx VIP
- 党员干部个人组织生活会个人对照(学习贯彻党的创新理论方面;加强党性锤炼方面;联系服务职工群众方面;发挥先锋模范作用方面;改作风树新风等方面)存在的问题清单及整改措施.docx VIP
- 短视频新闻案例.pptx VIP
- 高中数学 常用二级结论.pdf VIP
原创力文档

文档评论(0)