- 16
- 0
- 约1.55万字
- 约 9页
- 2016-05-25 发布于安徽
- 举报
Graph-Based Substructure Pattern Mining.doc
Graph-Based Substructure Pattern Mining
Abstract
We investigate new approaches for frequent graph-based pattern mining in graph datasets and propose a novel algorithm called gSpan (graph-based Substructure pattern mining), which discovers frequent substructures without candidate generation. gSpan builds a new lexicographic order among graphs, and maps each graph to a unique minimum DFS code as its canonical label. Based on this lexicographic order, gSpan adopts the depth-first search strategy to mine frequent connected subgraphs efficiently. Our performance study shows that gSpan substantially outperforms previous algorithms, sometimes by an order of magnitude.
1. Introduction
Frequent substructure pattern mining has been an emerging data mining problem with many scientific and commercial applications. As a general data structure, labeled graph can be used to model much complicated substructure patterns among data. Given a graph dataset, D={G0, G1, ..., Gn}, support(g) denotes the number of graphs (in D) in which g is a subgraph. The problem of frequent subgraph mining is to find any subgraph g s.t. Support(g) ≥ minSup (a minimum support threshold). To reduce the complexity of the problem (meanwhile considering the connectivity property of hidden structures in most situations), only frequent connected subgraphs are studied in this paper.
The kernel of frequent subgraph mining is subgraph isomorphism test. Lots of well-known pair-wise isomorphism testing algorithms were developed. However, the frequent subgraph mining problem was not explored well. Recently, Inokuchi et al. [4] proposed an Apriori-based algorithm, called AGM, to discover all frequent (both connected and disconnected) substructures. Kuramochi and Karypis [5] further developed the idea using adjacent representation of graph and an edge-growing strategy. Their algorithm, called FSG, is able to find all frequent connected subgraphs from a chemical compound dataset in 10 minutes with 6.5% minimum support.
您可能关注的文档
- dd2014年主管护师相关知识模拟试题及答案.doc
- dd22014年主管护师相关知识模拟试题及答案.doc
- Delaunay三角剖分.docx
- dexing斑岩铜矿.doc
- DGGE 指纹图谱分析太湖富营养化水体中细菌群落结构的变化.doc
- discourse on grading.pptx
- DMCNET总线开启高端数控潮流-new.ppt
- dsp关于TMS320C54X的应用.doc
- DSP存储空间的分配问题.doc
- DSP课程设计报告—刘雅琪.doc
- 安徽省华师联盟2025-2026学年高三上学期1月质量检测生物试卷+答案.doc
- 安徽省华师联盟2025-2026学年高三上学期1月质量检测语文试卷+答案.doc
- 四川省绵阳南山中学实验学校2025-2026学年高三上学期1月月考数学含答案.doc
- 2026届辽宁省大连市高三上学期双基考试物理试卷+答案.doc
- 辽宁名校联盟2026年1月高三上期末联考质量检测化学含答案.doc
- 辽宁名校联盟2026年1月高三上期末联考质量检测生物含答案.doc
- 辽宁名校联盟2026年1月高三上期末联考质量检测英语含答案.doc
- 辽宁名校联盟2026年1月高三上期末联考质量检测政治含答案.doc
- 黑龙江省龙江教育联盟2026年1月高三上学期期末考试化学含答案.doc
- 黑龙江省龙江教育联盟2026年1月高三上学期期末考试生物含答案.doc
最近下载
- 2025届福建省厦门市思明区小学三年级上学期科学试题及答案.docx
- 2025生产安全事故伤害损失工作日判定.docx
- 金融工程深度报告:股票关联与溢出效应因子构建-中信建投证券-0页.pdf VIP
- 常州纺织服装职业技术学院单招《数学》练习题【学生专用】附答案详解.docx VIP
- 中药涂擦疗法操作规范学习培训课件.pptx VIP
- 福建教育学院 石修银【特级教师】.ppt VIP
- 古诗词诵读 《虞美人》课件 (共24张PPT)统编版高中语文必修上册.ppt.pptx VIP
- 小规模经营网约房管理规范及编制说明.pdf
- 支委候选人初步人选考察谈话记录.docx VIP
- 2025年北师大版高中数学数学建模实战试卷.docx VIP
原创力文档

文档评论(0)