- 4
- 0
- 约3.83万字
- 约 49页
- 2019-05-14 发布于上海
- 举报
面向高维数据挖掘的特征选择方法研究
面向高维数据挖掘的特征选择方法研究 Titie:The Research on Feature Selection MethDds
iIl Hi曲Dimensional Data Mining Major:Computer Software and Theory Name:Sun Jinghan
Supervisor:Yin Ban
Abstract
Data mining is one of the frontiers of research in the fields of database and DSS.In practice,the hi。曲dimensional data is frequently used,as it becomes mainstream in real·life data increasingly,the research on high dimensional data mining is more and
more important.However,mining in hi曲dimensional data is extremely difficult due
to some particular characteristics of its own.Therefore,specific methods must be adopted to solve these problems.
This paper starts with the concept of data mining and characteristics of hi【曲
dimensional data,regards feature selection methods in hi曲dimensional data mining as core content.Feature selection methods applied in text data and gene expression data are discussed respectively.
For the problems such as dimension curve,sparse vector and random selection of initial centers in the area of text clustering and k-means algorithm,we proposed an improved k-means algorithm for text clustering,which is improved by feature selection,sparse vector filtering,density and scattering—based initial centers searching.
Via using dataset 20 NewsGroup for experiment,the results show that the proposed
algorithm works better than standard k-means algorithm both in quality and stability. For gene expression data,we proposed a new feature selection method,which
realizes the feature subset search by genetic algorithm;and the feature subset fitness
is evaluated by the separability measure based on boundary points.The experimental results show that the proposed algorithm can find the feature subsets with good
separability,which results in the low dimensional data and the good classification
accuracy.Based on the research mentioned above,another feature selection method
————————————————塑鳖塑塑丝塑竺竺堑兰兰塑鎏竺!垄
————————————————塑鳖塑塑丝塑竺竺堑兰兰塑鎏竺!垄 一一一
combing filter and
您可能关注的文档
- 民事撤回起诉制度研究-法律专业毕业论文.docx
- 煤体爆破裂纹扩展规律及其试验研究-桥梁与隧道工程专业毕业论文.docx
- 面向目标跟踪的能量获取传感器网络休眠调度算法-通信与信息系统专业毕业论文.docx
- 镁合金板脉动液压加载方式下成形规律的研究-机械电子工程专业毕业论文.docx
- 棉花黄萎病菌系基于AFLP的遗传分化研究-植物病理学专业毕业论文.docx
- 脒修饰的载雷公藤甲素阳离子脂质体的肾小球靶向给药系统研究-药剂学专业毕业论文.docx
- 氯乙烯职业危害的流行病学调查及氯乙烯的诱变性研究-劳动卫生与环境卫生学专业毕业论文.docx
- 面向主动安全的汽车行驶姿态实时采集系统研究-测试计量技术及仪器专业毕业论文.docx
- 马克思哲学的感性问题研究-马克思主义哲学专业毕业论文.docx
- 旅游景点公示语翻译错误分析及翻译技巧探讨——以郴州旅游景点为例英语语言文学专业毕业论文.docx
原创力文档

文档评论(0)