- 10
- 0
- 约9.98千字
- 约 3页
- 2019-08-18 发布于天津
- 举报
一种基于KNN的文本分类算法-CORE.PDF
ISSN1009-3044 E-mail: xsjl@
Computer Knowledge and Technology 电脑知识与技术 第8 卷第7 期 (2012 年3 月)
Computer Knowledge and Technology 电脑知识与技术
Vol.8, No.7, March 2012. Tel:+86-551-5690963 5690964
一种基于KNN 的文本分类算法
余悦蒙,黄小斌
(厦门大学信息科学与技术学院,福建厦门361005)
摘要:KNN(K-Nearest Neighbor)是向量空间模型中最好的文本分类算法之一。但是,当样本集较大以及文本向量维数较多时,
KNN 算法分类的效率和准确率就会大大降低。该文提出了一种提高KNN 分类效率的改进算法,并且改进了相似度的计算方法,
能更准确的判断维数高且样本集大的文本向量。算法在训练过程中计算出各类文本在向量空间中的分布范围,在分类过程中,根
据待分类文本向量在样本空间中的分布位置,缩小其K 最近邻搜索范围。实验证实改进的算法可以在保持KNN 分类性能基本不
变的情况下,显著提高分类效率。
关键词:文本分类;K-最近邻;算法
中图分类号:TP301 文献标识码:A 文章编号:1009-3044(2012)07-1564-03
AnAlgorithmforTextClassificationBasedonKNN
YU Yue-meng, HUANG Xiao-bin
(School of Information Science and Engineering, Xiamen University, Xiamen 361005, China)
Abstract: KNN (K-Nearest Neighbor) is one of the best text classification algorithms by Vector Support Model. However, its efficiency
and accuracy rate are very low for text classification task with high dimension and huge samples. In this paper, a new algorithm is intro⁃
duced to improve the efficiency rate. For high precision, we also have a new way to compute the similarity of two texts. The distribution
of training samples of each class is computed in the training process. According to the position of the documents in the sample space, this al⁃
gorithm can reduce the searching range of their K nearest neighbors in the classing process. The results of experiments show that this algo⁃
rithm can save largely the classification time and has almost the same classification performance as that of the traditional KNN classification
algorithm.
Keywords:text classification; KNN; algorithm
互联网的迅速发展使我们人类进入了信息的时代。信息时代的到来亦让世界范围的信息量迅猛增
您可能关注的文档
- 《应用数学和力学》第35卷总目次.PDF
- 《建筑BIM设计与建筑能耗模拟》教学大纲.doc
- 《无机化学》教学大纲-化工学院-衡水学院.doc
- 《汽车电控系统设计》课程教学大纲-上海交通大学.PDF
- 《测试技术与虚拟仪器编程》课程教学大纲-东莞理工学院-机械工程学院.PDF
- 《编译原理与技术》类型检查.PDF
- 《车身使用协议之补充协议》暨关联交易公告.PDF
- 【疾病名】单纯性肾囊肿【英文名】simplerenalcyst【缩写】【别名】simple.PDF
- 【疾病名】卵巢交界性肿瘤【英文名】borderlineovariantumors【别名.PDF
- 【论文】题目外墙机喷干粘石饰面施工总结从事专业建筑工程工作单位.doc
原创力文档

文档评论(0)