基于不均衡数据集的文本分类算法分析-analysis of text classification algorithm based on imbalanced data sets.docxVIP
- 23
- 0
- 约4.9万字
- 约 51页
- 2018-05-18 发布于上海
- 举报
基于不均衡数据集的文本分类算法分析-analysis of text classification algorithm based on imbalanced data sets
filteringtheimbalanceofthedatadistributionandgivearelativelybalanceddataset whichisusedtotrainaclassifier.Therandomoversamplingalwaysleadtoover-fitting inclassification,whiletherandomunder-samplingcan’tavoidtodeletesomesamples whichplayimportantroleinclassificationwhichmayproducethereductionofthe classificationresult.Soanimprovementofthecombinedre-samplingmethodis proposedbyusingtheSMOTEonoversamplingwhichoftenbehaveswellandthe under-samplingmethodbasedonimproved clusteringalgorithm.Theexperimentresults showthatthenewmethodhas producedabetterclassificationresult.Keywords:Imbalanceddatasets,textclassification,CHI-squareselectionmethod, datadistribution,re-sampling目录中文摘要..........................................................................................................................................I英文摘要........................................................................................................................................II1 绪论.........................................................................................................................................11.1 论文研究背景和意义...............................................................................................................11.2国内外研究现状.......................................................................................................................21.3 本文的研究内容和章节结构...................................................................................................62文本分类相关技术..............................................................................................................72.1 文本分类定义和过程...............................................................................................................72.1.1 文本分类的定义................................................................................................................72.1.2 文本分类基本过程............................................................................................................72.2文本表示模型.......................
您可能关注的文档
- 基于z源逆变器的pmsm母线电压调整控制的分析-analysis of pmsm bus voltage adjustment control based on z - source inverter.docx
- 基于β-ca3po42结构荧光粉的制备及性能研究-preparation and properties of fluorescent powder based on β - ca3po4 structure.docx
- 基于α稳定分布的volterra自适应滤波算法的分析-analysis of volterra adaptive filtering algorithm based on α stable distribution.docx
- 基于β-环糊精的phgsh响应羧甲基壳聚糖胶束的研究-study on ph gsh - responsive carboxymethyl chitosan micelles based on β -cyclodextrin.docx
- 基于α-叠氮肉桂酸酯合成异噁唑和中氮茚衍生物的方法学分析-methodological analysis of synthesis of isoxazole and indolizine derivatives based on α - azinone cinnamate.docx
- 基于α-酮酰胺合成的新方法学分析-new methodological analysis based on synthesis of α -ketoamide.docx
- 基于β射线法的pm2.5连续在线监测的相关问题研究-research on related issues of pm 2.5 continuous online monitoring based on β -ray method.docx
- 基于β-环糊精的液晶大分子的合成及其相行为的分析-synthesis and phase behavior analysis of liquid crystal macromolecules based on β -cyclodextrin.docx
- 基于μcos ii的usb ohci主机协议栈的实现-implementation of usb ohci host protocol stack based on μ cos ii.docx
- 基于μcos-ⅱ的usb ohci主机协议栈的实现-implementation of usb ohci host protocol stack based on μ cos - ⅱ.docx
最近下载
- 新解读《DL_T 408—2023电力安全工作规程 发电厂和变电站电气部分》最新解读.docx VIP
- 1.7 有多少名观众 教案 2025-2026学年北师大版数学三年级下册.docx VIP
- 第5章 比亚迪精诚钣喷质量管理体系(A0版).pdf VIP
- 学堂在线《大学生心理健康》课后作业单元考核答案.docx VIP
- 脑出血钻孔引流术后护理要点.pptx VIP
- 抖音美妆类短视频营销策略.pdf VIP
- 热敏罐灸疗法可复制.pdf VIP
- 《过敏性紫癜预防与处理指南(2025)解读》.docx VIP
- SL706-2015水库调度编制导则.pdf VIP
- 《美妆短视频的发展问题研究》文献综述1700字.docx VIP
原创力文档

文档评论(0)