基于不均衡数据集的文本分类算法分析-analysis of text classification algorithm based on imbalanced data sets.docxVIP

  • 23
  • 0
  • 约4.9万字
  • 约 51页
  • 2018-05-18 发布于上海
  • 举报

基于不均衡数据集的文本分类算法分析-analysis of text classification algorithm based on imbalanced data sets.docx

基于不均衡数据集的文本分类算法分析-analysis of text classification algorithm based on imbalanced data sets

filteringtheimbalanceofthedatadistributionandgivearelativelybalanceddataset whichisusedtotrainaclassifier.Therandomoversamplingalwaysleadtoover-fitting inclassification,whiletherandomunder-samplingcan’tavoidtodeletesomesamples whichplayimportantroleinclassificationwhichmayproducethereductionofthe classificationresult.Soanimprovementofthecombinedre-samplingmethodis proposedbyusingtheSMOTEonoversamplingwhichoftenbehaveswellandthe under-samplingmethodbasedonimproved clusteringalgorithm.Theexperimentresults showthatthenewmethodhas producedabetterclassificationresult.Keywords:Imbalanceddatasets,textclassification,CHI-squareselectionmethod, datadistribution,re-sampling目录中文摘要..........................................................................................................................................I英文摘要........................................................................................................................................II1 绪论.........................................................................................................................................11.1 论文研究背景和意义...............................................................................................................11.2国内外研究现状.......................................................................................................................21.3 本文的研究内容和章节结构...................................................................................................62文本分类相关技术..............................................................................................................72.1 文本分类定义和过程...............................................................................................................72.1.1 文本分类的定义................................................................................................................72.1.2 文本分类基本过程............................................................................................................72.2文本表示模型.......................

您可能关注的文档

文档评论(0)

1亿VIP精品文档

相关文档