- 1、原创力文档(book118)网站文档一经付费(服务费),不意味着购买了该文档的版权,仅供个人/单位学习、研究之用,不得用于商业用途,未经授权,严禁复制、发行、汇编、翻译或者网络传播等,侵权必究。。
- 2、本站所有内容均由合作方或网友上传,本站不对文档的完整性、权威性及其观点立场正确性做任何保证或承诺!文档内容仅供研究参考,付费前请自行鉴别。如您付费,意味着您自己接受本站规则且自行承担风险,本站不退款、不进行额外附加服务;查看《如何避免下载的几个坑》。如果您已付费下载过本站文档,您可以点击 这里二次下载。
- 3、如文档侵犯商业秘密、侵犯著作权、侵犯人身权等,请点击“版权申诉”(推荐),也可以打举报电话:400-050-0827(电话支持时间:9:00-18:30)。
- 4、该文档为VIP文档,如果想要下载,成为VIP会员后,下载免费。
- 5、成为VIP后,下载本文档将扣除1次下载权益。下载后,不支持退款、换文档。如有疑问请联系我们。
- 6、成为VIP后,您将拥有八大权益,权益包括:VIP文档下载权益、阅读免打扰、文档格式转换、高级专利检索、专属身份标志、高级客服、多端互通、版权登记。
- 7、VIP文档为合作方或网友上传,每下载1次, 网站将根据用户上传文档的质量评分、类型等,对文档贡献者给予高额补贴、流量扶持。如果你也想贡献VIP文档。上传文档
查看更多
* On the J2EE dataset, we see that about +24.7% improvements can be obtained in terms of F1-Measure. The experimental results are consistent with the results on the BR dataset. * Finally, we got a summary for our work. * In this paper. We have investigated the problem of email data cleaning and formalized it as non-text filtering and text normalization. We have proposed a cascaded approach to the task. Using Support Vector Machines, we have been able to make an implementation of the approach. Experimental results show that our approach can significantly outperform baseline methods for email data cleaning. When applying it to term extraction from emails, we observe a significant improvement on extraction accuracy. * * Another interesting work is to make the detection adaptive to new domains. We analyzed headers in different datasets. We have found that the header of different datasets have different patterns. Header models learned from one dataset may work not well on the other datasets. We have conducted a preliminary experiment. We collected a new dataset that is called WWW. The WWW dataset is from enterprise track in TREC 2005. We applied the trained SVM models directly to the WWW dataset. We found that the drop is about 10% in terms of F1-Measure. We plan to deal with the problem by making use of semi-supervised methods. Outline Motivation and Problem Description Related Work Our Approach Implementation Experimental Results Summary Summary Formalized email data cleaning as non-text filtering and text normalization Conducted email cleaning in ‘cascaded’ approach Used SVM models for header, signature, program code, and extra line break detection Our approach significantly outperforms baseline methods When applied to term extraction, significant improvement on extraction accuracy can be obtained Thanks! Examples of List and Table in Email Supppoesdly this is what it does: *New Layout *Pentium-safe Compiled *Fixed Various bugs in ftp scan *Fixed R
您可能关注的文档
- Egypt Gift of the Nile teacherweb埃及尼罗河的礼物 teacherweb.pptx
- Egyptian Civilization SUNY Oneonta埃及文明纽约州立大学奥尼昂塔.ppt
- Egyptian Civilization “The Gift of the Nile”埃及文明“尼罗河的礼物”.ppt
- Egyptian Gods uchongrade6埃及的众神uchongrade6.weebly.pptx
- Egyptian, Greek or Roman Sculpture Auburn School 埃及、希腊或罗马雕塑奥本学校.pptx
- Ei dian otsikkoa helsinkiEI店otsikkoa 赫尔辛基.fi.pptx
- EIM For Coker Application EIM焦化应用.pptx
- Egyptian Creation myths University of Texas at 埃及神话德克萨斯大学.ppt
- Eighth Grade Research Project Ada Merritt K8 第八级研究项目艾达梅利特K8.ppt
- EIN NOVUM AM STDTISCHEN GYMNASIUM 一个新的ST和196dtischen体育馆.ppt
- Email Marketing Netconcepts电子邮件营销 Netconcepts.ppt
- Email networks and the spread of computer viruses电子邮件网络与计算机病毒的传播.ppt
- EMail Systems电子邮件系统.ppt
- Ellensburg Morning Rotary Club Windows埃伦斯堡上午扶轮社的窗口.pptx
- Emancipation Quodvultdeus解放 quodvultdeus.pptx
- Emancipation Proclamation解放宣言.ppt
- Elly HeussKnappGymnasium skola75spb艾莉霍伊斯纳普体育馆 skola75 SPB.ru.ppt
- EMarketing, 3rd edition Judy Strauss, Raymond Frost, 网络营销第三版斯特劳斯雷蒙德霜.ppt
- Embarking on an Adventure While Drawing the Map在绘制地图时开始冒险.ppt
- EMBOLIE PULMONAIRE GRAVE ET TRAITEMENT DE LA 肺栓塞严重和处理.ppt
文档评论(0)