- 1、原创力文档(book118)网站文档一经付费(服务费),不意味着购买了该文档的版权,仅供个人/单位学习、研究之用,不得用于商业用途,未经授权,严禁复制、发行、汇编、翻译或者网络传播等,侵权必究。。
- 2、本站所有内容均由合作方或网友上传,本站不对文档的完整性、权威性及其观点立场正确性做任何保证或承诺!文档内容仅供研究参考,付费前请自行鉴别。如您付费,意味着您自己接受本站规则且自行承担风险,本站不退款、不进行额外附加服务;查看《如何避免下载的几个坑》。如果您已付费下载过本站文档,您可以点击 这里二次下载。
- 3、如文档侵犯商业秘密、侵犯著作权、侵犯人身权等,请点击“版权申诉”(推荐),也可以打举报电话:400-050-0827(电话支持时间:9:00-18:30)。
- 4、该文档为VIP文档,如果想要下载,成为VIP会员后,下载免费。
- 5、成为VIP后,下载本文档将扣除1次下载权益。下载后,不支持退款、换文档。如有疑问请联系我们。
- 6、成为VIP后,您将拥有八大权益,权益包括:VIP文档下载权益、阅读免打扰、文档格式转换、高级专利检索、专属身份标志、高级客服、多端互通、版权登记。
- 7、VIP文档为合作方或网友上传,每下载1次, 网站将根据用户上传文档的质量评分、类型等,对文档贡献者给予高额补贴、流量扶持。如果你也想贡献VIP文档。上传文档
查看更多
10/10/2006 Natural Language Processing (3b) Zhao Hai 赵海Department of Computer Science and EngineeringShanghai Jiao Tong University2010-2011?zhaohai@cs.sjtu.edu.cn Outline Lexicons and Lexical Analysis Collocation Lexicons and Lexical Analysis (254) Collocation (35) Hypothesis Testing (1) One difficulty that we have glossed over so far is that high frequency and low variance can be accidental. For example, if the two constituent words of a frequent bigram like new companies are frequently occurring words (as new and companies are), then we expect the two words to co-occur a lot just by chance, even if they do not form a collocation. Lexicons and Lexical Analysis (255) Collocation (36) Hypothesis Testing (2) What we really want to know is whether two words occur together more often than chance. Assessing whether or not something is a chance event is one of the classical problems of statistics. It is usually couched in terms of hypothesis testing. Lexicons and Lexical Analysis (256) Collocation (37) Hypothesis Testing (3) We formulate a null hypothesis H0 that there is no association between the words beyond chance occurrences, compute the probability p that the event would occur. If H0 were true, and then reject H if p is too low (typically if beneath a significance level of p 0.05, 0.01, 0.0005, or 0.001) and retain H0 as possible otherwise. Lexicons and Lexical Analysis (257) Collocation (38) Hypothesis Testing (4) How can we apply the methodology of hypothesis testing to the problem of finding collocations? We first need to formulate a null hypothesis which states what should be true if two words do not form a collocation. For such a free combination of two words we will assume that each of the words w1 and w2 is generated completely independently of the other, and so their chance of coming together is simply given by: Lexicons and Lexical Analysis (258) Collocation (39) Hypothesis Testing (5) The model imp
您可能关注的文档
- 第4章光检测器和光接收器分析.ppt
- 第2章光谱分析法分析.doc
- 第6章软件测试分析.ppt
- 第二篇乳与乳制品第八章其他乳制品分析.ppt
- 第四节干燥过程的物料平衡和热量平衡分析.ppt
- 第五章农药残留测定方法分析.ppt
- 鄂温克族自治旗巴彦托海镇街巷硬化分析.doc
- 非居民纳税人享受税收协定待遇情况报告表(企业所得税A表)分析.doc
- 非询价采购项目落实需求表分析.doc
- 高考专题辅导[001]分析.ppt
- NAVBO'sVascularBiologyPublicationAlert分析.doc
- NazivProizvoda分析.docx
- NetworksBasicConcepts分析.ppt
- NGOCOMPLEMENTARYREPORTONTHESTATUSOF分析.doc
- NonDestructiveExamination分析.ppt
- NorthwestASSISTHowtoobtainmaximumvaluefrom分析.ppt
- NetWORKersandtheirActivityinIntensionalNetworks分析.doc
- NoteToSpecifierThissectionrequireseditingonaproject分析.doc
- Nouveautésaudionumériques分析.doc
- NRCINSPECTIONMANUALCQV分析.doc
最近下载
- 2025广西公需科目考试答案(3套,涵盖95_试题)一区两地一园一通道建设;人工智能时代的机遇与挑战.pdf VIP
- 2025商用车发动机气缸体铸件技术条件.docx VIP
- 颅内复杂动脉瘤介入治疗围术期护理专家共识2025 .pdf
- 第二节病虫害预测预报教学教材.ppt VIP
- 电梯新检规施工自检报告-曳引客货-2024.doc
- 2025年高考英语(新高考Ⅱ卷)试卷评析及2026高考备考策略 课件.pptx
- 2000年全国高中学生化学竞赛决赛(冬令营)理论试题与实验试题及参考答案精品.pdf VIP
- 苹果公司知识产权保护策略.pptx
- 中药饮片智能调剂与煎煮关键技术研究课件.pdf VIP
- 局部解剖学(山东联盟) 智慧树 知到答案.docx VIP
文档评论(0)