网站大量收购独家精品文档,联系QQ:2885784924

词性标注(国外英文资料).doc

  1. 1、本文档共11页,可阅读全部内容。
  2. 2、原创力文档(book118)网站文档一经付费(服务费),不意味着购买了该文档的版权,仅供个人/单位学习、研究之用,不得用于商业用途,未经授权,严禁复制、发行、汇编、翻译或者网络传播等,侵权必究。
  3. 3、本站所有内容均由合作方或网友上传,本站不对文档的完整性、权威性及其观点立场正确性做任何保证或承诺!文档内容仅供研究参考,付费前请自行鉴别。如您付费,意味着您自己接受本站规则且自行承担风险,本站不退款、不进行额外附加服务;查看《如何避免下载的几个坑》。如果您已付费下载过本站文档,您可以点击 这里二次下载
  4. 4、如文档侵犯商业秘密、侵犯著作权、侵犯人身权等,请点击“版权申诉”(推荐),也可以打举报电话:400-050-0827(电话支持时间:9:00-18:30)。
查看更多
词性标注(国外英文资料)

词性标注(国外英文资料) In the part of speech tag set have been identified, and every word in the dictionary are determined on the basis of part of speech, to an input word string converted to the corresponding process called part-of-speech tagging of part of speech tag string. The problem of speech labeling needs to be solved How to determine the meaning of the word in context. Guess the word for an unlogged word Multi-category words influence on syntactic analysis: although the proportion of multi-category words in the vocabulary is not very high, but because they appear higher percentage, therefore directly influence for syntax analysis. Lexical notation: Probability method The lexical annotation method based on the hidden markov model The method of machine learning rules A transformation based error driven word annotation method Consider the problem of lexical annotation from a statistical model A given word string W = w1 w2... Wn, if T = t1t2... Tn is a string of words for W. The word W is defined as the process of asking for T in a given W and a word table with a lexical annotation. Suppose that W has multiple possible lexical tags, T1, T2,... Ti, the lexical annotation of W is to find the maximum conjugations of P (T | W) in the context of known W. 3T = argmax P (T | W) Such as word string the/this / / report/editor / / / in some words have more than one part of speech tag (multi-category words), so the words on the corresponding part-of-speech tagging chain has more than one. The total mark result is equal to the number of words of each word, which is 4 times 1 times 1 times 2 times 2 times 2 times 3, which is 96. The task of the lexical labelling is to find the most likely word of the word string T in multiple possibilities. The corresponding lexical string in the previous example is PRVNVMQ For an etymological tagging system, the most likely to be considered the most likely to be the most likely to be the right one, may be wrong. To make it convenient, do the follow

文档评论(0)

f8r9t5c + 关注
实名认证
内容提供者

该用户很懒,什么也没介绍

版权声明书
用户编号:8000054077000003

1亿VIP精品文档

相关文档