- 1、原创力文档(book118)网站文档一经付费(服务费),不意味着购买了该文档的版权,仅供个人/单位学习、研究之用,不得用于商业用途,未经授权,严禁复制、发行、汇编、翻译或者网络传播等,侵权必究。。
- 2、本站所有内容均由合作方或网友上传,本站不对文档的完整性、权威性及其观点立场正确性做任何保证或承诺!文档内容仅供研究参考,付费前请自行鉴别。如您付费,意味着您自己接受本站规则且自行承担风险,本站不退款、不进行额外附加服务;查看《如何避免下载的几个坑》。如果您已付费下载过本站文档,您可以点击 这里二次下载。
- 3、如文档侵犯商业秘密、侵犯著作权、侵犯人身权等,请点击“版权申诉”(推荐),也可以打举报电话:400-050-0827(电话支持时间:9:00-18:30)。
查看更多
面向互联网应用的中文浅层语言分析技术ResearchonChineseLexicalAnalysisbit
展望:Next Generation Web II From information to Message Interactive, dynamic Community Instant Messenger, Online chat: MSN Messenger, ICQ, QQ 展望:Next Generation Web III From one single dimension to multi-dimension Text content, multimedia data Timestamp Information structure Relationship network: such as sender receiver Sender time Receiver data Toward Next Generation Web Computing … P2P-centered Web platform Personalization Modeling and Personalized Web Services Message Full-Scale Mining: temporal reasoning, text mining, data mining, knowledge management, community generation … Customized Shallow Chinese Language Computing would be popular. Thanks! * Content Layouts * Content Layouts ICTCLAS:HHMM Architecture Corpus Character String Word graph Class-based WS model Role model Training NSP rough segment Unknown word recognition Optimized selection Lexical result Atom Segment ICTCLAS: Word Segmentation ci = wi iff wi is listed in the segmentation lexicon; PER, LOC, ORG, TIME or NUM iff wi is an unknown named entity; STR iff wi is an unknown symbol string; BEG iff beginning of a sentence END iff ending of a sentence OTHER otherwise. Word class definition Class-based segmentation model ICTCLAS: Unknown word recognition In unknown word recognition, we mainly deal with Named Entities, such as person names, location names, organization names, and transliterations of foreign names We use two-level HMM for unknown word recognition In 1st level HMM, we recognize person names,simple location names, transliterations of foreign names and other proper names In 2nd level HMM, we recognize complex location names and organization names, usually with some simple unknown words as its components. ICTCLAS : Unknown word recognition(Cont.) We divided the role tag set into an internal tag set and an external tag set The role tags in the internal tag set represent the component of the unknown words The role tags in the external tag set represent the context of the unknown
您可能关注的文档
最近下载
- 黑白胶带在背光行业中的应用光学膜在LCM模组上的应用LCD背光源工艺_精品.ppt VIP
- 7.1 计数器-课件.ppt VIP
- 零跑汽车-市场前景及投资研究报告:Stellantis,出海表现.pdf VIP
- 中科曙光HPC培训教程汇总:D31-并行编程—CUDA程序设计简介.ppt VIP
- 第一章刑事案件现场勘查.ppt VIP
- 第二章-消费者选择合集课件.pptx VIP
- 广东省重点行业污染治理实用技术指南(电镀).pdf VIP
- 2025-2026学年初中信息科技安徽版2024八年级上册-安徽版2024教学设计合集.docx
- 万科业主篮球赛活动方案.pptx
- 宝可梦 Let's Go!皮卡丘1.02版switch大气层系统游戏修改代码.docx VIP
文档评论(0)