- 5
- 0
- 约4.4万字
- 约 4页
- 2017-08-07 发布于浙江
- 举报
Unexpected Productions May Well be Errors Seminar für Sprachwissenschaft.pdf
Unexpected Productions May Well be Errors
Tylman Ule and Kiril Simov
Seminar für Sprachwissenschaft Linguistic Modelling Laboratory
Universität Tübingen Bulgarian Academy of Sciences
ule@sfs.uni-tuebingen.de kivs@
Abstract
We present a method for detecting annotation errors in treebanks. It assumes that errors are unexpected small tree fragments. We generate
statistics over configurations of these fragments using a standard statistical test. We use the test result and the characteristics of their
distributions as features to classify unseen configurations as likely errors via machine learning. Evaluation shows that the resulting list
of error candidates is reliable, independent of corpus size, annotation quality, and target language.
Setting up language resources involves considerable ef- are especially useful when pattern-based approaches are not
fort, because human intervention is inevitable and costly. easily applicable, because patterns are difficult to find.
Human annotators are essential, because they usually out- We present such a non-symbolic method that attacks er-
perform automatic methods in terms of annotation accu- rors and inconsistencies in structural annotation, and that
racy, but they still make their own kind of errors. In addition shows good performance across languages and annotation
to genuine mistakes, they do not always behave identically schemes. We detect errors and inconsistencies that appear
each time when presented with the same infrequent prob- as unexpected events in a corpus using a variant of Directed
lem. Thus one can expect a number of errors to be present
您可能关注的文档
- Tuberculosis and Air Travel.ppt
- tubereculosis in children.ppt
- TulaFale A Security Tool for Web Services.pdf
- turban_ec2010_ch13.ppt
- Tutorial on E-R Diagrams.ppt
- tutorial_application_integrity_and_security_r1a.pdf
- Two Approaches to Scripting Character Animation.pdf
- Tyers M The GRID the General Repository for Interaction Datasets.pdf
- Typical Food of Spring Festival.ppt
- tyut专外复习提纲 完整版.doc
最近下载
- 《将进酒》说课课件(12张PPT)_20211215205119.pdf VIP
- 山西省教师晋升副高职称考试题库.pdf VIP
- 接触网设计规范(综合版).docx
- 统编版语文六年级下册第三单元教材解读+课件.pptx VIP
- 英语六级高频核心词汇速记.pdf VIP
- 中医护理的整体观念和辨证施护.pptx VIP
- 2025年中外历史纲要下 第二单元 第 4 课中古时期的亚洲(共38张PPT).ppt VIP
- 2026年江西旅游商贸职业学院单招综合素质考试题库及答案解析.docx VIP
- 2025年广东省广州市中考物理试卷及答案.docx VIP
- 2026年内蒙古化工职业学院单招(计算机)测试模拟题库附答案解析.docx VIP
原创力文档

文档评论(0)