the effects of tabular-based content extraction on patent document clusteringtabular-based内容提取专利文档聚类的效果.pdfVIP
- 1、原创力文档(book118)网站文档一经付费(服务费),不意味着购买了该文档的版权,仅供个人/单位学习、研究之用,不得用于商业用途,未经授权,严禁复制、发行、汇编、翻译或者网络传播等,侵权必究。。
- 2、本站所有内容均由合作方或网友上传,本站不对文档的完整性、权威性及其观点立场正确性做任何保证或承诺!文档内容仅供研究参考,付费前请自行鉴别。如您付费,意味着您自己接受本站规则且自行承担风险,本站不退款、不进行额外附加服务;查看《如何避免下载的几个坑》。如果您已付费下载过本站文档,您可以点击 这里二次下载。
- 3、如文档侵犯商业秘密、侵犯著作权、侵犯人身权等,请点击“版权申诉”(推荐),也可以打举报电话:400-050-0827(电话支持时间:9:00-18:30)。
- 4、该文档为VIP文档,如果想要下载,成为VIP会员后,下载免费。
- 5、成为VIP后,下载本文档将扣除1次下载权益。下载后,不支持退款、换文档。如有疑问请联系我们。
- 6、成为VIP后,您将拥有八大权益,权益包括:VIP文档下载权益、阅读免打扰、文档格式转换、高级专利检索、专属身份标志、高级客服、多端互通、版权登记。
- 7、VIP文档为合作方或网友上传,每下载1次, 网站将根据用户上传文档的质量评分、类型等,对文档贡献者给予高额补贴、流量扶持。如果你也想贡献VIP文档。上传文档
查看更多
the effects of tabular-based content extraction on patent document clusteringtabular-based内容提取专利文档聚类的效果
Algorithms 2012, 5, 490-505; doi:10.3390/a5040490
OPEN ACCESS
algorithms
ISSN 1999-4893
/journal/algorithms
Article
The Effects of Tabular-Based Content Extraction on Patent
Document Clustering
Denise R. Koessler , Benjamin W. Martin , Bruce E. Kiefer and Michael W. Berry
EECS Department, Min H. Kao Building Suite 401, University of Tennessee, 1520 Middle Drive,
Knoxville, TN 37996, USA; E-Mails: dkoessle@; bmarti15@
Catalyst Repository Systems, 1860 Blake Street, 7th Floor, Denver, CO 80202, USA;
E-Mail: bkiefer@
* Author to whom correspondence should be addressed; E-Mail: berry@;
Tel.: +1-865-974-3838; Fax: +1-865-974-5483.
Received: 1 July 2012; in revised form: 16 August 2012 / Accepted: 9 October 2012 /
Published: 22 October 2012
Abstract: Data can be represented in many different ways within a particular document
or set of documents. Hence, attempts to automatically process the relationships between
documents or determine the relevance of certain document objects can be problematic. In
this study, we have developed software to automatically catalog objects contained in HTML
files for patents granted by the United States Patent and Trademark Office (USPTO). Once
these objects are recognized, the software creates metadata that assigns a data type to each
document object. Such metadata can be easily processed and analyzed for subsequent text
mining tasks. Specifically, document similarity and clustering techniques were applied to
a subset of the USPTO document collection. Although our preliminary results demonstrate
that tables and nu
您可能关注的文档
- the diagnostic value of biomarkers (steatotest) for the prediction of liver steatosis生物标志物的诊断价值(steatotest)预测肝脂肪变性.pdf
- the diagnostic value of serum leptin monitoring and its correlation with tumor necrosis factor-α in critically ill patients a prospective observational study血清中瘦素监测的诊断价值及其与肿瘤坏死相关factor-α危重患者的前瞻性研究.pdf
- the diagnostic work up of growth failure in secondary health care; an evaluation of consensus guidelines增长的诊断工作失败在二级医疗保健;.pdf
- the diamond trial protocol a randomised controlled trial of two decision aids for mode of delivery among women with a previous caesarean section [isrctn84367722]钻石试验协议两个辅助决策系统的随机对照试验的方式交付前一个剖腹产的妇女中,[isrctn84367722].pdf
- the diagnostic value of arginase-1 immunostaining in differentiating hepatocellular carcinoma from metastatic carcinoma and cholangiocarcinoma as compared to heppar-1的诊断价值arginase-1疣状从转移癌分化肝细胞癌和胆管癌与heppar-1相比.pdf
- the 'diagonal' approach to global fund financing a cure for the broader malaise of health systems全球基金融资的u201c对角线u201d方法治疗卫生系统的更广泛的问题.pdf
- the dichotomy in degree correlation of biological networks二分法的生物网络的程度相关.pdf
- the dictyostelium genome the private life of a social model revealed盘基网柄菌基因组的私人生活社会模式显示.pdf
- the different muscle-energetics during shortening and stretch不同muscle-energetics期间缩短和拉伸.pdf
- the dictyostelium genome encodes numerous rasgefs with multiple biological roles盘基网柄菌基因组编码无数rasgefs与多个生物角色.pdf
- the efficacy of tetracyclines in peripheral and intracerebral prion infection四环素的功效在外围和颅内感染朊病毒.pdf
- the efficacy of thymosin alpha 1 for severe sepsis (etass) a multicenter, single-blind, randomized and controlled trial胸腺素α1对严重脓毒症的疗效(贱民)多中心、单盲、随机、对照试验.pdf
- the efficiency of designs for fine-mapping of quantitative trait loci using combined linkage disequilibrium and linkage设计的精细定位的效率使用数量性状连锁不平衡和联系.pdf
- the ehec type iii effector nlel is an e3 ubiquitin ligase that modulates pedestal formation肠出血性大肠杆菌感染类型iii效应nlel是一个调节底座的e3泛素连接酶的形成.pdf
- the efficacy of playing a virtual reality game in modulating pain for children with acute burn injuries a randomized controlled trial [isrctn87413556]玩虚拟现实游戏的功效调节疼痛对儿童急性烧伤随机对照试验(isrctn87413556).pdf
- the efficacy of organo-complex-based wood preservative formula against dry-wood termite cryptotermes cynocephalus light的功效organo-complex-based木材防腐剂配方对干木白蚁cryptotermes狒狒光.pdf
- the electrochemical detremination of nitric oxide in seawater media with microelectrodes海水中的一氧化氮电化学detremination媒体与微电极.pdf
- the electromechanical behavior of a micro-ring driven by traveling electrostatic force的机电行为micro-ring由静电力旅行.pdf
- the elderly in the psychiatric emergency service (pes); a descriptive study老年人在精神紧急服务(pes);.pdf
- the elusive yeast interactome难以捉摸的酵母interactome.pdf
最近下载
- YY0306-2018 热辐射类治疗设备安全专用要求.pdf VIP
- 电力工程项目建设用地指标(风电场)(建标〔2011〕209号).pdf VIP
- 商品混凝土采购组织供应、运输、售后服务方案.docx VIP
- 《合成生物学》课件.pptx VIP
- 《中国文化概况》带翻译版.pdf VIP
- 消除艾滋病、梅毒和乙肝母婴传播项目工作制度及流程(模板).docx
- 2.1.2 认识地球(第2课时 地球仪与经纬网)七年级地理上册课件(湘教版).pptx VIP
- 二年级上册1-8单元看图写话.pdf VIP
- YY/T 0061-2021特定电磁波治疗器.pdf
- 沥青路面旧路改造工程施工方案.docx VIP
文档评论(0)