random texts do not exhibit the real zipfs law-like rank distribution随机短信不表现出真正的zipf如同法律等级分布.pdfVIP
- 1、原创力文档(book118)网站文档一经付费(服务费),不意味着购买了该文档的版权,仅供个人/单位学习、研究之用,不得用于商业用途,未经授权,严禁复制、发行、汇编、翻译或者网络传播等,侵权必究。。
- 2、本站所有内容均由合作方或网友上传,本站不对文档的完整性、权威性及其观点立场正确性做任何保证或承诺!文档内容仅供研究参考,付费前请自行鉴别。如您付费,意味着您自己接受本站规则且自行承担风险,本站不退款、不进行额外附加服务;查看《如何避免下载的几个坑》。如果您已付费下载过本站文档,您可以点击 这里二次下载。
- 3、如文档侵犯商业秘密、侵犯著作权、侵犯人身权等,请点击“版权申诉”(推荐),也可以打举报电话:400-050-0827(电话支持时间:9:00-18:30)。
- 4、该文档为VIP文档,如果想要下载,成为VIP会员后,下载免费。
- 5、成为VIP后,下载本文档将扣除1次下载权益。下载后,不支持退款、换文档。如有疑问请联系我们。
- 6、成为VIP后,您将拥有八大权益,权益包括:VIP文档下载权益、阅读免打扰、文档格式转换、高级专利检索、专属身份标志、高级客服、多端互通、版权登记。
- 7、VIP文档为合作方或网友上传,每下载1次, 网站将根据用户上传文档的质量评分、类型等,对文档贡献者给予高额补贴、流量扶持。如果你也想贡献VIP文档。上传文档
查看更多
random texts do not exhibit the real zipfs law-like rank distribution随机短信不表现出真正的zipf如同法律等级分布
Random Texts Do Not Exhibit the Real Zipf’s Law-Like
Rank Distribution
1 ˚ 2
Ramon Ferrer-i-Cancho *, Brita Elvevag
` `
1 Departament de Llenguatges i Sistemes Informatics, Universitat Politecnica de Catalunya, Barcelona, Catalonia, Spain, 2 Clinical Brain Disorders Branch, National Institute
of Mental Health, National Institutes of Health, Bethesda, Maryland, United States of America
Abstract
Background: Zipf’s law states that the relationship between the frequency of a word in a text and its rank (the most
frequent word has rank 1, the 2nd most frequent word has rank 2,…) is approximately linear when plotted on a double
logarithmic scale. It has been argued that the law is not a relevant or useful property of language because simple random
texts - constructed by concatenating random characters including blanks behaving as word delimiters - exhibit a Zipf’s law-
like word rank distribution.
Methodology/Principal Findings: In this article, we examine the flaws of such putative good fits of random texts.
We demonstrate - by means of three different statistical tests - that ranks derived from random texts and ranks derived
from real texts are statistically inconsistent with the parameters employed to argue for such a good fit, even when the
parameters are inferred from the target real text. Our findings are valid for both the simplest random texts composed of
equally likely characters as well as more elaborate and realistic versions where character probabilities are borrowed from a
real text.
Conclusions/Significance: The good fit
您可能关注的文档
- quantifying type-specific reproduction numbers for nosocomial pathogens evidence for heightened transmission of an asian sequence type 239 mrsa clone量化特定类型复制数字医院病原体的证据传播加剧亚洲239型耐甲氧西林金黄色葡萄球菌克隆序列.pdf
- quantitative 3-dimensional imaging of murine neointimal and atherosclerotic lesions by optical projection tomography小鼠血管内膜和动脉粥样硬化病变的定量三维成像光学投影层析成像.pdf
- quantitative analysis and diagnostic significance of methylated slc19a3 dna in the plasma of breast and gastric cancer patientsdna甲基化的定量分析和诊断意义slc19a3在乳腺癌和胃癌患者的血浆.pdf
- quantitative analysis of genetic and neuronal multi-perturbation experiments定量分析的遗传和神经multi-perturbation实验.pdf
- quantifying the underestimation of relative risks from genome-wide association studies从全基因组关联研究量化相对风险的低估.pdf
- quantitative analysis of mechanisms that govern red blood cell age structure and dynamics during anaemia定量分析的管理机制在贫血红细胞年龄结构和动力学.pdf
- quantitative analysis of serum procollagen type i c-terminal propeptide by immunoassay on microchip定量分析血清胶原i型c端前肽的免疫测定微芯片.pdf
- quantitative analysis of cell nucleus organisation定量分析的细胞核组织.pdf
- quantitative analysis of peripheral tissue perfusion using spatiotemporal molecular dynamics定量分析的外围组织灌注使用时空分子动力学.pdf
- quantitative analysis of protein phosphorylations and interactions by multi-colour ip-fcm as an input for kinetic modelling of signalling networks定量分析蛋白质的磷酸化和交互的彩色ip-fcm作为输入信号网络的动力学模型.pdf
- randomised controlled double-blind non-inferiority trial of two antivenoms for saw-scaled or carpet viper (echis ocellatus) envenoming in nigeria随机对照双盲non-inferiority试验两个抗蛇毒血清的锯鳞或地毯蝰(echis ocellatus)下毒在尼日利亚.pdf
- random field model reveals structure of the protein recombinational landscape随机场模型揭示了蛋白质的结构重组景观.pdf
- random mutagenesis mappit analysis identifies binding sites for vif and gag in both cytidine deaminase domains of apobec3g随机诱变mappit分析确定绑定网站vif apobec3g和呕吐在胞嘧啶核苷脱氨酶域.pdf
- randomized clinical trial on ivermectin versus thiabendazole for the treatment of strongyloidiasis随机临床试验对伊维菌素与噻苯咪唑类圆线虫病的治疗.pdf
- randomized controlled trial of fish oil and montelukast and their combination on airway inflammation and hyperpnea-induced bronchoconstriction随机对照试验的鱼油和montelukast及其组合对气道炎症和hyperpnea-induced支气管收缩.pdf
- randomized polypill crossover trial in people aged 50 and over随机polypill交叉试验在50岁以上的人群中.pdf
- randomization in laboratory procedure is key to obtaining reproducible microarray results随机化在实验室过程是获得可再生的关键芯片结果.pdf
- randomized trial of piperaquine with sulfadoxine-pyrimethamine or dihydroartemisinin for malaria intermittent preventive treatment in children随机试验的哌喹与双氢青蒿素含量测定磺胺多辛-乙胺嘧啶或间歇性预防治疗疟疾的儿童.pdf
- randomized controlled trial of rts,sas02d and rts,sas01e malaria candidate vaccines given according to different schedules in ghanaian children随机对照试验的rts,sas02d和rts sas01e疟疾候选疫苗,鉴于加纳孩子。根据不同的时间表.pdf
- randomised, controlled, assessor blind trial comparing 4% dimeticone lotion with 0.5% malathion liquid for head louse infestation随机控制,评估员盲目试验比较4%二甲硅油乳液0.5%马拉硫磷液体头虱问题.pdf
最近下载
- 八年级生物(上)第六章 《人体生命活动的调节》单元检测卷含答案解析.docx
- 一种水生萤火虫室内规模化饲养装置.pdf VIP
- D301-1~3 室内管线安装(2004年合订本).docx VIP
- 2025至2030中国电子树脂行业产业运行态势及投资规划深度研究报告.docx
- 三一中型挖掘机SY335BH SIC_产品手册用户使用说明书技术参数图解图示电子版.pdf VIP
- 全科教学模式探讨及实践(安徽医科大学第一附属医院 全科医学科 全科医学教研室 唐海沁).pdf VIP
- 最全(一)公安局辅警招聘考试题库.doc VIP
- 直接引语和间接引语课件详细.ppt VIP
- 西式面点师(初级)课件 项目2 面包制作.pptx
- 发酵设备课程设计——1000m³内循环气升式生物酒精发酵罐设计.doc VIP
文档评论(0)