- 4
- 0
- 约9.64千字
- 约 33页
- 2017-03-09 发布于上海
- 举报
Vincent Blondel and Paul Van Dooren CESAME, 文森特布朗德和保罗Van Dooren塞莎
Web searching and graph similarityVincent Blondel and Paul Van Dooren*CESAME, Universite Catholique de Louvainhttp://www.inma.ucl.ac.be/ * Thanks to P. Sennelart GAMM, 2003 The web graph Nodes = web pages, Edges = hyperlinks between pages 3 billion (Google searched 3,083,324,625 webpages in 2002) Average of 7 outgoing links The web graph Nodes = web pages, Edges = hyperlinks between pages 3 billion (Google searched 3,083,324,625 webpages in 2002) Average of 7 outgoing links Growth of a few % every month Outline 1. Structure of the web 2. Methods for searching the web (Google PageRank and Kleinberg Hits) 3. Similarity in graphs 4. Application to synonym extraction (Blondel-Sennelart) Structure of the web Experiments : two crawls over 200 million pages in 1999 found a giant strongly connected component (core) Contains most prominent sites It contains 30% of all pages Average distance between nodes is 16 Small world Ref : Broder et al., Graph structure in the web, WWW9, 2000 The web is a bowtie Ref : The web is a bowtie, Nature, May 11, 2000 In- and out-degree distributions Power law distribution : number of pages of in-degree n is proportional to 1/n2.1 (Zipf law) A score for every page The score of a page is high if the page has many incoming links coming from pages with high page score One browses from page to page by following outgoing links with equal probability. Score = frequency a page is visited. A score for every page The score of a page is high if the page has many incoming links coming from pages with high page score One browses from page to page by following outgoing links with equal probability. Score = frequency a page is visited. … some pages may have no outgoing links … many pages have zero frequency PageRank : teleporting random score The surfer follows a path by choosing an outgoing link with probability p/dout(i)
您可能关注的文档
- USHC 11 ushc.1 – Settlement Rock Hill High School.ppt
- usinas GO e MS v2 epeusinas EPE和V2 质谱.gov.br.ppt
- Using a Web Based CMS Scholars' Bank Home使用网络为基础的合作医疗学者的银行.ppt
- Using a GuidelineCentered Approach for the Design of a使用指南为中心的方法设计一个。.ppt
- Using a Sentiment Map for Visualizing Credibility of News利用情感地图进行新闻可信度的可视化.ppt
- Using Biosurfactants Produced from Agriculture Process Waste利用生物表面活性剂生产过程从农业废物.ppt
- USING A THREE YEAR ROLLING DEVELOPMENT PLAN3YRDP TO LEAD采用一三年滚动发展规划3yrdp领导.ppt
- Using an Interactive JavaBased Environment to Facilitate使用交互式的基于java的环境来促进.ppt
- Using Cellular Automata and Influence Maps in Games利用元胞自动机及其在游戏中的影响.ppt
- Using Alice rferro用爱丽丝 rferro.ppt
最近下载
- 某天然气公司燃气系统运行安全现状评价报告.doc
- 维修Switch+中文版教程.pdf VIP
- 公共营养师四级试题【含答案】.docx VIP
- 统编人教部编版小学六年级下册道德与法治第一单元教学案.docx VIP
- IPC-6012F-CN-中文版 2024 TOC 刚性印制板的鉴定及性能规范.pdf VIP
- 2025年寒假作业七年级生物北师大版答案.pdf VIP
- impella for doctors左心辅助知识讲座.pptx VIP
- 成人患者医用粘胶相关性皮肤损伤的预防及护理(1).pptx VIP
- 安徽省高新技术产品国际竞争力:现状、挑战与提升策略.docx VIP
- PENTAX宾得 645NII相机手册.pdf VIP
原创力文档

文档评论(0)