The XTree An Index structure for High Dimensional Data采油树的高维数据索引结构.pptVIP

  • 2
  • 0
  • 约7千字
  • 约 33页
  • 2017-03-09 发布于上海
  • 举报

The XTree An Index structure for High Dimensional Data采油树的高维数据索引结构.ppt

The XTree An Index structure for High Dimensional Data采油树的高维数据索引结构

The X-Tree An Index Structure for High Dimensional Data Outline Introduction Problems of R-tree based structures X-tree Structure X-tree Algorithms Overall-Minimal Split Performance Evaluation Introduction Objective - To index point and spatial data in high-dimensional space Dimensions - few tens to hundreds Hyper-rectangles Fields - CAD, Molecular biology Improves upon R*-tree Approach ‘Minimal Overlap Split’ Directory structure organization - ‘Supernodes’ Performance is better than R* tree and TV tree by 2 orders of magnitude Previous work (on High Dimensional Data) Reduce dimensionality - two basic approaches: Data is highly clustered and correlated Occupy only some space Algorithms to transform to lower dimension Index using traditional multi-dimensional index structures Eg: SS Tree Small number of dimensions contain most of the information Eg: TV Tree BUT…reduced dimensions may still be too high Problem with R* tree Why R*-Trees ? Handles both point and spatial data Spatial data is not transformed to point data Performance deteriorates rapidly with dimension. After detailed evaluations, found that Overlap in directory increases rapidly with growing dimensionality. Dimension=5, Overlap=90% Overlap ?Query Performance ? Defining Overlap Intuitively - Percentage of volume covered by more than one directory hyper-rectangle Overlap R-tree node contains n hyper-rectangles {R1, R2, … Rn} Overlap directly corresponds to query performance (only if query objects are uniformly distributed) Query distribution estimated by data distribution In high dimensional data queries and data are clustered Defining Overlap (contd) Weighted Overlap More accurate Percentage of data objects in overlapping space Defining Overlap (contd) Multi Overlap - How many Ri’s in the overlapping space ? Overlap in R* Tree Dimensionality ?, Overlap ? So multiple paths need to be searched for each query X-Tree - eXtended node tree Goal - Efficient query processing of high di

您可能关注的文档

文档评论(0)

1亿VIP精品文档

相关文档