Clustering and Information RetrievalWeili Wu, Hui Xiong, Shashi Shekhar Springer Science & Business Media, 2003 M11 30 - 330 pages Clustering is an important technique for discovering relatively dense sub-regions or sub-spaces of a multi-dimension data distribution. Clus tering has been used in information retrieval for many different purposes, such as query expansion, document grouping, document indexing, and visualization of search results. In this book, we address issues of cluster ing algorithms, evaluation methodologies, applications, and architectures for information retrieval. The first two chapters discuss clustering algorithms. The chapter from Baeza-Yates et al. describes a clustering method for a general metric space which is a common model of data relevant to information retrieval. The chapter by Guha, Rastogi, and Shim presents a survey as well as detailed discussion of two clustering algorithms: CURE and ROCK for numeric data and categorical data respectively. Evaluation methodologies are addressed in the next two chapters. Ertoz et al. demonstrate the use of text retrieval benchmarks, such as TRECS, to evaluate clustering algorithms. He et al. provide objective measures of clustering quality in their chapter. Applications of clustering methods to information retrieval is ad dressed in the next four chapters. Chu et al. and Noel et al. explore feature selection using word stems, phrases, and link associations for document clustering and indexing. Wen et al. and Sung et al. discuss applications of clustering to user queries and data cleansing. Finally, we consider the problem of designing architectures for infor mation retrieval. Crichton, Hughes, and Kelly elaborate on the devel opment of a scientific data system architecture for information retrieval. |
Contents
2 | |
IV | 4 |
V | 6 |
VI | 7 |
VII | 9 |
VIII | 10 |
IX | 11 |
X | 12 |
162 | |
CXIII | 163 |
CXIV | 166 |
CXV | 167 |
CXVI | 172 |
CXVII | 176 |
CXVIII | 177 |
CXIX | 180 |
XI | 13 |
XII | 14 |
XIII | 15 |
XVI | 17 |
XVIII | 18 |
XIX | 21 |
XX | 22 |
XXI | 24 |
XXIII | 27 |
XXIV | 29 |
XXV | 32 |
35 | |
36 | |
XXVIII | 37 |
XXIX | 40 |
XXX | 44 |
XXXI | 45 |
XXXII | 48 |
XXXIV | 49 |
XXXV | 53 |
XXXVI | 54 |
XXXVIII | 55 |
XL | 56 |
XLI | 57 |
XLII | 58 |
XLIII | 60 |
XLIV | 61 |
XLV | 63 |
XLVI | 65 |
XLVIII | 66 |
L | 69 |
LI | 71 |
LIII | 72 |
LIV | 75 |
LV | 76 |
83 | |
84 | |
LVIII | 86 |
LX | 87 |
LXI | 88 |
LXII | 89 |
LXIII | 90 |
LXIV | 91 |
LXV | 92 |
LXVII | 95 |
LXVIII | 99 |
LXIX | 100 |
LXX | 101 |
105 | |
106 | |
LXXIII | 107 |
LXXV | 109 |
LXXVI | 110 |
LXXVII | 112 |
LXXVIII | 114 |
LXXIX | 116 |
LXXX | 117 |
LXXXI | 118 |
LXXXIII | 119 |
LXXXIV | 122 |
LXXXV | 123 |
LXXXVII | 125 |
LXXXVIII | 126 |
XC | 127 |
XCI | 128 |
XCII | 130 |
XCIII | 131 |
135 | |
136 | |
XCVI | 137 |
XCVIII | 138 |
XCIX | 139 |
C | 140 |
CI | 141 |
CII | 143 |
CIII | 147 |
CIV | 148 |
CV | 149 |
CVI | 152 |
CVII | 153 |
CVIII | 154 |
CIX | 156 |
CX | 157 |
161 | |
CXX | 182 |
CXXI | 186 |
CXXII | 189 |
CXXIII | 191 |
195 | |
196 | |
CXXVI | 199 |
CXXVIII | 200 |
CXXIX | 202 |
CXXX | 204 |
CXXXI | 206 |
CXXXII | 208 |
CXXXIV | 209 |
CXXXV | 210 |
CXXXVI | 212 |
CXXXVII | 214 |
CXXXVIII | 215 |
CXXXIX | 216 |
CXL | 217 |
CXLII | 221 |
CXLIII | 222 |
227 | |
228 | |
CXLVII | 229 |
CXLVIII | 230 |
CXLIX | 235 |
CL | 236 |
CLII | 243 |
CLIII | 244 |
CLIV | 245 |
CLV | 247 |
CLVI | 253 |
CLVII | 254 |
CLVIII | 255 |
261 | |
CLX | 263 |
CLXI | 264 |
CLXII | 266 |
CLXIII | 267 |
CLXIV | 268 |
CLXV | 269 |
CLXVI | 270 |
CLXVII | 271 |
CLXVIII | 273 |
CLXX | 276 |
CLXXII | 277 |
CLXXIII | 278 |
CLXXIV | 279 |
CLXXV | 280 |
CLXXVII | 281 |
CLXXIX | 282 |
CLXXX | 283 |
CLXXXII | 284 |
CLXXXIII | 285 |
CLXXXV | 288 |
CLXXXVI | 292 |
CLXXXIX | 293 |
CXCI | 294 |
CXCII | 295 |
CXCV | 296 |
CXCVI | 298 |
299 | |
300 | |
CXCIX | 302 |
CCI | 303 |
CCII | 305 |
CCIII | 306 |
CCIV | 307 |
CCV | 308 |
CCVI | 309 |
CCVII | 310 |
CCVIII | 311 |
CCIX | 312 |
CCXI | 313 |
CCXIII | 315 |
CCXIV | 316 |
CCXV | 317 |
CCXVII | 319 |
CCXVIII | 320 |
CCXIX | 321 |
CCXX | 323 |
CCXXI | 324 |
Other editions - View all
Common terms and phrases
approach approximate ART-C attributes centroid clicks cluster separation clustering algorithm clustering methods co-citation common compared comparison methods complexity computing concepts contains CORBA data architecture data cleansing data element Data Mining data model data points data set defined described descriptors different clusters distribution document clustering domain Dublin Core duplicate records edit distance EDRN entropy evaluation example Figure frequent itemsets granular computing granulation graph hierarchical clustering hypernym identify IEEE implementation information retrieval input interface IRSS itemset supports k-means keywords large databases measure metadata metric space nearest neighbor graph node number of clusters number of links number of points OODT outliers output pair of points parameter partition phrases Priority Queue problem Proceedings product server profile server query clustering query expansion query sessions relevant represent retrieval systems sample search engines semantic similarity function string technique threshold transactions vector word stems
Popular passages
Page 83 - The content of this work does not necessarily reflect the position or policy of the government and no official endorsement should be inferred.