Clustering and Information Retrieval

Weili Wu, Hui Xiong, S. Shekhar

Springer Science & Business Media, 2013 M12 1 - 330 pages

Clustering is an important technique for discovering relatively dense sub-regions or sub-spaces of a multi-dimension data distribution. Clus tering has been used in information retrieval for many different purposes, such as query expansion, document grouping, document indexing, and visualization of search results. In this book, we address issues of cluster ing algorithms, evaluation methodologies, applications, and architectures for information retrieval. The first two chapters discuss clustering algorithms. The chapter from Baeza-Yates et al. describes a clustering method for a general metric space which is a common model of data relevant to information retrieval. The chapter by Guha, Rastogi, and Shim presents a survey as well as detailed discussion of two clustering algorithms: CURE and ROCK for numeric data and categorical data respectively. Evaluation methodologies are addressed in the next two chapters. Ertoz et al. demonstrate the use of text retrieval benchmarks, such as TRECS, to evaluate clustering algorithms. He et al. provide objective measures of clustering quality in their chapter. Applications of clustering methods to information retrieval is ad dressed in the next four chapters. Chu et al. and Noel et al. explore feature selection using word stems, phrases, and link associations for document clustering and indexing. Wen et al. and Sung et al. discuss applications of clustering to user queries and data cleansing. Finally, we consider the problem of designing architectures for infor mation retrieval. Crichton, Hughes, and Kelly elaborate on the devel opment of a scientific data system architecture for information retrieval.

Preview this book »

Selected pages

Title Page

Table of Contents

References

A Robust Clustering Algorithm for Categorical Attributes	54

Clustering from an Optimization Perspective નસ	72

27	78

On Quantitative Evaluation of Clustering Systems 105	104

Techniques for Textual Document Indexing and Retrieval	135

Introduction	157

Document Clustering Visualization and Retrieval	160

Query Clustering in the Web Context	195

Clustering Techniques for Large Database Cleansing 227	226

A Science Data System Architecture	261

Copyright

Other editions - View all

Clustering and Information Retrieval
Weili Wu,Hui Xiong,Shashi Shekhar
Limited preview - 2003

Clustering and Information Retrieval
Weili Wu,Hui Xiong,Shashi Shekhar
No preview available - 2011

Clustering and Information Retrieval
Weili Wu,Hui Xiong,S. Shekhar
No preview available - 2014

Common terms and phrases

approach approximate ART-C cardinality categorical attributes centroid Charikar chernoff bounds clicks clus cluster compactness cluster separation clustering method clustering system co-citation common compared comparison methods complexity contains cr(c criterion function data cleansing Data Mining data points data set defined different clusters distribution document clustering document similarity domain duplicate records edit distance evaluation example Figure frequent itemsets granulation hierarchical clustering hypernym identify IEEE information retrieval input IRSS itemset supports Jaccard coefficient k-means K-median Keratoconus keywords large databases merged cluster metric space minsup nearest neighbor graph node number of clusters number of links number of points outliers pair of clusters pair of points pairwise parameter partition phrases Priority Queue problem Proceedings product server profile server query clustering query expansion query sessions relevant search engines semantic similarity function string subset Symposium technique threshold transactions vector weight word stems

Bibliographic information

Title	Clustering and Information Retrieval Volume 11 of Network Theory and Applications
Editors	Weili Wu, Hui Xiong, S. Shekhar
Edition	illustrated
Publisher	Springer Science & Business Media, 2013
ISBN	1461302277, 9781461302278
Length	330 pages

Export Citation	BiBTeX EndNote RefMan

About Google Books - Privacy Policy - Terms of Service - Information for Publishers - Report an issue - Help - Google Home