Proceedings of the Fifth SIAM International Conference on Data MiningHillol Kargupta SIAM, 2005 M04 1 - 648 pages The Fifth SIAM International Conference on Data Mining continues the tradition of providing an open forum for the presentation and discussion of innovative algorithms as well as novel applications of data mining. Advances in information technology and data collection methods have led to the availability of large data sets in commercial enterprises and in a wide variety of scientific and engineering disciplines. The field of data mining draws upon extensive work in areas such as statistics, machine learning, pattern recognition, databases, and high performance computing to discover interesting and previously unknown information in data. This conference results in data mining, including applications, algorithms, software, and systems. |
Contents
A Random Walks Perspective on Maximizing Satisfaction and Profit | 12 |
Surveying Data for Patchy Structure | 20 |
Chris Ding and Jieping | 32 |
Summarizing and Mining Skewed Data Streams | 44 |
Online Analysis of Community Evolution in Data Streams | 56 |
Mining Frequent Itemsets from Data Streams with a TimeSensitive Sliding Window | 68 |
On Abnormality Detection in Spuriously Populated Data Streams | 80 |
PrivacyPreserving Classification of Customer Data without Loss of Accuracy | 92 |
A Load Shedding Scheme for Classifying Data Streams | 346 |
Variational Learning for NoisyOR Component Analysis | 370 |
A New Statistic for the Structural Break Detection in Time Series | 392 |
Efficient Mining of Maximal Sequential Patterns Using Multiple Samples | 415 |
Correlation Clustering for Learning Mixtures of Canonical Correlation Models | 439 |
Mining Iceberg Cubes from Data Warehouses | 461 |
Decision Tree Induction in High Dimensional Hierarchically Distributed Databases | 466 |
Sparse Fisher Discriminant Analysis for Computer Aided Detection | 476 |
A Feasible Approach for Inverse | 103 |
On Variable Constraints in Privacy Preserving Data Mining | 115 |
Clustering with ModelLevel Constraints | 126 |
Feasibility Issues and the kMeans Algorithm | 138 |
A Cutting Algorithm for the Minimum SumofSquared Error Clustering | 150 |
Dynamic Classification of Defect Structures in Molecular Dynamics Simulation Data | 161 |
Simultaneous Mining of Positive and Negative Spatial | 173 |
Finding Young Stellar Populations in Elliptical Galaxies from Independent Components | 183 |
Hybrid Attribute Reduction for Classification Based on a Fuzzy Rough Set Technique | 195 |
Contents | 205 |
Lazy Learning for Classification Based on Query Projections | 227 |
DepthFirst Nonderivable Itemset Mining | 250 |
A Spectral Clustering Approach to Finding Communities in Graphs | 274 |
Learning to Refine Ontology for a New Web Site Using a Bayesian Approach | 298 |
Contents | 304 |
Exploiting Geometry for Support Vector Machine Indexing | 322 |
Making Data Mining Models Useful to Model Nonpaying Customers of Exchange Carriers | 486 |
Cluster Validity Analysis of Alternative Results from Multiobjective Optimization | 496 |
Three Myths about Dynamic Time Warping Data Mining | 506 |
Mining TopK Itemsets over a Sliding Window Based on Zipfian Distribution | 516 |
On Clustering Binary Data | 526 |
Pushing Feature Selection Ahead of Join | 536 |
Discretization Using Successive Pseudo Deletion at Maximum Information Gain | 546 |
An AlgorithmIndependent Approach | 556 |
The Best Nurturers in Computer Science Research | 566 |
Knowledge Discovery from Heterogeneous Dynamic Systems Using ChangePoint | 571 |
NearNeighbor Search in Pattern Distance Spaces | 586 |
Indexing Sequences by Sequential Pattern Analysis | 601 |
Symmetric Statistical Translation Models for Automatic Image Annotation | 616 |
A Flexible Approximation Scheme from Clustered TermDocument Matrices | 631 |
647 | |
Common terms and phrases
accuracy algorithm analysis applied approach approximation association rules attribute Bayesian candidate class label classification component compute concepts consider constraints correlation corresponding CPDs CUSUMS data mining data points data streams database dataset decision tree defined denoted detection Dirichlet Distributions distance distribution domain efficient entropy error rate estimation evaluated example experiments feature selection feature values Figure framework frequent itemsets function given graph hyperplane iterative k-means K-means algorithm kernel computation kernel matrix lazy learning linear load shedding Loadstar Machine Learning Markov method MSPX nodes number of clusters ontology optimal paper parameter sharing partial orders partition patterns performance prediction problem Proc proposed pruning QPAL query random RelDC rough set sample scheme Section sequence similar solution statistics structure subset support vector support vector machines Table techniques text fragments Theorem threshold tion top-k transactions weights