Proceedings of the Fourth SIAM International Conference on Data MiningMichael W. Berry SIAM, 2004 M01 1 - 537 pages The Fourth SIAM International Conference on Data Mining continues the tradition of providing an open forum for the presentation and discussion of innovative algorithms as well as novel applications of data mining. This is reflected in the talks by the four keynote speakers who discuss data usability issues in systems for data mining in science and engineering, issues raised by new technologies that generate biological data, ways to find complex structured patterns in linked data, and advances in Bayesian inference techniques. This proceedings includes 61 research papers. |
Contents
Making TimeSeries Classification More Accurate Using Learned Constraints | 11 |
A New Model for Clustering Linear Sequences | 23 |
Nonlinear Manifold Learning for Data Stream | 33 |
Text Mining from Site Invariant and Dependent Features for Information Extraction | 45 |
Contents | 49 |
Constructing Time Decompositions for Analyzing Time Stamped Documents | 57 |
Equivalence of Several TwoStage Methods for Linear Discriminant Analysis | 69 |
A Framework for Discovering Colocation Patterns in Data Sets with Extended Spatial | 78 |
Clustering Categorical Data Using the CorrelatedForce Ensemble | 269 |
Enhancing Communities of Interest Using Bayesian Stochastic Blockmodels | 291 |
DOMBased Information Space Adsorption for Web Information Hierarchy Mining | 312 |
Active SemiSupervision for Pairwise Constrained Clustering | 333 |
A General Probabilistic Framework for Mining Labeled Ordered Trees | 357 |
A Mixture Model for Clustering Ensembles | 379 |
Visualizing RFM Segmentation | 391 |
Visually Mining through Cluster Hierarchies | 400 |
A TopDown Method for Mining Most Specific Frequent Patterns in Biological Sequences | 90 |
Using Support Vector Machines for Classifying Large Sets of MultiRepresented Objects | 102 |
Minimum SumSquared Residue CoClustering of Gene Expression Data | 114 |
Training Support Vector Machine Using Adaptive Clustering | 126 |
IREP++ A Faster Rule Learning Algorithm | 138 |
A Single Pass Generalized Incremental Algorithm for Clustering | 147 |
A Distributed Tool for Constructing Summaries of HighDimensional Discrete | 154 |
Basic Association Rules | 166 |
Hierarchical Clustering for Thematic Browsing and Summarization of Large Sets | 178 |
An Abstract Weighting Framework for Clustering Algorithms | 200 |
Linear Regression and Classification | 222 |
DensityConnected Subspace Clustering for HighDimensional Data | 246 |
ClassSpecific Ensembles for Active Learning in Digital Imagery | 412 |
Mining Text for Word Senses Using Independent Component Analysis | 422 |
Accelerating Closed Itemset Mining by Deeply Pushing the LengthDecreasing | 432 |
A Recursive Model for Graph Mining | 442 |
Text Mining Using Nonnegative Matrix Factorizations | 452 |
The Aspect Bernoulli Model | 462 |
Iterative Feature and Data Clustering | 472 |
A Foundational Approach to Mining Itemset Utilities from Databases | 482 |
ReservoirBased Random Sampling with Replacement from Data Stream | 492 |
Classifying Documents without Labels | 502 |
Subspace Clustering of High Dimensional Data | 517 |
Mining Patters of Activity from Video Data | 532 |
Common terms and phrases
accuracy analysis applied approach association rules attributes binary classification clustering algorithm co-clustering co-location patterns column component computed concept graph conditional probability constraints corresponding Data Mining data set data stream database decomposition defined denoted density distance distribution document set domain efficient EM algorithm ensemble Euclidean distance evaluation extraction feature Figure frequent patterns genes hierarchical IEKA information preserving input International Conference interval IREP++ ISOMAP iteration k-means k-means algorithm kernel keywords Knowledge Discovery labels Lemma linear Machine Learning matrix measure measure function method minimal Mixture Density mixture model money money money node number of clusters objective function obtained optimal parameters partition performance points problem Proc pruning random RIPPER Section sequence similar space spatial statistical subgraph subintervals subset subspace support vector support vector machines Table techniques text mining tion TOMMS training data training examples transaction tree unsupervised learning UPGMA values wrapper