## Proceedings of the Sixth SIAM International Conference on Data MiningThe Sixth SIAM International Conference on Data Mining continues the tradition of presenting approaches, tools, and systems for data mining in fields such as science, engineering, industrial processes, healthcare, and medicine. The datasets in these fields are large, complex, and often noisy. Extracting knowledge requires the use of sophisticated, high-performance, and principled analysis techniques and algorithms, based on sound statistical foundations. These techniques in turn require powerful visualization technologies; implementations that must be carefully tuned for performance; software systems that are usable by scientists, engineers, and physicians as well as researchers; and infrastructures that support them. |

### What people are saying - Write a review

We haven't found any reviews in the usual places.

### Contents

CPM A CovarioncePreserving Projection Method | 24 |

A Lotent Dirichlet Model for Unsupervised Entity Resolution | 47 |

Nome Reference Resolution in Organizational Emdi Archives | 70 |

Mining for Outliers in Sequentiol Dotoboses | 94 |

A Doto Mining Approach | 118 |

KMedns Clustering over d Lorge Dynomic Network | 153 |

Contents | 154 |

Exploring Prototypes for Classification | 176 |

Collaborative Information Extraction and Mining from Multiple Web Documents | 442 |

Cluster Description Formots Problems ond Algorithms | 464 |

Boyesian KMedns ds d MaximizationExpectotion Algorithm | 474 |

Cone Cluster Lobeling for Support Vector Clustering | 484 |

A New PrivacyPreserving Distributed kClustering Algorithm | 494 |

Dissimilarity Medsures for Defecting Hepatotoxicity in Clinical Triol Data | 509 |

Robust Estimation for Mixture of Probability Tobles Bosed on 3likelihood | 519 |

RiskSensitive Learning vid Expected Shortfoll Minimization | 529 |

A Semontic Approach for Mining Hidden Links from Complementory ond Noninteractive | 200 |

Mining Frequent Agreement Subtrees in Phylogenetic Databases | 222 |

Trend Relotional Andlysis ond GreyFuzzy Clustering Method | 234 |

Weighted Clustering Ensembles | 258 |

A TopDown Row Enumerotion | 282 |

Discovery of Coevolving Spotiol Event Sets | 306 |

DensityBased Clustering over on Evolving Doto Stredm with Noise | 328 |

Efficient Mining of Temporally Annotated Sequences | 348 |

Segmentation ond Dimensionality Reduction | 372 |

Item Sets Thot Compress | 395 |

Mining Frequent Closed itemsets Out of Core | 419 |

Confidence Estimotion Methods for Portiolly Supervised Relotion Extraction | 539 |

Leorning from Incomplete Rotings Using Nonnegotive Motrix Foctorizotion | 549 |

Modeling Evolutionary Behaviors for CommunityBased Dynamic Recommendation | 559 |

DotoEnhanced Predictive Modeling for Sales Torgeting | 569 |

Mining dnd Volidating Locolized Frequent Itemsets with Dynamic Tolerance | 579 |

Mining Novel Associotion Rules from Text | 589 |

Using Compression to identify Classes of Induthentic Texts | 604 |

Robust Clustering for Tracking Noisy Evolving Doto Streams | 619 |

Finding Sequentiol Potterns from Mossive Number of Spotiotemporal Events | 634 |

### Common terms and phrases

accuracy algorithm analysis applications approach approximation assigned associated average centers centroids classification closed cluster compared complete computed concepts condition consider contains corresponding cost data mining data set database defined denote described distance distribution documents edge effective efficient error estimate event example experiments extraction Figure frequent function given graph increases initial input itemsets iteration K-means knowledge label learning matrix means measure method mining node normalized objects observed optimal parameters partition patterns performance points positive possible present probability problem produce proposed prototype random references represent respectively sample segmentation selected semantic sequence shown shows similarity space statistics step string structure Table technique threshold tion transactions tree tuple types values variables vector weight