Data Mining for Scientific and Engineering ApplicationsR.L. Grossman, C. Kamath, P. Kegelmeyer, V. Kumar, R. Namburu Springer Science & Business Media, 2001 M10 31 - 605 pages Advances in technology are making massive data sets common in many scientific disciplines, such as astronomy, medical imaging, bio-informatics, combinatorial chemistry, remote sensing, and physics. To find useful information in these data sets, scientists and engineers are turning to data mining techniques. This book is a collection of papers based on the first two in a series of workshops on mining scientific datasets. It illustrates the diversity of problems and application areas that can benefit from data mining, as well as the issues and challenges that differentiate scientific data mining from its commercial counterpart. While the focus of the book is on mining scientific data, the work is of broader interest as many of the techniques can be applied equally well to data arising in business and web applications. Audience: This work would be an excellent text for students and researchers who are familiar with the basic principles of data mining and want to learn more about the application of data mining to their problem in science or engineering. |
Contents
ON MINING SCIENTIFIC DATASETS | 1 |
UNDERSTANDING HIGH DIMENSIONAL AND LARGE DATA SETS SOME MATHEMATICAL CHALLENGES AND OPPORTUNITIES | 23 |
DATA MINING AT THE INTERFACE OF COMPUTER SCIENCE AND STATISTICS | 35 |
MINING LARGE IMAGE COLLECTIONS | 63 |
MINING ASTRONOMICAL DATABASES | 85 |
SEARCHING FOR BENTDOUBLE GALAXIES IN THE FIRST SURVEY | 95 |
A DATASPACE INFRASTRUCTURE FOR ASTRONOMICAL DATA | 115 |
DATA MINING APPLICATIONS IN BIOINFORMATICS | 125 |
DECOMPOSABLE ALGORITHMS FOR DATA MINING | 307 |
HDDI HIERARCHICAL DISTRIBUTED DYNAMIC INDEXING | 319 |
PARALLEL ALGORITHMS FOR CLUSTERING HIGHDIMENSIONAL LARGESCALE DATASETS | 335 |
EFFICIENT CLUSTERING OF VERY LARGE DOCUMENT COLLECTIONS | 357 |
A SCALABLE HIERARCHICAL ALGORITHM FOR UNSUPERVISED CLUSTERING | 383 |
HIGHPERFORMANCE SINGULAR VALUE DECOMPOSITION | 401 |
MINING HIGHDIMENSIONAL SCIENTIFIC DATA SETS USING SINGULAR VALUE DECOMPOSITION | 425 |
SPATIAL DEPENDENCE IN DATA MINING | 439 |
MINING RESIDUE CONTACTS IN PROTEINS | 141 |
KDD SERVICES AT THE GODDARD EARTH SCIENCES DISTRIBUTED ACTIVE ARCHIVE CENTER | 165 |
DATA MINING IN INTEGRATED DATA ACCESS AND DATA ANALYSIS SYSTEMS | 183 |
SPATIAL DATA MINING FOR CLASSIFICATION VISUALISATION AND INTERPRETATION WITH ARTMAP NEURAL NETWORK | 201 |
REAL TIME FEATURE EXTRACTION FOR THE ANALYSIS OF TURBULENT FLOWS | 223 |
DATA MINING FOR TURBULENT FLOWS | 239 |
EVITA EFFICIENT VISUALIZATION AND INTERROGATION OF TERASCALE DATA | 257 |
TOWARDS UBIQUITOUS MINING OF DISTRIBUTED DATA | 281 |
SPARC SPATIAL ASSOCIATION RULEBASED CLASSIFICATION | 461 |
WHATS SPATIAL ABOUT SPATIAL DATA MINING THREE CASE STUDIES | 487 |
PREDICTING FAILURES IN EVENT SEQUENCES | 515 |
EFFICIENT ALGORITHMS FOR MINING LONG PATTERNS IN SCIENTIFIC DATA SETS | 541 |
PROBABILISTIC ESTIMATION IN DATA MINING | 567 |
CLASSIFICATION USING ASSOCIATION RULES WEAKNESSES AND ENHANCEMENTS | 591 |
Other editions - View all
Data Mining for Scientific and Engineering Applications R.L. Grossman,C. Kamath,P. Kegelmeyer,V. Kumar,R. Namburu No preview available - 2001 |
Common terms and phrases
accuracy amino acids applications approach approximation ARTMAP association rules attributes bent-double bioinformatics candidate dense units catalog classification clustering algorithm collection Computer Science confusion matrix count data analysis data mining data mining techniques data sets database decision tree dimensional dimensions distributed document domain efficient error estimation EVITA example extraction Figure frequent galaxies genes global graph grid hash table identify itemsets join index k-means algorithm Knowledge Discovery log-determinant machine learning matrix method neighbors neural network node observations outliers parallel parameters partition patterns PDDP performance precision recall predictive problem processor protein pruning query random regression remote sensing sample scientific data sequence SIESIP simulation singular value decomposition singular values space spatial data spatial dependence statistical step structure subset swirl Table test set threshold tion training objects training set transaction variables vector space model visualization wavelet