Data Mining and Knowledge Discovery Handbook

Front Cover
Oded Maimon, Lior Rokach
Springer Science & Business Media, 2006 M05 28 - 1383 pages

Data Mining and Knowledge Discovery Handbook organizes all major concepts, theories, methodologies, trends, challenges and applications of data mining (DM) and knowledge discovery in databases (KDD) into a coherent and unified repository.

This book first surveys, then provides comprehensive yet concise algorithmic descriptions of methods, including classic methods plus the extensions and novel methods developed recently. This volume concludes with in-depth descriptions of data mining applications in various interdisciplinary industries including finance, marketing, medicine, biology, engineering, telecommunications, software, and security.

Data Mining and Knowledge Discovery Handbook is designed for research scientists and graduate-level students in computer science and engineering. This book is also suitable for professionals in fields such as computing applications, information systems management, and strategic research management.

 

Contents

Acknowledgments
88
Discretization Methods
113
39
130
8
148
The Curse of Dimensionality
159
Classification Problem Extensions
161
References
162
Decision Trees
165
Web Mining
899
Graph Properties of the Web
900
Web Search
902
Text Classification
904
Hypertext Classification
905
Information Extraction and Wrapper Induction
907
The Semantic Web
908
Web Usage Mining
909

Algorithmic Framework for Decision Trees
167
Univariate Splitting Criteria
168
Multivariate Splitting Criteria
174
Pruning Methods
175
Other Issues
179
Decision Trees Inducers
181
Advantages and Disadvantages of Decision Trees
183
Decision Tree Extensions
185
References
187
Bayesian Networks
193
Representation
195
Reasoning
198
Learning
200
Bayesian Networks in Data Mining
211
Data Mining Applications
218
Conclusions and Future Research Directions
223
Acknowledgments
226
Data Mining within a Regression Framework
231
Some Definitions
232
Regression Splines
234
Smoothing Splines
236
Locally Weighted Regression as a Smoother
238
Smoothers for Multiple Predictors
239
Recursive Partitioning
242
Conclusions
252
References
253
Rule Induction
277
14
296
48
314
15
321
16
351
Application to Other Types of Data
362
Extensions of the Basic Framework
364
Conclusions
372
References
373
Frequent Set Mining
377
Problem Description
378
Apriori
381
Eclat
384
Optimizations
386
Concise representations
388
Theoretical Aspects
391
Further Reading
392
References
393
18
395
Constraintbased Data Mining 399
398
Background and Notations
402
Solving AntiMonotonic Constraints
404
Introducing non AntiMonotonic Constraints
406
Conclusion
413
References
414
Link Analysis
417
Social Network Analysis
419
Search Engines
422
Viral Marketing
424
Law Enforcement Fraud Detection
426
Combining with Traditional Methods
428
Summary
430
Soft Computing Methods
433
Evolutionary Algorithms for Data Mining
435
An Overview of Evolutionary Algorithms
436
Evolutionary Algorithms for Discovering Classification Rules
442
Evolutionary Algorithms for Clustering
447
Evolutionary Algorithms for Data Preprocessing
450
MultiObjective Optimization with Evolutionary Algorithms
456
Conclusions
459
References
461
an Overview from a Data Mining Perspective 469
468
The ReinforcementLearning Model
470
ReinforcementLearning Algorithms
472
Extensions to Basic Model and Algorithms
476
Applications of ReinforcementLearning
478
ReinforcementLearning and DataMining
479
An Instructive Example
480
References
485
Neural Networks
487
A Brief History
488
Neural Network Models
490
Data Mining Applications
506
Conclusions
508
23
510
On the use of Fuzzy Logic in Data Mining
517
Fuzzy Sets and Fuzzy Logic
518
Soft Regression
522
Fuzzy Association Rules
525
Conclusions
532
Granular Computing and Rough Sets
535
Naive Model for Problem Solving
536
A Geometric Models of Information Granulations
538
Information GranulationsPartitions
540
Nonpartition Application Chinese Wall Security Policy Model
541
Knowledge Representations
543
Topological Concept Hierarchy LatticesTrees
549
Knowledge Processing
553
Information Integration
556
Conclusions
558
Supporting Methods
562
Statistical Methods for Data Mining
565
Statistical Issues in DM
567
Modeling Relationships using Regression Models
573
False Discovery Rate FDR Control in Hypotheses Testing
578
Model Variables or Features Selection using FDR Penalization in GLM
582
Concluding Remarks
584
References
585
Logics for Data Mining 589
588
Generalized quantifiers
590
Some important classes of quantifiers
593
Some comments and conclusion
598
Acknowledgments
599
Wavelet Methods in Data Mining
603
Tao Li Sheng Ma and Mitsunori Ogihara 1 Introduction
604
Wavelet Background
605
Data Management
610
Preprocessing
611
Core Mining Process
614
Conclusion
622
References
623
Fractal Mining
627
Daniel Barbara and Ping Chen 1 Introduction
628
Fractal Dimension
629
Clustering Using the Fractal Dimension
633
Projected Fractal Clustering
641
Tracking Clusters
642
Conclusions
645
29
649
30
661
31
695
32
715
33
731
34
749
35
764
Haixun Wang Philip S Yu and Jiawei
778
37
793
Formal Frameworks And AlgorithmBased Techniques
808
Hybrid Approaches TEG
814
Text Mining Visualization and Analytics
815
References
820
39
828
Spatial Data Mining 833
832
Spatial Data
834
Spatial Outliers
837
Spatial Colocation Rules
841
Predictive Models
844
Spatial Clusters
848
Summary
849
References
850
An Overview
853
Performance Measure
854
Sampling Strategies
858
Ensemblebased Methods
860
Discussion
862
References
863
Relational Data Mining 869
868
Inductive logic programming
874
Relational Association Rules
884
Relational Decision Trees
889
RDM Literature and Internet Resources
894
References
895
Collaborative Filtering
910
Conclusion
911
43
912
A Review of Web Document Clustering Approaches
921
Nora Oikonomakou and Michalis Vazirgiannis 1 Introduction
922
Web Document Clustering Approaches
924
Comparison
935
Conclusions and Open Issues
937
Causal Discovery
945
Background Knowledge
946
Theoretical Foundation
949
Learning a DAG of CN by FDs
950
Experimental Results
953
References
954
Ensemble Methods For Classifiers 957
956
Sequential Methodology
958
Concurrent Methodology
964
Combining Classifiers
966
Ensemble Diversity
973
Ensemble Size
974
Cluster Ensemble
976
References
977
Decomposition Methodology for
981
Decomposition Advantages
984
The Elementary Decomposition Methodology
986
The Decomposers Characteristics
991
The Relation to Other Methodologies
996
Summary
999
Information Fusion
1005
Preprocessing Data
1006
Building Data Models
1009
Information Extraction
1012
References
1013
Parallel And GridBased Data Mining
1017
Antonio Congiusta Domenico Talia and Paolo Trunfio 1 Introduction
1018
Parallel Data Mining
1019
GridBased Data Mining
1027
The Knowledge Grid
1033
Summary
1038
References
1039
Collaborative Data Mining 1043
1042
Remote Collaboration
1044
The Data Mining Process
1047
Collaborative Data Mining Guidelines
1048
Discussion
1052
Conclusions
1053
References
1054
50
1056
Organizational Data Mining
1057
Hamid R Nemati and Christopher D Barko 1 Introduction
1058
Organizational Data Mining
1059
ODM versus Data Mining
1060
Ongoing ODM Research
1062
ODM Evolution
1063
Summary
1066
Mining Time Series Data
1069
Time Series Similarity Measures
1071
Time Series Data Mining
1077
Time Series Representations
1088
Summary
1098
Applications
1104
Data Mining in Medicine
1107
Symbolic Classification Methods
1109
Subsymbolic Classification Methods
1120
Other Methods Supporting Medical Knowledge Discovery
1126
Conclusions
1129
53
1130
Learning Information Patterns in
1139
Learning Stochastic Pattern Models
1141
Searching for MetaPatterns
1148
Conclusions
1156
Data Mining for Selection of Manufacturing Processes
1159
Data Mining in Engineering
1160
Selection of Manufacturing Process with a Data Mining Approach
1161
Conclusion
1165
References
1166
Data Mining of Design Products and Processes
1167
Product Design Process
1169
Product Portfolio Management
1171
Conceptual Design
1172
Detailed Design
1175
Business and Manufacturing Process Planning
1177
Text Mining
1178
Observations and Future Advancements
1180
Epilogue
1182
References
1183
Data Mining in Telecommunications 1189
1188
Types of Telecommunication Data
1190
Data Mining Applications
1194
Conclusion
1199
References
1200
Data Mining for Financial Applications
1203
Specifics of Data Mining in Finance
1205
Aspects of Data Mining Methodology in Finance
1210
Data Mining Models and Practice in Finance
1214
Conclusion
1219
References
1221
58
1223
Data Mining for Intrusion Detection
1225
Data Mining Basics
1226
Data Mining Meets Intrusion Detection
1228
Conclusions and Future Research Directions
1235
References
1236
Data Mining For Software Testing 1239
1238
Mining Software Metrics Databases
1241
InteractionPattern Discovery in System Usage Data
1242
Using Data Mining in Functional Testing
1243
Summary
1246
Acknowledgments
1247
Data Mining for CRM
1249
Data Mining and Campaign Management
1251
Customer Acquisition
1252
Data Mining for Target Marketing
1261
Modeling Process
1263
Evaluation Metrics
1265
Segmentation Methods
1268
Predictive Modeling
1275
InMarket Timing
1281
Pitfalls of Targeting
1285
Conclusions
1297
References
1299
Software
1302
Weka
1305
References
1313
Oracle Data Mining
1315
The MiningintheDatabase Paradigm
1317
ODM Functionality and Algorithms
1319
Text and Spatial Mining
1324
ODM Examples
1325
Conclusions
1327
References
1328
Building Data Mining Solutions with 1331
1330
OLE DB for Data Mining
1332
Data Mining in SQL Server 2000
1336
Building Data Mining Application using OLE DB for Data Mining
1338
for Analysis
1340
Conclusion
1343
References
1344
LERSA Data Mining System
1347
Input Data
1348
Main Features
1349
Final Remarks
1350
GainSmarts Data Mining System 1353
1352
Accessing GainSmarts
1354
Setting Up the Data for Modeling
1355
Knowledge Evaluation
1360
Software Characteristics
1363
WizSofts WizWhy
1365
IfThen Rules
1366
Data Summarization
1367
Classifications
1368
WizWhy vs other Data Mining Methods
1369
DataEngine
1371
Intelligent Technologies for Modeling and Control
1372
Work with DataEngine
1374
References
1377
Index 1379
1378
Copyright

Other editions - View all

Common terms and phrases

Bibliographic information