Data Mining and Knowledge Discovery HandbookOded Maimon, Lior Rokach Springer Science & Business Media, 2006 M05 28 - 1383 pages Data Mining and Knowledge Discovery Handbook organizes all major concepts, theories, methodologies, trends, challenges and applications of data mining (DM) and knowledge discovery in databases (KDD) into a coherent and unified repository. This book first surveys, then provides comprehensive yet concise algorithmic descriptions of methods, including classic methods plus the extensions and novel methods developed recently. This volume concludes with in-depth descriptions of data mining applications in various interdisciplinary industries including finance, marketing, medicine, biology, engineering, telecommunications, software, and security. Data Mining and Knowledge Discovery Handbook is designed for research scientists and graduate-level students in computer science and engineering. This book is also suitable for professionals in fields such as computing applications, information systems management, and strategic research management. |
Contents
Acknowledgments | 88 |
Discretization Methods | 113 |
39 | 130 |
8 | 148 |
The Curse of Dimensionality | 159 |
Classification Problem Extensions | 161 |
References | 162 |
Decision Trees | 165 |
Web Mining | 899 |
Graph Properties of the Web | 900 |
Web Search | 902 |
Text Classification | 904 |
Hypertext Classification | 905 |
Information Extraction and Wrapper Induction | 907 |
The Semantic Web | 908 |
Web Usage Mining | 909 |
Algorithmic Framework for Decision Trees | 167 |
Univariate Splitting Criteria | 168 |
Multivariate Splitting Criteria | 174 |
Pruning Methods | 175 |
Other Issues | 179 |
Decision Trees Inducers | 181 |
Advantages and Disadvantages of Decision Trees | 183 |
Decision Tree Extensions | 185 |
References | 187 |
Bayesian Networks | 193 |
Representation | 195 |
Reasoning | 198 |
Learning | 200 |
Bayesian Networks in Data Mining | 211 |
Data Mining Applications | 218 |
Conclusions and Future Research Directions | 223 |
Acknowledgments | 226 |
Data Mining within a Regression Framework | 231 |
Some Definitions | 232 |
Regression Splines | 234 |
Smoothing Splines | 236 |
Locally Weighted Regression as a Smoother | 238 |
Smoothers for Multiple Predictors | 239 |
Recursive Partitioning | 242 |
Conclusions | 252 |
References | 253 |
Rule Induction | 277 |
14 | 296 |
48 | 314 |
15 | 321 |
16 | 351 |
Application to Other Types of Data | 362 |
Extensions of the Basic Framework | 364 |
Conclusions | 372 |
References | 373 |
Frequent Set Mining | 377 |
Problem Description | 378 |
Apriori | 381 |
Eclat | 384 |
Optimizations | 386 |
Concise representations | 388 |
Theoretical Aspects | 391 |
Further Reading | 392 |
References | 393 |
18 | 395 |
Constraintbased Data Mining 399 | 398 |
Background and Notations | 402 |
Solving AntiMonotonic Constraints | 404 |
Introducing non AntiMonotonic Constraints | 406 |
Conclusion | 413 |
References | 414 |
Link Analysis | 417 |
Social Network Analysis | 419 |
Search Engines | 422 |
Viral Marketing | 424 |
Law Enforcement Fraud Detection | 426 |
Combining with Traditional Methods | 428 |
Summary | 430 |
Soft Computing Methods | 433 |
Evolutionary Algorithms for Data Mining | 435 |
An Overview of Evolutionary Algorithms | 436 |
Evolutionary Algorithms for Discovering Classification Rules | 442 |
Evolutionary Algorithms for Clustering | 447 |
Evolutionary Algorithms for Data Preprocessing | 450 |
MultiObjective Optimization with Evolutionary Algorithms | 456 |
Conclusions | 459 |
References | 461 |
an Overview from a Data Mining Perspective 469 | 468 |
The ReinforcementLearning Model | 470 |
ReinforcementLearning Algorithms | 472 |
Extensions to Basic Model and Algorithms | 476 |
Applications of ReinforcementLearning | 478 |
ReinforcementLearning and DataMining | 479 |
An Instructive Example | 480 |
References | 485 |
Neural Networks | 487 |
A Brief History | 488 |
Neural Network Models | 490 |
Data Mining Applications | 506 |
Conclusions | 508 |
23 | 510 |
On the use of Fuzzy Logic in Data Mining | 517 |
Fuzzy Sets and Fuzzy Logic | 518 |
Soft Regression | 522 |
Fuzzy Association Rules | 525 |
Conclusions | 532 |
Granular Computing and Rough Sets | 535 |
Naive Model for Problem Solving | 536 |
A Geometric Models of Information Granulations | 538 |
Information GranulationsPartitions | 540 |
Nonpartition Application Chinese Wall Security Policy Model | 541 |
Knowledge Representations | 543 |
Topological Concept Hierarchy LatticesTrees | 549 |
Knowledge Processing | 553 |
Information Integration | 556 |
Conclusions | 558 |
Supporting Methods | 562 |
Statistical Methods for Data Mining | 565 |
Statistical Issues in DM | 567 |
Modeling Relationships using Regression Models | 573 |
False Discovery Rate FDR Control in Hypotheses Testing | 578 |
Model Variables or Features Selection using FDR Penalization in GLM | 582 |
Concluding Remarks | 584 |
References | 585 |
Logics for Data Mining 589 | 588 |
Generalized quantifiers | 590 |
Some important classes of quantifiers | 593 |
Some comments and conclusion | 598 |
Acknowledgments | 599 |
Wavelet Methods in Data Mining | 603 |
Tao Li Sheng Ma and Mitsunori Ogihara 1 Introduction | 604 |
Wavelet Background | 605 |
Data Management | 610 |
Preprocessing | 611 |
Core Mining Process | 614 |
Conclusion | 622 |
References | 623 |
Fractal Mining | 627 |
Daniel Barbara and Ping Chen 1 Introduction | 628 |
Fractal Dimension | 629 |
Clustering Using the Fractal Dimension | 633 |
Projected Fractal Clustering | 641 |
Tracking Clusters | 642 |
Conclusions | 645 |
29 | 649 |
30 | 661 |
31 | 695 |
32 | 715 |
33 | 731 |
34 | 749 |
35 | 764 |
Haixun Wang Philip S Yu and Jiawei | 778 |
37 | 793 |
Formal Frameworks And AlgorithmBased Techniques | 808 |
Hybrid Approaches TEG | 814 |
Text Mining Visualization and Analytics | 815 |
References | 820 |
39 | 828 |
Spatial Data Mining 833 | 832 |
Spatial Data | 834 |
Spatial Outliers | 837 |
Spatial Colocation Rules | 841 |
Predictive Models | 844 |
Spatial Clusters | 848 |
Summary | 849 |
References | 850 |
An Overview | 853 |
Performance Measure | 854 |
Sampling Strategies | 858 |
Ensemblebased Methods | 860 |
Discussion | 862 |
References | 863 |
Relational Data Mining 869 | 868 |
Inductive logic programming | 874 |
Relational Association Rules | 884 |
Relational Decision Trees | 889 |
RDM Literature and Internet Resources | 894 |
References | 895 |
Collaborative Filtering | 910 |
Conclusion | 911 |
43 | 912 |
A Review of Web Document Clustering Approaches | 921 |
Nora Oikonomakou and Michalis Vazirgiannis 1 Introduction | 922 |
Web Document Clustering Approaches | 924 |
Comparison | 935 |
Conclusions and Open Issues | 937 |
Causal Discovery | 945 |
Background Knowledge | 946 |
Theoretical Foundation | 949 |
Learning a DAG of CN by FDs | 950 |
Experimental Results | 953 |
References | 954 |
Ensemble Methods For Classifiers 957 | 956 |
Sequential Methodology | 958 |
Concurrent Methodology | 964 |
Combining Classifiers | 966 |
Ensemble Diversity | 973 |
Ensemble Size | 974 |
Cluster Ensemble | 976 |
References | 977 |
Decomposition Methodology for | 981 |
Decomposition Advantages | 984 |
The Elementary Decomposition Methodology | 986 |
The Decomposers Characteristics | 991 |
The Relation to Other Methodologies | 996 |
Summary | 999 |
Information Fusion | 1005 |
Preprocessing Data | 1006 |
Building Data Models | 1009 |
Information Extraction | 1012 |
References | 1013 |
Parallel And GridBased Data Mining | 1017 |
Antonio Congiusta Domenico Talia and Paolo Trunfio 1 Introduction | 1018 |
Parallel Data Mining | 1019 |
GridBased Data Mining | 1027 |
The Knowledge Grid | 1033 |
Summary | 1038 |
References | 1039 |
Collaborative Data Mining 1043 | 1042 |
Remote Collaboration | 1044 |
The Data Mining Process | 1047 |
Collaborative Data Mining Guidelines | 1048 |
Discussion | 1052 |
Conclusions | 1053 |
References | 1054 |
50 | 1056 |
Organizational Data Mining | 1057 |
Hamid R Nemati and Christopher D Barko 1 Introduction | 1058 |
Organizational Data Mining | 1059 |
ODM versus Data Mining | 1060 |
Ongoing ODM Research | 1062 |
ODM Evolution | 1063 |
Summary | 1066 |
Mining Time Series Data | 1069 |
Time Series Similarity Measures | 1071 |
Time Series Data Mining | 1077 |
Time Series Representations | 1088 |
Summary | 1098 |
Applications | 1104 |
Data Mining in Medicine | 1107 |
Symbolic Classification Methods | 1109 |
Subsymbolic Classification Methods | 1120 |
Other Methods Supporting Medical Knowledge Discovery | 1126 |
Conclusions | 1129 |
53 | 1130 |
Learning Information Patterns in | 1139 |
Learning Stochastic Pattern Models | 1141 |
Searching for MetaPatterns | 1148 |
Conclusions | 1156 |
Data Mining for Selection of Manufacturing Processes | 1159 |
Data Mining in Engineering | 1160 |
Selection of Manufacturing Process with a Data Mining Approach | 1161 |
Conclusion | 1165 |
References | 1166 |
Data Mining of Design Products and Processes | 1167 |
Product Design Process | 1169 |
Product Portfolio Management | 1171 |
Conceptual Design | 1172 |
Detailed Design | 1175 |
Business and Manufacturing Process Planning | 1177 |
Text Mining | 1178 |
Observations and Future Advancements | 1180 |
Epilogue | 1182 |
References | 1183 |
Data Mining in Telecommunications 1189 | 1188 |
Types of Telecommunication Data | 1190 |
Data Mining Applications | 1194 |
Conclusion | 1199 |
References | 1200 |
Data Mining for Financial Applications | 1203 |
Specifics of Data Mining in Finance | 1205 |
Aspects of Data Mining Methodology in Finance | 1210 |
Data Mining Models and Practice in Finance | 1214 |
Conclusion | 1219 |
References | 1221 |
58 | 1223 |
Data Mining for Intrusion Detection | 1225 |
Data Mining Basics | 1226 |
Data Mining Meets Intrusion Detection | 1228 |
Conclusions and Future Research Directions | 1235 |
References | 1236 |
Data Mining For Software Testing 1239 | 1238 |
Mining Software Metrics Databases | 1241 |
InteractionPattern Discovery in System Usage Data | 1242 |
Using Data Mining in Functional Testing | 1243 |
Summary | 1246 |
Acknowledgments | 1247 |
Data Mining for CRM | 1249 |
Data Mining and Campaign Management | 1251 |
Customer Acquisition | 1252 |
Data Mining for Target Marketing | 1261 |
Modeling Process | 1263 |
Evaluation Metrics | 1265 |
Segmentation Methods | 1268 |
Predictive Modeling | 1275 |
InMarket Timing | 1281 |
Pitfalls of Targeting | 1285 |
Conclusions | 1297 |
References | 1299 |
Software | 1302 |
Weka | 1305 |
References | 1313 |
Oracle Data Mining | 1315 |
The MiningintheDatabase Paradigm | 1317 |
ODM Functionality and Algorithms | 1319 |
Text and Spatial Mining | 1324 |
ODM Examples | 1325 |
Conclusions | 1327 |
References | 1328 |
Building Data Mining Solutions with 1331 | 1330 |
OLE DB for Data Mining | 1332 |
Data Mining in SQL Server 2000 | 1336 |
Building Data Mining Application using OLE DB for Data Mining | 1338 |
for Analysis | 1340 |
Conclusion | 1343 |
References | 1344 |
LERSA Data Mining System | 1347 |
Input Data | 1348 |
Main Features | 1349 |
Final Remarks | 1350 |
GainSmarts Data Mining System 1353 | 1352 |
Accessing GainSmarts | 1354 |
Setting Up the Data for Modeling | 1355 |
Knowledge Evaluation | 1360 |
Software Characteristics | 1363 |
WizSofts WizWhy | 1365 |
IfThen Rules | 1366 |
Data Summarization | 1367 |
Classifications | 1368 |
WizWhy vs other Data Mining Methods | 1369 |
DataEngine | 1371 |
Intelligent Technologies for Modeling and Control | 1372 |
Work with DataEngine | 1374 |
References | 1377 |
1378 | |
Other editions - View all
Data Mining and Knowledge Discovery Handbook Oded Z. Maimon,Oded Maimon,Lior Rokach Limited preview - 2005 |
Data Mining and Knowledge Discovery Handbook Oded Z. Maimon,Oded Maimon,Lior Rokach No preview available - 2005 |