Database Support for Data Mining Applications: Discovering Knowledge with Inductive QueriesRosa Meo, Pier L. Lanzi, Mika Klemettinen Springer, 2004 M07 28 - 332 pages Data mining from traditional relational databases as well as from non-traditional ones such as semi-structured data, Web data, and scientific databases housing biological, linguistic, and sensor data has recently become a popular way of discovering hidden knowledge. This book on database support for data mining is developed to approaches exploiting the available database technology, declarative data mining, intelligent querying, and associated issues, such as optimization, indexing, query processing, languages, and constraints. Attention is also paid to the solution of data preprocessing problems, such as data cleaning, discretization, and sampling. The 16 reviewed full papers presented were carefully selected from various workshops and conferences to provide complete and competent coverage of the core issues. Some papers were developed within an EC funded project on discovering knowledge with inductive queries. |
From inside the book
... define syntactical restrictions on desired patterns, e.g., its “length” is below a user-given threshold. Preprocessing concerns the definition of the database r, the mining phase is often the computation of the specified theory while ...
... definition of a pattern domain is made of the definition of a language of patterns L, evaluation functions that assign a semantics to each pattern in a given database r, languages for primitive constraints that specify the desired ...
... definitions. Definition 3 (Association Rules). An association rule is denoted X ⇒ Y where X ∩ Y = ∅ and X ⊆ Items is the body of the rule and Y ⊆ Items is the head of the rule. Let us now define constraints on itemsets. Definition ...
... Definition 7 (Frequency). The absolute frequency of an itemset S in r is defined by Fa (S,r) = |support(S)| where |.| denote the cardinality of the multiset (each transaction is counted with its multiplicity). The relative frequency of ...
... Definition 8 (Closures of Itemsets). The closure of an itemset S in r (denoted by closure(S,r)) is the maximal (for set inclusion) superset of S which has the same support as S. In other terms, the closure of S is the set of items that ...
Contents
1 | |
24 | |
Declarative Data Mining Using SQL3 | 52 |
Towards a Logic Query Language for Data Mining | 76 |
A Data Mining Query Language for Knowledge Discovery in | 95 |
Towards Query Evaluation in Inductive Databases Using Version Spaces | 117 |
The GUHA Method Data Preprocessing and Mining | 135 |
Constraint Based Mining of First Order Sequences in SeqLog | 154 |
Frequent Itemset Discovery with SQL Using Universal Quantification | 194 |
Deducing Bounds on the Support of Itemsets | 214 |
ModelIndependent Bounding of the Supports of Boolean Formulae | 234 |
Condensed Representations for Sets of Mining Queries | 250 |
Arnaud Giacometti Dominique Laurent Cheikh Talibouya Diop | 270 |
Evgueni N Smirnov Ida G SprinkhuizenKuyper H Japp van den Herik | 289 |
Kimmo Hätönen Mika Klemettinen | 304 |
Artur Bykowski Thomas Daurel Nicolas Méger Christophe Rigotti | 324 |