## Sequence Data MiningUnderstanding sequence data, and the ability to utilize this hidden knowledge, will create a significant impact on many aspects of our society. Examples of sequence data include DNA, protein, customer purchase history, web surfing history, and more. This book provides thorough coverage of the existing results on sequence data mining as well as pattern types and associated pattern mining methods. It offers balanced coverage on data mining and sequence data analysis, allowing readers to access the state-of-the-art results in one place. |

### From inside the book

Page 10

You have reached your viewing limit for this book.

You have reached your viewing limit for this book.

Page 14

You have reached your viewing limit for this book.

You have reached your viewing limit for this book.

Page 47

You have reached your viewing limit for this book.

You have reached your viewing limit for this book.

Page 67

You have reached your viewing limit for this book.

You have reached your viewing limit for this book.

Page 70

You have reached your viewing limit for this book.

You have reached your viewing limit for this book.

### What people are saying - Write a review

We haven't found any reviews in the usual places.

### Contents

1 | |

Frequent and Closed Sequence Patterns | 14 |

Classification Clustering Features and Distances of Sequence Data | 47 |

Identifying and Characterizing Sequence Families | 67 |

Mining Partial Orders from Sequences | 88 |

Distinguishing Sequence Patterns | 113 |

Related Topics | 131 |

References | 139 |

Index | 147 |

### Other editions - View all

### Common terms and phrases

alignment alphabet amino acids anti-monotonic applications Apriori-like bioinformatics biological sequence bitset array breadth-first search candidate sequences classification closed partial orders closed sequential patterns complete set compute considered dataset defined depth-first search discussed distance functions distinguishing sequence patterns DNA sequences element example Frecpo frequent closed partial frequent partial orders gene Gibbs sampling given sequence graph hidden Markov models identify itemsets Markov chain Markov models mask bitset match matrix minimization minimum gap constraint mining process monotonic node occurrence partial periodic patterns pattern-growth patterns with prefix Prefix-growth prefix-monotone PrefixSpan projected database protein sequence pruning Pseudocounts quences recursively regular expression regular expression constraint satisfying the constraint scan sequence clustering sequence data mining sequence database sequence families sequential pattern mining set of items set of sequential similarity subsequence subset suffix total order transaction transitive closure transitive reduction types URLs

### Popular passages

Page 144 - J. Pei, J. Han, B. Mortazavi-Asl, H. Pinto, Q. Chen, U. Dayal, and M.-C. Hsu. PrefixSpan: Mining sequential patterns efficiently by prefix-projected pattern growth.

Page 65 - K'(G) of a graph G is the minimum number of edges whose removal from G results in a disconnected graph or a trivial graph.

Page 139 - A. Bateman, E. Birney, L. Cerruti, R. Durbin, L. Etwiller, SR Eddy, S. GriffithsJones, KL Howe, M. Marshall, and ELL Sonnhammer. The Pfam protein families database.

Page 144 - Nucleotide sequence of an RNA polymerase binding site at an early T7 promoter, Proc.

Page 25 - Table 2. 4. The set of sequential patterns is the collection of patterns found in the above recursive mining process. One can verify that it returns exactly the same set of sequential patterns as what GSP and FreeSpan do.

Page 16 - That is, element (x) is written as x. An item can occur at most once in an element of a sequence, but can occur multiple times in different elements of a sequence. The number of instances of items in a sequence is called the length of the sequence. A sequence with length / is called an /-sequence. A sequence a...