Sequence Data MiningSpringer Science & Business Media, 2007 M10 31 - 150 pages Understanding sequence data, and the ability to utilize this hidden knowledge, will create a significant impact on many aspects of our society. Examples of sequence data include DNA, protein, customer purchase history, web surfing history, and more. This book provides thorough coverage of the existing results on sequence data mining as well as pattern types and associated pattern mining methods. It offers balanced coverage on data mining and sequence data analysis, allowing readers to access the state-of-the-art results in one place. |
Contents
Introduction | 1 |
Frequent and Closed Sequence Patterns | 14 |
Classification Clustering Features and Distances of Sequence Data | 47 |
Identifying and Characterizing Sequence Families | 67 |
Mining Partial Orders from Sequences | 88 |
Distinguishing Sequence Patterns | 113 |
Related Topics | 131 |
139 | |
147 | |
Other editions - View all
Common terms and phrases
algorithm alignment analysis appear applications approach biological bitset called candidate chapter classification closed partial orders closed sequential patterns clustering complete compute condition considered constraint construction contains count dataset defined Definition described determine discussed distance edges efficient element event example exists expression extension families Figure frequent closed partial function gap constraint given graph identify important interest length Markov match matrix methods minimization Moreover motif node occurrence pairs periodic position possible prefix PrefixSpan present probability problem projected database protein pruning quences Reference represented respect satisfying scan sequence database sequential pattern mining sequential patterns shown similarity step string structure studied subsequence subset Table task techniques threshold transaction transitive reduction tree types weight window
Popular passages
Page 144 - J. Pei, J. Han, B. Mortazavi-Asl, H. Pinto, Q. Chen, U. Dayal, and M.-C. Hsu. PrefixSpan: Mining sequential patterns efficiently by prefix-projected pattern growth.
Page 139 - A. Bateman, E. Birney, L. Cerruti, R. Durbin, L. Etwiller, SR Eddy, S. GriffithsJones, KL Howe, M. Marshall, and ELL Sonnhammer. The Pfam protein families database.
Page 144 - Nucleotide sequence of an RNA polymerase binding site at an early T7 promoter, Proc.