Constraint-Based Mining of Sequential Patterns over Datasets with Consecutive Repetitions

Leleu, Marion; Rigotti, Christophe; Boulicaut, Jean-François; Euvrard, Guillaume

doi:10.1007/978-3-540-39804-2_28

conference paper

Constraint-Based Mining of Sequential Patterns over Datasets with Consecutive Repetitions

Leleu, Marion

•

Rigotti, Christophe

•

Boulicaut, Jean-François

more

Lavrač, Nada

•

Gamberger, Dragan

more

2003

Knowledge Discovery in Databases: PKDD 2003

7th European Conference on Principles and Practice of Knowledge Discovery in Databases

onstraint-based mining of sequential patterns is an active research area motivated by many application domains. In practice, the real sequence datasets can present consecutive repetitions of symbols (e.g., DNA sequences, discretized stock market data) that can lead to a very important consumption of resources during the extraction of patterns that can turn even efficient algorithms to become unusable. We propose a constraint-based mining algorithm using an approach that enables to compact these consecutive repetitions, reducing drastically the amount of data to process and speeding-up the extraction time. The technique introduced in this paper allows to retain the advantages of existing state-of-the-art algorithms based on the notion of occurrence lists, while permitting to extend their application fields to datasets containing consecutive repetitions. We analyze the benefits obtained using synthetic datasets, and show that the approach is of practical interest on real datasets.

Type

conference paper

DOI

10.1007/978-3-540-39804-2_28

Authors

Leleu, Marion

•

Rigotti, Christophe

•

Boulicaut, Jean-François

•

Euvrard, Guillaume

Editors

Lavrač, Nada

•

Gamberger, Dragan

•

Todorovski, Ljupčo

•

Blockeel, Hendrik

Publication date

2003

Publisher

Springer

Published in

Knowledge Discovery in Databases: PKDD 2003

Publisher place

Berlin, Heidelberg

Series title/Series vol.

Lecture Notes in Computer Science; 2838

Start page

303

End page

314

Peer reviewed

REVIEWED

EPFL units

SV

Event name	Event place	Event date
7th European Conference on Principles and Practice of Knowledge Discovery in Databases	Cavtat-Dubrovnik, Croatia	September 22-26, 2003

Available on Infoscience

August 30, 2017

Use this identifier to reference this record

https://infoscience.epfl.ch/handle/20.500.14299/139915