Process mining is a technique for extracting process models from event logs recorded by information systems. Process mining approaches normally rely on the assumption that the log to be mined is complete. Checking log completeness is known to be a difficult issue. Except for some trivial cases, checkable criteria for log completeness are not known. We overcome this problem by taking a probabilistic point of view. In this paper, we propose a method to compute the probability that the event log is complete. Our method provides a probabilistic lower bound for log completeness for three subclasses of Petri nets, namely, workflow nets, T-workflow nets, and S-workflow nets. Furthermore, based upon the complete log obtained by our methods, we propose two specialized mining algorithms to discover T-workflow nets and S-workflow nets, respectively. We back up our theoretical work with empirical studies that show that the probabilistic bounds computed by our method are reliable.
Published in:
Research Challenges in Information Science (RCIS), 2011 Fifth International Conference on
Date of Conference: 19-21 May 2011