Markovletics: methods and a novel application for learning continuous-time Markov chain mixtures
Files
Published version
Date
2024-05-13
Authors
Spaeh, Fabian
Tsourakakis, Charalampos E.
Version
OA Version
Citation
Fabian Spaeh and Charalampos E. Tsourakakis. 2024. Markovletics: Methods and A Novel Application for Learning Continuous-Time Markov Chain Mixtures. In Proceedings of the ACM on Web Conference 2024 (WWW '24). Association for Computing Machinery, New York, NY, USA, 4160–4171. https://doi.org/10.1145/3589334.3645491
Abstract
Sequential data naturally arises from user engagement on digital platforms like social media, music streaming services, and web navigation, encapsulating evolving user preferences and behaviors through continuous information streams. A notable unresolved task in stochastic processes is learning mixtures of continuous-time Markov chains (CTMCs). While there is progress in learning mixtures of discrete-time Markov chains with recovery guarantees [GKV16,ST23,KTT2023], the continuous scenario uncovers unique unexplored challenges. The intrigue in CTMC mixtures stems from their ability to model intricate continuous-time stochastic processes prevalent in various fields including social media, finance, and biology.
In this study, we introduce a novel framework for exploring CTMCs, emphasizing the influence of observed trails' length and mixture parameters on problem regimes, which demands specific algorithms. Through thorough experimentation, we examine the impact of discretizing continuous-time trails on the learnability of the continuous-time mixture, given that these processes are often observed via discrete, resource-demanding observations. Our comparative analysis with leading methods explores sample complexity and the trade-off between the number of trails and their lengths, offering crucial insights for method selection in different problem instances. We apply our algorithms on an extensive collection of Lastfm's user-generated trails spanning three years, demonstrating the capability of our algorithms to differentiate diverse user preferences. We also pioneer the use of CTMC mixtures on a basketball passing dataset to unveil intricate offensive tactics of NBA teams. This underscores the pragmatic utility and versatility of our proposed framework.
Description
License
© 2024 Copyright held by the owner/author(s). This work is licensed under a Creative Commons Attribution International 4.0 License. This article has been published under a Read & Publish Transformative Open Access (OA) Agreement with ACM.