?
HoTPP benchmark: Are we good at the long horizon events forecasting?
Forecasting multiple future events within a given time horizon is essential for applications in finance, retail,
social networks, and healthcare. This problem is typically addressed using Marked Temporal Point Processes
(MTPP), which provide a principled framework for modeling both event timing and event labels. While most
existing research focuses on predicting only the next event, forecasting distant future events introduces unique
challenges and requires specialized approaches. To support research in this setting, we introduce HoTPP, the first
benchmark specifically designed to rigorously evaluate long-horizon predictions in MTPPs. We identify shortcom
ings in widely used evaluation metrics, develop a theoretically grounded T-mAP metric, present strong statistical
baselines, and provide efficient implementations of popular models. Our empirical results show that modern
MTPP approaches often underperform compared to simple statistical baselines. We further analyze the diversity
of predicted sequences and find that most methods exhibit mode collapse. Finally, we study the impact of au
toregression on predictive performance and outline promising directions for future research. The HoTPP source
code, hyperparameters, and complete evaluation results are available on GitHub.