When Simpler Models Outperform Deep Learning in Sparse Demand Forecasting

By: Francois Aubin.

Introduction

Consider the challenge of forecasting demand for parts in industries such as aerospace, where demand is often low-volume, sparse, and subject to sudden shifts. In a recent AI initiative conducted by our team in collaboration with McKinsey & Company for a leading aerospace company, the initial assumption by data scientists was to employ complex, sophisticated forecasting models. However, a cognitive analysis of expert planners revealed that accurate forecasting alone was not their central concern. Rather, their main challenge lay in effectively supervising inventory levels and adapting forecasts dynamically in response to events such as changes in market demand, economic conditions, or policy shifts.

Consequently, our team proposed a novel approach focusing on user interface design that allows planners to integrate and supervise forecasts generated by various models. Ultimately, the project demonstrated that simpler forecasting models combined with expert supervision provided precise enough performance compared to highly precise, yet less interpretable deep learning models.

In demand forecasting, particularly in sparse-data scenarios characterized by low-volume demand (~100 units/year), limited historical data (~3 years monthly), and forecasting horizons of approximately 12 months, it is crucial to balance model complexity against data availability. Although advanced deep learning techniques such as Amazon’s DeepAR (a global RNN model) may appear attractive, classical exponential smoothing methods like Holt-Winters (ETS) frequently achieve better outcomes. Below, we examine five critical reasons—ranging from bias-variance tradeoffs to interpretability—highlighting why a simpler ETS model can outperform DeepAR in sparse-demand contexts.

Bias-Variance Tradeoff with Sparse Data

Small datasets significantly increase the risk of overfitting with complex models. Deep learning methods like DeepAR, which have numerous parameters and high flexibility, often lead to overfitting noise or underfitting due to excessive regularization when constrained by limited data (~36 data points per series). Exponential smoothing methods, however, are structurally constrained with fewer parameters, resulting in higher bias but substantially lower variance and better generalization to sparse data. As forecasting expert Rob Hyndman states, “A model that best fits historical data does not necessarily forecast well.” Empirical evidence from Makridakis forecasting competitions also underscores that simpler models frequently outperform sophisticated methods due to their ability to capture dominant patterns without over-parameterization.

Robustness to Regime Shifts and External Shocks

ETS models inherently provide greater robustness against sudden regime changes, such as economic crises or geopolitical events. ETS methods weight recent observations more heavily, allowing rapid adaptation to structural changes. Conversely, deep learning models often implicitly assume that historical patterns will persist, making them vulnerable to unexpected shocks. Additionally, ETS methods allow straightforward manual interventions by planners, while adjustments to deep learning models typically necessitate retraining or additional variables. Makridakis et al. confirm that simpler methods exhibit greater resilience during volatile periods, providing crucial adaptability over more complex models.

Interpretability and Human-in-the-Loop Forecasting

Operational forecasting requires interpretability. ETS models produce clearly decomposed outputs (level, trend, seasonal indices) easily understood by analysts, fostering trust and enabling structured human overrides based on domain insights. Deep learning models such as DeepAR, in contrast, operate as black boxes with opaque internal states, complicating human interventions. This lack of interpretability often undermines stakeholder confidence. Industry practice consistently favors ETS models for their interpretability and the ease with which planners can integrate expert judgment, scenario analyses, and overrides.

Challenges with Overfitting and Uncertainty Estimation in Deep Models

Complex models like DeepAR risk significant overfitting in sparse scenarios, often capturing coincidental patterns or noise rather than genuine signals. This issue manifests through unrealistic forecasts, erroneous seasonal predictions, and unreliable uncertainty estimates—either overly narrow (misleadingly confident) or excessively broad (non-informative). ETS models, although simpler, provide heuristic-based intervals that planners can intuitively adjust during turbulent periods. Additionally, simpler methods inherently generalize better to unforeseen patterns and novel events, thereby mitigating the risk of large forecasting errors.

Global vs. Local Modeling in Sparse and Irregular Series

DeepAR, as a global model, learns patterns across multiple time series to improve forecast accuracy. However, this strength diminishes when dealing with sparse, irregular, or highly individualized series. Global models require substantial homogeneous data to realize their advantages. In sparse-demand contexts, irregular events can lead global models to erroneously smooth out anomalies or mistakenly propagate series-specific anomalies across others. Local ETS models, in contrast, handle series independently, confining anomalies and enabling targeted adjustments based on local conditions and expert insights. Research emphasizes that global deep learning models are inherently data-intensive and can underperform in data-sparse environments where simpler, local models are better suited.

Error Propagation and Forecast Uncertainty

Improved precision in forecasting models does not translate to reduced total forecast error under conditions of high external uncertainty. Drawing from error propagation principles in physics and engineering, total forecast error can be understood as:

Total Forecast Error ≈ Model Error + Real-World Uncertainty

In scenarios characterized by low-volume demand, external uncertainty dominates, limiting the practical benefit gained from the complex precision of deep models. While DeepAR may theoretically minimize model error, this advantage is negligible compared to irreducible real-world unpredictability. ETS methods, recognizing and embracing this inherent uncertainty, avoid overreacting to random fluctuations, making them robustly aligned with real-world volatility.

Conclusion

Complexity does not inherently yield better forecasting performance, particularly in sparse-demand scenarios fraught with data limitations and volatility. Exponential smoothing methods thrive precisely because of their parsimony, robustness, adaptability, and interpretability, providing superior performance in practice. While advanced deep learning models excel in environments with abundant and stable data, simpler ETS models often prove more reliable and practical when facing sparse data, structural shifts, and high external uncertainty. The wise integration of human oversight further accentuates the practical advantage of simpler forecasting methods, reaffirming the classical principle that simplicity can indeed be the best sophistication in the volatile world of demand forecasting.