Stop using Prophet out-of-the-box — 3 examples of it going wrong
--
Instead, plot its components, and check that they make sense
Many data scientists approach modelling as a feature engineering task: they gather features, do cross-validation, and if the score isn’t good enough, they keep adding more features until it is. In forecasting, where there is typically little data, this is risky. It can be much more effective to look inside your models and understand what they are doing — Prophet
makes this especially easy thanks to its plot_components
function. Let’s look at some examples of when using it can help diagnose poor forecasts, and how it can help us stop ourselves from using it.
Imaginary seasonality appears like a ghost!
Let’s generate some data completely at random and fit a Prophet
model to it. I would not expect a time series forecast to pick any kind of pattern here. However, it does:
It also performs worse than a naïve forecast (i.e. one that just forecasts the training set mean)!
To see why, let’s look at this Prophet
model decomposed. Now, we can see that it’s picking up a weekly seasonality which doesn’t exist:
Solution
If you don’t expect there to be a weekly trend, then you should instantiate your model withweekly_seasonality=False
.
Constant trend, but with a missing day!
Prophet will not use missing data to update its components. So if your weekly trend is constant but you have a missing date (say, because your shop closed and its sales got redistributed to the other days of the week), then Prophet
will pick up a trend which doesn’t exist:
Decomposing it helps us see why:
Solution
If it’s an event you expect to see again, fill the missing date with 0 and set a custom seasonality. Else, exclude it from training.
The trend changes, but the weekly seasonality doesn’t!
If your trend is constantly increasing but your weekly split is constant, then running Prophet
out-of-the-box will deliver poor results — this is because, by default, it uses additive (rather than multiplicative) seasonality:
Again, decomposing it helps us notice the issue:
Solution
If you expect your daily quantities to be proportions of your weekly total, use multiplicative weekly seasonality.
Conclusion
Prophet has some fantastic diagnostic tools available, and they can help us understand when not to use it. Unfortunately, there is a trend among data scientists to approach modelling as purely a feature engineering problem: if the model doesn’t work well, they just try adding extra features until it does. However, especially in forecasting, it can pay dividends to examine what the model is doing and to adjust it accordingly. Prophet
makes this very easy thanks to its plot_components
function — by inspecting it, you can figure out why it might not be performing well.