Instead, plot its components, and check that they make sense
Many data scientists approach modelling as a feature engineering task: they gather features, do cross-validation, and if the score isn’t good enough, they keep adding more features until it is. In forecasting, where there is typically little data, this is risky. It can be much more effective to look inside your models and understand what they are doing —
Prophet makes this especially easy thanks to its
plot_components function. Let’s look at some examples of when using it can help diagnose poor forecasts.
Let’s generate some data completely at random…
Make your return types more precise!
Suppose we have a function which take a boolean argument
inplace and that its return type depends on the value of
inplace. If it’s
True, it returns
None, else it returns an integer:
By inspecting the function, we can see that we expect the return type in line 17 to be
None, while the return types in lines 18 and 19 to be
mypy doesn’t peak inside the function, and so believes them both to be
Optional[Cat]. Indeed, running the above snipped, we get:
main.py:17: note: Revealed type is 'Union[main.Cat, None]'
Suppose you’ve tossed a coin 1,000 times and obtained 292. You’d like to know what the probability of obtaining heads is from a single coin toss — but you don’t just want a single estimate, you want an entire distribution. If you define
and then model y as a binomial distribution with n=1,000, then the posterior distribution is very easy to obtain with just a few lines of code:
Say you want to show off how many pull requests you’ve submitted this year, or how many you’ve reviewed. Manually counting them on GitHub would be hard work…but there’s an easier, simpler, and more efficient way.
In a Python3 virtual environment, install
On your GitHub account, go to “settings”
We data scientists love Jupyter Notebooks: they enable fast prototyping, let us tell stories with our code, and allow us to explore datasets thoroughly. Yet, as anyone who’s tried to keep a suite of Jupyter Notebooks under version control will tell you, they’re really hard to maintain.
There are many reasons for this, e.g.:
If we want to use any of the following excellent tools:
You’ve just written an amazing Jupyter Notebook, and you’d like to send it to your coworkers. Asking them to install Jupyter isn’t an option, and neither is asking IT for a server on which to host your page. What do you do?
I’ll show you how to export your notebook as a self-contained html report which anyone can open in their browser. I’ll start with the simplest possible example of how to export an html report, then I’ll show how to hide the input cells, and finally I’ll show how to toggle showing/hiding code cells.
Data Scientist, pandas maintainer, Kaggle competitions expert, Univ. of Oxford MSc