How I won $6,000 in the M6 Forecasting Competition

3 min readFeb 19, 2023

Spoiler: not with deep learning

The M6 Forecasting Competition involved two tasks: given 100 assets, you had to

forecast a probability distribution for the quintile in which the log returns of each asset would be at the end of each month;
determine how you would invest a hypothetical unit of money.

The competition went on for 12 whole entire months, with prizes each quarter. I was lucky enough to land 2nd in Q1, and 10th overall, out of 163 participants. The competition is finally over, so I can now share what I did.

My approach: forecasting

General idea

My workflow was:

for each asset, take the last 200 values of its adjusted close price;
estimate the covariance between their returns, using a covariance estimator from the precise Python package (https://github.com/microprediction/precise)
sample from the covariance matrix, and count how often each asset appears in each quintile.

One difficulty lies in how to choose which covariance estimator to use. My answer is: cross-validation, cross-validation, cross-validation (this also happens to also my answer to virtually any question about Data Science…).

Cross-validation strategy

Suppose we’re forecasting for the period 2022–03–07 — 2022–04–01. Then, for each covariance estimator in precise, I would:

train on data strictly before 2022–02–07, forecast for the period 2022–02–07–2022–03–04;
train on data strictly before 2022–01–10, forecast for the period 2022–01–10–2022–02–04;
train on data strictly before 2021–12–13, forecast for the period 2021–12–13–2022–01–07.

Then, calculate the RPS score for each, take the best-performing covariance estimators, and average them.

With this approach I was able to beat the benchmark — though, full disclaimer, I did not beat it for every single month (just in aggregate). Only 3 participants managed to beat the benchmark every single month, and I’m looking forwards to reading their solutions.

My approach: investing

Honestly, my result here was 50% luck and 50% luck. I really don’t know anything about investing and stocks. So here’s what I did:

estimate a covariance matrix using the technique described above;
put that through the portfolio constructors of precise .

Then, I would cross-validate (same setup as above), and choose the portfolio constructors which would beat the benchmark for each of the 3 last months. If I could find no such portfolio constructors, I would just use the benchmark.

Magic

I received a question about how, in the last quarter, I was able to “achieve” a IR score of exactly 1.000. The answer lies in the asset DRE , which stopped being publicly traded around Q7 and so its price stayed constant thereafter. At the end of Q8 I noticed that my IR score was barely positive, so I decided to invest everything in DRE for the final quarter, thus “locking in” my IR score until the end of the competition (in fact, guaranteeing it would ever-so-slightly rise).

In the end, my investing score would’ve been higher if I’d just used the benchmark each time, so more fool me.

OK, where’s the code? Where can I read more?

Here you go: https://github.com/MarcoGorelli/wound-ignite.

Other resources:

M6 competition website: https://m6competition.com
Notebook on how to calculate the evaluation metrics: https://www.kaggle.com/code/marcogorelli/m6-calculation-example-70x-faster
Precise Python package: https://github.com/microprediction/precise
Blog post on worse-than-random participants https://mikeharrisny.medium.com/m6-competition-and-worse-than-random-participants-68b3b89f8850
Blog post on beating most participants using the options market https://medium.com/geekculture/the-options-market-beat-94-of-participants-in-the-m6-financial-forecasting-contest-fa4f47f57d33

Closing remarks

When I entered the competition, I had a gut feeling that a simple solution would have a good chance of ending up in the top 10% because most participants would end up over-fitting anyway. Pretty sure this is exactly what ended up happening: only about 23% of participants beat the benchmark in the forecasting track. If there’s one lesson you take away from this, it’s that cross-validation is fundamental to Data Science.

Finally, a note on the team name: for some reason, a Ruscist propagandist started commenting on virtually all LinkedIn posts about the competition, so I thought it’d be apt to piss him off by including “Glory to Ukraine” in my team name and getting it close to the top of the leaderboard.

Thank you for reading, and please follow me on, ermm, GitHub.