Accurate Forecasting is Overrated

analytics

Introduction

A very human tendency is to reduce uncertainty of their environment. As we have seen in the past few weeks, the outbreak of the novel coronavirus, COVID-19 demonstrated this by panic buying common items such as toilet paper, rice and pasta. The rapid selling of assets on the stock market is also another example of this. At the same time, however, this very need to craft an environment that has a low level of uncertainty, i.e., high security, has been responsible for our technological progress.

The practice of forecasting is no different either. The reason why we forecast, estimate, or predict, is to reduce uncertainty about the potential scenarios that may happen so that we can better prepare ourselves for them when the time comes.

This post isn’t about the philosophical implications about uncertainty and its management. Instead, it’s about the where forecasts and predictions operate effectively, and strategies to cope when making a wrong decision.

The Three Environments

Before we can talk about forecasting, it’s important to know which environment forecasting operates in.

Deterministic

This is the environment with set rules. For instance, a set of if, else statements in computer code would be the best example here. 1 + 1 = 2 and their related ilk. There is absolutely zero uncertainty in such a case. Essentially, there is a direct one-to-one mapping between inputs and outputs. No forecasting is required here because we know exactly what to expect. This is the golden realm where engineers could only dream of.

Stochastic

There is variability in the environment, leading to uncertainty. Unlike the deterministic environment, where there is a direct mapping between a cause and effect, the effect here can be variable, making it harder to nail down the cause. However, given enough repetitions in the environment, patterns can be observed, allowing learning from past observations. This is where statistical models can be applied, because they provide a way to quantify the variability of the environment. Based on past experience and learnings, prediction can be made. How many samples you would need to learn would depend on the system under study.

Unknown Unknowns

For lack of a better description, I quote from Rumsfeld. This is the regime of step changes, where we don’t know what we don’t know. Nassim Taleb calls these step changes ”Black Swans”. This is where highly improbable events occur, and where no statistical model could be applied to make a prediction. Some examples include the invention of the Internet (as a positive black swan), and the GFC as well as the recent COVID-19 outbreaks (as negative black swans). Black swan events arise because the world is a complex place with many interdependent, complex subcomponents interacting; any highly improbable event can happen, and it’s only a matter of time.

We can see that forecasting lies squarely within the realm of the second environment, where there can be something to be learned beforehand. In any forecasting method therefore, the underlying implicit assumption is that the past can help predict the future.

Decision Making with Forecasts

More often than not, forecasting actually takes a backseat to making a decision. We are fascinated by pundits making predictions of presidential elections, or stock prices, not so much that we care about the prediction itself, but about the implications of it. As a professor at the University of Melbourne where I was studying once remarked, the number coming out of the forecast itself isn’t as important as the decision that would be derived out of it.

At its core, any forecast is just a guess. That being said, depending on the environment, some things are easier to guess than others. For instance, the predictions we get from models in physics are very good, because the underlying models are very good approximations of reality. We can certainly trust the Newtonian equation $F = ma$ on this side of the Milky Way as well as another corner of the universe. We can’t, however, easily trust financial models, as they depend on human behavior, which is far harder to model, as human behavior don’t always follow well-defined rules.

So we know that any forecast is potentially wrong. What then can we do about it?

Let’s look at some strategies to handle mistakes with forecasts.

Make lots of guesses and decisions, sized appropriately

This is the method favored by traders, sports bettors, and hedge funds. The latter includes Renaissance Tech, where their hit rate is just slightly above 50%¹, yet have made a fortune on the market since 1989. Your forecast model doesn’t have to be right all the time, it just have to be right most of the time. Given enough time, a slight edge becomes a huge one due to the wonders of compounding. Losses are constrained by limiting your exposure to catastrophic loss.

Moreover, another non-obvious observation is that the magnitude of the expected reward from your decision is just as, if not more, important than getting the forecast right. This is where a strategy like Kelly allocation comes in.

Update whenever new information arises

John Maynard Keynes is often credited with the following quote:

When the facts change, I change my mind.

Whether Keynes actually said it is irrelevant, but what is relevant is that the quote exemplifies a sign of intelligent, rational behavior: adapt to circumstances accordingly. We live in a non-linear complex environment, so much so that we should always revisit our assumptions. More importantly, changes are only accelerating.

Just about less than 20 years ago, computers were not powerful enough to run the sophisticated deep learning algorithms we have today. The large availability of data today gave rise to a class of new professions such as the data engineer, data analyst and data scientist. Robotics is improving rapidly, and soon some professions may be obsolete, changing the very fabric of society. Things change, and they can change very quickly indeed.

One method to cope with this is via Bayesian statistics. Put simply, Bayesian statistics assumes that our knowledge about an event is encoded by our own beliefs and what we observe. It is a way to model the lack of information by expressing that lack as uncertainty. As new information arises, we must factor in new information, which would gradually decrease our uncertainty.

Have backups and spares

The concept of a backup has existed since the dawn of mankind, from Joseph shoring up grain during prosperous times in the Bible, to soldiers having a second sword or sidearm. Backups and redundant systems are standard in engineering.

It’s a simple idea, but one that doesn’t get as much credit as it does these days, since some modern business models rely on fragile, highly optimized just-in-time supply chains for cost savings. This strategy works relatively well even in the unknown unknowns environment, since it provides a margin of error against a poor forecast. Sometimes when a regime change happens, no amount of past learning can help in predicting the future.

The downside with backups is that they incur cost (which is why some business models play around with to make a profit despite low margins). For example, having to run two clusters to maintain a search service at a company with Google scale traffic clearly costs more than just a single cluster. However, what you are preparing for is a worse case scenario, as highly optimized systems only work during the good times.

Split your system into independent subcomponents

Another one favored by engineers, any component that is decoupled from the larger system allows isolation of failures. The idea is simple: if a subsystem is isolated from the wider system, then in the case of failure of the subsystem, the rest of the wider system is not affected. This allows us to make forecasts for each of the subsystems, effectively partitioning our failures. This idea is also related to diversification in the portfolio, as we hold multiple uncorrelated assets, so that the failure in one doesn’t affect the overall returns of the portfolio.

If you have to remember one thing, it is this

Never have a single point of failure.

As an example, the Internet was designed to be a decentralized system. The goal was that the network has to continue functioning in the event of a nuclear attack².

However, decentralization and independent subsystems incur cost in coordination and communication. Too much diversification dilutes returns. But at least something will survive in case bad events happen.

Conclusion

Building models that provide a highly accurate forecast is an interesting intellectual exercise, and has been a staple at Kaggle. In the real world, however, forecasts are almost always tied to decision making, such as figuring out how much to stock at a supermarket outlet, or the surge in customers who would want an Uber ride at a specific location and time. We need to come to terms with the limitations of forecasts and be prepared for it. Hopefully the strategies outlined above helps us to do just that.

References

¹ Gregory Zuckerman, “The Man who Solved the Market: How Jim Simons Launched the Quant Revolution”, 2019 link

² Mitch Waldrop, “DARPA and the Internet Revolution”, DARPA, pp.28-85, 2015 link

Published 15 Mar 2020

Machine Learning EngineerPaul Tune on Twitter