Way back in 2015, I published an article giving away a free, simple, forecasting tool, and talking through use cases for forecasting in SEO. It was a quick, effective way to see if a change to your site traffic is some kind of seasonality you can ignore, something to celebrate, or a worrying sign of traffic loss.
In short: you could enter in a series of data, and it would plot it out on a graph like the image above.
Five years later, I still get people — from former colleagues to complete strangers — asking me about this tool, and more often than not, I’m asked for a version that works directly in spreadsheets.
I find this easy to sympathize with: a spreadsheet is more flexible, easier to debug, easier to expand upon, easier to maintain, and a format that people are very familiar with.
The tradeoff when optimizing for those things is, although I’ve improved on that tool from a few years ago, I’ve still had to keep things manageable in the famously fickle programming environment that is Excel/Google Sheets. That means the template shared in this post uses a simpler, slightly less performant model than some tools with external code execution (e.g. Forecast Forge).
In this post, I’m going to give away a free template, show you how it works and how to use it, and then show you how to build your own (better?) version. (If you need a refresher on when to use forecasting in general, and concepts like confidence intervals, refer to the original article linked above.).
There is one thing I want to expand on before we get into the spreadsheet stuff: the different types of SEO forecast.
Broadly, I think you can put SEO forecasts into three groups:
Type two has its merits, but if you compare the likes of Ahrefs/SEMRush/Sistrix data to your own analytics, you’ll see how hard this is to generalize. As an aside, I don’t think type one is as ridiculous as it looks, but it’s not something I’ll be exploring any further in this post. In any case, the template in this post fits into type three.
Why, nothing at all. One thing you’ll notice about my description of type three above is that it doesn’t mention anything SEO-specific. It could equally apply to direct traffic, for example. That said, there are a couple of reasons I’m suggesting this specifically as an SEO forecast:
I mentioned that type two above is very challenging, and this is because of the highly non-deterministic nature of SEO and the generally poor quality of detailed data in Search Console and other SEO-specific platforms. In addition, to get an accurate idea of seasonality, you’d need to have been warehousing your Search Console data for at least a couple of years.
For many other channels, high quality, detailed historic data does exist, and relationships are far more predictable, allowing more granular forecasts. For example, for paid search, the Forecast Forge tool I mentioned above builds in factors like keyword-level conversion data and cost-per-click based on your historical data, in a way that would be wildly impractical for SEO.
That said, we can still combine multiple types of forecast in the template below. For example, rather than forecasting the traffic of your site as a whole, you might forecast subfolders separately, or brand/non-brand separately, and you might then apply percentage growth to certain areas or build in anticipated ranking changes. But, we’re getting ahead of ourselves…
The first thing you’ll need to do is make a copy (under the “File” menu in the top left, but automatic with the link I’ve included). This means you can enter your own data and play around to your heart’s content, and you can always come back and get a fresh copy later if you need one.
Then, on the first tab, you’ll notice some cells have a green or blue highlight:
You should only be changing values in the colored cells.
The blue cells in column E are basically to make sure everything ends up correctly labelled in the output. So, for example, if you’re pasting session data, or click data, or revenue data, you can set that label. Similarly, if you enter a start month of 2018-01 and 36 months of historic data, the forecast output will begin in January 2021.
On that note, it needs to be monthly data — that’s one of the tradeoffs for simplicity I mentioned earlier. You can paste up to a decade of historic monthly data into column B, starting at cell B2, but there are a couple of things you need to be careful of:
Make sure you also delete any leftovers of my example data in column B.
Once you’ve done that, you can head over to the “Outputs” tab, where you’ll see something like this:
Column C is probably the one you’re interested in. Keep in mind that it’s full of formulas here, but you can copy and paste as values into another sheet, or just go to File > Download > Comma-separated values to get the raw data.
You’ll notice I’m only showing 15 months of forecast in that graph by default, and I’d recommend you do the same. As I mentioned above, the implicit assumption of a forecast is that historical context carries over, unless you explicitly include changed scenarios like COVID lockdowns into your model (more on that in a moment!). The chance of this assumption holding two or three years into the future is low, so even though I’ve provided forecast values further into the future, you should keep that in mind.
The upper and lower bounds shown are 95% confidence intervals — again, you can recap on what that means in my previous post if you so wish.
You may by now have noticed the “Advanced” tab:
Although I said I wanted to keep this simple, I felt that given everything that happened in 2020, many people would need to incorporate major external factors into their model.
In the example above, I’ve filled in column B with a variable for whether or not the UK was under COVID lockdown. I’ve used “0.5” to represent that we entered lockdown halfway through March.
You can probably make a better go of this for the relevant factors for your business, but there are a few important things to keep in mind with this tab:
Here’s some example use cases of this tab for you to consider:
There’s two major differences in method between this template and my old tool:
If you’re seeing a significant difference in the forecast values between the two, it almost certainly comes down to the second reason, and although it adds a little complexity, in the vast majority of cases the new technique is more realistic and flexible.
It’s also far less likely to predict zero or negative traffic in the case of a severe downwards trend, which is nice.
There’s a hidden tab in the template where you can take a peek, but the short version is the “LINEST()” spreadsheet formula.
The inputs I’m using are:
The formula then gives a series of “coefficients” as outputs, which can be multiplied with values and added together to form a prediction like:
You can see in that hidden sheet I’ve labelled and color-coded a lot of the outputs from the Linest formula, which may help you to get started if you want to play around with it yourself.
If you do want to play around with this yourself, here are some areas I personally have in mind for further expansion that you might find interesting:
Richard Fergie, whose Forecast Forge tool I mentioned a couple of times above, also provided some great suggestions for improving forecast accuracy with fairly limited extra complexity:
I may or may not include some or all of the above myself over time, but if so I’ll make sure I use the same link and make a note of it in the spreadsheet, so this article always links to the most up-to-date version.
If you’ve made it this far, what would you like to see? Let me know in the comments!