Building local pages in any amount can be a painful task. It’s hard to strike the right mix of on-topic content, expertise, and location, and the temptation to take shortcuts has always been tempered by the fact that good, unique content is almost impossible to scale.
In this week’s edition of Whiteboard Friday, Russ Jones shares his favorite white-hat technique using natural language generation to create local pages to your heart’s content.
Click on the whiteboard image above to open a high-resolution version in a new tab!
Hey, folks, this is Russ Jones here with Moz again to talk to you about important search engine optimization issues. Today I’m going to talk about one of my favorite techniques, something that I invented several years ago for a particular client and has just become more and more and more important over the years.
I call this using natural language generation to create hyper-local content. Now I know that there’s a bunch of long words in there. Some of you are familiar with them, some of you are not.
So let me just kind of give you the scenario, which is probably one you’ve been familiar with at some point or another. Imagine you have a new client and that client has something like 18,000 locations across the United States.
Then you’re told by Google you need to make unique content. Now, of course, it doesn’t have to be 18,000. Even 100 locations can be difficult, not just to create unique content but to create uniquely valuable content that has some sort of relevance to that particular location.
So what I want to do today is talk through one particular methodology that uses natural language generation in order to create these types of pages at scale.
Now there might be a couple of questions that we need to just go ahead and get off of our plates at the beginning. So first, what is natural language generation? Well, natural language generation was actually originated for the purpose of generating weather warnings. You’ve actually probably seen this 100,000 times.
Whenever there’s like a thunderstorm or let’s say high wind warning or something, you’ve seen on the bottom of a television, if you’re older like me, or you’ve gotten one on your cellphone and it says the National Weather Service has issued some sort of warning about some sort of weather alert that’s dangerous and you need to take cover.
Well, the language that you see there is generated by a machine. It takes into account all of the data that they’ve arrived at regarding the weather, and then they put it into sentences that humans automatically understand. It’s sort of like Mad Libs, but a lot more technical in the sense that what comes out of it, instead of being funny or silly, is actually really useful information.
That’s our goal here. We want to use natural language generation to produce local pages for a business that has information that is very useful.
Now the question we almost always get or I at least almost always get is: Is this black hat? One of the things that we’re not supposed to do is just auto-generate content.
So I’m going to take a moment towards the end to discuss exactly how we differentiate this type of content creation from just the standard, Mad Libs-style, plugging in different city words into content generation and what we’re doing here. What we’re doing here is providing uniquely valuable content to our customers, and because of that it passes the test of being quality content.
So let’s do this. Let’s talk about probably what I believe to be the easiest methodology, and I call this the Google Trends method.
So let’s step back for a second and talk about this business that has 18,000 locations. Now what do we know about this business? Well, businesses have a couple of things that are in common regardless of what industry they’re in.
They either have like products or services, and those products and services might have styles or flavors or toppings, just all sorts of things that you can compare about the different items and services that they offer. Therein lies our opportunity to produce unique content across almost any region in the United States.
The tool we’re going to use to accomplish that is Google Trends. So the first step that you’re going to do is you’re going to take this client, and in this case I’m going to just say it’s a pizza chain, for example, and we’re going to identify the items that we might want to compare. In this case, I would probably choose toppings for example.
So we would be interested in pepperoni and sausage and anchovies and God forbid pineapple, just all sorts of different types of toppings that might differ from region to region, from city to city, and from location to location in terms of demand. So then what we’ll do is we’ll go straight to Google Trends.
The best part about Google Trends is that they’re not just providing information at a national level. You can narrow it down to city level, state level, or even in some cases to ZIP Code level, and because of this it allows us to collect hyper-local information about this particular category of services or products.
So, for example, this is actually a comparison of the demand for pepperoni versus mushroom versus sausage toppings in Seattle right now. So most people, when people are Googling for pizza, would be searching for pepperoni.
So what you would do is you would take all of the different locations and you would collect this type of information about them. So you would know that, for example, here there is probably about 2.5 times more interest in pepperoni than there is in sausage pizza. Well, that’s not going to be the same in every city and in every state. In fact, if you choose a lot of different toppings, you’ll find all sorts of things, not just the comparison of how much people order them or want them, but perhaps how things have changed over time.
For example, perhaps pepperoni has become less popular. If you were to look in certain cities, that probably is the case as vegetarian and veganism has increased. Well, the cool thing about natural language generation is that we can automatically extract out those kinds of unique relationships and then use that as data to inform the content that we end up putting on the pages on our site.
So, for example, let’s say we took Seattle. The system would automatically be able to identify these different types of relationships. Let’s say we know that pepperoni is the most popular. It might also be able to identify that let’s say anchovies have gone out of fashion on pizzas. Almost nobody wants them anymore.
Something of that sort. But what’s happening is we’re slowly but surely coming up with these trends and data points that are interesting and useful for people who are about to order pizza. For example, if you’re going to throw a party for 50 people and you don’t know what they want, you can either do what everybody does pretty much, which is let’s say one-third pepperoni, one-third plain, and one-third veggie, which is kind of the standard if you’re like throwing a birthday party or something.
But if you landed on the Pizza Hut page or the Domino’s page and it told you that in the city where you live people actually really like this particular topping, then you might actually make a better decision about what you’re going to order. So we’re actually providing useful information.
So this is where we’re talking about generating the text from the trends and the data that we’ve grabbed from all of the locales.
Now the first step, of course, is just looking at local trends. But local trends aren’t the only place we can look. We can go beyond that. For example, we can compare it to other locations. So it might be just as interesting that in Seattle people really like mushroom as a topping or something of that sort.
But it would also be really interesting to see if the toppings that are preferred, for example, in Chicago, where Chicago style pizza rules, versus New York are different. That would be something that would be interesting and could be automatically drawn out by natural language generation. Then finally, another thing that people tend to miss in trying to implement this solution is they think that they have to compare everything at once.
That’s not the way you would do it. What you would do is you would choose the most interesting insights in each situation. Now we could get technical about how that might be accomplished. For example, we might say, okay, we can look at trends. Well, if all of the trends are flat, then we’re probably not going to choose that information. But we see that the relationship between one topping and another topping in this city is exceptionally different compared to other cities, well, that might be what gets selected.
Now here’s where the question comes in about white hat versus black hat. So we’ve got this local page, and now we’ve generated all of this textual content about what people want on a pizza in that particular town or city. We need to make sure that this content is actually quality. That’s where the final step comes in, which is just human review.
In my opinion, auto-generated content, as long as it is useful and valuable and has gone through the hands of a human editor who has identified that that’s true, is every bit as good as if that human editor had just looked up that same data point and wrote the same sentences.
So I think in this case, especially when we’re talking about providing data to such a diverse set of locales across the country, that it makes sense to take advantage of technology in a way that allows us to generate content and also allows us to serve the user the best possible and the most relevant content that we can.
So I hope that you will take this, spend some time looking up natural language generation, and ultimately be able to build much better local pages than you ever have before. Thanks.