The terms that a user types into a search engine are commonly referred to as keywords (sometimes they are called a keyphrases or keyterms) and they form the majority of the input information considered by the search engine when determining the results that should be returned.
Keyword research refers to the process of discovery and analysis a marketer undergoes when seeking to understand how their target market searches. Typically it will seek to understand searcher behavior throughout the buying cycle – from very broad exploratory terms to highly targeted commercial terms.
The process typically works in some key steps:
Insights from keyword research form a valuable input into a variety of stages of a marketing strategy:
All in all, the information we can glean from the ways our target market search is some of the most cost-effective and valuable market research we can do. The rest of this module goes into more detail on the actual process along with tips and tricks, plus warnings of potential pitfalls.
Search engines have always taken keywords as input. They have varied in the degree to which they have exposed how many people are searching for different things and in how much they have revealed about the ranking algorithms that they use to display results based on those keywords.
There are three main primary sources of keyword data:
Most of the other retail sources of keyword data buy or gather their data from one of these sources.
In late 2011, Google began moving from the vast majority of queries happening in the clear, over http (non-secure) connections to almost all queries happening via https (secure) connections. This automatically resulted in search queries being hidden from ISPs as the URLs being requested were obscured within encrypted requests.
At the same time, they changed the information passed to the destination website in the so-called ‘referrer’ tag which normally shows what URL someone was on before they clicked a link to your website. Google obscured the keyword data in the referrer field so website analytics (and aggregators of website analytics) could no longer analyse the search terms used by most of their visitors from search. In place of the old keyword data, Google started passing (not provided). It also became more difficult for website owners when Google stopped providing keyword volumes on their keyword planner unless you were spending sifnicant amounts on Google ads.
Between these changes, the current state of play is that keyword search data is only available from:
The other major change to impact how we think about keywords is the increasing diversity of inputs considered by the algorithm. In addition to the words entered by the user, Google in particular will consider a number of implicit aspects of the query in order to customize the results. Among others, these can include:
It used to be that search engines would return the exact same set of results in the same order for any user for a given input. Now that is no longer true, we need to pay even more attention to the intent of the user and their context.
Probably the best example of this is the increasing utility of very bland searches. Searches like “cinema” would not have returned useful results in the past. These days a search like this is likely to be treated with implicit geographical, social and preference inputs as a search for something much more specific and you are likely to get recommendations of restaurants nearby.
It would definitely be nice to have access to more granular data than is now available, but the main effect of the changes is to encourage a focus on qualitative keyword research over exact mathematical calculations. In particular:
After digging a little deeper into the data quality, we’ll start learning more about the process itself.
In an ideal world, for any conceivable theme, we would have access to:
We do have some information in all of these areas, but it’s not perfect. In particular, since categories 1-3 are now almost exclusively provided by the search engines and paid search tools it is hard to sense-check the information provided and there are some generalities that can be used to make judgments but figuring this out is not an exact science.
The primary source of keyword data from Google is the Keyword Planner. It ostensibly provides volume (if you pay for ads, otherwise it is vague) and seasonality data, but we now have little way of sense-checking it.
For some of the other data pieces, we have to turn to any source we can find for keyword data. SEMrush is probably the best paid tool on the marketplace for keyword data across the globe.
As a general rule, there is a natural tendency for Google to over-emphasise commercial searches – if for no other reason than its utility largely coming from its use in keyword advertising which is naturally commercially-focused. For this, and other similar reasons, it is important to treat Google-provided keyword data as directional and relative as well as using it for idea-mining rather than as an input into traffic modelling unless you are prepared to work around the caveats.
At Google, the top 20% of searches account for much less than 80% of the search volume. The incredible volume in the “long tail” of searches (each of which may be executed only once a month or even less frequently) is mind-boggling. It’s hard for us to comprehend, but one illustration that can help is to think about the trillions of unique search queries Google has seen up to this point – and to realise that 20-25% of all searches are brand new each day.
The upshot of all of this is that keyword data is naturally limited in its ability to analyze all the incredible variety of ways that people search. Not only are individual keywords less interesting than aggregated search volume across a wide range of related long-tail keywords, but the fact that many of these long-tail keywords have never been searched-for before makes them exceptionally hard to analyze.
This is yet one more reason why our approach is therefore to seek themes, concepts and groups rather than to analyze each keyword individually
There are a variety of ways of classifying different types of search by intent, but SEO people generally refer to the following segmentation:
It is generally not possible to measure intent exactly, but rather we must seek to infer it and understand more about the kind of thing the searcher is seeking. Our ultimate goal, of course, is to drive profitable user action by aligning what the website is offering with the user’s desires.
In all three of the categories identified above, it can be useful to classify searches additionally as branded or unbranded. Branded searches are those that mention or seek a specific brand or website. Mentions are simple to understand – a search for “Instagram” for example has straight-forward intent. You may choose to count certain other kinds of searches as branded – for example those mentioning specific unique product names, or other identifiers.
Let’s look at examples of each kind of keyword in both branded and unbranded form.
Although in general, we need to remember that a given keyword can mean different things to different searchers (think about all the different potential meanings for a search for “monopoly” for example), we know our business and it is unusual for us to find keywords that our audience uses in multiple different ways.
So step one of any identification process is to put ourselves in the shoes of our audience (or potential audience) performing the search.
It is generally then – assuming we know our business well – easy to assign any given keyword into one of the six buckets outlined above (branded or unbranded, navigational, informational or transactional). What is harder, is making that process scale to dozens, hundreds, or thousands of keywords.
To do this effectively, we must find rules of thumb and heuristics to help us. Typically these rules take the form of Excel or Google Sheets formulae.
Here are some of the rules and heuristics that we have found useful:
Beyond identifying new opportunities, a key goal of keyword research is to match your website content with the intent of the searcher. This is important for a number of reasons:
You can identify intent mismatches by seeking pages and sections with high bounce rates from organic search in analytics.
Of course, it’s impossible to control all the things that a page ranks for, so sometimes you will find high bounce rates where pages rank for an accidental term. Outside of those situations, where you find bounce rates indicating a mismatch, you can use the framework above to modify page content to suit.
The most common mismatch we see is websites that put commercial, conversion-oriented content in front of searchers performing informational queries. Of course, you need to consider the profitable outcomes you are seeking, but if users don’t even stay around to consume any content, you have no chance of converting them. In these situations, you may need to consider smaller conversion steps such as email subscription sign-ups.
You can get a long way by bringing together people who understand the underlying business with people who are used to thinking about searcher behavior. You can do this face-to-face or virtually, but the key is to ask good questions. For example, for a B2B website, think about questions like:
At this stage, you are not necessarily seeking to structure your output particularly carefully, but can just make an unordered list of ideas.
As you begin to exhaust your creativity in brainstorming, you should turn to additional resources for ideas and evidence of how your customers and potential customers think about your services.
Useful insights may be found from:
And any other places you can find website users and customers writing in their own words.
Before working with the list you are building, you will want to gather data from a variety of sources to identify edge cases and also to enhance your master list with whatever other data you can find.
Now that you have a seed list of keywords, you need to expand them into a list that is as comprehensive as possible. Below are a few techniques that you can learn to do this.
Google Suggest can be a great way to expand your seed keyword list and provoke more ideas for you to explore. You won’t be given search volume for the keywords but Google Suggest is largely powered by user intent which means that it gives you a good indicator of what people search for.
Ubersuggest was created by famous internet marketer Neil Patel and has expanded it’s functionality out to give keywords and rough volumes. Ubersuggest basically scrapes Google Suggest which means you don’t have to manually enter each possible keyword and note down the results. You can start with a simple search and get a long list of suggestions.
You can select your target language and country as well as the type of results you want such as web, news, images etc. You could make use of these because you can often get very different results when selecting different types of results.
You only need to run a few variations of searches on Ubersuggest to get a few thousand suggestions very quickly. Once you’ve copied all of these into your spreadsheet, you will need to de-dupe (remove duplicates) them as there will be crossover sometimes.
Another way to find more keyword ideas and expand your seed list is to look at related searches at the bottom of Google search results.
We can see that these are slightly different to the suggestions we get, so it’s another source of keyword data that we can use to expand our master list.
You can go pretty deep with this and keep clicking through until you’re at very long-tail keywords.
In order to help prioritise your activities, it is valuable to expand your raw keyword list with additional data. You might seek data like:
Before you start using other tools to add this extra data, you should make sure that your master list of keywords is tidy and free from duplicates.
To gather search volume and seasonality data for your list of keywords, you can use the Google AdWords keyword planner tool.
You can see data for the following:
These data points are somewhat useful in determining competition but you need to bear in mind that it’s purely competition from an AdWords perspective, not an organic search perspective.
You can then download all of this data back into a new CSV file so that you have your core list of keywords but with lots of additional bits of useful data.
Again, this process is a lot easier when using a paid tool like SEMRush, like we do at SEO Services NI.
Many keyword tools also allows you to define a geographic location, like Ubersuggest and SEMRush so that you can see how search volume differs by country:
This can be really useful for a couple of reasons:
At the end of this process, you should have a list of relevant keywords along with search volume and you should have a strong idea of their seasonality and geographical differences.
We already know the importance of keyword classification based on searcher intent. Once we are armed with an actual keyword list, we can also slice this data in other ways. One key way that helps integrate your keyword data into a publishing schedule or content calendar is to consider themes and concepts.
While the exact terminology isn’t important, it is important to separate two different ways of grouping keywords:
You can define groups “top down” by creating the groups based on external information such as website sections or folders or “bottom up” by seeking emergent themes in the data.
Either way, you can start to define rules for classifying your keywords – typically either based on lookup relationships to words in the query or by using intermediate lookups that first map words to synonyms and then use lookups to perform the classification.
You can choose how narrow or broad to make your themes depending on the use to which you are putting the data. To guide a website’s information architecture, you may seek fewer, broader categories while a content strategy or calendar may need more fine detail.
Competitor information serves two main purposes during keyword research:
At a high level, you can often determine major keyword groupings simply by looking at a competitor’s menu structure and major categories. To dive deeper, you will likely need to gather more data – normally through crawling the site in question or through external data sources like SEO Quake, Scrapebox, Sitebulb or Screaming Frog. The kinds of data that you can seek out include:
It is valuable to perform this kind of analysis both for real-world competitors and for those sites that appear in the search results for words and phrases you are targeting.
For example, it is typically worth checking out:
Below are a few techniques for finding keywords that your competitors are targeting.
You can use a crawler such as Screaming Frog to find on-page keyword targeting from elements of the page such as META data and header tags.
You can export all of these titles into a spreadsheet and look for the most used keywords and their frequency.
Using third party tools
There are tools available that can give you insight into the keywords that competitors rank well for. The two most common ones are Searchmetrics and SEMrush. These tools gather a lot of ranking data for a wide range of keywords which means that you can see a sample of keywords that your competitor is ranking for.
This can be useful in finding competitors who you may not have previously considered.
There are two ways of looking at competition levels:
You can often get a good sense of both of these areas by simply searching around the themes and topics you are targeting, but there are a range of tools that are useful. For example SEMrush gives an approximate measure of the percentage of traffic a website is getting for a given keyword, this information can be very useful for analysing competitor performance.
You can use the data you gather from this research, combined with an understanding of the authority of your own website to inform your own content strategy. In particular, if you find yourself in an area of high competition and with your own website being lower authority than the established players, you may find it more effective to target the gaps and seek areas that the bigger players haven’t targeted.
In order to work out how much value might accrue to your business from targeting a keyword or theme, you need to combine three kinds of estimation:
Historically, search volume data has come from two main sources:
The ISP data has gradually gone away and the remaining data that comes from the search engines themselves comes with its own downsides:
The data we get from Google is important and the best we can really do is take the information we get from paid tools, understanding that they are not always accurate.
After factoring in the competitiveness of the keywords and the strength of your own site, you can begin to build models for the potential performance you might see when you build out content targeting the phrases you’ve chosen. This is not an exact science, of course, but by building the model, you can understand the sensitivity to the different assumptions you make along the way.
Once you have a performance model, you can layer on top an estimate of clickthrough rates to build a model for the search traffic that you may be able to capture. As Google’s user interface becomes increasingly complicated, with rich snippets, answer boxes, knowledge graphs etc., these steps are also getting harder. If you have an existing website covering some of the areas you plan to target, you can sense-check your calculations by comparing the output to the organic search traffic that existing pages are already capturing.
When you are planning the structure of a new website or seeking to improve the structure of an existing website, you are typically trying to understand the kinds of thing that your visitors are going to be looking for. This is where the themes and categories discovered through wide-ranging keyword research can come in useful.
By targeting themes in this way, you not only create sensible classifications for the content you have or that you want, but you are also creating user-centric paths to discovering the content that your users really want.
If you would like to learn more about information architecture and website structure, you should check out our information architecture module.
There are a whole bunch of inputs that are useful in determining the kind of content that you choose to invest in and create, but demand is a key element.
Of course, you will also want to consider brand messaging, the competitive landscape, and your ability to build an audience or drive engagement, but by using your keyword research, you ensure that you are building realistic models and that you are aligning your strategy with the opportunities in the market.
Of course one of the primary uses of keyword research data is to guide the tactical creation of individual pieces of content. Typically, once you have chosen the subject and the target keyword, it is straight-forward to integrate the specific words and phrases that your demand research has identified (if it’s not, that typically points to a mismatch in your content targeting).
Keyword research can be a daunting and time consuming process. That’s why most companies have dedicated in-house teams for performing these types of marketing activities or they employ outside professional SEO services. If you are planning to do it yourself, we really recommend using SEMRush as it’s user interface is the most user friendly, the data is excellent and the price is very reasonable for what you get. Plus they are regularly updating the features of their service. Check out the free 7-day pro trial here.